# Homework: Implementing a RAG Example with Ollama and Mistral LLM

## Overview

In this homework, you will be working on a practical application of the DataStax RAGStack. The goal is to modify this Jupyter Notebook that currently leverages OpenAI's LLM (Large Language Models) for a RAG example. Your task is to adapt this notebook to use Ollama running Mistral LLM, as the backbone for the RAG implementation.

## Why Ollama?

Ollama offers the option to run a LLM on a local machine. Self-managed LLMs are especially of interest for Customers using Cassandra or DSE on-prem and in internet-restricted environments, and for those using Cassandra, DSE and Astra DB who are cautious about sending sensitive data to cloud-based LLM services due to privacy concerns and cost considerations. Ollama enables local execution of Large Language Models, providing a private solution. This is particularly beneficial for demonstrations and aligns with the requirements of businesses handling critical data, ensuring it remains within their controlled environment.

For those seeking self-managed LLMs, alternatives like Mistral are available, offering performance comparable to OpenAI's models. Interested parties are encouraged to [review the Mistral documentation](https://huggingface.co/docs/transformers/main/en/model_doc/mistral). Mistral is designed for easy installation and can be efficiently hosted on the robust computing resources available in customer data centers.

## Objectives

1. **Understand the Current Implementation**: Begin by familiarizing yourself with the existing Jupyter Notebook. It uses DataStax's RAGStack, integrating Astra DB as a vector store, and employs an OpenAI LLM for generating responses.

2. **Transition to Ollama and Mistral LLM**: Your primary task is to modify the code in the notebook to replace the OpenAI LLM with Ollama running Mistral LLM. This will involve understanding the differences between the two models and adapting the API calls and data handling accordingly.

3. **Test and Validate**: Keep in mind you run notebook and ollama on your local machine. After implementing the changes, test the notebook to ensure that it functions correctly with the new LLM.

## Resources

- **Ollama Documentation**: [How to install Ollama](https://ollama.com/download) and [How to run Mistral LLM powered by Ollama](https://ollama.com/library/mistral/tags)

## Submission Guidelines

- Complete the task in the provided Jupyter Notebook.
- Ensure all code cells are well-documented.
- Submit the final notebook along with a brief report summarizing your approach, key challenges, and solutions.

Good luck, and feel free to reach out if you have any questions or need further clarifications!


In [1]:
!pip install ragstack-ai sentence-transformers

Collecting ragstack-ai
  Downloading ragstack_ai-0.10.0-py3-none-any.whl.metadata (3.7 kB)
Collecting sentence-transformers
  Using cached sentence_transformers-2.5.1-py3-none-any.whl.metadata (11 kB)
Collecting astrapy<0.8.0,>=0.7.0 (from ragstack-ai)
  Downloading astrapy-0.7.7-py3-none-any.whl.metadata (12 kB)
Collecting cassio<0.2.0,>=0.1.3 (from ragstack-ai)
  Using cached cassio-0.1.5-py3-none-any.whl.metadata (4.1 kB)
Collecting langchain==0.1.12 (from ragstack-ai)
  Downloading langchain-0.1.12-py3-none-any.whl.metadata (13 kB)
Collecting langchain-astradb==0.1.0 (from ragstack-ai)
  Downloading langchain_astradb-0.1.0-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-community==0.0.28 (from ragstack-ai)
  Downloading langchain_community-0.0.28-py3-none-any.whl.metadata (8.3 kB)
Collecting langchain-core==0.1.31 (from ragstack-ai)
  Downloading langchain_core-0.1.31-py3-none-any.whl.metadata (6.0 kB)
Collecting langchain-openai==0.0.8 (from ragstack-ai)
  Downloading lang

In [1]:
import getpass

ASTRA_DB_API_ENDPOINT = input("Please provide your ASTRA_DB_API_ENDPOINT: ")
ASTRA_DB_APPLICATION_TOKEN = getpass.getpass("Enter your ASTRA_DB_APPLICATION_TOKEN: ")

In [5]:
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"

In [6]:
import requests
response = requests.get('https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt')
text = response.text

f = open('essay.txt', 'w')
f.write(text)
f.close()

In [7]:
!pip install -U langchain-mistralai

Collecting langchain-mistralai
  Downloading langchain_mistralai-0.0.5-py3-none-any.whl.metadata (2.3 kB)
Collecting mistralai<0.2,>=0.1 (from langchain-mistralai)
  Downloading mistralai-0.1.6-py3-none-any.whl.metadata (1.9 kB)
Collecting httpx<0.26.0,>=0.25.2 (from mistralai<0.2,>=0.1->langchain-mistralai)
  Using cached httpx-0.25.2-py3-none-any.whl.metadata (6.9 kB)
Collecting pyarrow<16.0.0,>=15.0.0 (from mistralai<0.2,>=0.1->langchain-mistralai)
  Downloading pyarrow-15.0.1-cp311-cp311-macosx_11_0_arm64.whl.metadata (3.0 kB)
Downloading langchain_mistralai-0.0.5-py3-none-any.whl (10 kB)
Downloading mistralai-0.1.6-py3-none-any.whl (15 kB)
Using cached httpx-0.25.2-py3-none-any.whl (74 kB)
Downloading pyarrow-15.0.1-cp311-cp311-macosx_11_0_arm64.whl (24.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.2/24.2 MB[0m [31m28.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: pyarrow, httpx, mistralai, langchain-mistralai
  At

In [15]:
from langchain_community.document_loaders import TextLoader
from langchain_community.chat_models import ChatOllama # Replace ChatMistralAI or ChatOpenAI with ChatMistral
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import AstraDB
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain

In [16]:
# Load data
loader = TextLoader("essay.txt")
docs = loader.load()
# Split text into chunks
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
# Define the embedding model
embeddings = HuggingFaceEmbeddings()
#vector = load_vector_store()
#vector = AstraDB.from_documents(documents, embeddings,collection_name="openai_demo", api_endpoint=ASTRA_DB_API_ENDPOINT, token=ASTRA_DB_APPLICATION_TOKEN)
vector = AstraDB.from_documents(documents, embeddings, collection_name="mistral_demo", api_endpoint=ASTRA_DB_API_ENDPOINT, token=ASTRA_DB_APPLICATION_TOKEN)
# Define a retriever interface
retriever = vector.as_retriever()

#Use Ollama LLM with Mistral model
model = ChatOllama(model="mistral")
# Define prompt template
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

# Create a retrieval chain to answer questions
document_chain = create_stuff_documents_chain(model, prompt)
retrieval_chain = create_retrieval_chain(retriever, document_chain)
response = retrieval_chain.invoke({"input": "What were the two main things the author worked on before college?"})
print(response["answer"])


 The two main things the author worked on before college were learning piano and programming.
