# Homework: Implementing a RAG Example with Ollama and Mistral LLM

## Overview

In this homework, you will be working on a practical application of the DataStax RAGStack. The goal is to modify this Jupyter Notebook that currently leverages OpenAI's LLM (Large Language Models) for a RAG example. Your task is to adapt this notebook to use Ollama running Mistral LLM, as the backbone for the RAG implementation.

## Why Ollama?

Ollama offers the option to run a LLM on a local machine. Self-managed LLMs are especially of interest for Customers using Cassandra or DSE on-prem and in internet-restricted environments, and for those using Cassandra, DSE and Astra DB who are cautious about sending sensitive data to cloud-based LLM services due to privacy concerns and cost considerations. Ollama enables local execution of Large Language Models, providing a private solution. This is particularly beneficial for demonstrations and aligns with the requirements of businesses handling critical data, ensuring it remains within their controlled environment.

For those seeking self-managed LLMs, alternatives like Mistral are available, offering performance comparable to OpenAI's models. Interested parties are encouraged to [review the Mistral documentation](https://huggingface.co/docs/transformers/main/en/model_doc/mistral). Mistral is designed for easy installation and can be efficiently hosted on the robust computing resources available in customer data centers.

## Objectives

1. **Understand the Current Implementation**: Begin by familiarizing yourself with the existing Jupyter Notebook. It uses DataStax's RAGStack, integrating Astra DB as a vector store, and employs an OpenAI LLM for generating responses.

2. **Transition to Ollama and Mistral LLM**: Your primary task is to modify the code in the notebook to replace the OpenAI LLM with Ollama running Mistral LLM. This will involve understanding the differences between the two models and adapting the API calls and data handling accordingly.

3. **Test and Validate**: Keep in mind you run notebook and ollama on your local machine. After implementing the changes, test the notebook to ensure that it functions correctly with the new LLM.

## Resources

- **Ollama Documentation**: [How to install Ollama](https://ollama.com/download) and [How to run Mistral LLM powered by Ollama](https://ollama.com/library/mistral/tags)

## Submission Guidelines

- Complete the task in the provided Jupyter Notebook.
- Ensure all code cells are well-documented.
- Submit the final notebook along with a brief report summarizing your approach, key challenges, and solutions.

Good luck, and feel free to reach out if you have any questions or need further clarifications!


In [None]:
%pip install ragstack-ai sentence-transformers

In [27]:
import getpass, openai, configparser

# Create a config parser object
config = configparser.ConfigParser()

# Read the config.ini file
config.read('config.ini')

# Access the configurations
ASTRA_DB_API_ENDPOINT = config['Settings']['ASTRA_DB_API_ENDPOINT']
ASTRA_DB_APPLICATION_TOKEN = config['Settings']['ASTRA_DB_APPLICATION_TOKEN']
OPENAI_API_KEY = config['Settings']['OPENAI_API_KEY']
lm_studio_model = config['Settings']['LM_STUDIO_MODEL']

In [2]:
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"

In [3]:
import requests
response = requests.get('https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt')
text = response.text

f = open('essay.txt', 'w')
f.write(text)
f.close()

In [21]:
from langchain_community.document_loaders import TextLoader
from langchain_openai import ChatOpenAI
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import AstraDB
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.chains import create_retrieval_chain
from openai import OpenAI

In [None]:
%pip show langchain_community

In [29]:
# Load data
loader = TextLoader("essay.txt")
docs = loader.load()
# Split text into chunks
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)
# Define the embedding model
embeddings = HuggingFaceEmbeddings()
#vector = load_vector_store()
vector = AstraDB.from_documents(documents, embeddings,collection_name="openai_demo", api_endpoint=ASTRA_DB_API_ENDPOINT, token=ASTRA_DB_APPLICATION_TOKEN)
# Define a retriever interface
retriever = vector.as_retriever()
# Define LLM
#model = ChatOpenAI(openai_api_key=OPENAI_API_KEY)
model = ChatOpenAI(base_url="http://localhost:1234/v1", api_key="not-needed")
# Define prompt template
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:

<context>
{context}
</context>

Question: {input}""")

# Create a retrieval chain to answer questions
document_chain = create_stuff_documents_chain(model, prompt)
retrieval_chain = create_retrieval_chain(retriever, document_chain)
response = retrieval_chain.invoke({"input": "What were the two main things the author worked on before college?"})
print(response["answer"])



Based on the provided context, there are no specific two main things that the author worked on before college. The response is not relevant to the given context.
