# Build Your Own RAG using RAGStack
This notebook shows the steps to take to use the DataStax Enterprise v7 Vector Store as a means to make LLM interactions meaningfull and without hallucinations. The approach taken here is Retrieval Augmented Generation.

You'll learn:
1. About the content in a CNN dataset
2. How to interact with the OpenAI Chat Model *without* providing this context
3. How to load this context into DataStax Enterprise v7
4. How to run a semantic similarity search on DataStax Enterprise v7
5. How to use this context *with* the local Mistral Chat Model

## Install dependencies

In [None]:
!pip install ragstack-ai sentence-transformers datasets pipdeptree langchain_ollama pypdf



## Visualize Ragstack dependencies
RAGStack is a curated stack of the best open-source software for easing implementation of the RAG pattern in production-ready applications using DataStax Enterprise, Astra Vector DB or Apache Cassandra as a vector store.

A single command (pip install ragstack-ai) unlocks all the open-source packages required to build production-ready RAG applications with LangChain and DataStax Enterprise, Astra Vector DB or Apache Cassandra.

For each open-source project included in RAGStack, we select a version lineup and then test the combination for compatibility, performance, and security. Our extensive test suite ensures that RAGStack components work well together so you can confidently deploy them in production. We also run security scans on all components using industry-standard tools to ensure that you are not exposed to known vulnerabilities.

In [2]:
!pipdeptree -p ragstack-ai

ragstack-ai==1.1.0
├── ragstack-ai-colbert [required: ==1.0.6, installed: 1.0.6]
│   ├── cassio [required: >=0.1.7,<0.2.0, installed: 0.1.10]
│   │   ├── cassandra-driver [required: >=3.28.0,<4.0.0, installed: 3.29.2]
│   │   │   └── geomet [required: >=0.1,<0.3, installed: 0.2.1.post1]
│   │   │       ├── click [required: Any, installed: 8.1.7]
│   │   │       └── six [required: Any, installed: 1.16.0]
│   │   ├── numpy [required: >=1.0, installed: 1.26.4]
│   │   └── requests [required: >=2.31.0,<3.0.0, installed: 2.32.3]
│   │       ├── certifi [required: >=2017.4.17, installed: 2023.7.22]
│   │       ├── charset-normalizer [required: >=2,<4, installed: 3.3.0]
│   │       ├── idna [required: >=2.5,<4, installed: 3.4]
│   │       └── urllib3 [required: >=1.21.1,<3, installed: 2.0.7]
│   ├── colbert-ai [required: ==0.2.19, installed: 0.2.19]
│   │   ├── bitarray [required: Any, installed: 3.0.0]
│   │   ├── datasets [required: Any, installed: 2.19.2]
│   │   │   ├── aiohttp [required:

## Keeping it all locally and within the enterprise firewall
In this notebook we'll keep all services local to ensure maximum safety:

- For the Vector Database, [DataStax Enterprise 7](https://www.datastax.com/blog/get-started-with-the-datastax-enterprise-7-0-developer-vector-search-preview) will be used.
- For the Foundational Model we'll be using [Mistral](https://mistral.ai/).

Read more about Mistral and how it stacks up to GPT-4 [here](https://www.zdnet.com/article/what-to-know-about-mistral-ai-the-company-behind-the-latest-gpt-4-rival/).

# Get an inference engine with Mistral started
There are a multitude of inference engines. You can go for [LM Studio](https://lmstudio.ai/) which has a nice UI. In this notebook, we'll use [Ollama](https://ollama.com/).

1. Get started by [downloading](https://ollama.com/download)
2. Install it to your machine
3. Start the inference engine, while downloading Mistral (~4GB) with the command `ollama run mistral` in a terminal

In case this all fails, because of RAM limitations, you can opt to use [tinyllama](https://ollama.com/library/tinyllama) as a model.

## Call Mistral's Chat Model
In this example we'll ask what Daniell Radcliffe recieves when he turns 18.

As Mistral has no access to the CNN documents, it will come up with some answer that is very generic.

In [2]:
from langchain_ollama import ChatOllama

chat_model = ChatOllama(model="mistral:latest", 
    num_ctx=4096,
    base_url="http://host.docker.internal:11434")

chat_model.invoke("Who was the first man on the moon?")

AIMessage(content=' The first man on the moon was Neil Armstrong. He walked on the lunar surface as part of the Apollo 11 mission, which launched from Kennedy Space Center on July 16, 1969, and landed on the Moon on July 20, 1969. Armstrong\'s famous words upon taking his first steps on the moon were "That\'s one small step for man, one giant leap for mankind." Buzz Aldrin was the second man to walk on the moon, following Armstrong after they had landed the Apollo Lunar Module Eagle.', response_metadata={'model': 'mistral:latest', 'created_at': '2024-11-18T13:47:41.363715Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 5093749875, 'load_duration': 11863834, 'prompt_eval_count': 14, 'prompt_eval_duration': 462000000, 'eval_count': 130, 'eval_duration': 4618000000}, id='run-520719e8-a2d7-4882-b628-050c9c7a6e07-0', usage_metadata={'input_tokens': 14, 'output_tokens': 130, 'total_tokens': 144})

In [3]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
from langchain.schema.runnable import RunnableMap
from langchain.schema.output_parser import StrOutputParser

template = """
You are a philosopher that draws inspiration from great thinkers of the past
to craft well-thought answers to user questions. Use the provided context as the basis
for your answers and do not make up new reasoning paths - just mix-and-match what you are given.
Your answers must be extensively written.
"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", 
         template
         ),
         ("human", "QUESTION: {question}"
          ),
    ]
)

llm = ChatOllama(
    model="mistral:latest", 
    num_ctx=4096,
    base_url="http://host.docker.internal:11434"
)

# inputs = RunnableMap({
#   'question': lambda x: x['question']
# })
chain =  prompt | llm | StrOutputParser()

chain.invoke({"question": "What kind of fortune does Daniel Radcliffe get when he turns 18?"})

" To address your question, it is important to note that the person in question, Daniel Radcliffe, is a fictional character from the Harry Potter series, and his fortune or events related to turning 18 are part of a narrative constructed by author J.K. Rowling rather than real-life events. However, we can draw upon philosophical insights from various thinkers to discuss the concept of fortune and its interpretation in this context.\n\nFirstly, Aristotle's Nicomachean Ethics offers a detailed exploration of the meaning of happiness or eudaimonia. For Aristotle, eudaimonia is not simply a matter of possessing external goods like wealth, power, or fame, but rather it is a state of living well and flourishing as a human being. In this sense, we can say that Daniel Radcliffe's fortune at the age of 18, when measured by his external success as Harry Potter, is significant, but it does not necessarily equate to true eudaimonia unless he is using that fortune to live a virtuous and fulfilling 

## Load data from CNN

In [4]:
import datasets

def load_articles(n=5):
  dataset = datasets.load_dataset('cnn_dailymail', '3.0.0', split='train', streaming=True)
  data = dataset.take(n)
  return [d['article']
          for d in data]

articles = load_articles()

  from .autonotebook import tqdm as notebook_tqdm


## Check out some content
In this example we can read that when Daniel Radcliffe turns 18, he'll gain access to £20 million.

In [35]:
print(articles[-1])

(CNN)  -- The National Football League has indefinitely suspended Atlanta Falcons quarterback Michael Vick without pay, officials with the league said Friday. NFL star Michael Vick is set to appear in court Monday. A judge will have the final say on a plea deal. Earlier, Vick admitted to participating in a dogfighting ring as part of a plea agreement with federal prosecutors in Virginia. "Your admitted conduct was not only illegal, but also cruel and reprehensible. Your team, the NFL, and NFL fans have all been hurt by your actions," NFL Commissioner Roger Goodell said in a letter to Vick. Goodell said he would review the status of the suspension after the legal proceedings are over. In papers filed Friday with a federal court in Virginia, Vick also admitted that he and two co-conspirators killed dogs that did not fight well. Falcons owner Arthur Blank said Vick's admissions describe actions that are "incomprehensible and unacceptable." The suspension makes "a strong statement that con

## Generate chunks to load into the Vector Store
Now let's load the CNN data into the Astra DB Vector Store.
1. First we'll chunk up the data so that it can be loaded in multiple pieces.
2. Then we'll create a new Vector Store on Astra DB.
3. Lastly, we'll load up the documents. As part of this step, the data will be vectorized and it's embeddings stored in the Vector Store.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

documents = splitter.create_documents(articles)
document_chunks = splitter.split_documents(documents)

print(document_chunks[0])

page_content='for release later this year. He will also appear in "December Boys," an Australian film about four boys who escape an orphanage. Earlier this year, he made his stage debut playing a tortured teenager in Peter Shaffer's "Equus." Meanwhile, he is braced for even closer media scrutiny now that he's legally an adult: "I just think I'm going to be more sort of fair game," he told Reuters. E-mail to a friend . Copyright 2007 Reuters. All rights reserved.This material may not be published, broadcast, rewritten, or redistributed.'
page_content='LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported £20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won't cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity p

# Now let's run DSE 7 Vector Store
Make sure you have [Docker](https://www.docker.com/) installed.

Run DSE 7 in any of these two ways from a terminal window:
1. `docker-compose up` (using the docker-compose.yml file in the root of this repository)
2. `docker run -e DS_LICENSE=accept -p 9042:9042 datastax/dse-server:7.0.0-alpha.4`

And then create a default keyspace as follows:

In [7]:
from cassandra.cluster import Cluster

# Connect to DSE7
cluster = Cluster(["host.docker.internal"])
session = cluster.connect()

# Create the default keyspace
session.execute("CREATE KEYSPACE IF NOT EXISTS default_keyspace WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}")

<cassandra.cluster.ResultSet at 0xfffecb3becd0>

# Get the Vector Store
The following code will create a new Vector Store in DataStax Enterprise. For embeddings we'll be using the default from Huggingface.

In [8]:
from langchain_community.vectorstores import Cassandra
from langchain_community.embeddings import HuggingFaceEmbeddings

# Create a new Astra DB Vector Store
vector_store = Cassandra(
    session=session,
    keyspace="default_keyspace",
    table_name="dse_vector_table",
    embedding=HuggingFaceEmbeddings()
)

  warn_deprecated(


In [9]:
# Load the CNN documents into the Astra DB Vector Store (Only the first time)
vector_store.add_documents(document_chunks)

['8c0f759b967a4ae0b9efb799cc7c25d5',
 '417061129386498d890cd802c057e69c',
 'c7f973d591b743e98271bb1a613618db',
 '8b8c204c7aee473680faa6fc48d7bfc3',
 '99ca949816e54028bdf5c830586e4717',
 '1fa55fb3ab1247f59d47a0c3ce0dd7a3',
 '6e971c98c3574fd4858b6ae1c6d64bf0',
 'b08e86d1684340a8b9e8a5a3e585642d',
 '892f87d29454419ab11241a74f787b23',
 '5df038e187ff45f6ac45a5ce373b8aea',
 'ccb0e52de79f4d95add65422bb0cdf83',
 '98679b0cbd8d42ed899b89816028de45',
 '0537a8d708874bca92b14114dab3c303',
 '6589d009310d4781aab981a01360c8cf',
 '811ef753d17d4935978034aa11e08177',
 'a9febff2a4f742879cc458aadc258cd9',
 '17f530fbdfd84e7496ca6a3a50876e68',
 'df09378443504729bcfdc32e91e6ec38',
 'da018ab1937641db8a1e0943b73185f5',
 'a83a33c484514154828df34159a70c4f',
 'f99b713018634ea699c3a5d566e87fd7']

## Run a semantic query on the Astra DB Vector Store
Here you'll see that Astra DB retrieves relevant documents given the query.

In [10]:
query = 'What kind of fortune does Daniel Radcliffe get when he turns 18?'
vector_store.similarity_search(query, k=2)

[Document(page_content='LONDON, England (Reuters) -- Harry Potter star Daniel Radcliffe gains access to a reported £20 million ($41.1 million) fortune as he turns 18 on Monday, but he insists the money won\'t cast a spell on him. Daniel Radcliffe as Harry Potter in "Harry Potter and the Order of the Phoenix" To the disappointment of gossip columnists around the world, the young actor says he has no plans to fritter his cash away on fast cars, drink and celebrity parties. "I don\'t plan to be one of those people who, as soon as they turn 18, suddenly buy themselves a massive sports car collection or something similar," he told an Australian interviewer earlier this month. "I don\'t think I\'ll be particularly extravagant. "The things I like buying are things that cost about 10 pounds -- books and CDs and DVDs." At 18, Radcliffe will be able to gamble in a casino, buy a drink in a pub or see the horror film "Hostel: Part II," currently six places below his number one movie on the UK box 

## Call Mistral's Chat Model again
Now let's run the query again on the Mistral Chat Model while inserting the relevant context from the DataStax Enterprise Vector Store to make the response meaningfull and stop hallucinating.

In [20]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
from langchain_core.runnables import RunnableMap
from langchain.schema.output_parser import StrOutputParser

# Get the retriever for the Chat Model
retriever = vector_store.as_retriever(
    search_kwargs={"k": 5}
)


# Create the prompt template
template = """
You are a philosopher that draws inspiration from great thinkers of the past
to craft well-thought answers to user questions. Use the provided context as the basis
for your answers and do not make up new reasoning paths - just mix-and-match what you are given.
Your answers must be extensively written.

CONTEXT:
{context}
"""

llm = ChatOllama(
    model="mistral:latest", 
    num_ctx=4096,
    base_url="http://host.docker.internal:11434"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", 
         template
         ),
         ("human", "QUESTION: {question}"
          ),
    ]
)

# Define the chain
inputs = RunnableMap({
  'context': lambda x: retriever.get_relevant_documents(x['question']),
  'question': lambda x: x['question']
})
print(inputs)
chain = inputs | prompt | llm | StrOutputParser()

# Call the chain with the question
chain.invoke({"question": "What kind of fortune does Daniel Radcliffe get when he turns 18?"})

steps__={'context': RunnableLambda(...), 'question': RunnableLambda(...)}


" Daniel Radcliffe, the renowned actor known for playing Harry Potter in the popular film series, gains access to a reported £20 million ($41.1 million) fortune upon turning 18 years old. This substantial wealth is the result of earnings from the first five films in the Harry Potter franchise, which have been held in a trust fund until his coming of age. Despite this significant financial windfall, Radcliffe has expressed that he does not plan to be extravagant with his spending and prefers buying relatively modest items such as books, CDs, and DVDs. It's worth noting that the specific details regarding how Radcliffe will celebrate his 18th birthday have remained private."

# CV processing EXAMPLE


## Load the data about the position in the document store``

In [None]:
## Load data from the position.txt file 
## Upload your position description. To the one that you want to adapt your cv for
with open("position.txt", "r") as file:
    text = file.read()
    # Load the text into a single element of the array
    position = [text]
print(position)

['Job description\nStart Date: ASAP\n\nWork hours: 40 hrs p/w (full-time, flexible)\n\nLocation: Remote/Hybrid (within +/- 2 hours CET; candidate must have an existing permit to live and work in the country they reside)\n\nContract: 1 year with the possibility for an extension by mutual consent\n\n\n\nPosition Summary\n\nGRI is an independent NGO and standard-setter and over the last 26 years, our standards have created a global language used by organizations to provide transparency.\u202fAt GRI we enable organizations to assess and report on the environmental, social and economic impacts of their activities.\u202fWe also help build organizational capacity for sustainability reporting: from our Academy training courses to working with licensing partners to enable digital reporting with GRI’s standards.\u202fTogether, the skills, capabilities, and data we create help build sustainable, long-term value.\u202fWorking at GRI, you will be part of unlocking positive change in the world.\u202

## Split the documents

In [51]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)

documents = splitter.create_documents(position)
document_chunks_work = splitter.split_documents(documents)

print(document_chunks_work[0])

page_content='Job description
Start Date: ASAP

Work hours: 40 hrs p/w (full-time, flexible)

Location: Remote/Hybrid (within +/- 2 hours CET; candidate must have an existing permit to live and work in the country they reside)

Contract: 1 year with the possibility for an extension by mutual consent



Position Summary

GRI is an independent NGO and standard-setter and over the last 26 years, our standards have created a global language used by organizations to provide transparency. At GRI we enable organizations to assess and report on the environmental, social and economic impacts of their activities. We also help build organizational capacity for sustainability reporting: from our Academy training courses to working with licensing partners to enable digital reporting with GRI’s standards. Together, the skills, capabilities, and data we create help build sustainable, long-term value. Working at GRI, you will be part of unlocking positive change in the world.'


In [52]:
## Load the data into the Astra DB Vector Store
work_store = Cassandra(
    session=session,
    keyspace="default_keyspace",
    table_name="work_database",
    embedding=HuggingFaceEmbeddings()
)
work_store.add_documents(document_chunks_work)


['c6dfc9b3d6d145c595bee994c1651889',
 '8a54ddfcb46e44c88d2e1d71ec93c289',
 'ab564f628ae8434099d0c11e0a0046a4',
 '954ebd604ebb4a4784fb0308180e343f',
 '30cf351917e34e02959cbfad6902a080']

## Create the query

In [54]:
from langchain_community.document_loaders import PyPDFLoader

file_path = "cv.pdf"
loader = PyPDFLoader(file_path)
pages = []
for page in loader.lazy_load():
    pages.append(page)

In [60]:
pdf_content = " ".join([page.page_content for page in pages])


In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama import ChatOllama
from langchain_core.runnables import RunnableMap
from langchain.schema.output_parser import StrOutputParser

# Get the retriever for the Chat Model
retriever = work_store.as_retriever(
    search_kwargs={"k": 5}
)


# Create the prompt template
# Update the template to match what type of advisor you want. 
template = """
You are a curriculum advisor that is specialized in the field of product management, technical writing and fund acquisition. 
You advice people on how to improve their CVs and what to include in them. Use the provided context as the basis for your answers and do not make up new reasoning paths 
- just mix-and-match what you are given. You should focus on the context given for a position and the CV provided.

CONTEXT:
{context}

CV:{cv}
"""

llm = ChatOllama(
    model="mistral:latest", 
    num_ctx=4096,
    base_url="http://host.docker.internal:11434"
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", 
         template
         ),
         ("human", "QUESTION: {question}"
          ),
    ]
)

# Define the chain
inputs = RunnableMap({
  'context': lambda x: retriever.get_relevant_documents(x['question']),
  'cv': lambda x: pdf_content,
  'question': lambda x: x['question']
})
print(inputs)
chain = inputs | prompt | llm | StrOutputParser()

# Call the chain with the question
chain.invoke({"question": "can you print what you know about Noemi Ayala?"})

steps__={'context': RunnableLambda(...), 'cv': RunnableLambda(...), 'question': RunnableLambda(...)}


" Based on the provided CV and context of the job position, here is a summary of Noemi Ayala's qualifications that could be relevant to the grant proposal development role at GRI:\n\n1. Education:\n   - Ph.D. in Food Science (Excellent) from Universitat Autònoma de Barcelona\n   - M.Sc. in Quality of Food of Animal Origin, with a high GPA of 8.3/10 from the same university\n   - Degree in Human Nutrition (GPA: 4.15/5.0) from Universidad Autónoma del Sur\n\n2. Experience:\n   - Product Manager and Technical Writer at Leica Biosystems, where she managed the process of generating product documentation, ensured compliance with quality standards, and liaised between various departments to ensure updates were integrated in a timely manner. This experience demonstrates her ability to coordinate and develop projects, as well as her strong organizational and project management skills.\n\n3. Skills:\n   - Excellent analytical and persuasive writing skills, demonstrated through her work at Leica 