Welcome to a tutorial on RAG with LangChain!
----

We will be following [this](https://graphacademy.neo4j.com/courses/llm-fundamentals/) official course by Neo4j which introduces you to LangChain, and creating agents with Neo4j as the vetor store. However, because they use the paid APIs by OpenAI for both embedddings and the LLM, I have adapted the tutorial to open-source offerings:

- The embedding size and model embedding sizes must match, which narrow the scope of open-source choices. Because this notebook is meant to give a consolidated overview of RAG agents, we will not worry about the performance of the chat model. We take the simpler/smaller Falcon-7B-instruct as it's small enough to be serviced by the HuggingFace API and is also fine-tuned for chat (it's 40B version is also said to be just as performant as its LLaMA size). This narrows our embedder options, but the all-mpnet-base is fortunately trained for semantic search queries, making it ideal for use in the RAG component in this application (we need to match a query to documents).
- Though the vector store given to us by default has embeddings, they cannot work for a differently-sized model. Thus, I have taken a further step to recreate the embeddings through the chosen embedder. Details about it appear in the relevant section.

Use the index to navigate through the notebook. I have not explained the details of my tweaks and adaptations, but a general understanding from starter LangChain documentation should get you going. In the index:
- The [MAIN CODE] indicates the full code by the tutorial, where you can plug and play your LLMs/Embeddings and run the agent code.
- The [DEMO] code is where I plug my choices as justified above, to see a sample of how the main code could run (details in the relevant section).
- The [CHAT] cell does away with the agents to allow us to have an actual chat with memory enabled with RAG from the Neo4j vector store.

- Finally, the last section has a simple application of RAG on local PDFs, which has been deployed on [Streamlit](https://demo-pdf-rag.streamlit.app/).  


>[Installing Prerequisites](#scrollTo=dzncwwjDjJy8)

>[Preparing Embeddings](#scrollTo=vA_47m-X_OPM)

>>[Cypher with vector semantic search](#scrollTo=tA6A5s9YGlzP)

>[LangChain](#scrollTo=rN4Q18Q4lVIp)

>>[Prompting](#scrollTo=ihVG4Ecgldhj)

>>[Chaining](#scrollTo=t2d972H-lmIE)

>>[Chat model](#scrollTo=IN2e0kuJlqcq)

>>>[Memory](#scrollTo=0Ojfgat9zy7o)

>>>[Memory storage in Neo4j](#scrollTo=uDHHzLiNFlws)

>[Agents](#scrollTo=dImrap2jL8b4)

>[Retrievers](#scrollTo=R7QlMk5-m3JK)

>>>[[Additional] Neo4j index at runtime](#scrollTo=A0Gu2vV2sEUh)

>>[Full RetrievalQA chain](#scrollTo=I39pwzoX3DFc)

>>>[[MAIN CODE] RetrieverQA + Agents (Optional Exercise)](#scrollTo=2onNUexwrTbU)

>>>[[DEMO] With Falcon](#scrollTo=9WxCAmBXKBae)

>>>[[CHAT] Falcon + Neo4j RAG without agents](#scrollTo=P61zOYnbPf6v)

>[[Optional] Cypher query generation by LLMs](#scrollTo=aN11etjfzg5-)

>[PDF RAG](#scrollTo=C-kVR781BedD)



# Installing Prerequisites

In [2]:
%pip install langchain langchain-community neo4j transformers sentence-transformers youtube-search
# !pip install openai langchain-openai

Collecting langchain
  Downloading langchain-0.3.3-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.2-py3-none-any.whl.metadata (2.8 kB)
Collecting neo4j
  Downloading neo4j-5.25.0-py3-none-any.whl.metadata (5.7 kB)
Collecting sentence-transformers
  Downloading sentence_transformers-3.2.0-py3-none-any.whl.metadata (10 kB)
Collecting youtube-search
  Downloading youtube_search-2.1.2-py3-none-any.whl.metadata (1.2 kB)
Collecting langchain-core<0.4.0,>=0.3.10 (from langchain)
  Downloading langchain_core-0.3.10-py3-none-any.whl.metadata (6.3 kB)
Collecting langchain-text-splitters<0.4.0,>=0.3.0 (from langchain)
  Downloading langchain_text_splitters-0.3.0-py3-none-any.whl.metadata (2.3 kB)
Collecting langsmith<0.2.0,>=0.1.17 (from langchain)
  Downloading langsmith-0.1.134-py3-none-any.whl.metadata (13 kB)
Collecting tenacity!=8.4.0,<9.0.0,>=8.1.0 (from langchain)
  Downloading tenacity-8.5.0-py3-none-any.whl.metadata (1.2 kB)
Collec

# Preparing Embeddings
To match the Falcon embedding dimensions, we will have to recreate the embeddings and upload them back to our Neo4j DB. The MPNET has the embedding size we want, and is also trained with a training objective of sentence similarity embedding/matching, which suits our needs here.

We first generate the embeddings by taking the movie-plots from Neo4j, then downloading what we get as a CSV file. Because this is a demo project, we will keep it to the first 100 movies.

In [None]:
import os
import csv

# from openai import OpenAI
from neo4j import GraphDatabase

from dotenv import load_dotenv
load_dotenv()

from langchain import HuggingFaceHub

# OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "hf_..."

from langchain_community.embeddings import HuggingFaceEmbeddings

model_name = "sentence-transformers/all-mpnet-base-v2"
# model_kwargs = {'device': 'cpu'}
# encode_kwargs = {'normalize_embeddings': False}
hf = HuggingFaceEmbeddings(
    model_name=model_name,
    # model_kwargs=model_kwargs,
    # encode_kwargs=encode_kwargs
)

def get_movie_plots(limit=None):

    driver = GraphDatabase.driver(
        "bolt://44.220.93.128:7687",
  auth=basic_auth("neo4j", "mechanisms-facility-nose"))


    driver.verify_connectivity()

    query = """MATCH (m:Movie) WHERE m.plot IS NOT NULL
    RETURN m.movieId AS movieId, m.title AS title, m.plot AS plot"""

    if limit is not None:
        query += f' LIMIT {limit}'

    movies, summary, keys = driver.execute_query(
        query
    )

    driver.close()

    return movies

def generate_embeddings(file_name, limit=None):

    csvfile_out = open(file_name, 'w', encoding='utf8', newline='')
    fieldnames = ['movieId','embedding']
    output_plot = csv.DictWriter(csvfile_out, fieldnames=fieldnames)
    output_plot.writeheader()

    movies = get_movie_plots(limit=limit)

    print(len(movies))

    for movie in movies[:100]:
        print(movie['title'])

        plot = f"{movie['title']}: {movie['plot']}"
        response = hf.embed_query(
            plot,
            # model='all-mpnet-base-v2'
        )

        output_plot.writerow({
            'movieId': movie['movieId'],
            'embedding': response
        })

    csvfile_out.close()

# generate_embeddings('.\data\\movie-plot-embeddings.csv',limit=1)
generate_embeddings('movie-plot-embeddings.csv')



9083
Toy Story
Jumanji
Grumpier Old Men
Waiting to Exhale
Father of the Bride Part II
Heat
Sabrina
Tom and Huck
Sudden Death
GoldenEye
American President, The
Dracula: Dead and Loving It
Balto
Nixon
Cutthroat Island
Casino
Sense and Sensibility
Four Rooms
Ace Ventura: When Nature Calls
Money Train
Get Shorty
Copycat
Assassins
Powder
Leaving Las Vegas
Othello
Now and Then
Persuasion
City of Lost Children, The (Cité des enfants perdus, La)
Shanghai Triad (Yao a yao yao dao waipo qiao)
Dangerous Minds
Twelve Monkeys (a.k.a. 12 Monkeys)
Babe
Carrington
Dead Man Walking
Across the Sea of Time
It Takes Two
Clueless
Cry, the Beloved Country
Richard III
Dead Presidents
Restoration
Mortal Kombat
To Die For
How to Make an American Quilt
Seven (a.k.a. Se7en)
Pocahontas
When Night Is Falling
Usual Suspects, The
Mighty Aphrodite
Lamerica
Big Green, The
Georgia
Home for the Holidays
Postman, The (Postino, Il)
Confessional, The (Confessionnal, Le)
Indian in the Cupboard, The
Eye for an Eye
Mr. Hollan

Because we are working on the Sandbox, the config will not allow us to import a local file to our Neo4j DB (such importing is disabled under the security config parameters). If you are creating your own CSV, I recommend you upload it to a website for such purposes (GDrive, GitHub, etc.) and keep that URL ready. If you are pushing it to GitHub like I did, make sure you copy the link to the raw version.



## Cypher with vector semantic search

Next, we will upload our embeddings back to our Neo4j DB and create an index on it. I am adapting the same queries from the tutorial, reproduced here for ready reference:

1. MATCH (m:Movie {title: "Toy Story"})
RETURN m.title AS title, m.plot AS plot

 #Sanity check

2. LOAD CSV WITH HEADERS
FROM 'https://data.neo4j.com/llm-fundamentals/openai-embeddings.csv'
AS row
MATCH (m:Movie {movieId: row.movieId})
CALL db.create.setNodeVectorProperty(m, 'plotEmbedding', apoc.convert.fromJsonList(row.embedding))
RETURN count(*)

  #Import CSV and make a new field for the relevant existing nodes

3. MATCH (m:Movie {title: "Toy Story"})
RETURN m.title AS title, m.plot AS plot, m.plotEmbedding

  #Updated DB Sanity check

4. CREATE VECTOR INDEX moviePlots IF NOT EXISTS
FOR (m:Movie)
ON m.plotEmbedding
OPTIONS {indexConfig: {
 `vector.dimensions`: 1536,
 `vector.similarity_function`: 'cosine'
}}

  #Index creation (no output)

5. SHOW INDEXES  YIELD id, name, type, state, populationPercent WHERE type = "VECTOR"

  #Sanity check (population% should be 100)

6. MATCH (m:Movie {title: 'Toy Story'})
CALL db.index.vector.queryNodes('moviePlots', 6, m.plotEmbedding)
YIELD node, score
RETURN node.title AS title, node.plot AS plot, score

  #Testing semantic vector search

PS: The queries above are as-is from your guide for your reference, they are tweaked in the actual code below. You may also tweak accordingly.

In [None]:
# pip3 install neo4j-driver - not needed
# python3 example.py

from neo4j import GraphDatabase, basic_auth

driver = GraphDatabase.driver(
    "bolt://44.220.93.128:7687",
  auth=basic_auth("neo4j", "mechanisms-facility-nose"))


driver.verify_connectivity()

queries = ["""MATCH (m:Movie {title: "Toy Story"}) RETURN m.title AS title, m.plot AS plot""","""LOAD CSV WITH HEADERS FROM
'https://raw.githubusercontent.com/FauzanFarooqui/hello-world/refs/heads/main/movie-plot-embeddings.csv' AS row MATCH (m:Movie {movieId: row.movieId})
CALL db.create.setNodeVectorProperty(m, 'plotEmbedding', apoc.convert.fromJsonList(row.embedding)) RETURN count(*)""", """MATCH (m:Movie {title: "Toy Story"})
RETURN m.title AS title, m.plot AS plot, m.plotEmbedding""", """DROP INDEX moviePlots""" , """CREATE VECTOR INDEX moviePlots IF NOT EXISTS FOR (m:Movie) ON m.plotEmbedding
OPTIONS {indexConfig: { `vector.dimensions`: 768, `vector.similarity_function`: 'cosine' }}""", """SHOW INDEXES YIELD id, name, type, state, populationPercent WHERE type = 'VECTOR' """,
"""MATCH (m:Movie {title: 'Toy Story'})

CALL db.index.vector.queryNodes('moviePlots', 6, m.plotEmbedding) YIELD node, score

RETURN node.title AS title, node.plot AS plot, score """ ]

# if limit is not None:
#   query += f' LIMIT {limit}'
for query in queries:
  movies, summary, keys = driver.execute_query(
    query
  )
  print ("Movies", movies, "\nSum", summary, "\nKey", keys, "\n")

driver.close()

Movies [<Record title='Toy Story' plot="A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room.">] 
Sum <neo4j._work.summary.ResultSummary object at 0x78262e736e30> 
Key ['title', 'plot'] 

Movies [<Record count(*)=100>] 
Sum <neo4j._work.summary.ResultSummary object at 0x78262e7d0160> 
Key ['count(*)'] 

Movies [<Record title='Toy Story' plot="A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room." m.plotEmbedding=[0.05740255489945412, 0.043752674013376236, -0.005099424161016941, 0.0011854032054543495, -0.008661516942083836, 0.02697652392089367, 0.007698857691138983, -0.008475186303257942, -0.02351144514977932, 0.010201675817370415, 0.01100873202085495, -0.09389089792966843, 0.016855904832482338, 0.049800675362348557, -0.0035002632066607475, -0.019216863438487053, 0.021932777017354965, -0.022039487957954407, 0.037455376237630844, -0.016466926783323288, 0.009

# LangChain


The below cells are the guide taken from the official course, that requires access to the OpenAI API. The last sub-section gives the full code for the entire example you would like to see for the Neo4j+LLM+RAG example as in the course. However, as I don't have access to OpenAPI, I show a "MVP" working RAG example with Falcon in the last cell of this notebook (without a memory-chat-like interface, just independent querying.

## Prompting

The first step to talking to a LLM is to see how you can pass a prompt with an input variable.

In [None]:
from langchain_openai import OpenAI

llm = OpenAI(
    openai_api_key="sk-...",
    model="gpt-3.5-turbo-instruct",
    temperature=0
)

response = llm.invoke("What is Neo4j?")

print(response)

from langchain.prompts import PromptTemplate

template = PromptTemplate(template="""
You are a cockney fruit and vegetable seller.
Your role is to assist your customer with their fruit and vegetable needs.
Respond using cockney rhyming slang.

Tell me about the following fruit: {fruit}
""", input_variables=["fruit"])

response = llm.invoke(template.format(fruit="apple"))

print(response)

## Chaining

The next step is to see how you can "chain" various components up together and invoke it in one go a very simple way.

In [None]:
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate

llm = OpenAI(openai_api_key="sk-...")

template = PromptTemplate.from_template("""
You are a cockney fruit and vegetable seller.
Your role is to assist your customer with their fruit and vegetable needs.
Respond using cockney rhyming slang.

Output JSON as {{"description": "your response here"}}

Tell me about the following fruit: {fruit}
""")

from langchain.output_parsers.json import SimpleJsonOutputParser

llm_chain = template | llm | SimpleJsonOutputParser() # Default: from langchain.schema import StrOutputParser -> StrOutputParser()

response = llm_chain.invoke({"fruit": "apple"})

print(response)

ModuleNotFoundError: No module named 'langchain_openai'

## Chat model
Querying a model with context.

In [None]:
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage, SystemMessage

chat_llm = ChatOpenAI(
    openai_api_key="sk-..."
)

instructions = SystemMessage(content="""
You are a surfer dude, having a conversation about the surf conditions on the beach.
Respond using surfer slang.
""")

question = HumanMessage(content="What is the weather like?")

response = chat_llm.invoke([
    instructions,
    question
])

print(response.content) #AIMessage(content="Dude, the weather is totally gnarly! It's sunny with some epic offshore winds. Perfect conditions for shredding some sick waves!", additional_kwargs={}, example=False)

# Above as a chain with context:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser

chat_llm = ChatOpenAI(openai_api_key="sk-...")

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ( "system", "{context}" ),
        ( "human", "{question}" ),
    ]
)

chat_chain = prompt | chat_llm | StrOutputParser()

current_weather = """
    {
        "surf": [
            {"beach": "Fistral", "conditions": "6ft waves and offshore winds"},
            {"beach": "Polzeath", "conditions": "Flat and calm"},
            {"beach": "Watergate Bay", "conditions": "3ft waves and onshore winds"}
        ]
    }"""

response = chat_chain.invoke(
    {
        "context": current_weather,
        "question": "What is the weather like on Watergate Bay?",
    }
)

print(response)

### Memory

In [None]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a surfer dude, having a conversation about the surf conditions on the beach. Respond using surfer slang.",
        ),
        ("system", "{context}"),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{question}"),
    ]
)

from langchain_community.chat_message_histories import ChatMessageHistory

memory = ChatMessageHistory()

def get_memory(session_id):
    return memory

from langchain_core.runnables.history import RunnableWithMessageHistory

chat_chain = prompt | chat_llm | StrOutputParser()

chat_with_message_history = RunnableWithMessageHistory(
    chat_chain,
    get_memory,
    input_messages_key="question",
    history_messages_key="chat_history",
)

response = chat_with_message_history.invoke(
    {
        "context": current_weather,
        "question": "Hi, I am at Watergate Bay. What is the surf like?"
    },
    config={"configurable": {"session_id": "none"}}
)
print(response)

response = chat_with_message_history.invoke(
    {
        "context": current_weather,
        "question": "Where I am?"
    },
    config={"configurable": {"session_id": "none"}}
)
print(response)

In [None]:
while True:
    question = input("> ")

    response = chat_with_message_history.invoke(
        {
            "context": current_weather,
            "question": question,

        },
        config={
            "configurable": {"session_id": "none"}
        }
    )

    print(response)

### Memory storage in Neo4j

In [None]:
from langchain_community.graphs import Neo4jGraph

graph = Neo4jGraph(
    url="bolt://18.233.226.221:7687",
    username="neo4j",
    password="monitors-terminators-mile"
)

result = graph.query("""
MATCH (m:Movie{title: 'Toy Story'})
RETURN m.title, m.plot, m.poster
""")

print(result)

print(graph.schema) #graph.refresh_schema()

from uuid import uuid4

SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

from langchain_community.chat_message_histories import Neo4jChatMessageHistory

def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)



[{'m.title': 'Toy Story', 'm.plot': "A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room.", 'm.poster': 'https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg'}]
Node properties:
Movie {url: STRING, runtime: INTEGER, revenue: INTEGER, budget: INTEGER, plotEmbedding: LIST, imdbRating: FLOAT, released: STRING, countries: LIST, languages: LIST, plot: STRING, imdbVotes: INTEGER, imdbId: STRING, year: INTEGER, poster: STRING, movieId: STRING, tmdbId: STRING, title: STRING}
Genre {name: STRING}
User {userId: STRING, name: STRING}
Actor {url: STRING, bornIn: STRING, bio: STRING, died: DATE, born: DATE, imdbId: STRING, name: STRING, poster: STRING, tmdbId: STRING}
Director {url: STRING, bornIn: STRING, bio: STRING, died: DATE, born: DATE, imdbId: STRING, name: STRING, poster: STRING, tmdbId: STRING}
Person {url: STRING, bornIn: STRING, bio: STRING, died: DATE, born: DATE, imdbId: STRING, name: STRING, po

In [None]:
# Assuming previous relevant sub-sections have run, or copy them here again if u like
while True:
    question = input("> ")

    response = chat_with_message_history.invoke(
        {
            "context": current_weather,
            "question": question,

        },
        config={
            "configurable": {"session_id": "none"}
        }
    )

    print(response)

# Agents

Follow along with the course.

In [None]:
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.schema import StrOutputParser
from langchain_community.tools import YouTubeSearchTool
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from langchain_community.graphs import Neo4jGraph
from uuid import uuid4

SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

llm = ChatOpenAI(openai_api_key="sk-...")

graph = Neo4jGraph(
    url="bolt://54.159.230.252:7687",
    username="neo4j",
    password="boom-expansion-sterilizer"
)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a movie expert. You find movies from a genre or plot.",
        ),
        ("human", "{input}"),
    ]
)

movie_chat = prompt | llm | StrOutputParser()

youtube = YouTubeSearchTool()

def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)

def call_trailer_search(input):
    input = input.replace(",", " ")
    return youtube.run(input)

tools = [
    Tool.from_function(
        name="Movie Chat",
        description="For when you need to chat about movies. The question will be a string. Return a string.",
        func=movie_chat.invoke,
    ),
    Tool.from_function(
        name="Movie Trailer Search",
        description="Use when needing to find a movie trailer. The question will include the word trailer. Return a link to a YouTube video.",
        func=call_trailer_search,
    ),
]

agent_prompt = hub.pull("hwchase17/react-chat") #https://smith.langchain.com/hub/hwchase17/react-chat?organizationId=d9a804f5-9c91-5073-8980-3d7112f1cbd3
agent = create_react_agent(llm, tools, agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)  # max_iterations=3,    verbose=True,    handle_parse_errors=True

chat_agent = RunnableWithMessageHistory(
    agent_executor,
    get_memory,
    input_messages_key="input",
    history_messages_key="chat_history",
)

while True:
    q = input("> ")

    response = chat_agent.invoke(
        {
            "input": q
        },
        {"configurable": {"session_id": SESSION_ID}},
    )

    print(response["output"])

Session ID: 8112f867-6a08-4b9e-9005-439eb6bee4b6




> A science-fiction movie about climate change.




AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-.... You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

# Retrievers

Let's first see how the similarity search works at Neo4j (no LLMs involved in the below cell).

In [None]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.graphs import Neo4jGraph
from langchain_community.vectorstores import Neo4jVector

embedding_provider = OpenAIEmbeddings(
    openai_api_key="sk-..."
)

graph = Neo4jGraph(
    url="bolt://18.233.226.221:7687",
    username="neo4j",
    password="monitors-terminators-mile"
)

movie_plot_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="moviePlots",
    embedding_node_property="plotEmbedding",
    text_node_property="plot",
)

result = movie_plot_vector.similarity_search("A movie where aliens land and attack earth.") #query, k=4
for doc in result:
    print(doc.metadata["title"], "-", doc.page_content)

ModuleNotFoundError: No module named 'langchain_openai'

### [Additional] Neo4j index at runtime

For reference only if you wish to create an index at runtime. The below cell isn't used in this course, so you may skip this.

In [None]:
from langchain_openai import OpenAIEmbeddings
from langchain_community.graphs import Neo4jGraph
from langchain_community.vectorstores import Neo4jVector
from langchain.schema import Document

# A list of Documents
documents = [
    Document(
        page_content="Text to be indexed",
        metadata={"source": "local"}
    )
]

# Service used to create the embeddings
embedding_provider = OpenAIEmbeddings(
    openai_api_key="sk-..."
)

graph = Neo4jGraph(
    url="bolt://18.233.226.221:7687",
    username="neo4j",
    password="monitors-terminators-mile"
)

new_vector = Neo4jVector.from_documents(
    documents,
    embedding_provider,
    graph=graph,
    index_name="myVectorIndex",
    node_label="Chunk",
    text_node_property="text",
    embedding_node_property="embedding",
    create_id_index=True,
)

ModuleNotFoundError: No module named 'langchain_openai'

## Full RetrievalQA chain

The retriever, now with LLMs (no agents involved in the below cell).

In [None]:
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.graphs import Neo4jGraph
from langchain_community.vectorstores import Neo4jVector

OPENAI_API_KEY = "sk-..."

llm = ChatOpenAI(openai_api_key=OPENAI_API_KEY)

embedding_provider = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

graph = Neo4jGraph(
    url="bolt://18.233.226.221:7687",
    username="neo4j",
    password="monitors-terminators-mile"
)

movie_plot_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="moviePlots",
    embedding_node_property="plotEmbedding",
    text_node_property="plot",
)

plot_retriever = RetrievalQA.from_llm(
    llm=llm,
    retriever=movie_plot_vector.as_retriever(),
    #  verbose=True,
    # return_source_documents=True
)

response = plot_retriever.invoke(
    {"query": "A movie where a mission to the moon goes wrong"}
)

print(response)

### [MAIN CODE] RetrieverQA + Agents (Optional Exercise)
The main code, combining the "intelligent" agent with retrievers for your RAG application.

In [None]:
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.schema import StrOutputParser
from langchain_community.tools import YouTubeSearchTool
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from langchain_community.graphs import Neo4jGraph
from uuid import uuid4
from langchain_community.vectorstores import Neo4jVector
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA



SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

llm = ChatOpenAI(openai_api_key="sk-...")

embedding_provider = OpenAIEmbeddings(openai_api_key=OPENAI_API_KEY)

graph = Neo4jGraph(
    url="neo4j://3.84.134.243:7687",
    username="neo4j",
    password="tuition-experiences-colors"
)


movie_plot_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="moviePlots",
    embedding_node_property="plotEmbedding",
    text_node_property="plot",
)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a movie expert. You find movies from a genre or plot.",
        ),
        ("human", "{input}"),
    ]
)

plot_retriever = RetrievalQA.from_llm(
    llm=llm,
    retriever=movie_plot_vector.as_retriever(),
     verbose=True,
    return_source_documents=True
)
response = plot_retriever.invoke(
    {"query": "A movie where a mission to the moon goes wrong"}
)
print(response)

movie_chat = prompt | llm | StrOutputParser()

youtube = YouTubeSearchTool()

def get_memory(session_id):
    return Neo4jChatMessageHistory(session_id=session_id, graph=graph)


def call_trailer_search(input):
    input = input.replace(",", " ")
    return youtube.run(input)

tools = [
    Tool.from_function(
        name="Movie Chat",
        description="For when you need to chat about movies. The question will be a string. Return a string.",
        func=movie_chat.invoke,
    ),
    Tool.from_function(
        name="Movie Trailer Search",
        description="Use when needing to find a movie trailer. The question will include the word trailer. Return a link to a YouTube video.",
        func=call_trailer_search,
    ),
    Tool.from_function( #RAG
        name="Movie Plot Search",
        description="For when you need to compare a plot to a movie. The question will be a string. Return a string.",
        func=plot_retriever.invoke, #I was putting this in place of the first tool func, but realization is that the prompt description also matters - the word "compare"
    ),
]

agent_prompt = hub.pull("hwchase17/react-chat") #https://smith.langchain.com/hub/hwchase17/react-chat?organizationId=d9a804f5-9c91-5073-8980-3d7112f1cbd3
agent = create_react_agent(llm, tools, agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose = True )  # max_iterations=3,    verbose=True,    handle_parse_errors=True

chat_agent = RunnableWithMessageHistory(
    agent_executor,
    get_memory,
    input_messages_key="input",
    history_messages_key="chat_history",
)

while True:
    q = input("> ")

    response = chat_agent.invoke(
        {
            "input": q
        },
        {"configurable": {"session_id": SESSION_ID}},
    )

    print(response["output"])

### [DEMO] With Falcon

The main code, but with our created embeddings and an accessible LLM (both open-source).


In [3]:
import os

# embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

from langchain import HuggingFaceHub

os.environ["HUGGINGFACEHUB_API_TOKEN"] = "hf_...." #Put your own HF API here - it's free!

llm=HuggingFaceHub(repo_id="tiiuae/falcon-7b-instruct", model_kwargs={"temperature":0.01}) #Play around with the temp

  llm=HuggingFaceHub(repo_id="tiiuae/falcon-7b-instruct", model_kwargs={"temperature":0.01}) #Play around with the temp


In [None]:
llm.invoke("What is the meaning of the word \"model\"?") #Sanity check

'What is the meaning of the word "model"?\nThe word "model" can refer to a physical or ideal representation of something, or a set of instructions or guidelines for achieving a particular goal. It can also refer to a specific type of model, such as a mathematical model or a scientific model.'

In [None]:
from langchain.agents import AgentExecutor, create_react_agent
from langchain.tools import Tool
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain.schema import StrOutputParser
from langchain_community.tools import YouTubeSearchTool
from langchain_community.chat_message_histories import Neo4jChatMessageHistory
from langchain_community.graphs import Neo4jGraph
from uuid import uuid4
from langchain_community.vectorstores import Neo4jVector
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA

SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")


embedding_provider = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

graph = Neo4jGraph(
    url="bolt://44.220.93.128:7687",
    username="neo4j",
    password="mechanisms-facility-nose"
)


movie_plot_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="moviePlots",
    embedding_node_property="plotEmbedding",
    text_node_property="plot",
)

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a movie expert. You find movies from a genre or plot.",
        ),
        ("placeholder", "{chat_history}"),
        ("human", "{input}"),
    ]
)
template = """
TOOLS:

------

You have access to the following tools:

{tools}

CHOOSE ONE FROM {tool_names} for the "Action".

Action: Movie Plot Search
Action Input: {input}
Observation: the result of the action

```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:

```
Thought: Do I need to use a tool? No

Final Answer: [your response here]



```

Begin!



New input: {input}

{agent_scratchpad}

"""

agent_prompt = ChatPromptTemplate.from_template(template)


plot_retriever = RetrievalQA.from_llm(
    llm=llm,
    retriever=movie_plot_vector.as_retriever(),
     verbose=True,
    return_source_documents=True
)
# response = plot_retriever.invoke(
#     {"query": "A movie where a mission to the moon goes wrong"}
# )
# print(response)

movie_chat = prompt | llm | StrOutputParser()

youtube = YouTubeSearchTool()

# def get_memory(session_id): #persisting memory
#     return Neo4jChatMessageHistory(session_id=session_id, graph=graph)

#You may use the below messgae history if you don't want to keep polluting your Sandbox with different session runs for persistence (each time you run this cell, a new session is made)
from langchain_community.chat_message_histories import ChatMessageHistory

memory = ChatMessageHistory() #ephemeral memory for the current session

def get_memory(session_id):
    return memory

def call_trailer_search(input):
    input = input.replace(",", " ")
    return youtube.run(input)

tools = [
    Tool.from_function(
        name="Movie Chat",
        description="For when you need to chat about movies. The question will be a string. Return a string.",
        func=movie_chat.invoke,
    ),
    Tool.from_function(
        name="Movie Trailer Search",
        description="Use when needing to find a movie trailer. The question will include the word trailer. Return a link to a YouTube video.",
        func=call_trailer_search,
    ),
    Tool.from_function( #RAG
        name="Movie Plot Search",
        description="Use when retrieving the title for a given plot. The question will include the word title. Return the closest matched plot's title from the context you are given.",
        func=plot_retriever.invoke, #I was putting this in place of the first tool func, but realization is that the prompt description also matters - the word "compare"
    ),
]

agent = create_react_agent(llm, tools, agent_prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose = True, max_iterations=3, handle_parsing_errors=True, use_function_response=True )  # max_iterations=3,    verbose=True,    handle_parse_errors=True

chat_agent = RunnableWithMessageHistory(
    agent_executor, #If no agent, direct chain also (movie_chat)
    get_memory,
    input_messages_key="input",
    history_messages_key="chat_history",
)

while True:
    q = input("> ")

    response = chat_agent.invoke( #or can do a direct agent_executor.invoke if u do not want the memory / movie_chat if u do not want the agent
        {
            "input": q
        },
        {"configurable": {"session_id": SESSION_ID}},
    )

    print(response['output'])

Session ID: 4e58005e-959b-4536-987c-08405ded8eb7




> A movie about a cowboy doll and a spaceman toy.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mHuman: 
TOOLS:

------

You have access to the following tools:

Movie Chat(input: 'Input', config: 'Optional[RunnableConfig]' = None, **kwargs: 'Any') -> 'Output' - For when you need to chat about movies. The question will be a string. Return a string.
Movie Trailer Search(input) - Use when needing to find a movie trailer. The question will include the word trailer. Return a link to a YouTube video.
Movie Plot Search(input: Dict[str, Any], config: Optional[langchain_core.runnables.config.RunnableConfig] = None, **kwargs: Any) -> Dict[str, Any] - Use when retrieving the title for a given plot. The question will include the word title. Return the closest matched plot's title from the context you are given.

CHOOSE ONE FROM Movie Chat, Movie Trailer Search, Movie Plot Search for the "Action".

Action: Movie Plot Search
Action Input: A movie about a cowboy doll and a spaceman toy

KeyboardInterrupt: Interrupted by user

Note that in the prompt, "Action" and "Action Input" must be present in the same casing. While the original prompt expects the LLM to choose one of the tools, our LLM simply resorts to passing down to the agent the exact string present after "Action" instead of reasoning over it. Thus, I needed to hardcode it. The same goes for the Action Input. The LLM / agent also doesn't return the actual output in the format we expect, which can be seen when you harcode either of the three tools in Action (copy any one of the three tool names there).

- You may note that when given the plot search tool, the tool certainly enters the RetrievalQA chain, and gives a response with the retrieved documents. Infact, it correctly retrieves all plots given from the source document as context and at at the end, under "Helpful Answer", correctly gives the title (highly dependent on the way the user input was worded). However, it seems that the response isn't being parsed back to the agent executor call, which is still waiting for a reply and hence times out. The same issue seems to happen for the simpler chat invoke, and again for when the YT links are returned - though all three tools do their part.

Overall, the LLM is unable to reason through the agent, perhaps because the 7B-instruct was developed for conversations instead of internal reasoning. (Feel free to try out the offline (non-HFAPI) model of 40B and see if it can work!)

Each iteration shows the thinking of the model. To disable the output, switch "verbose" to `False` in the agent_executor.

**Cypher for returning graph of the conv history (as earlier stored in Neo4j)**

Execute these on the sandbox website (refer to the course for what these do):


- MATCH (s:Session)-[:LAST_MESSAGE]->(last:Message)<-[:NEXT*]-(msg:Message)
RETURN s, last, msg

- MATCH (s:Session)-[:LAST_MESSAGE]->(last:Message)
WHERE s.id = 'your session id'
MATCH p = (last)<-[:NEXT*]-(msg:Message)
UNWIND nodes(p) as msgs
RETURN DISTINCT msgs.type, msgs.content

### [CHAT] Falcon + Neo4j RAG without agents

In [None]:
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain.schema import StrOutputParser
from operator import itemgetter
from langchain_community.graphs import Neo4jGraph

from langchain_community.vectorstores import Neo4jVector
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.chains import RetrievalQA
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables import RunnableParallel
from langchain_community.chat_message_histories import ChatMessageHistory
from uuid import uuid4

SESSION_ID = str(uuid4())
print(f"Session ID: {SESSION_ID}")

memory = ChatMessageHistory() #ephemeral memory for the current session

def get_memory(session_id):
    return memory


embedding_provider = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

graph = Neo4jGraph(
    url="bolt://44.220.93.128:7687",
    username="neo4j",
    password="mechanisms-facility-nose"
)


movie_plot_vector = Neo4jVector.from_existing_index(
    embedding_provider,
    graph=graph,
    index_name="moviePlots",
    embedding_node_property="plotEmbedding",
    text_node_property="plot",
)

# prompt = ChatPromptTemplate.from_messages(
#     [
#         (
#             "system",
#             "You are a movie expert with the knowledge given to you as context from the retrieval database. The human user gives a plot, which you must match with the plots given to you as context.",
#         ),
#         ("human", "{input}"),
#     ]
# )
template = """
You are a movie expert. You are given a context that has information about four movies, including their titles and plots.
Choose only one movie from the context given whose plot best correlates with the human question given.
Return only the title of your chosen movie.

Context:
{context}


Chat History:
{chat_history}

Question:
{query}
"""
prompt = ChatPromptTemplate.from_template(template)

# plot_retriever = RetrievalQA.from_llm(
#     llm=llm,
#     retriever=movie_plot_vector.as_retriever(),
#     prompt = prompt,
#      verbose=True,
#     return_source_documents=True
# )
retriever = movie_plot_vector.as_retriever()
# response = plot_retriever.invoke(
#     {"query": "A movie where a mission to the moon goes wrong"}
# )
# print(response)

# movie_chat = prompt | llm | StrOutputParser()
# movie_chat = prompt | llm | plot_retriever
chain = RunnableParallel({ "chat_history": itemgetter("chat_history"), "query": itemgetter("query"), "context": itemgetter("query") | retriever}) |  prompt | llm | StrOutputParser() # in chain chat_history is expected explicitly because runPar comes in and distorts what's in despite the MessageHistory
chat_agent = RunnableWithMessageHistory(
    chain,
    get_memory,
    input_messages_key="query",
    history_messages_key="chat_history",
)

while True:
    q = input("> ")

    response = chat_agent.invoke(
        {
            "query": q,
        },
        {"configurable": {"session_id": SESSION_ID}},
    )

    print(response)

# Issues: dict invoke doesn't work with retriever

Session ID: 4ced1bc7-90fe-49d3-a8d2-7011c34ba562


  embedding_provider = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

> Tell me a title for a movie about a cowboy doll and a spaceman toy.
Human: 
You are a movie expert. You are given a context that has information about four movies, including their titles and plots.
Choose only one movie from the context given whose plot best correlates with the human question given.
Return only the title of your chosen movie.

Context:
[Document(metadata={'budget': 30000000, 'movieId': '1', 'tmdbId': '862', 'imdbVotes': 591836, 'runtime': 81, 'countries': ['USA'], 'imdbId': '0114709', 'url': 'https://themoviedb.org/movie/862', 'released': '1995-11-22', 'languages': ['English'], 'imdbRating': 8.3, 'title': 'Toy Story', 'poster': 'https://image.tmdb.org/t/p/w440_and_h660_face/uXDfjJbdP4ijW5hWSBrPrlKpxab.jpg', 'year': 1995, 'revenue': 373554033}, page_content="A cowboy doll is profoundly threatened and jealous when a new spaceman figure supplants him as top toy in a boy's room."), Document(metadata={'budget': 7000000, 'movieId': '101', 'tmdbId': '13685', 'imdbVotes': 52

HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct (Request ID: sZgnODYJd4WDNZVJl77iA)

Input validation error: `inputs` tokens + `max_new_tokens` must be <= 8192. Given: 10725 `inputs` tokens and 100 `max_new_tokens`
Make sure 'text-generation' task is supported by the model.

Note how the model can both retrieve, and generate answers based on the chat history (I first asked for a movie that matches the plot, then asked for its genre which it wasn't instructed to in the prompt explicitly but correctly generates from its internal knowledge. However, the model does not seem to handle larger chat histories. I have tried removing the chat history component and it continues to chat quite well.

# [Optional] Cypher query generation by LLMs

You can ask the LLM to spin up Cypher queries using your natural language description and executing them on the graph DB. Again, this is as-is from the course - you will need your OpenAPI key, which I do not have and thus have not executed the code below.

In [None]:
from langchain_openai import ChatOpenAI
from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain.prompts import PromptTemplate

llm = ChatOpenAI(
    openai_api_key="sk-..."
)

graph = Neo4jGraph(
    url="bolt://localhost:7687",
    username="neo4j",
    password="pleaseletmein",
)

CYPHER_GENERATION_TEMPLATE = """
You are an expert Neo4j Developer translating user questions into Cypher to answer questions about movies and provide recommendations.
Convert the user's question based on the schema.

Instructions:
Use only the provided relationship types and properties in the schema.
Do not use any other relationship types or properties that are not provided.
For movie titles that begin with "The", move "the" to the end, For example "The 39 Steps" becomes "39 Steps, The" or "The Matrix" becomes "Matrix, The".
If no data is returned, do not attempt to answer the question.
Only respond to questions that require you to construct a Cypher statement.
Do not include any explanations or apologies in your responses.

Examples:

Find movies and genres:
MATCH (m:Movie)-[:IN_GENRE]->(g)
RETURN m.title, g.name

Find roles for actors:
MATCH (m:Movie)-[r:ACTED_IN]->(p:Person)
WHERE m.title = 'movie title' AND p.name = 'actor name'
RETURN m.title, r.role, p.name

Schema: {schema}
Question: {question}
"""

cypher_generation_prompt = PromptTemplate(
    template=CYPHER_GENERATION_TEMPLATE,
    input_variables=["schema", "question"],
)

cypher_chain = GraphCypherQAChain.from_llm( #This is meant to both query and execute the query over the graph
    llm,
    graph=graph,
    cypher_prompt=cypher_generation_prompt,
    verbose=True
)

cypher_chain.invoke({"query": "What is the plot of the movie Toy Story?"})
#Also: a) A different context - "What movies did Meg Ryan act in?" b) An aggregate query - "How many movies has Tom Hanks directed?"
# Other examples:
# What movies has Tom Hanks directed and what are the genres?
# MATCH (p:Person)-[:DIRECTED]->(m:Movie)-[:IN_GENRE]->(g:Genre)
# WHERE p.name = 'Tom Hanks'
# RETURN DISTINCT g.name
# [{'g.name': 'Drama'}, {'g.name': 'Comedy'}, {'g.name': 'Romance'}]

# What genre of film is Toy Story?
# MATCH (m:Movie {title: 'Toy Story'})-[:IN_GENRE]->(g:Genre)
# RETURN g.name
# [{'g.name': 'Adventure'}, {'g.name': 'Animation'}, {'g.name': 'Children'}, {'g.name': 'Comedy'}, {'g.name': 'Fantasy'}]

- Modifying Chat History (trimming, sum, itemgetter as well): https://python.langchain.com/v0.2/docs/how_to/chatbots_memory/#chat-history

# PDF RAG

Given the knowledge above, the same pipeline can be adapted for chatting with your local PDFs! The only difference is that the vector store now is now made to represent paragraphs from a chunked-and-embedded PDF. Seeing that the model cannot handle larger memory contexts repeatedly from the previous example, we do not include memory here.

In [4]:
%pip install -qU pypdf langchain_community

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/294.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━[0m [32m286.7/294.5 kB[0m [31m10.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m294.5/294.5 kB[0m [31m7.2 MB/s[0m eta [36m0:00:00[0m
[?25h

In [5]:
from langchain_community.document_loaders import PyPDFLoader

file_path = "NRUP_content.pdf"
loader = PyPDFLoader(file_path)

docs = loader.load()

print(len(docs))

8


In [6]:
print(docs[0].page_content[0:])
print(docs[0].metadata)

 
ETSI ETSI TS 138 415 V15.0.0 (2018 -07) 4 3GPP TS 38.415 version 15.0.0 Release 15
Foreword 
This Technical Specification has been produced by  the 3rd Generation Partnership Project (3GPP). 
The contents of the present document are subject to conti nuing work within the TSG and may change following formal 
TSG approval. Should the TSG modify the contents of the present document, it will be re-released by the TSG with an 
identifying change of release date and an increase in version number as follows: 
Version x.y.z 
where: 
x the first digit: 
1 presented to TSG for information; 2 presented to TSG for approval; 3 or greater indicates TSG approved document under change control. 
y the second digit is incremented for all changes of substance, i.e. technical enhancements, corrections, 
updates, etc. 
z the third digit is incremented when editorial only  changes have been incor porated in the document. 
{'source': 'NRUP_content.pdf', 'page': 0}


In [7]:
from langchain import HuggingFaceHub
os.environ["HUGGINGFACEHUB_API_TOKEN"] = "hf_...."
from langchain_community.embeddings import HuggingFaceEmbeddings
model_name = "sentence-transformers/all-mpnet-base-v2"

from langchain_core.vectorstores import InMemoryVectorStore
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = InMemoryVectorStore.from_documents(
    documents=splits, embedding=HuggingFaceEmbeddings(model_name=model_name)
)

retriever = vectorstore.as_retriever()

  documents=splits, embedding=HuggingFaceEmbeddings(model_name=model_name)
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [18]:
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.runnables import RunnablePassthrough
from langchain_core.runnables import RunnableParallel
from langchain_community.chat_message_histories import ChatMessageHistory
from operator import itemgetter
# from uuid import uuid4

# SESSION_ID = str(uuid4())
# print(f"Session ID: {SESSION_ID}")

# memory = ChatMessageHistory() #ephemeral memory for the current session

# def get_memory(session_id):
#     return memory

system_prompt = (
    """You are a 5G assistant for question-answering tasks on the NRUP ETSI TS Specification.
    Use the following pieces of retrieved context to answer the human question:
    \n\n Context:
    {context}
    \n
"""
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", "{input}"),
    ]
)



question_answer_chain = create_stuff_documents_chain(  llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

# results = rag_chain.invoke({"input": "Describe the elementary successful transfer of DL PDU session."})
# chat_mem = RunnableWithMessageHistory(
#     rag_chain,
#     get_memory,
#     input_messages_key="input",
#     history_messages_key="chat_history",
# )

while True:
  q = input("> ")

  results = rag_chain.invoke(
          {"input": q},
          # {"configurable": {"session_id": SESSION_ID}},
          )
  print(results['answer']) #Check the last line (line after "Human: ...") for the main content of the answer. This model may be trained to repeat the full input thus it repeats the page_content of the context as well

  print("\n ****************** Context ******************** \n")
  for _ in results["context"]:
    print(_.metadata, "\n")
    print(_.page_content)
    print("\n--------------------------------------\n")

> Define QFI
System: You are a 5G assistant for question-answering tasks on the NRUP ETSI TS Specification.
    Use the following pieces of retrieved context to answer the human question:
    

 Context:
    reserved for later versions. Value range:  (0–2
n-1). 
Field Length:  n bits. 
5.5.3.3 QoS Flow Identifier (QFI) 
Description:  When present this parameter indicates the QoS Flow Identifier of the QoS flow to which the transferred 
packet belongs. Value range:  {0..2
6-1}.  
Field length:  6 bits. 
5.5.3.4 Reflective QoS Indicator (RQI) 
Description:  This parameter indicates activation of the reflective  QoS towards the UE for the transferred packet as 
described in clause 5.4.1.1. It is used only in the downlink direction. If RQA (Reflective QoS Activation) has not been 
configured for the involved QoS flow, the RQI shall be ignored by the NG-RAN node. 
Value range:  {0= Reflective QoS activation not triggered, 1= Reflective QoS activation triggered}. 
Field length:  1 bit. 
5.5.

KeyboardInterrupt: Interrupted by user

In [19]:
print(results["context"][0].page_content)

5.3 Services expected from the Transport Network Layer 
The PDU session UP layer expects the following services from the Transport Network Layer: 
- Transfer of PDU session User Plane PDUs.  
5.4 Elementary procedures 
5.4.1 Transfer of DL PDU Session Information  
5.4.1.1 Successful operation 
The purpose of the Transfer of DL PDU Session Information procedure is to send control information elements related 
to the PDU Session from UPF/NG-RAN to NG-RAN.  
A PDU Session user plane instance making use of the Transfer  of DL PDU Session Information procedure is associated 
to a single PDU Session. The Transfer of DL PDU Session  Information procedure may be invoked whenever packets 
for that particular PDU Session need to be transferred across the related interface instance. 
The DL PDU Session Information frame includes a QoS Flow Identifier (QFI) field associated with the transferred


In [20]:
print(results["context"][0].metadata)

{'source': 'NRUP_content.pdf', 'page': 2}
