<a href="https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/use_cases/agents/langchain/langgraph-rag-agent-local.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Local LangGraph Vector + Graph Ingestion with Llama 3


Simple example of ingesting data with Llama 3 to both Milvus and Neo4j databases.



## Local models

### LLM

Use [Ollama](https://ollama.ai/) and [llama3](https://ollama.ai/library/llama3):

```
ollama pull llama3.1
```

### Env Variables
Variables needed in an .env file or loaded as variables at start:

Required:
```
NEO4J_URI=...
NEO4J_USERNAME=...
NEO4J_PASSWORD=...
```

In [12]:
pip install -U beautifulsoup4 langchain langchain_community langchain-experimental langchain-huggingface langchain-milvus neo4j sentence_transformers tiktoken pymilvus

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Note: you may need to restart the kernel to use updated packages.


In [13]:
### Load credentials from .env file
from dotenv import load_dotenv

load_dotenv()

True

In [14]:
from langchain.globals import set_verbose, set_debug

set_debug(True)
set_verbose(True)


In [19]:
### Milvus Lite Vectorstore

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_milvus import Milvus
from langchain_community.embeddings import HuggingFaceEmbeddings

urls = [
    "https://lilianweng.github.io/posts/2023-06-23-agent/",
    "https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/",
    "https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/",
]

docs = [WebBaseLoader(url).load() for url in urls]
docs_list = [item for sublist in docs for item in sublist if item]
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=250, chunk_overlap=0
)
doc_splits = text_splitter.split_documents(docs_list)

print(f'Number of docs: {len(docs_list)}')
print(f'Number of chunks: {len(doc_splits)}')

# Add to Milvus
vectorstore = Milvus.from_documents(
    documents=doc_splits,
    collection_name="rag_milvus",
    embedding=HuggingFaceEmbeddings(),
    connection_args={"uri": "./milvus_ingest.db"},

)
retriever = vectorstore.as_retriever()



Number of docs: 3
Number of chunks: 194




In [21]:
# Neo4j Graphstore
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_experimental.llms.ollama_functions import OllamaFunctions

# Initialize Neo4j
graph = Neo4jGraph()

# Graph Conversion requires function calling enabled llm
graph_llm = OllamaFunctions(model="llama3.1", format="json")

# Filtered graph transformer
graph_transformer = LLMGraphTransformer(
    llm=graph_llm,
    allowed_nodes=["Person","Concept","Technology"],
    node_properties=["name","description","source"],
    allowed_relationships=["WROTE", "MENTIONS", "RELATED_TO"],
)

# Convert list of Document objects to Graph Document
graph_documents = graph_transformer.convert_to_graph_documents(doc_splits)

# Filter Graph Documents with no nodes and relationships
filtered_graph_documents = [g_doc for g_doc in graph_documents if len(g_doc.nodes) > 0 or len(g_doc.relationships) > 0]

# Add Graph Documents to Neo4j
graph.add_graph_documents(filtered_graph_documents)

print(f"Graph documents pre-filter: {len(graph_documents)}, post-filter: {len(filtered_graph_documents)}")
print(f'1st Graph Doc: {filtered_graph_documents[0].__dict__}')
print(f"Nodes from 1st graph doc:{filtered_graph_documents[0].nodes}")
print(f"Relationships from 1st graph doc:{filtered_graph_documents[0].relationships}")

Failed to write data to connection IPv4Address(('localhost', 7687)) (ResolvedIPv4Address(('127.0.0.1', 7687)))


[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": "LLM Powered Autonomous Agents | Lil'Log\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nLil'Log\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nPosts\n\n\n\n\nArchive\n\n\n\n\nSearch\n\n\n\n\nTags\n\n\n\n\nFAQ\n\n\n\n\nemojisearch.app\n\n\n\n\n\n\n\n\n\n      LLM Powered Autonomous Agents\n    \nDate: June 23, 2023  |  Estimated Reading Time: 31 min  |  Author: Lilian Weng\n\n\n \n\n\nTable of Contents\n\n\n\nAgent System Overview\n\nComponent One: Planning\n\nTask Decomposition\n\nSelf-Reflection\n\n\nComponent Two: Memory\n\nTypes of Memory\n\nMaximum Inner Product Search (MIPS)\n\n\nComponent Three: Tool Use\n\nCase Studies\n\nScientific Discovery Agent\n\nGenerative Agents Simulation\n\nProof-of-Concept Examples\n\n\nChallenges\n\nCitation\n\nReferences"
}
[32;1m[1;3m[chain/start][0m [1m[chain:RunnableSequence > prompt:ChatPromptTemplate] Ente