## Setup (Installs, Data, Models)

In [1]:
!pip install llama-index
!pip install llama-index-core==0.10.42
!pip install llama-index-embeddings-openai
!pip install llama-index-postprocessor-flag-embedding-reranker
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-index-graph-stores-neo4j
!pip install llama-cloud-services

Collecting llama-index-core==0.10.42
  Using cached llama_index_core-0.10.42-py3-none-any.whl.metadata (2.4 kB)
Using cached llama_index_core-0.10.42-py3-none-any.whl (15.4 MB)
Installing collected packages: llama-index-core
  Attempting uninstall: llama-index-core
    Found existing installation: llama-index-core 0.14.12
    Uninstalling llama-index-core-0.14.12:
      Successfully uninstalled llama-index-core-0.14.12
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llama-cloud-services 0.6.54 requires llama-index-core>=0.12.0, but you have llama-index-core 0.10.42 which is incompatible.
llama-index-indices-managed-llama-cloud 0.9.4 requires llama-index-core<0.15,>=0.13.0, but you have llama-index-core 0.10.42 which is incompatible.
llama-index 0.14.12 requires llama-index-core<0.15.0,>=0.14.12, but you have llama-index-core 0.10.42 which is incompatible.


In [2]:
import nest_asyncio

nest_asyncio.apply()

In [None]:
import os

# API access to llama-cloud
os.environ["LLAMA_CLOUD_API_KEY"] = "llx-..."
os.environ["OPENAI_API_KEY"] = "sk-..."

#### Setup Model

Here we use gpt-4o and default OpenAI embeddings.

In [4]:
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings

llm = OpenAI(model="gpt-4o")
embed_model = OpenAIEmbedding(model="text-embedding-3-small")

Settings.llm = llm
Settings.embed_model = embed_model

#### Load Data

Here we load the PDF and parse it with LlamaParse.

In [None]:
from llama_cloud_services import LlamaParse

docs = LlamaParse(result_type="text").load_data("./data/gear_diff_m.pdf")

Started parsing the file under job_id 7d603e17-c8be-4f5d-80b0-d6987e859fed
.

In [6]:
from copy import deepcopy
from llama_index.core.schema import TextNode, Document
from llama_index.core import VectorStoreIndex


def get_sub_docs(docs):
    """Split docs into pages, by separator."""
    sub_docs = []
    for doc in docs:
        doc_chunks = doc.text.split("\n---\n")
        for doc_chunk in doc_chunks:
            sub_doc = Document(
                text=doc_chunk,
                metadata=deepcopy(doc.metadata),
            )
            sub_docs.append(sub_doc)

    return sub_docs

In [7]:
# this will split into pages
sub_docs = get_sub_docs(docs)

#### Initialize Graph Store

Here we use Neo4j but you can also use our other integrations like Nebula

To launch Neo4j locally, first ensure you have docker installed. Then, you can launch the database with the following docker command

```bash
docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest
```

From here, you can open the db at [http://localhost:7474/](http://localhost:7474/). On this page, you will be asked to sign in. Use the default username/password of `neo4j` and `neo4j`.

Once you login for the first time, you will be asked to change the password.

After this, you are ready to create your first property graph!

In [8]:
from llama_index.graph_stores.neo4j import Neo4jPGStore

graph_store = Neo4jPGStore(
    username="neo4j",
    password="graph1312",
    url="bolt://localhost:7687",
)
vec_store = None

## Construct Knowledge Graph, Get Retrievers

This section shows you how to construct the knowledge graph over the existing documents.

**Note**: we have the default extractors (implicit path, simple llm path) configured. You can also choose to use a pre-defined schema as mentioned in this [notebook](https://github.com/run-llama/llama_index/blob/main/docs/docs/examples/property_graph/property_graph_advanced.ipynb).

In [9]:
from llama_index.core.indices.property_graph import (
    ImplicitPathExtractor,
    SimpleLLMPathExtractor,
)
from llama_index.core import PropertyGraphIndex
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

In [10]:
index = PropertyGraphIndex.from_documents(
    sub_docs,
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    kg_extractors=[
        ImplicitPathExtractor(),
        SimpleLLMPathExtractor(
            llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
            num_workers=4,
            max_paths_per_chunk=10,
        ),
    ],
    property_graph_store=graph_store,
    show_progress=True,
)

Parsing nodes:   0%|          | 0/2 [00:00<?, ?it/s]

Extracting implicit paths: 100%|██████████| 6/6 [00:00<00:00, 24504.21it/s]
Extracting paths from text: 100%|██████████| 6/6 [00:04<00:00,  1.32it/s]
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.70s/it]
Generating embeddings: 100%|██████████| 1/1 [00:01<00:00,  1.86s/it]


In [11]:
# run this if index is already loaded
index = PropertyGraphIndex.from_existing(
    graph_store,
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    kg_extractors=[
        ImplicitPathExtractor(),
        SimpleLLMPathExtractor(
            llm=OpenAI(model="gpt-3.5-turbo", temperature=0.3),
            num_workers=4,
            max_paths_per_chunk=10,
        ),
    ],
    show_progress=True,
)

#### Define Vector Retriever

Here we define our vector context retriever - it returns initial nodes via vector search, and traverses the relations to pull in more nodes/context.

In [12]:
from llama_index.core.indices.property_graph import VectorContextRetriever

kg_retriever = VectorContextRetriever(
    index.property_graph_store,
    embed_model=OpenAIEmbedding(model_name="text-embedding-3-small"),
    similarity_top_k=2,
    path_depth=1,
    # include_text=False,
    include_text=True,
)

In [None]:
nodes = kg_retriever.retrieve(
    "Give me all the modules possible for the Spur gears made of polyacetal (POM)."
)
# nodes = kg_retriever.retrieve('san francisco')
print(len(nodes))
for idx, node in enumerate(nodes):
    print(f">> IDX: {idx}, {node.get_content()}")

1
>> IDX: 0, Here are some facts extracted from the provided text:

Anfrage -> Senden sie an -> Mailbox@zipperle-antriebstechnik.de
Anfrage -> Senden an -> Mailbox@zipperle-antriebstechnik.de
Standardteile -> Durch einbringung von -> Gewinden

5   1074,42     SH2090HF
 95    19      20     190     194      40      34     163      8       464    1134,11     SH2095HF
100    19      20     200     204      40      34     178      8      478,6   1193,81     SH20100HF
110    19      20     220     224      40      34     193      8      580,6   1313,19     SH20110HF

*) Freimaß nur auf Anguss-Seite
**) Bitte Angaben zu Drehmoment auf S. 30 beachten.

42     Gerne bearbeiten wir diese Standardteile auf Ihren Wunsch hin durch Einbringung von veränderten Bohrungsdurchmessern, zusätzlichen Bohrungen, Gewinden,
       Nuten etc. Außerdem fertigen wir auch spezifisch nach Ihren Zeichnungen. Ihre Anfrage senden Sie an mailbox@zipperle-antriebstechnik.de


## Build Baseline Vector Index

We also build a "baseline" vector index. This follows the "naive" RAG pipeline approach of chunking and vector embedding. We use this as a comparison point.

In [32]:
from llama_index.core import VectorStoreIndex
from llama_index.core.query_engine import RetrieverQueryEngine

base_index = VectorStoreIndex.from_documents(sub_docs, embed_model=embed_model)
base_retriever = base_index.as_retriever(similarity_top_k=4)
base_query_engine = RetrieverQueryEngine(base_retriever)

In [None]:
print(Settings.chunk_size)
print(Settings.chunk_overlap)
print(Settings.context_window)

1024
200
128000


In [51]:
response = base_query_engine.query(
    "How many rows and columns does the table for module=1,5 have?"
    "Give me the first and last ZZ value in the table"
)
print(str(response))

response = base_query_engine.query(
    "How many rows and columns does the table for module=2.0 have?"
    "Give me the first and last ZZ value in the table"
)
print(str(response))

The table for module 1.5 has 15 rows and 11 columns. The first ZZ value in the table is 12, and the last ZZ value is 26.
The table for module=2.0 has 23 rows and 12 columns. The first ZZ value in the table is 5, and the last ZZ value is 130.


In [None]:
response = base_query_engine.query(
    "Give me all the modules possible for the Spur gears made of polyacetal (POM)."
)
print(str(response))

The possible modules for the spur gears made of polyacetal (POM) are 1.5 and 2.0.


In [35]:
print(len(response.source_nodes))
for node in response.source_nodes:
    print("---")
    print(node.get_content())

4
---
Stirnräder aus Polyacetal (POM)

Modul 1,5

Ausführung: geradverzahnt, gespritzt, Eingriffswinkel 20°, Bohrung
spanabhebend  bearbeitet.
Maßänderung vorbehalten.                                                                        Abbildung beispielhaft

ZZ     ZB      ØB     ØTK     ØKK      ØN      L      ØFM  WS           G        DM**  Art.-Nr.
      [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]     [g]      [Ncm]
 12    12      6       18      21      14      23      –       12      5,66     50,89      SH1512HF
 13    12      6      19,5    22,5     14      23      –       12      6,14     55,13      SH1513HF
 14    12      6       21      24      14      23     13*     10,5     6,95     59,38      SH1514HF    WS
 15    12      6      22,5    25,5     14      23     16*     10,5     7,9      63,62      SH1515HF
 16    12      6       24      27      14      23     16*     10,5     8,69     67,86      SH1516HF
 17    12      6      25,5    28,5     14      23

In [36]:
response = base_query_engine.query(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=12"
)
print(str(response))

There are two rows of spur gears made of polyacetal (POM) where ZZ=12:

1. For Modul 1,5:
   - ZB: 12
   - ØB: 6 mm
   - ØTK: 18 mm
   - ØKK: 21 mm
   - ØN: 14 mm
   - L: 23 mm
   - ØFM: –
   - WS: 12 mm
   - G: 5,66 g
   - DM**: 50,89 Ncm
   - Art.-Nr.: SH1512HF

2. For Modul 2,0:
   - ZB: 15
   - ØB: 8 mm
   - ØTK: 24 mm
   - ØKK: 28 mm
   - ØN: 18,5 mm
   - L: 27 mm
   - ØFM: 16*
   - WS: 13,5 mm
   - G: 11,74 g
   - DM**: 113,1 Ncm
   - Art.-Nr.: SH2012HF


In [43]:
response = base_query_engine.query(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=24 and ØKK=52."
    "For which module?"
)
print(str(response))

The entire row of spur gears made of polyacetal (POM) where ZZ=24 and ØKK=52 is for module 2.0. The details are:

- ZZ: 24
- ZB: 15
- ØB: 10 mm
- ØTK: 48 mm
- ØKK: 52 mm
- ØN: 24 mm
- L: 27 mm
- ØFM: 35 mm
- WS: 6 mm
- G: 35.4 g
- DM**: 226.19 Ncm
- Art.-Nr.: SH2024HF


## Testing negative cases

In [None]:
response = base_query_engine.query(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=31"
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=29"
)
print(str(response))

There is no row with ZZ=31 or ZZ=29 in the provided information.


## Testing chunk limit

In [70]:
from llama_index.core import VectorStoreIndex
from llama_index.core.query_engine import RetrieverQueryEngine

base_index = VectorStoreIndex.from_documents(sub_docs, embed_model=embed_model)
base_retriever = base_index.as_retriever(similarity_top_k=10)
base_query_engine = RetrieverQueryEngine(base_retriever)

In [71]:
response = base_query_engine.query(
    "How many rows does the table for module=1,5 have? List their ZZ values"
)
print(str(response))

The table for module 1.5 has 15 rows. The ZZ values are 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, and 26.


In [65]:
response = base_query_engine.query(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=32"
)
print(str(response))

For spur gears made of polyacetal (POM) where ZZ=32, the details are as follows:

- Modul 1,5: ØB=10 mm, ØTK=48 mm, ØKK=51 mm, ØN=24 mm, L=23 mm, ØFM=33,5 mm, WS=5 mm, G=30,19 g, DM=135,72 Ncm, Art.-Nr.=SH1532HF
- Modul 2,0: ØB=10 mm, ØTK=64 mm, ØKK=68 mm, ØN=26 mm, L=27 mm, ØFM=44 mm, WS=6 mm, G=59,73 g, DM=301,59 Ncm, Art.-Nr.=SH2032HF


In [66]:
print(len(response.source_nodes))
for node in response.source_nodes:
    print("---")
    print(node.get_content())

6
---
Stirnräder aus Polyacetal (POM)

Modul 1,5

Ausführung: geradverzahnt, gespritzt, Eingriffswinkel 20°, Bohrung
spanabhebend  bearbeitet.
Maßänderung vorbehalten.                                                                        Abbildung beispielhaft

ZZ     ZB      ØB     ØTK     ØKK      ØN      L      ØFM  WS           G        DM**  Art.-Nr.
      [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]     [g]      [Ncm]
 12    12      6       18      21      14      23      –       12      5,66     50,89      SH1512HF
 13    12      6      19,5    22,5     14      23      –       12      6,14     55,13      SH1513HF
 14    12      6       21      24      14      23     13*     10,5     6,95     59,38      SH1514HF    WS
 15    12      6      22,5    25,5     14      23     16*     10,5     7,9      63,62      SH1515HF
 16    12      6       24      27      14      23     16*     10,5     8,69     67,86      SH1516HF
 17    12      6      25,5    28,5     14      23

In [99]:
response = base_query_engine.query(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=110 for both modules"
)
print(str(response))

For Module 1.5, the spur gear with ZZ=110 has the following specifications:

- ZB: 19
- ØB: 20 mm
- ØTK: 165 mm
- ØKK: 168 mm
- ØN: 40 mm
- L: 34 mm
- ØFM: 148 mm
- WS: 8 mm
- G: 335.5 g
- DM**: 738.67 Ncm
- Art.-Nr.: SH15110HF

For Module 2.0, the spur gear with ZZ=110 has the following specifications:

- ZB: 19
- ØB: 20 mm
- ØTK: 220 mm
- ØKK: 224 mm
- ØN: 40 mm
- L: 34 mm
- ØFM: 193 mm
- WS: 8 mm
- G: 580.6 g
- DM**: 1313.19 Ncm
- Art.-Nr.: SH20110HF


In [98]:
print(len(response.source_nodes))
for node in response.source_nodes:
    print("---")
    print(node.get_content())

6
---
Stirnräder aus Polyacetal (POM)

Modul 1,5

Ausführung: geradverzahnt, gespritzt, Eingriffswinkel 20°, Bohrung
spanabhebend  bearbeitet.
Maßänderung vorbehalten.                                                                        Abbildung beispielhaft

ZZ     ZB      ØB     ØTK     ØKK      ØN      L      ØFM  WS           G        DM**  Art.-Nr.
      [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]    [mm]     [g]      [Ncm]
 12    12      6       18      21      14      23      –       12      5,66     50,89      SH1512HF
 13    12      6      19,5    22,5     14      23      –       12      6,14     55,13      SH1513HF
 14    12      6       21      24      14      23     13*     10,5     6,95     59,38      SH1514HF    WS
 15    12      6      22,5    25,5     14      23     16*     10,5     7,9      63,62      SH1515HF
 16    12      6       24      27      14      23     16*     10,5     8,69     67,86      SH1516HF
 17    12      6      25,5    28,5     14      23

## Build Custom Retriever

Build joint retriever that combines vector and KG search.

In [72]:
from llama_index.core.retrievers import BaseRetriever
from llama_index.core.schema import NodeWithScore
from typing import List


class CustomRetriever(BaseRetriever):
    """Custom retriever that performs both KG vector search and direct vector search."""

    def __init__(self, kg_retriever, vector_retriever):
        self._kg_retriever = kg_retriever
        self._vector_retriever = vector_retriever

    def _retrieve(self, query_bundle) -> List[NodeWithScore]:
        """Retrieve nodes given query."""
        kg_nodes = self._kg_retriever.retrieve(query_bundle)
        vector_nodes = self._vector_retriever.retrieve(query_bundle)

        unique_nodes = {n.node_id: n for n in kg_nodes}
        unique_nodes.update({n.node_id: n for n in vector_nodes})
        return list(unique_nodes.values())

In [73]:
custom_retriever = CustomRetriever(kg_retriever, base_retriever)

In [74]:
nodes = custom_retriever.retrieve(
    "Give me the entire row of spur gears made of polyacetal (POM) where ZZ=32"
)
print(len(nodes))
for idx, node in enumerate(nodes):
    print(f">> IDX: {idx}, {node.get_content()}")

7
>> IDX: 0, Here are some facts extracted from the provided text:

Anfrage -> Senden sie an -> Mailbox@zipperle-antriebstechnik.de
Anfrage -> Senden an -> Mailbox@zipperle-antriebstechnik.de
Standardteile -> Durch einbringung von -> Gewinden

5   1074,42     SH2090HF
 95    19      20     190     194      40      34     163      8       464    1134,11     SH2095HF
100    19      20     200     204      40      34     178      8      478,6   1193,81     SH20100HF
110    19      20     220     224      40      34     193      8      580,6   1313,19     SH20110HF

*) Freimaß nur auf Anguss-Seite
**) Bitte Angaben zu Drehmoment auf S. 30 beachten.

42     Gerne bearbeiten wir diese Standardteile auf Ihren Wunsch hin durch Einbringung von veränderten Bohrungsdurchmessern, zusätzlichen Bohrungen, Gewinden,
       Nuten etc. Außerdem fertigen wir auch spezifisch nach Ihren Zeichnungen. Ihre Anfrage senden Sie an mailbox@zipperle-antriebstechnik.de
>> IDX: 1, Stirnräder aus Polyacetal (POM)



## Build Agent

Now that we have the retriever, we can treat it as a RAG pipeline tool, and wrap it with an agent that can perform basic CoT reasoning and maintain conversation memory over time.

In [75]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import RetrieverQueryEngine

kg_query_engine = RetrieverQueryEngine(custom_retriever)
kg_query_tool = QueryEngineTool(
    query_engine=kg_query_engine,
    metadata=ToolMetadata(
        name="query_tool",
        # description="Provides information about the 2023 SF Budget Report.",
        description="Provides information about row table lookups",
    ),
)

In [81]:
# from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.agent.workflow import ReActAgent
from llama_index.core.workflow import Context

agent = ReActAgent(
    tools=[kg_query_tool],
    llm=llm,
    verbose=True,
    allow_parallel_tool_calls=False,
)

# context to hold this session/state
ctx = Context(agent)

## Try out Queries

Now that the agent is setup, let's try out some queries.

In [85]:
handler = agent.run("Give me the entire row of spur gears made of polyacetal (POM) where ZZ=32", ctx=ctx)

In [None]:
from llama_index.core.agent.workflow import ToolCallResult, AgentStream

async for ev in handler.stream_events():
    # if isinstance(ev, ToolCallResult):
    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
    if isinstance(ev, AgentStream):
        print(f"{ev.delta}", end="", flush=True)

response = await handler

WorkflowRuntimeError: All the streamed events have already been consumed.