# Graph Agents
* The specific short project aims to combine the concepts of **multiagent systems** and **knowledge graphs**.
* A multiagent system (MAS) consists of multiple artificial intelligence (AI) agents working collectively to perform tasks on behalf of a user or another system ([IBM](https://www.ibm.com/think/topics/multiagent-system)).
* A knowledge graph is an organized representation of real-world entities and their relationships. It is typically stored in a graph database, which natively stores the relationships between data entities ([Neo4j](https://neo4j.com/blog/what-is-knowledge-graph/)).
* The idea behind combining multiagent systems with knowledge graphs is to **enhance query answering** by leveraging the **agents' specialized reasoning capabilities** and the **knowledge graph's structured semantic relationships**, enabling more accurate, context-aware, and scalable solutions.
* The specific implementation consists of 3 steps:
    1. *Creation of the knowledge graph*: this is done with [GraphRAG](https://github.com/microsoft/graphrag).
    2. *Saving the graph in a database*: a [Neo4j](https://neo4j.com/) graph database is used.
    3. *Building the multiagent system for question answering*: using [autogen](https://github.com/ag2ai/ag2).

## Acknowledgements
* Three resources influenced the implementation:
    1. [GraphRag: Getting Started](https://microsoft.github.io/graphrag/get_started/)
    2. [Commit `cb0aae7` in GraphRAG](https://github.com/microsoft/graphrag/commit/cb0aae7e6bf1763ca5a7540d2220c11162863915)
    3. [Knowledge Graphs for RAG course by DeepLearning.AI](https://learn.deeplearning.ai/courses/knowledge-graphs-rag)

## 1. Creation of the knowledge graph
* GraphRAG calls this process [indexing](https://microsoft.github.io/graphrag/index/overview/).
* It is designed to:
    1. extract entities, relationships and claims from raw text
    2. perform community detection in entities
    3. generate community summaries and reports at multiple levels of granularity
    4. embed entities into a graph vector space
    5. embed text chunks into a textual vector space
* The process can be executed from the command line following the guidelines described in the [Getting Started](https://microsoft.github.io/graphrag/get_started/) page.


### Create the folder
* In this folder the indexing files will be saved.

In [1]:
import os

os.makedirs('./graph_agents_indexing/input', exist_ok=True)

### Download test data
* Download *Alice's Adventures in Wonderland* by Lewis Carrol (a personal favorite) from the [Gutenberg Project](https://www.gutenberg.org/).

In [2]:
!curl https://www.gutenberg.org/files/11/11-0.txt -o ./graph_agents_indexing/input/alice_in_wonderland.txt   

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
100  150k  100  150k    0     0   104k      0  0:00:01  0:00:01 --:--:--  107k


### Initialize GraphRAG
* Setup GraphRAG.
* To do so follow the guidelines in the section [Set Up your Workspace Variables](https://microsoft.github.io/graphrag/get_started/#:~:text=Set%20Up%20Your%20Workspace%20Variables) of the *Getting Started* page.
* *Important*: Additionally set the `embeddings` value under `snapshots` to true, to get the embeddings of the entities.

In [3]:
!graphrag init --root ./graph_agents_indexing > NUL 2>&1

### Index data
* Run the indexing process

In [4]:
!graphrag index --root ./graph_agents_indexing > NUL 2>&1

* Now the results of the process are saved in parquet files within the `./graph_agents_indexing/output` folder.

## 2. Saving the graph in a database
* A powerful database option for graphs is [Neo4j](https://neo4j.com/).
* It uses nodes, relationships, and properties to represent and store data, enabling highly efficient querying and analysis of complex, interconnected information.
* The commit [`cb0aae7`](https://github.com/microsoft/graphrag/commit/cb0aae7e6bf1763ca5a7540d2220c11162863915) in GraphRAG was used as base for the `neo4j_loading.py` scipt.

### Neo4j installation (from [here](https://github.com/microsoft/graphrag/blob/1a13e0fd93cecca8b10eaa59860e5000d691d417/examples_notebooks/community_contrib/neo4j/graphrag_import_neo4j_cypher.ipynb#L19))
* You can create a free instance of Neo4j online. You get a credentials file that you can use for the connection credentials. You can also get an instance in any of the cloud marketplaces.
* If you want to install Neo4j locally either use Neo4j Desktop or the official Docker image: `docker run -e NEO4J_AUTH=neo4j/password -p 7687:7687 -p 7474:7474 neo4j`
* *Important*: To execute the cell below fill `neo4j_config.json` with the Neo4j credentials.

In [5]:
import neo4j_loading

neo4j_loading.load()

Data loaded successfully.
