# Networkx ATLAS KG construction and RAG example
This notebook demonstrates the full streamlined process of creating a knowledge graph (KG) using the atlas-rag package and performing retrieval-augmented generation (RAG) with our created RAG methods.

## ATLAS KG Construction
It is suggested to use local hf model to run the KG construction code, as llm api service provider use optimized, lightweight models to reduce costs, which may sacrifice performance, and hence hard to have guaranteed performance. (for example from fp16 to fp8 etc.)

ATLAS KG construction consist of 5 steps:
- Triples Json Generation (Base KG Json)
- Convert Triples Json to Triples csv
- Conceptualize Entity in Triples csv
- Merge Concept CSV to Triples CSV
- Convert CSV to graphml for networkx to perform rag / to neo4j dumps for Billion KG RAG

In [None]:
import os 
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
from atlas_rag.kg_construction.triple_extraction import KnowledgeGraphExtractor
from atlas_rag.kg_construction.triple_config import ProcessingConfig
from atlas_rag.llm_generator import LLMGenerator
from openai import OpenAI
#from transformers import pipeline
from configparser import ConfigParser

# Load API key from config file
config = ConfigParser()
config.read('config.ini')

# model_name = "meta-llama/Llama-3.3-70B-Instruct"
client = OpenAI(
  #base_url="https://api.deepinfra.com/v1/openai",
  api_key=config['settings']['OPENAI_API_KEY']
)
model_name = "gpt-4o"
#model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
# model_name = "meta-llama/Llama-3.2-3B-Instruct"
# client = pipeline(
#     "text-generation",
#     model=model_name,
#     device_map="auto",
# )
filename_pattern = 'CICGPC_Glazing_ver1.0a'
output_directory = f'import/{filename_pattern}'
triple_generator = LLMGenerator(client, model_name=model_name)

In [3]:
kg_extraction_config = ProcessingConfig(
      model_path=model_name,
      data_directory="example_data",
      filename_pattern=filename_pattern,
      batch_size_triple=3,
      batch_size_concept=16,
      output_directory=f"{output_directory}",
      max_new_tokens=2048,
      max_workers=3,
      remove_doc_spaces=True, # For removing duplicated spaces in the document text
)
kg_extractor = KnowledgeGraphExtractor(model=triple_generator, config=kg_extraction_config)

### Triples Generation

In [4]:
# construct entity&event graph
kg_extractor.run_extraction()

Found data files: ['CICGPC_Glazing_ver1.0a.json']


Generating train split: 0 examples [00:00, ? examples/s]

Processing shard 1/1 (texts 0-0 of 1, 1 documents)
Generated 10 chunks for shard 1/1
Model: gpt-4o


 25%|███████████████████████████████▊                                                                                               | 1/4 [00:36<01:49, 36.42s/it]

Processed 1 batches (3 chunks)


 50%|███████████████████████████████████████████████████████████████▌                                                               | 2/4 [01:15<01:15, 37.84s/it]

Processed 2 batches (6 chunks)
Item 0 Entity must be a non-empty array. Problematic item: {'Event': 'Initiatives taken to reduce energy use and improve energy efficiency.', 'Entity': []}


 75%|███████████████████████████████████████████████████████████████████████████████████████████████▎                               | 3/4 [01:43<00:33, 33.61s/it]

Processed 3 batches (9 chunks)


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [02:01<00:00, 30.32s/it]

Processed 4 batches (12 chunks)





In [6]:
# Concept Generation
kg_extractor.generate_concept_csv_temp()

all_batches 64


Shard_0:   0%|                                                                                                                             | 0/64 [00:00<?, ?it/s]2025-07-23 16:26:43,847 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:44,028 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:44,218 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:44,583 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:44,711 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:44,969 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:45,408 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-07-23 16:26:45,770 - INFO - HTTP Request: 

Number of unique conceptualized nodes: 1317
Number of unique conceptualized events: 351
Number of unique conceptualized entities: 660
Number of unique conceptualized relations: 485





In [5]:
# Convert Triples Json to CSV
kg_extractor.convert_json_to_csv()

Loading data from the json files
Number of files:  2


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 57.29it/s]

Processing file for file ids:  gpt-4o_CICGPC_Glazing_ver1.0a_output_20250723162400_1_in_1.json
Processing file for file ids:  meta-llama_Meta-Llama-3.1-8B-Instruct_CICGPC_Glazing_ver1.0a_output_20250706122933_1_in_1.json
Data to CSV completed successfully.





In [7]:
kg_extractor.create_concept_csv()

Loading concepts...


998it [00:00, 142674.10it/s]


Loading concepts done.
Relation to concepts: 137
Node to concepts: 861
Processing triple nodes...


861it [00:00, 74927.81it/s]


Processing concept nodes...


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 860/860 [00:00<00:00, 558115.65it/s]


Processing triple edges...


978it [00:00, 186965.78it/s]


# Choice 1: Convert to graphml for networkx rag

In [4]:
# convert csv to graphml for networkx
kg_extractor.convert_to_graphml()

## ATLAS Multihop QA

In order to perform RAG, one need to first create embeddings & faiss index for constructed KG

[There maybe performance difference in using AutoModel and Sentence Transformer for NV-Ebmed-v2]

In [7]:
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
from sentence_transformers import SentenceTransformer
from atlas_rag.vectorstore.embedding_model import NvEmbed, SentenceEmbedding
from transformers import AutoModel
# Load the SentenceTransformer model
encoder_model_name = "sentence-transformers/all-MiniLM-L6-v2"
sentence_model = SentenceTransformer(encoder_model_name, trust_remote_code=True, model_kwargs={'device_map': "auto"})
sentence_encoder = SentenceEmbedding(sentence_model)
# sentence_model.max_seq_length = 32768
# sentence_model.tokenizer.padding_side="right"
# sentence_model = AutoModel.from_pretrained(encoder_model_name, trust_remote_code=True, device_map="auto")
# sentence_encoder = NvEmbed(sentence_model)

In [None]:
from openai import OpenAI
from atlas_rag.llm_generator import LLMGenerator
from configparser import ConfigParser

# Load API key from config file
config = ConfigParser()
config.read('config.ini')

# reader_model_name = "meta-llama/llama-3.3-70b-instruct"
reader_model_name = "gpt-4o"

# Alternative API providers (commented out)
#client = OpenAI(
  # base_url="https://openrouter.ai/api/v1",
  # api_key=config['settings']['OPENROUTER_API_KEY'],
  #base_url="https://api.deepinfra.com/v1/openai",
  #api_key=config['settings']['DEEPINFRA_API_KEY'],
#)

client = OpenAI(
  #base_url="https://api.deepinfra.com/v1/openai",
  api_key=config['settings']['OPENAI_API_KEY']
)
llm_generator = LLMGenerator(client=client, model_name=reader_model_name)

In [10]:
from atlas_rag.vectorstore import create_embeddings_and_index
keyword = 'CICGPC_Glazing_ver1.0a'
working_directory = f'import/{keyword}'
data = create_embeddings_and_index(
    sentence_encoder=sentence_encoder,
    model_name = encoder_model_name,
    working_directory=working_directory,
    keyword=keyword,
    include_concept=True,
    include_events=True,
    normalize_embeddings= True,
    text_batch_size=64,
    node_and_edge_batch_size=64,
)

Using encoder model: all-MiniLM-L6-v2
Loading graph from import/CICGPC_Glazing_ver1.0a/kg_graphml/CICGPC_Glazing_ver1.0a_graph.graphml


100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1731/1731 [00:00<00:00, 1122675.15it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1731/1731 [00:00<00:00, 2391416.41it/s]
5886it [00:00, 4548208.06it/s]


Computing text embeddings...


Encoding texts: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.86s/it]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 448.64it/s]


Node and edge embeddings not found, computing...


Encoding nodes: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 27/27 [00:04<00:00,  6.31it/s]
Encoding edges: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 79/79 [00:06<00:00, 12.17it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54/54 [00:00<00:00, 1065.81it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 158/158 [00:00<00:00, 1035.48it/s]

Node and edge embeddings already computed.





In [11]:
# Initialize desired RAG method for benchmarking
from atlas_rag.retriever import HippoRAG2Retriever
from atlas_rag import setup_logger

hipporag2_retriever = HippoRAG2Retriever(
    llm_generator=llm_generator,
    sentence_encoder=sentence_encoder,
    data = data,
)

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1731/1731 [00:00<00:00, 387424.77it/s]


In [12]:
# perform retrieval
content, sorted_context_ids = hipporag2_retriever.retrieve("How is the U-value relevant to thermal insulation performance in glazing products?", topN=3)
print(f"Retrieved content: {content}")

Retrieved content: ['he limits below (Table 1). Bonus points will be awarded if the U-value of the product reaches the standard in Table 1. | U-value (W/m2K) | single glazing | double or tripled glazing | Points | |-----------------|----------------|---------------------------|--------| | | ≤ 5.8 | ≤ 3.30 | | | ≤ 3.7 | ≤ 2.30 | +5<br>(bonus) | | *Table 1: Limits of U-value for single, double or tripled glazing products* #### Verification Laboratory test report(s) on the U-value of the glazing product. The U-value should be measured according to the applicable ISO 8990 or EN-ISO 12567 standard. Alternatively, the U-value can be calculated according to the standard EN673 (Glazing) and EN ISO 10077 (Frame / Casement). ### <span id="page-10-2"></span>*4.2.2 Shading Coefficient* #### 15 Basic + 5 / 10 Bonus Points (Core Criterion) Shading coefficient (SC) and solar heat gain coefficient **(**SHGC) can portray how well a product blocks heat caused by sunlight. The lower the glazing\'s SHGC, 

In [13]:
# start benchmarking
sorted_context = "\n".join(content)
llm_generator.generate_with_context("How is the U-value relevant to thermal insulation performance in glazing products?", sorted_context, max_new_tokens=2048, temperature=0.5)

'Thought: The U-value is a measure of thermal transmittance, indicating how well a material conducts heat. In the context of glazing products, a lower U-value signifies better insulation properties, as it means less heat is transferred through the glazing. This is crucial for energy efficiency in buildings, as it helps maintain indoor temperatures and reduce the need for heating or cooling. The assessment criteria provided in the document outline specific U-value limits for single, double, or triple glazing, with bonus points awarded for achieving lower U-values, thereby incentivizing better thermal insulation performance.\n\nAnswer: Measure of thermal transmittance indicating insulation efficiency.'

# Choice 2: Convert to neo4j dumps

In [14]:
from sentence_transformers import SentenceTransformer
from atlas_rag.vectorstore.embedding_model import SentenceEmbedding
# use sentence embedding if you want to use sentence transformer
# use NvEmbed if you want to use NvEmbed-v2 model
sentence_model = SentenceTransformer('all-MiniLM-L6-v2')
sentence_encoder = SentenceEmbedding(sentence_model)

In [None]:
# add numeric id to the csv so that we can use vector indices
kg_extractor.add_numeric_id()

# compute embedding
kg_extractor.compute_kg_embedding(sentence_encoder) # default encoder_model_name="all-MiniLM-L12-v2", only compute all embeddings except any concept related embeddings
# kg_extractor.compute_embedding(encoder_model_name="all-MiniLM-L12-v2")
# kg_extractor.compute_embedding(encoder_model_name="nvidia/NV-Embed-v2")

# create faiss index
kg_extractor.create_faiss_index() # default index_type="HNSW,Flat", other options: "IVF65536_HNSW32,Flat" for large KG
# kg_extractor.create_faiss_index(index_type="HNSW,Flat")
# kg_extractor.create_faiss_index(index_type="IVF65536_HNSW32,Flat")


['name:ID', 'type', 'concepts', 'synsets', ':LABEL']


Adding numeric ID: 861it [00:00, 480993.04it/s]


[':START_ID', ':END_ID', 'relation', 'concepts', 'synsets', ':TYPE']


Adding numeric ID: 978it [00:00, 271531.69it/s]


['text_id:ID', 'original_text', ':LABEL']


Adding numeric ID: 10it [00:00, 13319.48it/s]


Processed chunk 0 (100000 rows)
Total number of embeddings: 861
Conversion complete!
Processed chunk 0 (100000 rows)
Total number of embeddings: 10
Conversion complete!


## Install Neo4j Server

Go to the AutoschemaKG/neo4j_scripts directory

```sh get_neo4j_demo.sh```

Then there a neo4j server is install in the directory: neo4j-server-dulce

Start the newly instealled empty Neo4j server for testing

```sh start_neo4j_demo.sh```



## Config Neo4j Server

Stop the server first before config and import data

```sh stop_neo4j_demo.sh```


Copy the ```AutoschemaKG/neo4j_scripts/neo4j.conf``` file to the conf directory of the Neo4j server (```neo4j-server-dulce/conf```). Then, update the following settings as needed: 1.Set dbms.default_database to the desired dataset name, such as ```wiki-csv-json-text```, ```pes2o-csv-json-text```, or ```cc-csv-json-text```. In this case we make it ```dulce-csv-json-text``` 2.Configure the Bolt, HTTP, and HTTPS connectors according to your requirements.

I have set up the config port to some random ports to avoid port conflicts in ```neo4j-server-dulce/conf/neo4j.conf``` .

 
``` 
# Bolt connector
server.bolt.enabled=true
#server.bolt.tls_level=DISABLED
server.bolt.listen_address=0.0.0.0:8612
server.bolt.advertised_address=:8612

# HTTP Connector. There can be zero or one HTTP connectors.
server.http.enabled=true
server.http.listen_address=0.0.0.0:7612
server.http.advertised_address=:7612

# HTTPS Connector. There can be zero or one HTTPS connectors.
server.https.enabled=false
server.https.listen_address=0.0.0.0:7781
server.https.advertised_address=:7781
```


## Import Data
We use the admin import method to import data, which is the fastest way. Other methods are too slow for large graphs.


## Load the CSV files into Neo4j

We try to import data from previously constructed csv files with numeric ids. All the csv files are in ```import/Dulce```. 
In total six csv files for the nodes and edges of triples, text chunks, and concepts. 

``` shell
./neo4j-server-dulce/bin/neo4j-admin database import full dulce-csv-json-text \
    --nodes ./import/Dulce/triples_csv/triple_nodes_Dulce_from_json_without_emb_with_numeric_id.csv \
    --nodes ./import/Dulce/triples_csv/text_nodes_Dulce_from_json_with_numeric_id.csv \
    --nodes ./import/Dulce/concept_csv/concept_nodes_Dulce_from_json_with_concept.csv \
    --relationships ./import/Dulce/triples_csv/triple_edges_Dulce_from_json_without_emb_with_numeric_id.csv \
    --relationships ./import/Dulce/triples_csv/text_edges_Dulce_from_json.csv \
    --relationships ./import/Dulce/concept_csv/concept_edges_Dulce_from_json_with_concept.csv  \
    --overwrite-destination \
    --multiline-fields=true \
    --id-type=string \
    --verbose --skip-bad-relationships=true
```

When this is finished, you can see the following notifications

```shell
IMPORT DONE in 2s 475ms. 
Imported:
  1183 nodes
  2519 relationships
  6743 properties
Peak memory usage: 1.032GiB
```

Then you can start host it by running in ```./neo4j_scripts```

```sh start_neo4j_demo.sh```

When you see the following line, then it is working well.


```Started neo4j (pid:742490). It is available at http://0.0.0.0:7612```



If you want to use the python driver to run neo4j, you need to use port 8612. You can access http://0.0.0.0:7612 in browser as well to use the neo4j GUI. 

The default user is ```neo4j``` with password ```admin2024```. 


## ATLAS Billion Level RAG
The LargeKGRetriever is designed to perform retrieval on a billion-level graph. 

There is a trade-off between retrieval performance and speed; this serves as a proof of concept for a billion-level knowledge graph.

After successfully hosting the Neo4j database, you can run the provided Python script to host the RAG API:
```shell
python neo4j_api_host/atlas_api_demo.py 
```

During the first startup of the API, it will create the necessary indexes and projection graphs in the Neo4j database for faster queries and computations. The time required for this process may vary depending on the size of the database. You can monitor the creation of these items in http://localhost:7612 by using the following commands:

To view the projected graphs:
```cypher
CALL gds.graph.list()
```
To view the indexes:
```cypher
SHOW INDEXES
```

The projected graph will be deleted after the database is shut down, while the indexes will not be removed.

After you saw: \
Index NodeNumericIDIndex created in 0.09 seconds \
Index TextNumericIDIndex created in 0.11 seconds \
Index EntityEventEdgeNumericIDIndex created in 0.02 seconds \
Projection graph largekgrag_graph created in 5.42 seconds 

You can perform rag as follows:

In [None]:
from openai import OpenAI

base_url ="http://0.0.0.0:10085/v1/"
client = OpenAI(api_key="EMPTY", base_url=base_url)

# knowledge graph en_simple_wiki_v0
message = [
    {
        "role": "system",
        "content": "You are a helpful assistant that answers questions based on the knowledge graph.",
    },
    {
        "role": "user",
        "content": "Question: Who is Alex Mercer?",
    }
]
response = client.chat.completions.create(
    model="llama",
    messages=message,
    max_tokens=2048,
    temperature=0.5
)
print(response.choices[0].message.content)