# Knowledge Graph Builder Example

This notebook demonstrates the process of building a knowledge graph from DataFrames using the functions defined in `kg_builder.py`. Each step is modular and sequential:

1. Import libraries and parse XML data

2. Load configuration and initialize LLM

3. Describe tables using LLM

4. Extract entities & relationships

5. Generate intent training data for chatbot

In [1]:
# Step 1: Import libraries and parse XML data
import sys
import os
import pandas as pd
from kg_builder import GoogleAIStudioLLM
import pathlib
notebook_dir = pathlib.Path().resolve()
parser_dir = notebook_dir / '../parser'
sys.path.append(str(parser_dir.resolve()))
from parser import parse_xml
xml_file_path = '../parser/data/RAN_CM_DATA_SAMPLES.xml'  # Adjust path as needed
dfs, metadata, metadata2 = parse_xml(xml_file_path)

  from .autonotebook import tqdm as notebook_tqdm
2025-07-17 04:44:49.952848: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1752727489.967891    9820 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1752727489.972395    9820 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1752727489.987560    9820 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1752727489.987576    9820 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1752727489.987578    9820

In [15]:
# Step 2: Load configuration and initialize LLM
import yaml
CONFIG_PATH = os.path.join(notebook_dir, "../config.yaml")
with open(CONFIG_PATH, "r") as f:
    config = yaml.safe_load(f)
GOOGLE_API_KEY = config["google_ai_studio"]["api_key"]
GOOGLE_MODEL = config["google_ai_studio"].get("model", "gemini-1.5-flash-latest")
llm = GoogleAIStudioLLM(api_key=GOOGLE_API_KEY, model=GOOGLE_MODEL)

In [16]:
# Step 3: Describe tables using LLM
from kg_builder import describe_tables
descriptions = describe_tables(dfs, llm)

In [17]:
# Step 4: Extract entities & relationships from descriptions
from kg_builder import extract_entities_and_relations
entities, relationships = extract_entities_and_relations(descriptions, llm)

In [18]:
# Step 5: Generate intent training data for chatbot
from kg_builder import generate_intent_training_data
intent_csv = generate_intent_training_data(cache_dir=".kg_cache", output_csv="intent_training_data.csv", llm=llm, queries_per_entity=5)
print(f"Intent training data saved to: {intent_csv}")

[WARN] LLM failed for DnsClient: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for EndcProfile: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for EndcProfile: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for AutoProvisioning: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for AutoProvisioning: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for SectorCarrier: 429 Client Error: Too Many Requests for url: https://generativelanguage.goo

[WARN] LLM failed for GeranFrequency: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for EUtraNetwork: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for EUtraNetwork: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for NonPlannedPciDrxProfile: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for NonPlannedPciDrxProfile: 429 Client Error: Too Many Requests for url: https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent
[WARN] LLM failed for AnrFunctionNR: 429 Client Error: Too Many Requests for url: https://g