# Lesson 8 - Knowledge Graph Construction - Part I

With all the plans in place, it's time to construct the knowledge graph. 

For the **domain graph** construction, no agent is required. The construction plan has all the information needed to drive a rule-based import.

<img src="images/domain.png" width="600">

**Note**: This notebook uses Cypher queries to build the domain graph from CSV files. Don't worry if you're unfamiliar with Cypher — focus on understanding the big picture of how the structured data is transformed into a graph structure based on the construction plan.

## 8.1. Tool

A single tool which will build a knowledge graph using the defined construction rules.
- Input: `approved_construction_plan`
- Output: a domain graph in Neo4j
- Tools: `construct_domain_graph` + helper functions

**Workflow**

1. The context is initialized with an `approved_construction_plan` and `approved_files`
2. Process all the node construction rules
3. Process all the relationship construction rules


## 8.2. Setup

The usual import of needed libraries, loading of environment variables, and connection to Neo4j.

## 8.3. Tool Definitions (Domain Graph Construction)

The `construct_domain_graph` tool is responsible for constructing the "domain graph" from CSV files,
according to the approved construction plan.

### Function: create_uniqueness_constraint



This function creates a uniqueness constraint in Neo4j to prevent duplicate nodes with the same label and property value from being created.

### Function: load_nodes_from_csv

This function performs batch loading of nodes from a CSV file into Neo4j. It uses the `LOAD CSV` command with the `MERGE` operation to create nodes while avoiding duplicates based on the unique column. The Cypher query processes data in batches of 1000 rows for better performance.

**Note**: The csv files are stored in the `/import` directory of `neo4j` database. When you use the query `LOAD CSV from "file:///" + $source_file`, neo4j checks the `/import` directory by default.

### Execute Domain Graph Construction

This cell executes the main construction function using the approved construction plan. It builds the complete knowledge graph by importing all nodes and relationships according to the defined rules.

### Function: import_nodes

This function orchestrates the node import process by first creating a uniqueness constraint and then loading nodes from the CSV file. It ensures data integrity by establishing constraints before importing data.

### Function: import_relationships

This function imports relationships between nodes from a CSV file. It uses a Cypher query that matches existing nodes and creates relationships between them. The query finds pairs of nodes and creates relationships with specified properties between them.

### Function: construct_domain_graph

This is the main orchestration function that builds the entire domain graph. It processes the construction plan in two phases:
1. **Node Construction**: First imports all nodes to ensure they exist before creating relationships
2. **Relationship Construction**: Then creates relationships between the existing nodes

This two-phase approach prevents relationship creation failures due to missing nodes.

## 8.4. Run construct_domain_graph()

This cell defines the approved construction plan as a dictionary containing rules for creating nodes and relationships. The plan includes:

- **Node Rules**: Define how to create Assembly, Part, Product, and Supplier nodes from CSV files
- **Relationship Rules**: Define how to create Contains, Is_Part_Of, and Supplied_By relationships

Each rule specifies the source file, labels, unique identifiers, and properties to be imported.

## 8.5 Inspect the Domain Graph

This cell filters the construction plan to extract only the relationship construction rules. This list will be used in the next cell to verify that all relationships were successfully created in the graph.

This cell creates and executes a Cypher query to verify that all relationship types from the construction plan were successfully created in the graph. 

The query uses several advanced Cypher features:
- `UNWIND`: Iterates through each relationship construction rule
- `CALL (construction) { ... }`: Subquery that executes for each construction rule
- `MATCH (from)-[r:relationship_type]->(to)`: Finds one example of each relationship type
- `LIMIT 1`: Returns only one example per relationship type

This provides a summary view showing one instance of each relationship pattern in the constructed graph.