# 🌍 Urban Air Quality Knowledge Graph Workflow

This notebook clearly demonstrates the entire workflow of:

- **Validating** structured urban air quality JSON data.
- **Merging** new knowledge into an existing JSON dataset.
- **Importing** validated and merged data into a local Neo4j knowledge graph.

## 🔧 Environment Setup

Before running the notebook, explicitly install all required Python packages:

```bash
!pip install -r ../requirements.txt

In [3]:
import sys
from pathlib import Path

# Add parent directory explicitly for importing local modules
sys.path.append(str(Path.cwd().parent))

# Explicitly import custom modules
from src.Json_validator import AQ_Json_validator
from src.merge_knowledge import merge_knowledge
from src.neo4j_import import import_air_quality_json_to_neo4j

## ✅ Step 1: Validate JSON Files

First, explicitly validate your JSON files to ensure entities referenced in relations match those defined in categories.

In [5]:
# Explicit paths to JSON files
validated_json_path = Path("../data/baseline KG/Validated_air_quality_knowledge.json")

print("🔍 Validating base JSON file:")
AQ_Json_validator(validated_json_path)

🔍 Validating base JSON file:
✅ JSON validation passed. All entities explicitly match correctly.


## 🔗 Step 2: Merge Knowledge

Merge the new knowledge extracted explicitly into the validated existing knowledge.

In [7]:
# Define explicit output file path
new_extracted_json_path = Path("../data/output/extracted_knowledge.json")
merged_json_output_path = Path("../data/output/merged_air_quality_knowledge.json")

# Run explicit merging function
merge_knowledge(
    base_filepath=validated_json_path,
    new_filepath=new_extracted_json_path,
    output_filepath=merged_json_output_path
)

✅ JSON files merged successfully into: ../data/output/merged_air_quality_knowledge.json


## 🧪 Step 3 (Optional): Validate Merged JSON

Explicitly re-validate the merged JSON to ensure consistency.

In [9]:
print("🔍 Validating merged JSON file:")
AQ_Json_validator(merged_json_output_path)

🔍 Validating merged JSON file:
🚨 JSON validation errors found:
- Missing mitigation measure entity: 'Cars and vans (petrol and diesel)' in source-mitigation relation.


In [11]:
import json

# Load the merged JSON explicitly
with open(merged_json_output_path, "r", encoding="utf-8") as file:
    merged_data = json.load(file)

# Explicitly add missing mitigation measure entity
missing_measure = 'Cars and vans (petrol and diesel)'
measure_category = 'TechnologicalMeasure'  # explicitly define appropriate category

# Check if the measure category exists explicitly, if not, initialize explicitly
if measure_category not in merged_data['mitigation_measures']:
    merged_data['mitigation_measures'][measure_category] = []

# Explicitly add missing entity if not already present
if missing_measure not in merged_data['mitigation_measures'][measure_category]:
    merged_data['mitigation_measures'][measure_category].append(missing_measure)
    print(f"✅ Explicitly added missing mitigation measure: '{missing_measure}' to category '{measure_category}'")
else:
    print(f"ℹ️ '{missing_measure}' already explicitly exists in the category '{measure_category}'.")

# Explicitly save corrected JSON back to file
with open(merged_json_output_path, "w", encoding="utf-8") as file:
    json.dump(merged_data, file, indent=2)

print("🔍 Re-validating merged JSON file explicitly after correction:")
AQ_Json_validator(merged_json_output_path)

✅ Explicitly added missing mitigation measure: 'Cars and vans (petrol and diesel)' to category 'TechnologicalMeasure'
🔍 Re-validating merged JSON file explicitly after correction:
✅ JSON validation passed. All entities explicitly match correctly.


## 🗄️ Step 4: Import Data into Neo4j

Explicitly import the validated and merged JSON data into your local Neo4j knowledge graph.

### 🚩 Neo4j Local Setup (Explicitly required):

- Ensure Neo4j is explicitly running at: `bolt://localhost:7687`.
- Explicitly confirm Neo4j credentials (username/password).

In [None]:
# Explicit Neo4j connection details (modify explicitly)
neo4j_uri = "bolt://localhost:7687"
neo4j_user = "neo4j"
neo4j_password = "66666666" #<-------Replace your password here!

# Explicitly import merged JSON data into Neo4j
import_air_quality_json_to_neo4j(
    json_filepath=merged_json_output_path,
    uri=neo4j_uri,
    username=neo4j_user,
    password=neo4j_password
)
