### 📦 Installing Required Packages

To ensure all dependencies for the project are installed, we use the following command:

In [None]:
!pip install -r requirements.txt

# Import Required Modules and Classes

This block imports all the necessary modules and classes from the `sokegraph` package and other dependencies. These components provide core functionalities such as:

- Paper sources (retrieving papers from Semantic Scholar or PDFs)
- Paper ranking logic
- Knowledge graph management
- AI agents (OpenAI, Gemini, Llama)
- Ontology updating utilities
- Logging utilities
- Neo4j knowledge graph interface
- JSON handling

These imports set up the environment for later steps like fetching papers, ranking them, updating ontologies, and building the knowledge graph.


In [38]:
from sokegraph.base_paper_source import BasePaperSource
from sokegraph.semantic_scholar_source import SemanticScholarPaperSource
from sokegraph.pdf_paper_source import PDFPaperSource
from sokegraph.paper_ranker import PaperRanker
from sokegraph.knowledge_graph import KnowledgeGraph
from sokegraph.util.logger import LOG
from sokegraph.ai_agent import AIAgent
from sokegraph.openai_agent import OpenAIAgent
from sokegraph.gemini_agent import GeminiAgent
from sokegraph.ontology_updater import OntologyUpdater
from sokegraph.neo4j_knowledge_graph import Neo4jKnowledgeGraph
from sokegraph.llama_agent import LlamaAgent
from sokegraph.ollama_agent import OllamaAgent
from sokegraph.claude_agent import ClaudeAgent
from sokegraph.journal_api_source import JournalApiPaperSource
from sokegraph.networkx_knowledge_graph import NetworkXKnowledgeGraph
import json

# Initialize and Display the User Interface

- The `SOKEGraphUI` class is imported from the `ui_inputs` module.
- An instance of `SOKEGraphUI` is created and assigned to `ui`.
- The `display_ui()` method is called to render the interactive user interface.

> **Note:**  
> If you make changes to `ui_inputs.py`, you might need to uncomment the `importlib.reload` lines to reload the module without restarting the notebook.


In [None]:
import importlib
import sokegraph.ui_inputs
importlib.reload(sokegraph.ui_inputs)
from sokegraph.ui_inputs import SOKEGraphUI


# Create UI instance
ui = SOKEGraphUI()

# Display the UI in the notebook
ui.display_ui()


VBox(children=(Dropdown(description='Paper Source:', options=('Select Source', 'Semantic Scholar', 'PDF Zip Fi…

## Step 0: Setup AI Agent

- Logs the start of the full pipeline.
- Selects and initializes the AI agent based on the user interface (UI) parameter `AI`.
- Supports three AI providers:
  - `openAI` initializes an `OpenAIAgent` with the API keys file.
  - `gemini` initializes a `GeminiAgent` with the API keys file.
  - `llama` initializes a `LlamaAgent` with the API keys file.
- Raises an error if an unsupported AI provider is selected.

> **Note:**  
> Ensure that the API keys file path provided via `ui.params.api_keys_file` is correct and contains valid credentials for the selected AI provider.


In [41]:
LOG.info("🚀 Starting Full Pipeline")

# 0. Setup AI agent
ai_tool: AIAgent
if ui.params.AI == "openAI":
    ai_tool = OpenAIAgent(ui.params.api_keys_file)
elif ui.params.AI == "gemini":
    ai_tool = GeminiAgent(ui.params.api_keys_file)
elif ui.params.AI == "llama":
    ai_tool = LlamaAgent(ui.params.api_keys_file)
elif ui.params.AI == "ollama":
    ai_tool = OllamaAgent()
elif ui.params.AI == "claude":
    ai_tool = ClaudeAgent(ui.params.journal_api_key_file)
else:
    raise ValueError(f"Unsupported AI provider: {ui.params.AI}")

2025-07-15 10:59:29,929 [INFO ]	🚀 Starting Full Pipeline
INFO:sokegraph:🚀 Starting Full Pipeline


In [42]:
ai_tool

<sokegraph.openai_agent.OpenAIAgent at 0x1270e6160>

## Step 1: Select Paper Source

- Based on user input, select the source for retrieving papers:
  - If the user specifies `number_papers` and provides a query file (`paper_query_file`), papers are fetched from **Semantic Scholar** using the `SemanticScholarPaperSource` class.
  - If the user provides a ZIP file containing PDFs (`pdfs_file`) without specifying `number_papers`, papers are fetched from the PDFs using the `PDFPaperSource` class.
- Logs errors if:
  - The required query file is missing when using Semantic Scholar.
  - Both or neither `number_papers` and `pdfs_file` are specified.
- Finally, fetches papers from the selected source by calling `fetch_papers()`.

> **Important:**  
> - Make sure to specify either the number of papers and a query file **or** a PDF ZIP file, but not both.
> - The `fetch_papers()` method returns a list of paper metadata dictionaries.


In [43]:
# 1. Select paper source
paper_source: BasePaperSource

if ui.params.paper_source == "Semantic Scholar":
    if not ui.params.number_papers or not ui.params.paper_query_file:
        LOG.error("❌ 'number_papers' and 'paper_query_file' are required for Semantic Scholar source.")
    else:
        paper_source = SemanticScholarPaperSource(
            num_papers=int(ui.params.number_papers),
            query_file=ui.params.paper_query_file,
            output_dir=ui.params.output_dir
        )

elif ui.params.paper_source == "PDF Zip":
    if not ui.params.pdfs_file:
        LOG.error("❌ 'pdfs_file' (ZIP file) is required for PDF source.")
    else:
        paper_source = PDFPaperSource(
            zip_path=ui.params.pdfs_file,
            output_dir=ui.params.output_dir
        )

elif ui.params.paper_source == "Journal API":
    if not ui.params.paper_query_file or not ui.params.api_key_file:
        LOG.error("❌ 'paper_query_file' and 'api_key_file' are required for Journal API source.")
    else:
        paper_source = JournalApiPaperSource(
            query_file=ui.params.paper_query_file,
            api_key_file=ui.params.api_key_file,
            output_dir=ui.params.output_dir
        )

else:
    LOG.error("❌ Invalid or unsupported paper source selected.")
    paper_source = None

# 2. Fetch papers
if paper_source:
    papers = paper_source.fetch_papers()
else:
    papers = []


2025-07-15 10:59:34,781 [INFO ]	Searching Semantic Scholar for: Acidic earth abundant catalysts for water splitting
INFO:sokegraph:Searching Semantic Scholar for: Acidic earth abundant catalysts for water splitting


## Step 2: Update Ontology Using Retrieved Papers

- Create an instance of `OntologyUpdater`, which enriches the ontology file using:
  - The retrieved papers,
  - The selected AI tool (`ai_tool`),
  - The output directory for saving results.
- The ontology file (`ontology_file`) contains a structured hierarchy of materials science concepts.
- Calls `enrich_with_papers()` to extract relevant concepts and keywords from the papers and integrate them into the ontology.

> **What this step does:**
> - Associates papers with ontology categories by analyzing their content using an LLM agent.
> - Produces structured `ontology_extractions` used for graph construction and ranking in later steps.


In [None]:
# 2. Update ontology
ontology_updater = OntologyUpdater(ui.params.ontology_file, papers, ai_tool, ui.params.output_dir)  # or however you instantiate it
updated_ontology_path = ontology_updater.enrich_with_papers()

2025-07-15 11:00:14,303 [INFO ]	🔍 Processing paper: Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting
INFO:sokegraph.ai_agent:🔍 Processing paper: Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting


xxxx


2025-07-15 11:00:42,388 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:00:42,389 [INFO ]	✅ Extracted 5 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 5 items via regex.
2025-07-15 11:00:42,390 [INFO ]	✅ Extracted 5 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Environment'
INFO:sokegraph.ai_agent:✅ Extracted 5 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Environment'


xxxx


2025-07-15 11:00:52,052 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:00:52,054 [INFO ]	✅ Extracted 10 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 10 items via regex.
2025-07-15 11:00:52,060 [INFO ]	✅ Extracted 10 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Process'
INFO:sokegraph.ai_agent:✅ Extracted 10 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Process'


xxxx


2025-07-15 11:00:59,173 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:00:59,175 [INFO ]	✅ Extracted 8 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 8 items via regex.
2025-07-15 11:00:59,179 [INFO ]	✅ Extracted 8 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Reaction'
INFO:sokegraph.ai_agent:✅ Extracted 8 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Reaction'


xxxx


2025-07-15 11:01:07,054 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:07,056 [INFO ]	✅ Extracted 11 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 11 items via regex.
2025-07-15 11:01:07,057 [INFO ]	✅ Extracted 11 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Elemental Composition'
INFO:sokegraph.ai_agent:✅ Extracted 11 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Elemental Composition'


xxxx


2025-07-15 11:01:12,868 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:12,873 [INFO ]	✅ Extracted 7 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 7 items via regex.
2025-07-15 11:01:12,875 [INFO ]	✅ Extracted 7 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Material'
INFO:sokegraph.ai_agent:✅ Extracted 7 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Material'


xxxx


2025-07-15 11:01:15,538 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:15,540 [INFO ]	✅ Extracted 3 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 3 items via regex.
2025-07-15 11:01:15,542 [INFO ]	✅ Extracted 3 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Performance & Stability'
INFO:sokegraph.ai_agent:✅ Extracted 3 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Performance & Stability'


xxxx


2025-07-15 11:01:23,374 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:23,376 [INFO ]	✅ Extracted 12 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 12 items via regex.
2025-07-15 11:01:23,377 [INFO ]	✅ Extracted 12 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Application'
INFO:sokegraph.ai_agent:✅ Extracted 12 items for paper Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting layer 'Application'
2025-07-15 11:01:23,378 [INFO ]	🔍 Processing paper: Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for
INFO:sokegraph.ai_agent:🔍 Processing paper: Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for


xxxx


2025-07-15 11:01:26,962 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:26,965 [INFO ]	✅ Extracted 4 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 4 items via regex.
2025-07-15 11:01:26,967 [INFO ]	✅ Extracted 4 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Environment'
INFO:sokegraph.ai_agent:✅ Extracted 4 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Environment'


xxxx


2025-07-15 11:01:32,904 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:32,906 [INFO ]	✅ Extracted 6 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 6 items via regex.
2025-07-15 11:01:32,910 [INFO ]	✅ Extracted 6 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Process'
INFO:sokegraph.ai_agent:✅ Extracted 6 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Process'


xxxx


2025-07-15 11:01:39,024 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:39,026 [INFO ]	✅ Extracted 7 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 7 items via regex.
2025-07-15 11:01:39,028 [INFO ]	✅ Extracted 7 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Reaction'
INFO:sokegraph.ai_agent:✅ Extracted 7 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Reaction'


xxxx


2025-07-15 11:01:44,561 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:44,562 [INFO ]	✅ Extracted 7 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 7 items via regex.
2025-07-15 11:01:44,566 [INFO ]	✅ Extracted 7 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Elemental Composition'
INFO:sokegraph.ai_agent:✅ Extracted 7 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Elemental Composition'


xxxx


2025-07-15 11:01:55,201 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:01:55,204 [INFO ]	✅ Extracted 5 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 5 items via regex.
2025-07-15 11:01:55,207 [INFO ]	✅ Extracted 5 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Material'
INFO:sokegraph.ai_agent:✅ Extracted 5 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Material'


xxxx


2025-07-15 11:02:04,563 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:04,566 [INFO ]	✅ Extracted 4 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 4 items via regex.
2025-07-15 11:02:04,567 [INFO ]	✅ Extracted 4 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Performance & Stability'
INFO:sokegraph.ai_agent:✅ Extracted 4 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Performance & Stability'


xxxx


2025-07-15 11:02:08,371 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:08,373 [INFO ]	✅ Extracted 6 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 6 items via regex.
2025-07-15 11:02:08,375 [INFO ]	✅ Extracted 6 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Application'
INFO:sokegraph.ai_agent:✅ Extracted 6 items for paper Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for layer 'Application'
2025-07-15 11:02:08,378 [INFO ]	🔍 Processing paper: Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci
INFO:sokegraph.ai_agent:🔍 Processing paper: Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci


xxxx


2025-07-15 11:02:08,849 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:08,851 [INFO ]	✅ Extracted 0 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 0 items via regex.
2025-07-15 11:02:08,852 [INFO ]	✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Environment'
INFO:sokegraph.ai_agent:✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Environment'


xxxx


2025-07-15 11:02:10,255 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:10,257 [INFO ]	✅ Extracted 0 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 0 items via regex.
2025-07-15 11:02:10,258 [INFO ]	✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Process'
INFO:sokegraph.ai_agent:✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Process'


xxxx


2025-07-15 11:02:10,954 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:10,955 [INFO ]	✅ Extracted 0 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 0 items via regex.
2025-07-15 11:02:10,957 [INFO ]	✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Reaction'
INFO:sokegraph.ai_agent:✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Reaction'


xxxx


2025-07-15 11:02:18,044 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:18,046 [INFO ]	✅ Extracted 9 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 9 items via regex.
2025-07-15 11:02:18,048 [INFO ]	✅ Extracted 9 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Elemental Composition'
INFO:sokegraph.ai_agent:✅ Extracted 9 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Elemental Composition'


xxxx


2025-07-15 11:02:19,683 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:19,685 [INFO ]	✅ Extracted 0 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 0 items via regex.
2025-07-15 11:02:19,686 [INFO ]	✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Material'
INFO:sokegraph.ai_agent:✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Material'


xxxx


2025-07-15 11:02:21,254 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:21,258 [INFO ]	✅ Extracted 0 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 0 items via regex.
2025-07-15 11:02:21,263 [INFO ]	✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Performance & Stability'
INFO:sokegraph.ai_agent:✅ Extracted 0 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Performance & Stability'


xxxx


2025-07-15 11:02:23,780 [WARNI]	⚠️ JSON parsing failed. Falling back to regex.
2025-07-15 11:02:23,783 [INFO ]	✅ Extracted 5 items via regex.
INFO:sokegraph.ai_agent:✅ Extracted 5 items via regex.
2025-07-15 11:02:23,784 [INFO ]	✅ Extracted 5 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Application'
INFO:sokegraph.ai_agent:✅ Extracted 5 items for paper Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci layer 'Application'


## Step 3: Rank Papers Based on Ontology and Keywords

- Instantiate the `PaperRanker` with:
  - The selected AI tool (`ai_tool`)
  - The list of fetched papers
  - The enriched ontology output file (`ontology_updater.output_path`)
  - A keyword file containing user-defined or domain-relevant keywords
  - The output directory for storing results

- Call `rank_papers()` to:
  - Match papers to relevant ontology categories using keywords
  - Score and rank papers based on how well they align with the user’s interests
  - Optionally filter out papers dominated by **opposite concepts** (e.g., if a paper mentions “insulator” often when you are looking for “conductor”)

> **What this step does:**
> - Produces a ranked list of relevant papers
> - Generates visual summaries and a CSV of shared ontology category overlaps


In [45]:
# 3. Rank papers
#LOG.info("ranking papers ....")
ranker = PaperRanker(ai_tool, papers, ontology_updater.output_path, ui.params.keywords_file, ui.params.output_dir)
rank_paper_output = ranker.rank_papers()

2025-07-15 11:04:29,729 [INFO ]	🔍 User query: acidic HER water splitting
INFO:sokegraph:🔍 User query: acidic HER water splitting
2025-07-15 11:04:34,442 [INFO ]	
✅ Categories Used: 3
INFO:sokegraph:
✅ Categories Used: 3


response :  
[
    {
        "keyword": "acidic",
        "category": "Acidic",
        "layer": "Environment"
    },
    {
        "keyword": "HER",
        "category": "Hydrogen Evolution Reaction (HER)",
        "layer": "Reaction"
    },
    {
        "keyword": "water splitting",
        "category": "Water Electrolysis",
        "layer": "Process"
    }
]

✅ OpenAI: 'acidic' → Environment / Acidic
✅ OpenAI: 'her' → Reaction / Hydrogen Evolution Reaction (HER)
✅ OpenAI: 'water splitting' → Process / Water Electrolysis
❌ No match for: 'water'
❌ No match for: 'splitting'
  1. 'acidic' → Environment / Acidic
  2. 'her' → Reaction / Hydrogen Evolution Reaction (HER)
  3. 'water splitting' → Process / Water Electrolysis


2025-07-15 11:04:35,719 [WARNI]	⚠️ Could not parse opposites JSON. Raw:
```json
{
    "acidic": ["alkaline", "basic"],
    "her": ["orr", "oxygen evolution reaction", "oer"],
    "water splitting": ["water formation", "hydrogenation", "fuel cell operation"]
}
```
```json
{
    "acidic": ["alkaline", "basic"],
    "her": ["orr", "oxygen evolution reaction", "oer"],
    "water splitting": ["water formation", "hydrogenation", "fuel cell operation"]
}
```
2025-07-15 11:04:39,358 [WARNI]	⚠️ Could not parse synonym JSON. Raw:
```json
{
    "acidic": ["low pH", "acid media", "acidic conditions", "proton-rich environment"],
    "her": ["hydrogen evolution reaction", "H2 generation", "hydrogen production", "cathodic hydrogen evolution"],
    "water splitting": ["electrochemical water splitting", "water electrolysis", "H2O splitting", "water decomposition"]
}
```
```json
{
    "acidic": ["low pH", "acid media", "acidic conditions", "proton-rich environment"],
    "her": ["hydrogen evolution reac


🧪 Opposites used for filtering:
  - 'acidic': []
  - 'her': []
  - 'water splitting': []


Unnamed: 0,Paper ID,Query Keyword,Title Relevant Count,Title Opposing Count,Abstract Relevant Count,Abstract Opposing Count,Total Relevant Count,Total Opposing Count,Matched Opposing Keywords,Ratio,Status
0,Activating Ru in the pyramidal sites of Ru2Pty...,acidic,1,0,0,0,1,0,,0.0,Kept
1,Activating Ru in the pyramidal sites of Ru2Pty...,her,0,0,0,0,0,0,,inf,Filtered
2,Activating Ru in the pyramidal sites of Ru2Pty...,water splitting,1,0,0,0,1,0,,0.0,Kept


2025-07-15 11:04:39,381 [INFO ]	
🚫 Filtered out 1 papers due to dominance of opposite keywords.
INFO:sokegraph:
🚫 Filtered out 1 papers due to dominance of opposite keywords.
2025-07-15 11:04:39,382 [INFO ]	
📚 Ranked Papers:
INFO:sokegraph:
📚 Ranked Papers:
2025-07-15 11:04:39,383 [INFO ]	
🟢🟡 High & Moderate Relevance:
INFO:sokegraph:
🟢🟡 High & Moderate Relevance:
2025-07-15 11:04:39,384 [INFO ]	
🔴 Low Relevance (sorted by number of over-threshold keywords):
INFO:sokegraph:
🔴 Low Relevance (sorted by number of over-threshold keywords):
2025-07-15 11:04:39,385 [INFO ]	
🏆 Full Ranking by Number of Category Pairs Shared (out of 3 pairs):
INFO:sokegraph:
🏆 Full Ranking by Number of Category Pairs Shared (out of 3 pairs):
2025-07-15 11:04:39,387 [INFO ]	✅ Exported shared papers to 'shared_pair_ranked_papers.csv'
INFO:sokegraph:✅ Exported shared papers to 'shared_pair_ranked_papers.csv'
2025-07-15 11:04:39,388 [INFO ]	
🔄 Overlaps Between Categories:
INFO:sokegraph:
🔄 Overlaps Between Categor

  1. Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting (mentions=56)
  2. Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for (mentions=49)
  3. Polyoxometalate electrocatalysts based on earth-abundant metals for efficient water oxidation in aci (mentions=0)
  1. Invited Atomic Layer Deposited Sulfides As PreCatalysts for Electrochemical Water Splitting → shared in 3/3 pair(s)
     ⤷ Pairs: ['Acidic ↔ Hydrogen Evolution Reaction (HER)', 'Acidic ↔ Water Electrolysis', 'Water Electrolysis ↔ Hydrogen Evolution Reaction (HER)']
  2. Activating Ru in the pyramidal sites of Ru2Ptype structures with earthabundant transition metals for → shared in 3/3 pair(s)
     ⤷ Pairs: ['Acidic ↔ Hydrogen Evolution Reaction (HER)', 'Acidic ↔ Water Electrolysis', 'Water Electrolysis ↔ Hydrogen Evolution Reaction (HER)']


## Step 4: Build the Knowledge Graph

This step constructs a **knowledge graph** from the updated ontology, allowing for structured querying and visualization of relationships between materials, methods, and properties.

### What happens here:
- Credentials for the target graph database (e.g., Neo4j) are loaded from a JSON file.
- A `KnowledgeGraph` builder is instantiated (e.g., `Neo4jKnowledgeGraph`) using:
  - The enriched ontology file (`updated_ontology.json`)
  - Connection details to the Neo4j server (URI, username, password)
- The `.build_graph()` method builds the actual graph in the database.

> ✅ **Result**: A graph database containing categorized concepts and their relationships, ready for exploration or reasoning tasks.

---

🎉 **The full pipeline is now complete!** You’ve fetched papers, enriched the ontology, ranked relevant publications, and built a graph-based representation of your domain knowledge.


In [None]:
# 4. Build knowledge graph
LOG.info(" Building knowledge graph ....")
### load
with open(ui.params.kg_credentials_file, "r") as f:
    credentials = json.load(f)

#### build graph
graph_builder: KnowledgeGraph
if(ui.params.kg_type == "neo4j"):
    graph_builder = Neo4jKnowledgeGraph(ontology_updater.output_path,
                                        credentials["neo4j_uri"],
                                        credentials["neo4j_user"],
                                        credentials["neo4j_pass"])
elif(ui.params.kg_type == "networkx"):
    graph_builder = NetworkXKnowledgeGraph(ontology_updater.output_path)

graph_builder.build_graph()


LOG.info("🎉 Pipeline Completed Successfully")

2025-07-15 11:05:59,096 [INFO ]	 Building knowledge graph ....
INFO:sokegraph: Building knowledge graph ....
2025-07-15 11:05:59,108 [INFO ]	🎉 Pipeline Completed Successfully
INFO:sokegraph:🎉 Pipeline Completed Successfully


🎉 NetworkX knowledge graph construction complete.
