Skip to content

An end to end NLP pipeline that extracts entities, relationships, and facts from documents to construct an interactive Knowledge Graph with reasoning and visualization capabilities

Notifications You must be signed in to change notification settings

itzTiru/knowledge-graph-NLP-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

knowledge-graph-NLP-pipeline

Intro

An end-to-end system that extracts entities and relationships from unstructured text (PDF, Text, HTML) and constructs an interactive Knowledge Graph in Neo4j.

Visualization Demo

Features

1. Document Processing

Load and clean PDF, Text, and HTML files. Processing

2. NLP Pipeline & Knowledge Graph

Automated graph construction using Named Entity Recognition (BERT) and Relationship Extraction. Knowledge Graph

3. Interactive Visualization

Explore the graph with zooming and panning capabilities. Zoomed View

  • Reasoning: Basic graph analysis (Centrality) using NetworkX.
  • Visualization: Interactive Streamlit dashboard with PyVis graph view.

Prerequisites

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd AntiGravity
  2. Set up Virtual Environment:

    python -m venv venv
    .\venv\Scripts\activate  # Windows
    # source venv/bin/activate  # Mac/Linux
  3. Install Dependencies:

    pip install -r requirements.txt
    python -m spacy download en_core_web_sm
  4. Configure Environment: Create a .env file with your Neo4j credentials:

    NEO4J_URI=bolt://localhost:7687
    NEO4J_USER=neo4j
    NEO4J_PASSWORD=antigravity

Usage

Run the Streamlit application:

streamlit run src/app/main.py

Upload a document, click "Build Knowledge Graph", and explore the results.

Project Structure

  • src/document_processing: Loaders and cleaners.
  • src/nlp: NER and Relation Extraction models.
  • src/graph: Neo4j interaction logic.
  • src/reasoning: Graph analysis algorithms.
  • src/app: Streamlit dashboard.

About

An end to end NLP pipeline that extracts entities, relationships, and facts from documents to construct an interactive Knowledge Graph with reasoning and visualization capabilities

Topics

Resources

Stars

Watchers

Forks

Languages