Knowledge Graph Generator

Added Crew4AI for URL extraction alongside unstructured.io for PDF extraction. crew4ai is a powerful tool that provides advanced capabilities for extracting structured data from unstructured sources such as web pages, documents, and more. With crew4ai, you can easily extract URLs from text and leverage them in your knowledge graph generation process. This integration enhances the functionality of the application by allowing you to incorporate web-based information into your knowledge graphs. By combining the power of crew4ai and unstructured.io, you can create comprehensive and dynamic knowledge graphs that capture information from both PDF documents and web sources.

Prerequisites

Docker
Docker Compose
Google API Key for Gemini model

Project Structure

.
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── src/
│   └── kg_generator/
├── tests/
├── examples/
├── initial_pdfs/    # Mount point for initial PDF files
└── additional_pdfs/ # Mount point for additional PDF files to update the graph

Quick Start

Clone the repository:

git clone https://github.com/OGsiji/Enhanced_GraphRAG.git
cd Enhanced_GraphRAG

Create a .env file from the example:

touch .env

Edit the .env file and add your Google API key and other Important keys:

GOOGLE_API_KEY=your_google_api_key_here

Create PDF directories and add your PDF files:

mkdir initial_pdfs additional_pdfs
# Add initial PDFs
cp path/to/your/initial/pdfs/*.pdf initial_pdfs/
# Add additional PDFs (optional)
cp path/to/your/additional/pdfs/*.pdf additional_pdfs/


# Add URLs
Navigate to `src/kg_generator/url.py` and add/edit the URLs.


## Select what to run

You can also set what to run, whether URLs or PDFs, in `src/kg_generator/config.py` using `LinkConfig.url` or `LinkConfig.pdf`. The default value is `true`.

NOTE: For Streamlit UI, you can upload it directly through streamlit

Build and run the containers:

docker-compose up --build

Processing Flow

The system first processes all PDFs in the initial_pdfs directory to create the base knowledge graph
If any PDFs exist in the additional_pdfs directory, they will be processed and used to update the existing knowledge graph
Both directories are mounted as volumes, so you can add or remove PDFs without rebuilding the container

Running Tests

To run the tests in a Docker container:

docker-compose run --rm kg_generator pytest

A Python application that generates knowledge graphs from PDF documents using FalkorDB and Google's Gemini model. The Knowledge Graph generator extends the GraphRAG-SDK framework to handle PDF files using the Unstructured-IO library.

Prerequisites

Docker
Docker Compose
Google API Key for Gemini model

Project Structure

.
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── src/
│   └── kg_generator/
├── tests/
├── examples/
├── initial_pdfs/    # Mount point for initial PDF files
└── additional_pdfs/ # Mount point for additional PDF files to update the graph

Quick Start

Clone the repository:

git clone https://github.com/OGsiji/Enhanced_GraphRAG.git
cd Enhanced_GraphRAG

Create a .env file from the example:

touch .env

Edit the .env file and add your Google API key and other Important keys:

GOOGLE_API_KEY=your_google_api_key_here

Create PDF directories and add your PDF files:

mkdir initial_pdfs additional_pdfs
# Add initial PDFs
cp path/to/your/initial/pdfs/*.pdf initial_pdfs/
# Add additional PDFs (optional)
cp path/to/your/additional/pdfs/*.pdf additional_pdfs/

Build and run the containers:

docker-compose up --build

Processing Flow

The system first processes all PDFs in the initial_pdfs directory to create the base knowledge graph
If any PDFs exist in the additional_pdfs directory, they will be processed and used to update the existing knowledge graph
Both directories are mounted as volumes, so you can add or remove PDFs without rebuilding the container

Running Tests

To run the tests in a Docker container:

docker-compose run --rm kg_generator pytest

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
examples		examples
src/kg_generator		src/kg_generator
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Graph Generator

Prerequisites

Project Structure

Quick Start

NOTE: For Streamlit UI, you can upload it directly through streamlit

Processing Flow

Running Tests

Prerequisites

Project Structure

Quick Start

Processing Flow

Running Tests

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Knowledge Graph Generator

Prerequisites

Project Structure

Quick Start

NOTE: For Streamlit UI, you can upload it directly through streamlit

Processing Flow

Running Tests

Prerequisites

Project Structure

Quick Start

Processing Flow

Running Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages