📊 # GraphRAG-genomics

GraphRAG-Omics is an extension of Microsoft's GraphRAG library and TheAiSingularity/graphrag-local-ollama library that enables users to convert unstructured documents into knowledge graphs and interact with them using natural language queries.

This is just an expremental repository where the prompts were tailored for genomics and clinical documents.

🚀 Main Highlights

📄 Document Indexing: Convert raw .txt documents into .parquet files - this uses graphrag library.
🧠 Knowledge Graph Generation: Transform indexed documents into a structured knowledge graph stored in a Neo4j server.
💬 Natural Language Querying: Interact with your knowledge graph through an intuitive Streamlit web interface — ask questions, get insights.

🗂️ Project Structure

graphrag-omics/
│
├── graphrag_workflow.bat      # Command-line script to index documents
├── app.py                     # Streamlit app for graph creation and querying
├── input/                     # Directory to place raw .txt documents
└── proj_<project_name>/       # Generated output for each project

Components

Command-Line Indexing Script
- Takes input .txt documents
- Outputs .parquet files into a project-specific folder
Streamlit Web App
- Indexing Tab: Load .parquet files and generate a knowledge graph in Neo4j
- Query Tab: Use natural language to query your knowledge graph (GraphRAG interface)

🧪 How to Run

Prerequisits

install all necessary required libraries
install neo4j-desktop
install the graphrag, by executing the following command inside the root directory of the project.

pip install -e .

Step 1: Index Your Documents

Place your .txt documents inside the input/ folder (located in the root of the project).
Run the following command:

bash graphrag_workflow.bat proj_<project_name>

🔒 The project name must start with proj_
✅ Example: For a project named "med", use:

bash graphrag_workflow.bat proj_med

This will create a folder proj_med/ and generate the .parquet files inside it.

Step 2: Generate the Knowledge Graph & Query

Start the Streamlit app:

streamlit run app.py

Navigate to your browser where the app opens automatically.
Use the following tabs inside the app:
- Indexing: Select a project (e.g., proj_med) and generate the knowledge graph in Neo4j.
- Query: Ask questions using natural language — powered by the generated knowledge graph (GraphRAG style).

📌 Notes

Only .txt documents are currently supported.
Ensure that the Neo4j server is running before using the Indexing or Query functionality in the app.

🧬 Use Cases

Genomics research papers
Clinical documents & patient summaries
Biomedical literature mining
Interactive Q&A from specialized unstructured data

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
.semversioner		.semversioner
.vscode		.vscode
docsite		docsite
examples		examples
examples_notebooks		examples_notebooks
graphrag		graphrag
input		input
prompts		prompts
scripts		scripts
tests		tests
.gitignore		.gitignore
.vsts-ci.yml		.vsts-ci.yml
CODEOWNERS		CODEOWNERS
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
Screenshot 2024-07-09 at 3.34.31 AM-1.png		Screenshot 2024-07-09 at 3.34.31 AM-1.png
Screenshot 2024-07-09 at 3.36.28 AM.png		Screenshot 2024-07-09 at 3.36.28 AM.png
app.py		app.py
converter.py		converter.py
cspell.config.yaml		cspell.config.yaml
dictionary.txt		dictionary.txt
graphrag_workflow.bat		graphrag_workflow.bat
local_and_global_search.py		local_and_global_search.py
microsoft_to_neo4j.py		microsoft_to_neo4j.py
poetry.lock		poetry.lock
prompipe.py		prompipe.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
settings.yaml		settings.yaml
utils.py		utils.py
visualize-graphml.py		visualize-graphml.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 # GraphRAG-genomics

🚀 Main Highlights

🗂️ Project Structure

Components

🧪 How to Run

Prerequisits

Step 1: Index Your Documents

Step 2: Generate the Knowledge Graph & Query

📌 Notes

🧬 Use Cases

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📊 # GraphRAG-genomics

🚀 Main Highlights

🗂️ Project Structure

Components

🧪 How to Run

Prerequisits

Step 1: Index Your Documents

Step 2: Generate the Knowledge Graph & Query

📌 Notes

🧬 Use Cases

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages