Skip to content

neo4j-product-examples/graphrag-ebook

Repository files navigation

Neo4j GraphRAG Book Example

This repository contains all code, data, and notebooks referenced in the Neo4j GraphRAG book. It provides a step-by-step guide to building a Retrieval-Augmented Generation (RAG) system using Neo4j, OpenAI, LangChain, and real-world financial filings.


📖 Project Overview

  • Ingest and model company, filing, and holdings data as a knowledge graph in Neo4j
  • Run retrieval-augmented queries and semantic search over filings and financial data
  • Use Jupyter notebooks for hands-on experimentation

📂 Project Structure

  • 01_PDF_Loader_for_Neo4j_GraphRAG.ipynb — Loads PDF and CSV data into Neo4j, builds the graph, and creates vector embeddings
  • 02_Retreivers_notebook.ipynb — Interactive notebook for querying the graph, running RAG workflows, and experimenting with retrieval
  • 03_Neo4j_agent_notebook.ipynb — (Optional) Agent-based exploration and advanced workflows
  • data/ — Contains:
    • Asset_Manager_Holdings.csv, Company_Filings.csv: Sample data files
    • form10k-sample/: Example 10-K filings for graph ingestion
  • Additional_Info/ — (Optional) Additional scripts and explained notebooks
  • .env.sample — Template for required environment variables

🛠️ Setup Instructions

  1. Clone the repository
  2. Install dependencies
    pip install -r requirements.txt
  3. Configure environment variables:
    • Copy .env.sample to .env and fill in:
      • OPENAI_API_KEY
      • NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD
  4. Start Neo4j AuraDB (see book for connection details)
  5. Run the loader notebook (01_PDF_Loader_for_Neo4j_GraphRAG.ipynb) to ingest data
  6. Open 02_Retreivers_notebook.ipynb to run retrievals and explore the graph

📊 Data Files

  • All sample data used in the book is in the data/ directory
  • You may add or replace files in form10k-sample/ as needed for your experiments

🧹 Cleaning Up Unnecessary Files

  • You may safely delete the following if not used in your workflow:
    • Additional_Info/__pycache__/
    • Any script or notebook in Additional_Info/ not referenced in the book
    • neo4j_importer_model_2025-05-08.json if not required
  • Keep only the notebooks and data files you use in your book walkthrough

📝 Notes

  • Never commit secrets.env is in .gitignore
  • Use Jupyter or VS Code for interactive exploration
  • All notebooks are designed to be run step-by-step as described in the book

📚 Resources


For questions or help, open an issue in this repo.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •