YiGraph

End-to-End Intelligent Graph Data
Analysis Agent System Based on AAG Framework

📖 Project Introduction

YiGraph is an end-to-end intelligent graph data analysis agent system designed to help users quickly gain insights into key relationships from complex data.

YiGraph can automatically extract entities and relationships from various raw data sources such as logs, documents, and tables to build structured graph data. Users only need to describe business problems in natural language, and the system will automatically plan the analysis process, complete calculations, and generate clear, interpretable, and traceable analysis reports.

Internally, large language models are responsible for understanding user intent, breaking down analysis tasks, and organizing final outputs. The core technology supporting the reliability of analysis results is the AAG (Analytics-Augmented Generation) framework. AAG treats analytical computation as a core capability, invoking graph algorithms and graph systems at key stages to complete verifiable calculations, which are then interpreted and summarized by the model.

Therefore, YiGraph is not just a conversational AI that "answers questions", but an intelligent graph analysis agent that can transform business problems into executable and reviewable analysis processes.

Applicable Scenarios

YiGraph can flexibly adapt to different industries and business needs, covering various complex relational data analysis scenarios, including but not limited to:

Financial anti-money laundering and suspicious transaction analysis: Automatically build transaction networks from massive transaction flows to identify abnormal fund paths and suspicious transaction loops
E-commerce risk control and wool party identification: Integrate multi-source data such as accounts, devices, and addresses to build graphs and discover organized fraud and associated malicious behavior
Enterprise association and risk investigation: Build graphs through enterprise, equity, and transaction relationships to penetrate complex structures and identify potential compliance and operational risks
Park/city event analysis: Unify access control, trajectory, and event data into graphs to restore personnel relationships and event evolution processes
Supply chain risk analysis: Integrate enterprise and transaction data to build supply chain networks, locate hidden associated risks and transmission paths

⚡ Core Features

1. Knowledge-Driven Task Planning

The system first understands what the user's question "wants to solve", then breaks it down into executable analysis steps:

What data fields and relationships are needed
What kind of graph should be built (which entities, which relationships)
What analysis methods and parameters should be used
How analysis results should be interpreted and presented

You don't need to understand graph algorithms; the system will translate "what I want to query" into "how to do the analysis".

2. Algorithm-Centric Reliable Execution

YiGraph will not let the model arbitrarily "write a piece of uncontrollable code and run it". Instead, it centers on "verifiable algorithm modules" for invocation and combination, making each analysis step:

Reproducible: Same input yields stable and consistent output
Traceable: Know which algorithms were used and which steps were executed
More reliable: Key calculations are completed by professional modules rather than pure text reasoning

3. Task-Aware Graph Construction

YiGraph will not indiscriminately build all raw data into one large graph. It will selectively extract and construct "entities and relationships relevant to the problem" based on current task needs, avoiding interference from irrelevant structures, and organize the graph into a form more suitable for execution, thereby improving efficiency and result quality.

4. Rich Graph Algorithm Library

Built-in 100+ graph algorithms covering 11 major categories, providing professional algorithm support for various graph analysis scenarios:

Algorithm Category	Number of Algorithms	Typical Algorithms	Application Scenarios
Basics	10	BFS, DFS, Topological Sort, DAG Detection, Ancestor/Descendant Query	Graph structure validation, dependency analysis, hierarchical traversal
Path	13	Dijkstra, Bellman-Ford, Floyd-Warshall, Eulerian Path, DAG Longest Path	Path planning, relationship chain analysis, critical path
Centrality	14	PageRank, Betweenness Centrality, Closeness Centrality, Eigenvector Centrality, HITS, VoteRank	Key node identification, influence assessment, seed selection
Connectivity & Components	13	Connected Components, Strongly Connected Components, Cut Vertices/Edges, Minimum Cut, Node/Edge Connectivity	Network robustness, vulnerability analysis, island identification
Clustering & Community	17	Louvain, Leiden, Label Propagation, k-clique, Girvan-Newman, Clustering Coefficient, Cycle Detection	Circle identification, gang discovery, tightness analysis
Tree & Spanning Tree	3	Minimum Spanning Tree, Maximum Spanning Tree, Random Spanning Tree	Network skeleton extraction, cost optimization
Flow & Cut	5	Edmonds-Karp, Maximum Flow, Minimum Cut, Gomory-Hu Tree	Capacity planning, bottleneck analysis, network resilience
Matching & Coloring	6	Maximum/Minimum Weight Matching, Greedy Coloring, Minimum Edge Cover	Resource allocation, conflict detection, task scheduling
Cliques & Cores	4	Maximal Clique Enumeration, Maximum Weight Clique, k-core, Core Number	Tight group discovery, core member identification
Distance & Measures	8	Eccentricity, Diameter, Radius, Center/Periphery, Wiener Index, Assortativity Coefficient	Network health check, topology comparison, structural preference analysis
Graph Query	8	Node Query, Relationship Filtering, Neighbor Query, Path Query, Common Neighbors, Subgraph Extraction, Aggregation Statistics	Data retrieval and filtering, interactive exploration, risk control investigation

For detailed algorithm descriptions and usage guides, please refer to 📚 Online Documentation

5. Flexible Data Support

Supports multiple data source inputs:

Graph Data
Text Data: Documents, logs, reports, and other unstructured data

The system will automatically extract entities and relationships from raw data to build structured graph data.

6. Multiple Operating Modes

Normal Mode: Users only need to submit their business questions. YiGraph will automatically parse the problem, select appropriate graph algorithms, execute the computation, and generate an analysis report. This mode is suitable for non-technical or general business users.
Interactive Mode: Users collaborate with YiGraph to analyze business problems. For a given question, YiGraph interacts with the LLM to determine the computation workflow and graph algorithms, then executes the plan and returns an analysis report. This mode is suitable for advanced users who are familiar with both the business and graph algorithms.
Expert Mode: Users directly specify the business problem along with the solution approach, computation steps, and graph algorithms. YiGraph then executes the provided plan and returns an analysis report. This mode is intended for expert users with deep knowledge of the business and graph algorithms.

🎯 Version Release

v0.1.0 (Current Version)

Core Capabilities

✅ Complete graph computing engine (based on NetworkX and Neo4j)
✅ Intelligent task planning and execution
✅ 100+ graph algorithms support, covering 11 major categories
✅ Multi-data source support (graph/text)
✅ Interactive dialogue interface

Roadmap

v0.2.0 (Planned)

🔄 Expand the graph algorithm library to 200–300 algorithms
🔄 Add an integrated graph learning module (training/inference)

🚀 Quick Start

1. Environment Preparation

1.1 Python Version Requirements

Python >= 3.11

Please confirm that the current Python version meets the requirements:

python --version
# or
python3 --version

1.2 Create Virtual Environment with Conda (Recommended)

conda create -n AAG python=3.11
conda activate AAG

1.3 Neo4j Installation and Configuration

YiGraph requires Neo4j as the graph database. This guide uses Neo4j 3.5.25.

1.3.1 Java Version Requirements

Neo4j 3.5.25 requires Java 8 or Java 11. Please check your Java version:

java -version

If Java is not installed, please install the appropriate version first.

1.3.2 Download and Extract Neo4j

Download the Neo4j 3.5.25 installation package from the official website (usually in .tar.gz or .zip format)
Extract the package to your desired location:

Linux/Mac systems (.tar.gz format):

tar -xzf neo4j-community-3.5.25-unix.tar.gz
cd neo4j-community-3.5.25

Windows systems (.zip format):

Right-click the archive and select "Extract to current folder"
Or use command: unzip neo4j-community-3.5.25-windows.zip
Navigate to the extracted directory

1.3.3 Configure Neo4j

Enter the conf directory and edit the neo4j.conf file:

cd conf

Add or modify the following settings in neo4j.conf:

dbms.connectors.default_listen_address=0.0.0.0
dbms.connectors.default_advertised_address=localhost
dbms.connector.bolt.listen_address=0.0.0.0:7687
dbms.connector.http.listen_address=0.0.0.0:7474
dbms.connector.https.enabled=true

1.3.4 Start and Stop Neo4j

Navigate to the bin directory to start or stop Neo4j:

Start Neo4j:

cd bin
./neo4j start

Stop Neo4j:

./neo4j stop

After starting Neo4j, you can access the web interface at http://localhost:7474 to verify the installation.

2. Get Source Code and Install Dependencies

2.1 Download Source Code

git clone https://github.com/iDC-NEU/YiGraph.git
cd YiGraph

2.2 Install Dependencies

pip install -r requirements.txt

3. Configure System Parameters

3.1 Configure Inference and Retrieval Engine

Edit the configuration file:

config/engine_config.yaml

Example configuration:

# Running mode: interactive / batch
mode: interactive

# Reasoner module configuration
reasoner:
  llm:
    provider: "openai"   # Options: ollama / openai
    openai:
      base_url: "https://your-api-endpoint/v1/"
      api_key: "your-api-key"
      model: "gpt-4o-mini"

# Retrieval module configuration
retrieval:
  database:
    graph:
      space_name: "AMLSim1K"
      server_ip: "127.0.0.1"
      server_port: "9669"
    vector:
      collection_name: "graphllm_collection"
      host: "localhost"
      port: 19530
  embedding:
    model_name: "BAAI/bge-large-en-v1.5"
    device: "cuda:2"
  rag:
    graph:
      k_hop: 2
    vector:
      k_similarity: 5

3.2 Configure Dataset

Edit the configuration file:

config/data_upload_config.yaml

Example configuration:

datasets:
  - name: AMLSim1K
    type: graph
    schema:
      vertex:
        - type: account
          path: "/path/to/accounts.csv"
          format: csv
          id_field: acct_id
      edge:
        - type: transfer
          path: "/path/to/transactions.csv"
          format: csv
          source_field: orig_acct
          target_field: bene_acct

Please modify path to your local actual data file path.

4. Start YiGraph

Important Note: Before starting YiGraph, please ensure that the Neo4j database is already running. If Neo4j is not started, YiGraph will not be able to connect to the graph database. Please refer to 1.3.4 Start and Stop Neo4j to start Neo4j.

YiGraph supports the following two operating modes:

Web Interactive Mode (Recommended) Perform interactive analysis through a browser, suitable for daily use, demonstrations, and business analysis scenarios.
Terminal Interactive Mode Interact through the command line, suitable for development debugging, quick verification, and batch testing scenarios.

4.1 Web Interactive Mode

Execute the following command in the project root directory to start the Web service:

python web/frontend/run.py

After successful startup, the terminal will output the accessible service address. Please open the corresponding address in your browser according to the prompt to enter YiGraph's Web interface.

In the Web interface, users can input business questions in natural language, and the system will automatically complete the analysis process and display analysis results and reports.

Web Interface Usage Guide

Basic steps for using the YiGraph web interface for analysis:

Start Conversation: Start a new conversation or select an existing conversation from history.
Select Mode: Choose the mode that best suits your needs.
Select Dataset: The system will list your uploaded datasets. For example: DocumentDemo.
Enter Your Request: Type your instructions or questions in the input box. Please be as clear and specific as possible.
Submit: Click the send button.
Monitor Progress: Observe status updates in the main chat area (Running, Planning, Analyzing, etc.).
View Results: After processing is complete, results will be displayed in the main chat area. You can then ask follow-up questions or start a new request.

Dataset Management

In the web interface, you can conveniently manage datasets:

Create Dataset: Click the "Create New" button.
Fill in Dataset Information:
- Enter the dataset name
- Select the file type for the dataset
Upload Data Files: Upload corresponding data files based on the selected file type.
Save Dataset: After completing the configuration, save it. The dataset will be available for selection in conversations.

File Management

In the file management interface, you can manage and visualize files in datasets:

Select Dataset: Choose the corresponding dataset from the dropdown list.
Upload Files: Upload files to the selected dataset.
View Parsing Progress: The system will display file parsing progress and provide real-time status feedback.
Visualize Knowledge Graph: After file parsing is complete, click the "Visualization" button to view the knowledge graph visualization for that dataset.

4.2 Terminal Interactive Mode

If you want to interact with YiGraph directly through the command line, you can execute in the project root directory:

python aag/main.py

After startup, the system will enter terminal interactive mode. Users can input questions according to terminal prompts, and YiGraph will complete the analysis and output results in the command line.

Terminal Interactive Usage Guide

Basic steps for using terminal interactive mode:

View Available Datasets: Use commands to view available datasets in the system.
Select Dataset: Select the dataset you want to use according to the prompts.
Enter Questions: Directly input your business questions or analysis requirements in the terminal.
View Results: The system will display the analysis process and final results in real-time in the terminal.

This mode is mainly used for development debugging, algorithm verification, or quick testing scenarios.

5. Using YiGraph

Whether using Web mode or terminal mode, YiGraph's basic usage process is consistent:

Start the corresponding operating mode
Input natural language business questions according to prompts
The system automatically completes task understanding, analysis execution, and result generation

For more advanced features, parameter descriptions, and usage examples, please refer to the project's README documentation or operation prompts in the interface.

6. Common Issues and Suggestions

GPU device unavailable: Please confirm that embedding.device is set correctly
Port conflict: Check if graph database and vector database services have been started
Model cannot be loaded: Confirm that API Key and model name are valid

📖 Documentation & Resources

📚 Online Documentation

Access the complete user manual and developer guide:

http://iDC-NEU.github.io/YiGraphDocs/

Documentation includes:

Quick Start: System installation, configuration, and basic usage
Core Concepts: AAG framework principles and architecture design
Algorithm Documentation: Detailed descriptions and usage examples of 100+ graph algorithms
API Reference: Complete API interface documentation
Best Practices: Analysis cases and experience summaries for typical scenarios

📞 Contact Us

Contribution Guidelines

We welcome all forms of contributions:

🐛 Report Bugs
💡 Suggest New Features
📝 Improve Documentation
🔧 Submit Code

Community Communication

WeChat	Xiaohongshu	Twitter

📚 Citation

If you use YiGraph or the AAG framework in your research, please cite our paper:

@article{YiGraph2026,
  title={Towards Autonomous Graph Data Analytics with Analytics-Augmented Generation},
  author={Qiange Wang, Chaoyi Chen, Jingqi Gao, Zihan Wang, Yanfeng Zhang, Ge Yu},
  journal={arXiv preprint arXiv:2602.21604},
  year={2026}
}

Acknowledgments

This project benefits from the following open source projects:

NetworkX - Graph analysis and algorithm library
PyTorch Geometric - Graph deep learning framework
NebulaGraph - Distributed graph database
Milvus - Vector database
LlamaIndex - RAG framework

Thanks to all contributors for their hard work!

📄 License

This project is licensed under the MIT License.

⭐ Star History

If this project helps you, please Star ⭐ to support us!

Making Graph Data Analysis Simpler and Smarter

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
aag		aag
config		config
docs-site		docs-site
figure		figure
web		web
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
README_CN.md		README_CN.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

YiGraph

End-to-End Intelligent Graph DataAnalysis Agent System Based on AAG Framework

📖 Project Introduction

Applicable Scenarios

⚡ Core Features

1. Knowledge-Driven Task Planning

2. Algorithm-Centric Reliable Execution

3. Task-Aware Graph Construction

4. Rich Graph Algorithm Library

5. Flexible Data Support

6. Multiple Operating Modes

🎯 Version Release

v0.1.0 (Current Version)

Roadmap

🚀 Quick Start

1. Environment Preparation

1.1 Python Version Requirements

1.2 Create Virtual Environment with Conda (Recommended)

1.3 Neo4j Installation and Configuration

1.3.1 Java Version Requirements

1.3.2 Download and Extract Neo4j

1.3.3 Configure Neo4j

1.3.4 Start and Stop Neo4j

2. Get Source Code and Install Dependencies

2.1 Download Source Code

2.2 Install Dependencies

3. Configure System Parameters

3.1 Configure Inference and Retrieval Engine

3.2 Configure Dataset

4. Start YiGraph

4.1 Web Interactive Mode

Web Interface Usage Guide

Dataset Management

File Management

4.2 Terminal Interactive Mode

Terminal Interactive Usage Guide

5. Using YiGraph

6. Common Issues and Suggestions

📖 Documentation & Resources

📚 Online Documentation

📞 Contact Us

Contribution Guidelines

Community Communication

📚 Citation

Acknowledgments

📄 License

⭐ Star History

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

End-to-End Intelligent Graph Data
Analysis Agent System Based on AAG Framework

Packages