Skip to content

Commit f54dce1

Browse files
committed
added graphrag uses cases, reorganize content
1 parent 50ea476 commit f54dce1

File tree

5 files changed

+283
-136
lines changed

5 files changed

+283
-136
lines changed

site/content/gen-ai/graphrag/_index.md

Lines changed: 33 additions & 134 deletions
Original file line numberDiff line numberDiff line change
@@ -14,148 +14,47 @@ exclusive early access, [get in touch](https://arangodb.com/contact/) with
1414
the ArangoDB team.
1515
{{< /tip >}}
1616

17-
## Introduction
17+
## Transform unstructured documents into intelligent knowledge graphs
1818

19-
Large language models (LLMs) and knowledge graphs are two prominent and
20-
contrasting concepts, each possessing unique characteristics and functionalities
21-
that significantly impact the methods we employ to extract valuable insights from
22-
constantly expanding and complex datasets.
19+
ArangoDB's GraphRAG solution enables organizations to extract meaningful insights
20+
from their document collections by creating knowledge graphs that capture not just
21+
individual facts, but the intricate relationships between concepts across documents.
22+
This approach goes beyond traditional RAG systems by understanding document
23+
interconnections and providing both granular detail-level responses and high-level
24+
conceptual understanding.
2325

24-
LLMs, such as those powering OpenAI's ChatGPT, represent a class of powerful language
25-
transformers. These models leverage advanced neural networks to exhibit a
26-
remarkable proficiency in understanding, generating, and participating in
27-
contextually-aware conversations.
26+
- **Intelligent document understanding**: Automatically extracts and connects knowledge across multiple document sources
27+
- **Contextual intelligence**: Maintains relationships between concepts, enabling more accurate and comprehensive responses
28+
- **Multi-level insights**: Provides both detailed technical answers and strategic high-level understanding
29+
- **Seamless knowledge access**: Natural language interface for querying complex document relationships
2830

29-
On the other hand, knowledge graphs contain carefully structured data and are
30-
designed to capture intricate relationships among discrete and seemingly
31-
unrelated information.
31+
## Key benefits for enterprise applications
3232

33-
ArangoDB's unique capabilities and flexible integration of knowledge graphs and
34-
LLMs provide a powerful and efficient solution for anyone seeking to extract
35-
valuable insights from diverse datasets.
33+
- **Cross-document relationship intelligence**:
34+
Unlike traditional RAG systems that treat documents in isolation, ArangoDB's GraphRAG
35+
pipeline detects and leverages references between documents and chunks. This enables
36+
more accurate responses by understanding how concepts relate across your entire knowledge base.
3637

37-
The GraphRAG component of the GenAI Suite brings all the capabilities
38-
together with an easy-to-use interface, so you can make the knowledge accessible
39-
to your organization.
38+
- **Multi-level understanding architecture**:
39+
The system provides both detailed technical responses and high-level strategic insights
40+
from the same knowledge base, adapting response depth based on query complexity and user intent.
4041

41-
GraphRAG is particularly valuable for use cases like the following:
42-
- Applications requiring in-depth knowledge retrieval
43-
- Contextual question answering
44-
- Reasoning over interconnected information
42+
- **Reference-aware knowledge graph**:
43+
GraphRAG automatically detects and maps relationships between document chunks while
44+
maintaining context of how information connects across different sources.
4545

46-
## How GraphRAG works
46+
- **Dynamic knowledge evolution**:
47+
The system learns and improves understanding as more documents are added, with
48+
relationships and connections becoming more sophisticated over time.
4749

48-
ArangoDB's GraphRAG solution democratizes the creation and usage of knowledge
49-
graphs with a unique combination of vector search, graphs, and LLMs (privately or publicly hosted)
50-
in a single product.
5150

52-
The overall workflow involves the following steps:
53-
1. **Chunking**:
54-
- Breaking down raw documents into text chunks
55-
2. **Entity and relation extraction for Knowledge Graph construction**:
56-
- LLM-assisted description of entities and relations
57-
- Entities get inserted as nodes with embeddings
58-
- Relations get inserted as edges, these include: entity-entity, entity-chunk, chunk-document
59-
3. **Topology-based clustering into mini-topics (called communities)**:
60-
- Each entity points to its community
61-
- Each community points to its higher-level community, if available
62-
(mini-topics point to major topics)
63-
4. **LLM-assisted community summarization**:
64-
- Community summarization is based on all information available about each topic
51+
## What's next
6552

66-
### Turn text files into a Knowledge Graph
53+
- **[GraphRAG Enterprise Use Cases](use-cases.md)**: Understand the business value through real-world scenarios.
54+
- **[GraphRAG Technical Overview](technical-overview.md)**: Dive into the architecture, services, and implementation details.
55+
- **[GraphRAG Web Interface](web-interface.md)**: Try GraphRAG using the interactive web interface.
56+
- **[GraphRAG Tutorial using integrated Notebook servers](tutorial-notebook.md)**: Follow hands-on examples and implementation guidance via Jupyter Notebooks.
6757

68-
The Importer service is the entry point of the GraphRAG pipeline. It takes a
69-
raw text file as input, processes it using an LLM to extract entities and
70-
relationships, and generates a Knowledge Graph. The Knowledge Graph is then
71-
stored in an ArangoDB database for further use. The Knowledge Graph represents
72-
information in a structured graph format, allowing efficient querying and retrieval.
73-
74-
1. Pre-process the raw text file to identify entities and their relationships.
75-
2. Use LLMs to infer connections and context, enriching the Knowledge Graph.
76-
3. Store the generated Knowledge Graph in the database for retrieval and reasoning.
77-
78-
For detailed information about the service, see the
79-
[Importer](../services/importer.md) service documentation.
80-
81-
### Extract information from the Knowledge Graph
82-
83-
The Retriever service enables intelligent search and retrieval of information
84-
from your previously created Knowledge Graph.
85-
You can extract information from Knowledge Graphs using two distinct methods:
86-
- Global retrieval
87-
- Local retrieval
88-
89-
For detailed information about the service, see the
90-
[Retriever](../services/retriever.md) service documentation.
91-
92-
#### Global retrieval
93-
94-
Global retrieval focuses on:
95-
- Extracting information from the entire Knowledge Graph, regardless of specific
96-
contexts or constraints.
97-
- Provides a comprehensive overview and answers queries that span across multiple
98-
entities and relationships in the graph.
99-
100-
**Use cases:**
101-
- Answering broad questions that require a holistic understanding of the Knowledge Graph.
102-
- Aggregating information from diverse parts of the Knowledge Graph for high-level insights.
103-
104-
**Example query:**
105-
106-
Global retrieval can answer questions like _**What are the main themes or topics covered in the document**_?
107-
108-
During import, the entire Knowledge Graph is analyzed to identify and summarize
109-
the dominant entities, their relationships, and associated themes. Global
110-
retrieval uses these community summaries to answer questions from different
111-
perspectives, then the information gets aggregated into the final response.
112-
113-
#### Local retrieval
114-
115-
Local retrieval is a more focused approach for:
116-
- Queries that are constrained to specific subgraphs or contextual clusters
117-
within the Knowledge Graph.
118-
- Targeted and precise information extraction, often using localized sections
119-
of the Knowledge Graph.
120-
121-
**Use cases:**
122-
- Answering detailed questions about a specific entity or a related group of entities.
123-
- Retrieving information relevant to a particular topic or section in the Knowledge Graph.
124-
125-
**Example query:**
126-
127-
Local retrieval can answer questions like _**What is the relationship between entity X and entity Y**_?
128-
129-
Local queries use hybrid search (semantic and lexical) over the Entities
130-
collection, and then it expands that subgraph over related entities, relations
131-
(and its LLM-generated verbal descriptions), text chunks, and communities.
132-
133-
### Private LLMs
134-
135-
If you're working in an air-gapped environment or need to keep your data
136-
private, you can use the private LLM mode with
137-
[Triton Inference Server](../services/triton-inference-server.md).
138-
139-
This option allows you to run the service completely within your own
140-
infrastructure. The Triton Inference Server is a crucial component when
141-
running in private LLM mode. It serves as the backbone for running your
142-
language (LLM) and embedding models on your own machines, ensuring your
143-
data never leaves your infrastructure. The server handles all the complex
144-
model operations, from processing text to generating embeddings, and provides
145-
both HTTP and gRPC interfaces for communication.
146-
147-
### Public LLMs
148-
149-
Alternatively, if you prefer a simpler setup and don't have specific privacy
150-
requirements, you can use the public LLM mode. This option connects to cloud-based
151-
services like OpenAI's models via the OpenAI API or a large array of models
152-
(Gemini, Anthropic, publicly hosted open-source models, etc.) via the OpenRouter option.
153-
154-
## Limitations
155-
156-
The pre-release version of ArangoDB GraphRAG has the following limitations:
157-
158-
- You can only import a single file.
159-
- The knowledge graph generated from the file is imported into a named graph
160-
with a fixed name of `KnowledgeGraph` and set of collections which also have
161-
fixed names.
58+
For deeper implementation details, explore the individual services:
59+
- **[Importer Service](services/importer.md)**: Transform documents into knowledge graphs.
60+
- **[Retriever Service](services/retriever.md)**: Query and extract insights from your knowledge graphs.

site/content/gen-ai/graphrag/tutorial-notebook.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: GraphRAG Notebook Tutorial
33
menuTitle: Notebook Tutorial
44
description: >-
55
Building a GraphRAG pipeline using ArangoDB's integrated notebook servers
6-
weight: 10
6+
weight: 25
77
---
88
{{< tip >}}
99
The Arango Data Platform & GenAI Suite is available as a pre-release. To get

site/content/gen-ai/graphrag/web-interface.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: How to use GraphRAG in the Arango Data Platform web interface
33
menuTitle: Web Interface
4-
weight: 5
4+
weight: 20
55
description: >-
66
Learn how to create, configure, and run a full GraphRAG workflow in four steps
77
using the Platform web interface
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
---
2+
title: GraphRAG Technical Overview
3+
menuTitle: Technical Overview
4+
weight: 15
5+
description: >-
6+
Technical overview of ArangoDB's GraphRAG solution, including
7+
architecture, services, and deployment options
8+
---
9+
{{< tag "ArangoDB Platform" >}}
10+
11+
{{< tip >}}
12+
The ArangoDB Platform & GenAI Suite is available as a pre-release. To get
13+
exclusive early access, [get in touch](https://arangodb.com/contact/) with
14+
the ArangoDB team.
15+
{{< /tip >}}
16+
17+
## Introduction
18+
19+
Large language models (LLMs) and knowledge graphs are two prominent and
20+
contrasting concepts, each possessing unique characteristics and functionalities
21+
that significantly impact the methods we employ to extract valuable insights from
22+
constantly expanding and complex datasets.
23+
24+
LLMs, such as those powering OpenAI's ChatGPT, represent a class of powerful language
25+
transformers. These models leverage advanced neural networks to exhibit a
26+
remarkable proficiency in understanding, generating, and participating in
27+
contextually-aware conversations.
28+
29+
On the other hand, knowledge graphs contain carefully structured data and are
30+
designed to capture intricate relationships among discrete and seemingly
31+
unrelated information.
32+
33+
ArangoDB's unique capabilities and flexible integration of knowledge graphs and
34+
LLMs provide a powerful and efficient solution for anyone seeking to extract
35+
valuable insights from diverse datasets.
36+
37+
The GraphRAG component of the GenAI Suite brings all the capabilities
38+
together with an easy-to-use interface, so you can make the knowledge accessible
39+
to your organization.
40+
41+
GraphRAG is particularly valuable for use cases like the following:
42+
- Applications requiring in-depth knowledge retrieval
43+
- Contextual question answering
44+
- Reasoning over interconnected information
45+
46+
## How GraphRAG works
47+
48+
ArangoDB's GraphRAG solution democratizes the creation and usage of knowledge
49+
graphs with a unique combination of vector search, graphs, and LLMs (privately or publicly hosted)
50+
in a single product.
51+
52+
The overall workflow involves the following steps:
53+
1. **Chunking**:
54+
- Breaking down raw documents into text chunks
55+
2. **Entity and relation extraction for Knowledge Graph construction**:
56+
- LLM-assisted description of entities and relations
57+
- Entities get inserted as nodes with embeddings
58+
- Relations get inserted as edges, these include: entity-entity, entity-chunk, chunk-document
59+
3. **Topology-based clustering into mini-topics (called communities)**:
60+
- Each entity points to its community
61+
- Each community points to its higher-level community, if available
62+
(mini-topics point to major topics)
63+
4. **LLM-assisted community summarization**:
64+
- Community summarization is based on all information available about each topic
65+
66+
### Turn text files into a Knowledge Graph
67+
68+
The Importer service is the entry point of the GraphRAG pipeline. It takes a
69+
raw text file as input, processes it using an LLM to extract entities and
70+
relationships, and generates a Knowledge Graph. The Knowledge Graph is then
71+
stored in an ArangoDB database for further use. The Knowledge Graph represents
72+
information in a structured graph format, allowing efficient querying and retrieval.
73+
74+
1. Pre-process the raw text file to identify entities and their relationships.
75+
2. Use LLMs to infer connections and context, enriching the Knowledge Graph.
76+
3. Store the generated Knowledge Graph in the database for retrieval and reasoning.
77+
78+
For detailed information about the service, see the
79+
[Importer](services/importer.md) service documentation.
80+
81+
### Extract information from the Knowledge Graph
82+
83+
The Retriever service enables intelligent search and retrieval of information
84+
from your previously created Knowledge Graph.
85+
You can extract information from Knowledge Graphs using two distinct methods:
86+
- Global retrieval
87+
- Local retrieval
88+
89+
For detailed information about the service, see the
90+
[Retriever](services/retriever.md) service documentation.
91+
92+
#### Global retrieval
93+
94+
Global retrieval focuses on:
95+
- Extracting information from the entire Knowledge Graph, regardless of specific
96+
contexts or constraints.
97+
- Provides a comprehensive overview and answers queries that span across multiple
98+
entities and relationships in the graph.
99+
100+
**Use cases:**
101+
- Answering broad questions that require a holistic understanding of the Knowledge Graph.
102+
- Aggregating information from diverse parts of the Knowledge Graph for high-level insights.
103+
104+
**Example query:**
105+
106+
Global retrieval can answer questions like _**What are the main themes or topics covered in the document**_?
107+
108+
During import, the entire Knowledge Graph is analyzed to identify and summarize
109+
the dominant entities, their relationships, and associated themes. Global
110+
retrieval uses these community summaries to answer questions from different
111+
perspectives, then the information gets aggregated into the final response.
112+
113+
#### Local retrieval
114+
115+
Local retrieval is a more focused approach for:
116+
- Queries that are constrained to specific subgraphs or contextual clusters
117+
within the Knowledge Graph.
118+
- Targeted and precise information extraction, often using localized sections
119+
of the Knowledge Graph.
120+
121+
**Use cases:**
122+
- Answering detailed questions about a specific entity or a related group of entities.
123+
- Retrieving information relevant to a particular topic or section in the Knowledge Graph.
124+
125+
**Example query:**
126+
127+
Local retrieval can answer questions like _**What is the relationship between entity X and entity Y**_?
128+
129+
Local queries use hybrid search (semantic and lexical) over the Entities
130+
collection, and then it expands that subgraph over related entities, relations
131+
(and its LLM-generated verbal descriptions), text chunks, and communities.
132+
133+
### Private LLMs
134+
135+
If you're working in an air-gapped environment or need to keep your data
136+
private, you can use the private LLM mode with
137+
[Triton Inference Server](services/triton-inference-server.md).
138+
139+
This option allows you to run the service completely within your own
140+
infrastructure. The Triton Inference Server is a crucial component when
141+
running in private LLM mode. It serves as the backbone for running your
142+
language (LLM) and embedding models on your own machines, ensuring your
143+
data never leaves your infrastructure. The server handles all the complex
144+
model operations, from processing text to generating embeddings, and provides
145+
both HTTP and gRPC interfaces for communication.
146+
147+
### Public LLMs
148+
149+
Alternatively, if you prefer a simpler setup and don't have specific privacy
150+
requirements, you can use the public LLM mode. This option connects to cloud-based
151+
services like OpenAI's models via the OpenAI API or a large array of models
152+
(Gemini, Anthropic, publicly hosted open-source models, etc.) via the OpenRouter option.
153+
154+
## Limitations
155+
156+
The pre-release version of ArangoDB GraphRAG has the following limitations:
157+
158+
- You can only import a single file.
159+
- The knowledge graph generated from the file is imported into a named graph
160+
with a fixed name of `KnowledgeGraph` and set of collections which also have
161+
fixed names.

0 commit comments

Comments
 (0)