KGraph is a sophisticated C++ library that implements a hierarchical knowledge graph system designed for semantic document processing and retrieval-augmented generation (RAG) applications. It provides efficient graph-based storage and querying capabilities optimized for structured document relationships rather than vector-based similarity searches.
- Hierarchical Graph Structure: Supports complex node relationships with inheritance and semantic linking
- Path Finding Algorithms: Advanced path discovery including shortest path, strongest path, and type-specific traversal
- Cycle Detection: Built-in cycle detection and validation algorithms
- RAG Integration: Optimized for hierarchical retrieval-augmented generation systems
- Relational Database Compatible: Designed for traditional RDBMS systems rather than NoSQL solutions
- Thread-Safe Operations: Mutex-protected operations for concurrent access
- Memory Efficient: Optimized memory usage with RAII principles
┌─────────────────────────────────────────────────────────────┐
│ KGraph System │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ KGraph │ │ Supporting │ │ Graph │ │
│ │ Engine │ │ Types │ │Orchestrator│ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ BaseNode │ │ KGEdge │ │ KGPath │ │
│ │ System │ │ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
- KGraph: Main graph engine providing high-level graph operations
- SupportingTypes: Fundamental data structures and utilities
- BaseNode: Core node implementation with attribute management
- KGEdge: Edge definitions with relationship types and strength
- KGPath: Path representation and analysis utilities
- C++17 compatible compiler (GCC 7+, Clang 5+, MSVC 2017+)
- CMake 3.15 or higher
- C++ Standard Library
# Clone the repository
git clone https://github.com/AlotfyDev/KGraph.git
cd KGraph
# Create build directory
mkdir build && cd build
# Configure with CMake
cmake .. -DCMAKE_BUILD_TYPE=Release
# Build the library
cmake --build . --config Release
# Install (optional)
cmake --install . --prefix /usr/local
- Standard Library: Full C++17 support required
- Threading: Uses std::mutex for thread safety
- Optional: Integration with external storage systems
#include "KGraph/KGraph.hpp"
#include "SupportingTypes/KGBaseNode.hpp"
#include "SupportingTypes/EKGNodeType.hpp"
// Create a knowledge graph instance
KGraph graph;
// Create nodes
auto node1 = std::make_shared<BaseNode>("node1", NodeType::PROJECT, "Project Alpha");
auto node2 = std::make_shared<BaseNode>("node2", NodeType::TASK, "Task Beta");
// Add nodes to graph
graph.addNode(node1);
graph.addNode(node2);
// Create relationship
graph.addRelationship(node1, node2, EdgeRelationshipType::CONTAINS, 0.8);
// Query neighbors
auto neighbors = graph.getNeighbors("node1");
// Find shortest path between nodes
KGPath shortestPath = graph.findShortestPath("node1", "node2");
// Find all paths with constraints
std::vector<KGPath> allPaths = graph.findAllPaths(
"node1", "node2",
5, // max depth
0.3 // minimum strength
);
// Find paths by relationship type
std::vector<KGPath> dependencyPaths = graph.findPathsByType(
"node1", "node2",
EdgeRelationshipType::DEPENDS_ON,
10
);
// Query nodes by type
auto projectNodes = graph.queryNodesByType(NodeType::PROJECT);
// Extract subgraph
KGraph subgraph = graph.extractSubgraph({"node1", "node2", "node3"});
// Validate graph structure
bool isValid = graph.validateGraph();
auto orphaned = graph.findOrphanedNodes();
class KGraph {
public:
// Node management
void addNode(std::shared_ptr<BaseNode> node);
std::shared_ptr<BaseNode> getNode(const std::string& node_id) const;
void removeNode(const std::string& node_id);
bool containsNode(const std::string& node_id) const;
// Relationship management
void addRelationship(std::shared_ptr<BaseNode> source,
std::shared_ptr<BaseNode> target,
EdgeRelationshipType rel_type,
double strength = 1.0);
// Path finding
std::vector<KGPath> findAllPaths(const std::string& source_id,
const std::string& target_id,
int max_depth = 10,
double min_strength = 0.1) const;
KGPath findShortestPath(const std::string& source_id,
const std::string& target_id,
double min_strength = 0.1) const;
// Graph analysis
bool validateGraph() const;
std::vector<std::string> findOrphanedNodes() const;
std::vector<std::vector<std::string>> findConnectedComponents() const;
};
class BaseNode {
public:
// Node properties
std::string getId() const;
NodeType getType() const;
std::string getDescription() const;
std::time_t getTimestamp() const;
// Attribute management
void addAttribute(std::shared_ptr<KGStructuredAttribute> attr);
std::shared_ptr<KGStructuredAttribute> getAttributeByName(const std::string& name) const;
// Edge management
void addIncomingEdgeId(const std::string& edgeId);
void addOutgoingEdgeId(const std::string& edgeId);
// Serialization
std::string toYAML(int indent = 0) const;
std::string toJSON(int indent = 0) const;
};
KGraph/
├── KGraph/ # Core KGraph implementation
│ ├── KGraph.cpp # Main graph engine implementation
│ └── KGraph.hpp # KGraph class definition
├── SupportingTypes/ # Supporting type definitions
│ ├── KGBaseNode.cpp # Base node implementation
│ ├── KGBaseNode.hpp # Base node interface
│ ├── KGEdge.cpp # Edge implementation
│ ├── KGEdge.hpp # Edge definitions
│ ├── KGPath.cpp # Path implementation
│ ├── KGPath.hpp # Path definitions
│ ├── KGStructuredAttribute.cpp # Structured attribute implementation
│ ├── KGStructuredAttribute.hpp # Structured attribute interface
│ ├── GraphOrchestrator.cpp # Graph orchestration utilities
│ ├── GraphOrchestrator.hpp # Graph orchestration interface
│ ├── NodePipelineManager.cpp # Node pipeline management
│ ├── NodePipelineManager.hpp # Node pipeline interface
│ ├── EAttributeContentType.hpp # Attribute content type enums
│ ├── EAttributeType.hpp # Attribute type enums
│ ├── EKGNodeType.hpp # Node type enums
│ ├── EKGPathType.hpp # Path type enums
│ ├── EKGRelation.hpp # Relationship type enums
│ └── Struture.md # Structure documentation
├── CMakeLists.txt # Build configuration
└── README.md # This file
The system supports various node types for different semantic contexts:
Node Type | Description | Use Case |
---|---|---|
PROJECT |
Project-level nodes | Software projects, initiatives |
TASK |
Task or activity nodes | Individual tasks, activities |
DOCUMENT |
Document nodes | Text documents, specifications |
CONCEPT |
Conceptual nodes | Ideas, requirements, features |
PERSON |
People or stakeholder nodes | Team members, stakeholders |
RESOURCE |
Resource nodes | Tools, systems, artifacts |
Predefined relationship types for semantic linking:
Relationship Type | Description | Strength Range |
---|---|---|
CONTAINS |
Containment relationship | 0.1 - 1.0 |
DEPENDS_ON |
Dependency relationship | 0.1 - 1.0 |
RELATED_TO |
General relationship | 0.1 - 0.8 |
IMPLEMENTS |
Implementation relationship | 0.3 - 1.0 |
REFERENCES |
Reference relationship | 0.1 - 0.7 |
HIERARCHICAL |
Parent-child relationship | 0.2 - 1.0 |
KGraph is designed to work seamlessly with ArchiNote for semantic document management:
// ArchiNote integration example
KGraph graph;
graph.initializeGraph("archinote.db");
// Ingest ArchiNote documents
graph.ingestFromArchiNotes(yamlDocumentData);
// Export for RAG systems
std::string ragData = graph.exportToRAG(documentNodeIds);
Operation | Average Case | Worst Case |
---|---|---|
Node Addition | O(1) | O(1) |
Edge Addition | O(1) | O(1) |
Neighbor Query | O(degree) | O(degree) |
Shortest Path | O((V+E)log V) | O((V+E)log V) |
Cycle Detection | O(V+E) | O(V+E) |
- Node Storage: ~200-500 bytes per node (depending on attributes)
- Edge Storage: ~100-200 bytes per edge
- Path Storage: Variable based on path length and complexity
// Hierarchical document retrieval
std::string retrieveContext(const std::string& query) {
// Find relevant document nodes
auto relevantNodes = graph.retrieveForQuery(query);
// Extract hierarchical context
std::vector<std::string> nodeIds = {"doc1", "doc2", "doc3"};
std::string subgraph = graph.getSubgraph(nodeIds);
return subgraph;
}
// Comprehensive graph analysis
bool isValid = graph.validateGraph();
if (!isValid) {
auto orphanedNodes = graph.findOrphanedNodes();
auto components = graph.findConnectedComponents();
// Handle disconnected components
for (const auto& component : components) {
if (component.size() < expectedSize) {
// Reconnect or analyze isolated component
}
}
}
- Follow C++17 best practices
- Use RAII for resource management
- Implement thread-safe operations where needed
- Document public APIs with examples
// Example test structure
TEST(KGraphTest, BasicGraphOperations) {
KGraph graph;
auto node1 = std::make_shared<BaseNode>("test1", NodeType::PROJECT);
auto node2 = std::make_shared<BaseNode>("test2", NodeType::TASK);
graph.addNode(node1);
graph.addNode(node2);
EXPECT_TRUE(graph.containsNode("test1"));
EXPECT_EQ(graph.getNodeCount(), 2);
}
- Fork the repository
- Create a feature branch
- Add comprehensive tests
- Ensure thread safety where applicable
- Submit a pull request
// Optimize for different use cases
graph.setMaxPathDepth(5); // Limit search depth
graph.setMinRelationshipStrength(0.3); // Filter weak relationships
graph.enableCaching(true); // Enable result caching
- Use appropriate node and edge types
- Implement connection pooling for large graphs
- Consider graph partitioning for very large datasets
- Distributed Graph Processing: Support for multi-node graph operations
- Advanced Analytics: Graph metrics and visualization support
- Machine Learning Integration: ML-enhanced relationship discovery
- REST API: HTTP interface for web integration
- Plugin System: Extensible node and relationship types
This project is licensed under the MIT License - see the LICENSE file for details.
For questions, issues, or contributions, please:
- Check existing documentation
- Review open issues on GitHub
- Create new issues for bugs or feature requests
- Submit pull requests for improvements
KGraph Version 1.0.0 - Hierarchical Knowledge Graph System for Semantic Processing