Skip to content

AlotfyDev/KGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KGraph - Hierarchical Knowledge Graph System

C++ License: MIT Version: 1.0.0

Overview

KGraph is a sophisticated C++ library that implements a hierarchical knowledge graph system designed for semantic document processing and retrieval-augmented generation (RAG) applications. It provides efficient graph-based storage and querying capabilities optimized for structured document relationships rather than vector-based similarity searches.

Key Features

  • Hierarchical Graph Structure: Supports complex node relationships with inheritance and semantic linking
  • Path Finding Algorithms: Advanced path discovery including shortest path, strongest path, and type-specific traversal
  • Cycle Detection: Built-in cycle detection and validation algorithms
  • RAG Integration: Optimized for hierarchical retrieval-augmented generation systems
  • Relational Database Compatible: Designed for traditional RDBMS systems rather than NoSQL solutions
  • Thread-Safe Operations: Mutex-protected operations for concurrent access
  • Memory Efficient: Optimized memory usage with RAII principles

Architecture

Core Components

┌─────────────────────────────────────────────────────────────┐
│                      KGraph System                         │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │   KGraph    │  │ Supporting  │  │  Graph      │        │
│  │   Engine    │  │   Types     │  │Orchestrator│        │
│  └─────────────┘  └─────────────┘  └─────────────┘        │
├─────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │  BaseNode   │  │   KGEdge    │  │   KGPath    │        │
│  │   System    │  │             │  │             │        │
│  └─────────────┘  └─────────────┘  └─────────────┘        │
└─────────────────────────────────────────────────────────────┘

Component Relationships

  • KGraph: Main graph engine providing high-level graph operations
  • SupportingTypes: Fundamental data structures and utilities
  • BaseNode: Core node implementation with attribute management
  • KGEdge: Edge definitions with relationship types and strength
  • KGPath: Path representation and analysis utilities

Installation

Prerequisites

  • C++17 compatible compiler (GCC 7+, Clang 5+, MSVC 2017+)
  • CMake 3.15 or higher
  • C++ Standard Library

Build Instructions

# Clone the repository
git clone https://github.com/AlotfyDev/KGraph.git
cd KGraph

# Create build directory
mkdir build && cd build

# Configure with CMake
cmake .. -DCMAKE_BUILD_TYPE=Release

# Build the library
cmake --build . --config Release

# Install (optional)
cmake --install . --prefix /usr/local

Dependencies

  • Standard Library: Full C++17 support required
  • Threading: Uses std::mutex for thread safety
  • Optional: Integration with external storage systems

Usage

Basic Graph Operations

#include "KGraph/KGraph.hpp"
#include "SupportingTypes/KGBaseNode.hpp"
#include "SupportingTypes/EKGNodeType.hpp"

// Create a knowledge graph instance
KGraph graph;

// Create nodes
auto node1 = std::make_shared<BaseNode>("node1", NodeType::PROJECT, "Project Alpha");
auto node2 = std::make_shared<BaseNode>("node2", NodeType::TASK, "Task Beta");

// Add nodes to graph
graph.addNode(node1);
graph.addNode(node2);

// Create relationship
graph.addRelationship(node1, node2, EdgeRelationshipType::CONTAINS, 0.8);

// Query neighbors
auto neighbors = graph.getNeighbors("node1");

Path Finding

// Find shortest path between nodes
KGPath shortestPath = graph.findShortestPath("node1", "node2");

// Find all paths with constraints
std::vector<KGPath> allPaths = graph.findAllPaths(
    "node1", "node2",
    5,      // max depth
    0.3     // minimum strength
);

// Find paths by relationship type
std::vector<KGPath> dependencyPaths = graph.findPathsByType(
    "node1", "node2",
    EdgeRelationshipType::DEPENDS_ON,
    10
);

Advanced Querying

// Query nodes by type
auto projectNodes = graph.queryNodesByType(NodeType::PROJECT);

// Extract subgraph
KGraph subgraph = graph.extractSubgraph({"node1", "node2", "node3"});

// Validate graph structure
bool isValid = graph.validateGraph();
auto orphaned = graph.findOrphanedNodes();

API Reference

KGraph Class

Core Methods

class KGraph {
public:
    // Node management
    void addNode(std::shared_ptr<BaseNode> node);
    std::shared_ptr<BaseNode> getNode(const std::string& node_id) const;
    void removeNode(const std::string& node_id);
    bool containsNode(const std::string& node_id) const;

    // Relationship management
    void addRelationship(std::shared_ptr<BaseNode> source,
                        std::shared_ptr<BaseNode> target,
                        EdgeRelationshipType rel_type,
                        double strength = 1.0);

    // Path finding
    std::vector<KGPath> findAllPaths(const std::string& source_id,
                                   const std::string& target_id,
                                   int max_depth = 10,
                                   double min_strength = 0.1) const;

    KGPath findShortestPath(const std::string& source_id,
                           const std::string& target_id,
                           double min_strength = 0.1) const;

    // Graph analysis
    bool validateGraph() const;
    std::vector<std::string> findOrphanedNodes() const;
    std::vector<std::vector<std::string>> findConnectedComponents() const;
};

BaseNode Class

Key Features

class BaseNode {
public:
    // Node properties
    std::string getId() const;
    NodeType getType() const;
    std::string getDescription() const;
    std::time_t getTimestamp() const;

    // Attribute management
    void addAttribute(std::shared_ptr<KGStructuredAttribute> attr);
    std::shared_ptr<KGStructuredAttribute> getAttributeByName(const std::string& name) const;

    // Edge management
    void addIncomingEdgeId(const std::string& edgeId);
    void addOutgoingEdgeId(const std::string& edgeId);

    // Serialization
    std::string toYAML(int indent = 0) const;
    std::string toJSON(int indent = 0) const;
};

Project Structure

KGraph/
├── KGraph/                    # Core KGraph implementation
│   ├── KGraph.cpp            # Main graph engine implementation
│   └── KGraph.hpp            # KGraph class definition
├── SupportingTypes/          # Supporting type definitions
│   ├── KGBaseNode.cpp        # Base node implementation
│   ├── KGBaseNode.hpp        # Base node interface
│   ├── KGEdge.cpp            # Edge implementation
│   ├── KGEdge.hpp            # Edge definitions
│   ├── KGPath.cpp            # Path implementation
│   ├── KGPath.hpp            # Path definitions
│   ├── KGStructuredAttribute.cpp  # Structured attribute implementation
│   ├── KGStructuredAttribute.hpp  # Structured attribute interface
│   ├── GraphOrchestrator.cpp # Graph orchestration utilities
│   ├── GraphOrchestrator.hpp # Graph orchestration interface
│   ├── NodePipelineManager.cpp   # Node pipeline management
│   ├── NodePipelineManager.hpp   # Node pipeline interface
│   ├── EAttributeContentType.hpp # Attribute content type enums
│   ├── EAttributeType.hpp    # Attribute type enums
│   ├── EKGNodeType.hpp       # Node type enums
│   ├── EKGPathType.hpp       # Path type enums
│   ├── EKGRelation.hpp       # Relationship type enums
│   └── Struture.md           # Structure documentation
├── CMakeLists.txt            # Build configuration
└── README.md                 # This file

Node Types

The system supports various node types for different semantic contexts:

Node Type Description Use Case
PROJECT Project-level nodes Software projects, initiatives
TASK Task or activity nodes Individual tasks, activities
DOCUMENT Document nodes Text documents, specifications
CONCEPT Conceptual nodes Ideas, requirements, features
PERSON People or stakeholder nodes Team members, stakeholders
RESOURCE Resource nodes Tools, systems, artifacts

Relationship Types

Predefined relationship types for semantic linking:

Relationship Type Description Strength Range
CONTAINS Containment relationship 0.1 - 1.0
DEPENDS_ON Dependency relationship 0.1 - 1.0
RELATED_TO General relationship 0.1 - 0.8
IMPLEMENTS Implementation relationship 0.3 - 1.0
REFERENCES Reference relationship 0.1 - 0.7
HIERARCHICAL Parent-child relationship 0.2 - 1.0

Integration with ArchiNote

KGraph is designed to work seamlessly with ArchiNote for semantic document management:

// ArchiNote integration example
KGraph graph;
graph.initializeGraph("archinote.db");

// Ingest ArchiNote documents
graph.ingestFromArchiNotes(yamlDocumentData);

// Export for RAG systems
std::string ragData = graph.exportToRAG(documentNodeIds);

Performance Characteristics

Time Complexity

Operation Average Case Worst Case
Node Addition O(1) O(1)
Edge Addition O(1) O(1)
Neighbor Query O(degree) O(degree)
Shortest Path O((V+E)log V) O((V+E)log V)
Cycle Detection O(V+E) O(V+E)

Memory Usage

  • Node Storage: ~200-500 bytes per node (depending on attributes)
  • Edge Storage: ~100-200 bytes per edge
  • Path Storage: Variable based on path length and complexity

Advanced Features

Hierarchical RAG Integration

// Hierarchical document retrieval
std::string retrieveContext(const std::string& query) {
    // Find relevant document nodes
    auto relevantNodes = graph.retrieveForQuery(query);

    // Extract hierarchical context
    std::vector<std::string> nodeIds = {"doc1", "doc2", "doc3"};
    std::string subgraph = graph.getSubgraph(nodeIds);

    return subgraph;
}

Graph Validation and Analysis

// Comprehensive graph analysis
bool isValid = graph.validateGraph();

if (!isValid) {
    auto orphanedNodes = graph.findOrphanedNodes();
    auto components = graph.findConnectedComponents();

    // Handle disconnected components
    for (const auto& component : components) {
        if (component.size() < expectedSize) {
            // Reconnect or analyze isolated component
        }
    }
}

Development Guidelines

Code Style

  • Follow C++17 best practices
  • Use RAII for resource management
  • Implement thread-safe operations where needed
  • Document public APIs with examples

Testing Strategy

// Example test structure
TEST(KGraphTest, BasicGraphOperations) {
    KGraph graph;

    auto node1 = std::make_shared<BaseNode>("test1", NodeType::PROJECT);
    auto node2 = std::make_shared<BaseNode>("test2", NodeType::TASK);

    graph.addNode(node1);
    graph.addNode(node2);

    EXPECT_TRUE(graph.containsNode("test1"));
    EXPECT_EQ(graph.getNodeCount(), 2);
}

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add comprehensive tests
  4. Ensure thread safety where applicable
  5. Submit a pull request

Performance Tuning

Configuration Options

// Optimize for different use cases
graph.setMaxPathDepth(5);           // Limit search depth
graph.setMinRelationshipStrength(0.3); // Filter weak relationships
graph.enableCaching(true);          // Enable result caching

Memory Optimization

  • Use appropriate node and edge types
  • Implement connection pooling for large graphs
  • Consider graph partitioning for very large datasets

Future Enhancements

Planned Features

  • Distributed Graph Processing: Support for multi-node graph operations
  • Advanced Analytics: Graph metrics and visualization support
  • Machine Learning Integration: ML-enhanced relationship discovery
  • REST API: HTTP interface for web integration
  • Plugin System: Extensible node and relationship types

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For questions, issues, or contributions, please:

  1. Check existing documentation
  2. Review open issues on GitHub
  3. Create new issues for bugs or feature requests
  4. Submit pull requests for improvements

KGraph Version 1.0.0 - Hierarchical Knowledge Graph System for Semantic Processing

About

C++ Knowledge Graph Abstract Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published