In [None]:
#| hide
from fastkg.core import KnowledgeGraph
from fastkg.sqlite import *
from rdflib import *

# fastkg

> ast and efficient storage solutions for RDF graphs. Set of utilities for [RDFLIB](https://github.com/RDFLib/rdflib) to use parquet as storage, which is somewhat faster than turtle or ntriples serialization. The sqlite store extends this storage to use sqlite a s persistent graph store. 

This library provides optimized storage solutions for RDFLib graphs, focusing on:

1. Parquet storage for efficient columnar compression
2. SQLite storage for portable, indexed graph databases

## Developer Guide

If you are new to using `nbdev` here are some useful pointers to get you started.

### Install {{lib_path}} in Development mode

```sh
# make sure {{lib_path}} package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to {{lib_path}}
$ nbdev_prepare
```

## Usage

### Installation

Install latest from the GitHub [repository][repo]:

```sh
$ pip install git+https://github.com/la3d/fastkg.git
```


[repo]: {{git_url}}
[docs]: https://la3d.github.io/fastkg/

## Quick Start

### Using parquet as a fast store

In [None]:
from fastkg.core import KnowledgeGraph
import rdflib

# Create a knowledge graph
kg = KnowledgeGraph()

# Add some triples
ex = rdflib.Namespace("http://example.org/")
kg.bind_ns("ex", ex)
kg.add((ex.John, rdflib.RDF.type, ex.Person))
kg.add((ex.John, ex.name, rdflib.Literal("John Doe")))
kg.add((ex.John, ex.knows, ex.Jane))

print(f"Created graph with {len(kg)} triples")

# Save to Parquet file
kg.save_parquet("example.parquet")
print("Saved graph to Parquet file")

# Load from Parquet file
kg2 = KnowledgeGraph().load_parquet("example.parquet")
print(f"Loaded {len(kg2)} triples from Parquet file")

# Query the graph
results = list(kg2.query("""
    SELECT ?name WHERE {
        ?person a <http://example.org/Person> .
        ?person <http://example.org/name> ?name .
    }
"""))

for row in results:
    print(f"Found person: {row[0]}")


Created graph with 3 triples
Saved graph to Parquet file
Loaded 3 triples from Parquet file
Found person: "John Doe"


### Using SQLite as a simple triple store.

In [None]:
from fastkg.core import KnowledgeGraph
from fastkg.sqlite import *
import rdflib

# Create a knowledge graph and connect to SQLite
kg = KnowledgeGraph()
kg.connect_sqlite("example.db", create=True)

# Add some triples directly to the SQLite-backed graph
ex = rdflib.Namespace("http://example.org/")
kg.bind_ns("ex", ex)
kg.add((ex.John, rdflib.RDF.type, ex.Person))
kg.add((ex.John, ex.name, rdflib.Literal("John Doe")))
kg.add((ex.John, ex.knows, ex.Jane))

print(f"Added {len(kg)} triples to the database")

# Close the connection when done
kg.close();

# Load from SQLite
kg2 = KnowledgeGraph()
kg2.connect_sqlite("example.db", create=False)

print(f"Loaded {len(kg2)} triples from the database")

# Query the graph
results = list(kg2.query("""
    SELECT ?name WHERE {
        ?person a <http://example.org/Person> .
        ?person <http://example.org/name> ?name .
    }
"""))

for row in results:
    print(f"Found person: {row[0]}")

# Don't forget to close the connection
kg2.close();


Added 3 triples to the database
Loaded 3 triples from the database
Found person: John Doe


## Use Cases for RAG Systems

This library is particularly useful for LLM-based Retrieval Augmented Generation systems:

- **Agent Memory**: Store structured knowledge that persists between sessions
- **Knowledge Graphs**: Maintain entity relationships for complex reasoning
- **Efficient Retrieval**: Query relevant subgraphs to include in LLM context windows

## Core Features

The library includes:

1. `KnowledgeGraph` class - A wrapper around RDFLib's Graph with additional storage capabilities
2. Parquet storage - Fast columnar storage for large graphs
3. SQLite storage - Indexed, portable database storage
4. Helper methods for common graph operations

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Documentation can be found hosted on this GitHub [repository][repo]'s [pages][docs]. Additionally you can find package manager specific guidelines on [conda][conda] and [pypi][pypi] respectively.

[repo]: {{git_url}}
[docs]: https://{{user}}.github.io/{{lib_name}}/