Skip to content

Lettria/perseus-client

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

90 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Perseus Text-to-Graph

License: MIT Documentation

Documentation

In today's world, a vast amount of valuable information is locked away in unstructured textβ€”documents, articles, emails, and more. While AI and analytics tools are incredibly powerful, they struggle to make sense of this chaotic data. They need structured, connected information to reason effectively.

This is where the gap lies:

What Organizations Have What AI Systems Need
πŸ“„ Unstructured Text πŸ”— Connected Knowledge
Chaotic, disconnected data Structured, queryable graphs
Implicit relationships Explicit entities and relations
Hard to query and analyze Ready for deep analysis

Without a way to bridge this gap, AI systems can't unlock the full potential of your data. They might miss critical insights, provide incomplete answers, or fail to see the bigger picture.

Lettria's Perseus service is designed to solve this problem. It transforms your raw text into a structured knowledge graph, making it instantly usable for AI applications, from advanced search to complex reasoning. Furthermore, the SDK empowers users to leverage their own ontologies, providing a flexible way to define the desired data schema. This greatly reduces data complexity and ensures the generated knowledge graph is precisely tailored to specific use cases.

🌟 Features

  • Asynchronous Client: High-performance, non-blocking API calls using asyncio and aiohttp.
  • Simple Interface: Easy-to-use methods for file operations, ontology management, and graph building.
  • Data Validation: Robust data modeling with pydantic.
  • Neo4j Integration: Directly save your graph data to a Neo4j instance.
  • FalkorDB Integration: Directly save your graph data to a FalkorDB instance.
  • Flexible Configuration: Configure via environment variables or directly in code.

πŸ“¦ Installation

# For both Neo4j and FalkorDB support
pip install "perseus-client[all]==1.0.0-rc.16"

# For Neo4j support
pip install "perseus-client[neo4j]==1.0.0-rc.16"

# For FalkorDB support
pip install "perseus-client[falkordb]==1.0.0-rc.16"

πŸš€ Quick Start

To start using the SDK, you’ll need an API key from Lettria, which you can create by visiting our app here.

Configuration

The SDK can be configured via environment variables. The PerseusClient will automatically load them. You can place them in a .env file in your project root.

Variable Description Required
PERSEUS_API_KEY Your unique API key for the Lettria API. Yes
LOGLEVEL The log level for the client. No
NEO4J_URI The URI for your Neo4j database instance. No
NEO4J_USER The username for your Neo4j database. No
NEO4J_PASSWORD The password for your Neo4j database. No
FALKORDB_HOST The host for your FalkorDB instance. No
FALKORDB_PORT The port for your FalkorDB (default 6379). No
FALKORDB_GRAPH_NAME The name of the graph key to use. No
FALKORDB_PASSWORD The password for your FalkorDB instance. No

By default, the log level is set to INFO. You can change it by setting the LOGLEVEL environment variable to DEBUG, WARNING, ERROR, or CRITICAL.

Example: Build a Graph

This example shows how to build a graph from a text file.

import perseus_client

knowledge_graphs = perseus_client.build_graph(
    file_paths=["path/to/your/document.txt"],
)
for graph in knowledge_graphs:
    print(f"πŸŽ‰ Graph built successfully with {len(graph.entities)} entities and {len(graph.relations)} relations!")

The KnowledgeGraph Object

The build_graph_async method returns a KnowledgeGraph object, which holds the structured data of your graph.

Properties

Property Type Description
entities List[Entity] A list of nodes (entities) in the graph.
relations List[Relation] A list of relationships (facts) connecting the entities.
documents List[Document] A list of source documents used to generate the graph.
ttl_content Optional[str] The raw TTL content of the graph.
cql_content Optional[str] The raw Cypher Query Language content of the graph.

Methods

The KnowledgeGraph object also has several built-in methods to save or convert the data to different formats and databases.

Method Return Type Description
save_ttl(file_path: str) None Saves the graph to a TTL file.
to_ttl() str Returns the graph as a TTL string.
save_cql(file_path: str, strip_prefixes: bool = True) None Saves the graph to a CQL file.
to_cql(strip_prefixes: bool = True) str Returns the graph as a CQL string.
save_to_neo4j(strip_prefixes: bool = True) None Saves the graph to a Neo4j instance synchronously.
save_to_neo4j_async(strip_prefixes: bool = True) None Saves the graph to a Neo4j instance asynchronously.
save_to_falkordb(strip_prefixes: bool = True) None Saves the graph to a FalkorDB instance synchronously.
save_to_falkordb_async(strip_prefixes: bool = True) None Saves the graph to a FalkorDB instance asynchronously.
to_json() dict Converts the knowledge graph to a JSON serializable dictionary.
interlink(kbs: List[KnowledgeGraph], ...) KnowledgeGraph Merges multiple KnowledgeGraph objects into a single one.

Merging KnowledgeGraphs

You can merge multiple KnowledgeGraph objects using the perseus_client.interlink function.

import perseus_client

try:
    # Build two graphs in a single call
    knowledge_graphs = perseus_client.build_graph(
        file_paths=["path/to/document1.txt", "path/to/document2.txt"]
    )

    # Interlink them
    if len(knowledge_graphs) >= 2:
        merged_graph = perseus_client.interlink(kbs=knowledge_graphs)
        print(f"πŸŽ‰ Graphs merged successfully with {len(merged_graph.entities)} entities and {len(merged_graph.relations)} relations!")

except Exception as e:
    print(f"An error occurred: {e}")

⚑ Advanced Usage: Asynchronous Client

For long-running asynchronous applications or when you need explicit control over the client's lifecycle, you can use PerseusClient as an asynchronous context manager. This ensures connections are managed optimally and closed precisely when you're done.

import asyncio
from typing import List
from perseus_client import PerseusClient
from perseus_client.models import KnowledgeGraph

async def main():
    async with PerseusClient() as client:
        try:
            graphs: List[KnowledgeGraph] = await client.build.build_graph_async(
                file_paths=["path/to/your/async_document.txt"],
                ontology_path="path/to/your/ontology.ttl",
            )
            for graph in graphs:
                print(f"⚑ Async Graph built successfully with {len(graph.entities)} entities and {len(graph.relations)} relations!")
        except Exception as e:
            print(f"An async error occurred: {e}")

if __name__ == "__main__":
    asyncio.run(main())

πŸ“š API Reference

client.build_graph

def build_graph(
    file_paths: List[str],
    ontology_path: Optional[str] = None,
    refresh_graph: bool = False,
    metadata: Optional[Dict[str, Any]] = None,
) -> List[KnowledgeGraph]:

Processes one or more files by uploading them, optionally with an ontology, running jobs, and returning KnowledgeGraph objects synchronously.

Parameter Type Description Default
file_paths List[str] A list of file paths to process.
ontology_path Optional[str] The path to the ontology file to use. None
refresh_graph bool Whether to force a new job to be created (refresh the graph). False
metadata Optional[Dict[str, Any]] A dictionary of metadata to add to all nodes and relationships. None

KnowledgeGraph.interlink

@staticmethod
def interlink(
    kbs: List["KnowledgeGraph"],
    interlinking_key_uris: List[str] = ["http://www.w3.org/2000/01/rdf-schema#label"],
    immutable_properties: Optional[List[str]] = None,
    merge_properties_on_conflict: bool = False,
) -> "KnowledgeGraph":

Merges multiple KnowledgeGraph objects into a single one based on a linking key. Entities are deduplicated and their properties are combined. By default, entities of different types will not be merged, even if they share the same linking key. If immutable_properties are specified, entities with conflicting values for these properties will also not be merged, resulting in separate entities in the final graph.

Parameter Type Description Default
kbs List[KnowledgeGraph] A list of KnowledgeGraph objects to merge.
interlinking_key_uris List[str] The URI of the property to use for linking entities (e.g., rdfs:label). ["http://www.w3.org/2000/01/rdf-schema#label"]
immutable_properties Optional[List[str]] A list of property URIs (e.g., "http://purl.org/dc/elements/1.1/title" or "hasJobTitle") that, if their values conflict between entities, will prevent those entities from being merged. Instead, separate entities will be retained. None
merge_properties_on_conflict bool If True, merges properties when a conflict occurs. Otherwise, keeps the first one. False

πŸ“‚ Examples

For more detailed examples, check out the examples/ directory. Each example has its own README with instructions.

Simple Examples

Advanced Examples

  • Finance Compliance: A complete pipeline to convert unstructured sustainability disclosures into a knowledge graph and produce CSRD-compliant reports.
  • Graph RAG Reporting FalkorDB: A complete workflow to turn a PDF into a knowledge graph and generate a report. Graph is saved in FalkorDB.
  • Graph RAG Reporting Neo4j: A complete workflow to turn a PDF into a knowledge graph and generate a report. Graph is saved in Neo4j.

🀝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

πŸ“§ Contact

For support or questions, please reach out at hello@lettria.com.

πŸ“„ License

This SDK is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages