# Graph Store Module

## Overview

The Graph Store module provides a unified interface for working with property graph databases. It supports multiple backends (Neo4j, FalkorDB) and offers comprehensive features for storing, querying, and analyzing graph data.

### Key Features

- **Multi-Backend Support**: Neo4j (Enterprise), FalkorDB (Redis-based)
- **Full CRUD Operations**: Create, read, update, delete nodes and relationships
- **Cypher Query Language**: Execute complex graph queries with OpenCypher support
- **Graph Analytics**: Built-in algorithms for centrality, community detection, path finding
- **Batch Operations**: Optimized bulk data loading with progress tracking
- **Transaction Support**: ACID transactions with rollback capabilities
- **Index Management**: Create and manage indexes for performance optimization
- **Convenience Functions**: Simple function-based API for common operations

### Learning Objectives

By the end of this notebook, you will be able to:

1. Initialize and configure GraphStore with different backends
2. Perform CRUD operations on nodes and relationships
3. Execute Cypher queries for complex graph operations
4. Use graph analytics algorithms (shortest path, neighbors, centrality)
5. Update and delete graph data
6. Use batch operations for efficient data loading
7. Work with convenience functions and configuration management
8. Choose the right backend for your use case

---

## Installation

### Core Installation

```bash
# Install Semantica
pip install semantica

# Or install with all optional dependencies
pip install semantica[all]
```

### Backend-Specific Dependencies

```bash
# For Neo4j (requires Neo4j server)
pip install neo4j

# For FalkorDB (requires Redis/FalkorDB server)
pip install falkordb
```

### Docker Setup (Optional)

For FalkorDB, you can run it in Docker:

```bash
docker run -p 6379:6379 -p 3000:3000 -it --rm \
  -v ./data:/var/lib/falkordb/data \
  falkordb/falkordb
```

---

## Backend Comparison

| Backend | Best For | Deployment | Features |
|---------|----------|------------|----------|
| **Neo4j** | Enterprise applications, production systems | Server/Cloud | Full Cypher, APOC procedures, multi-database |
| **FalkorDB** | LLM applications, real-time systems, high performance | Redis-based | Ultra-fast, sparse matrix operations |

**Recommendation**: Use **Neo4j** for enterprise production systems or **FalkorDB** for high-performance real-time applications.


In [1]:
!pip install semantica






## Step 1: Initialize Graph Store

Initialize a `GraphStore` instance with your preferred backend. For this tutorial, we'll use **Neo4j** (requires a running server).


In [None]:
from semantica.graph_store import GraphStore

# Neo4j AuraDB Connection Details
# Replace these values with your actual AuraDB credentials
store = GraphStore(
    backend="neo4j",
    uri="Your URI",  # Your AuraDB Instance URI
    user="neo4j",
    password="Your Password"  # Please enter your password here
)

# Connect to the database
store.connect()
print("Connected to graph database successfully!")

Connected to graph database successfully!


## Step 2: Node Operations

### Creating Nodes

Nodes represent entities in your graph. Each node can have:
- **Labels**: Categories/types (e.g., `Person`, `Company`, `Location`)
- **Properties**: Key-value pairs (e.g., `{"name": "Alice", "age": 30}`)


In [7]:
# Create individual nodes with labels and properties
apple = store.create_node(
    labels=["Company"],
    properties={"name": "Apple Inc.", "founded": 1976, "industry": "Technology"}
)
print(f"Created company node: {apple.get('properties', {}).get('name')} (ID: {apple.get('id')})")

tim_cook = store.create_node(
    labels=["Person"],
    properties={"name": "Tim Cook", "title": "CEO", "age": 63}
)
print(f"Created person node: {tim_cook.get('properties', {}).get('name')} (ID: {tim_cook.get('id')})")

cupertino = store.create_node(
    labels=["Location"],
    properties={"name": "Cupertino", "state": "California", "country": "USA"}
)
print(f"Created location node: {cupertino.get('properties', {}).get('name')} (ID: {cupertino.get('id')})")


Status,Action,Module,Submodule,File,Time
‚úÖ,Semantica is processing,‚è≥ graph_store,Neo4jStore,-,0.14s




Created company node: Apple Inc. (ID: 0)




Created person node: Tim Cook (ID: 1)




Created location node: Cupertino (ID: 2)


In [8]:
# Create multiple nodes in batch (more efficient for large datasets)
other_companies = store.create_nodes([
    {"labels": ["Company"], "properties": {"name": "Microsoft", "founded": 1975}},
    {"labels": ["Company"], "properties": {"name": "Google", "founded": 1998}},
    {"labels": ["Company"], "properties": {"name": "Amazon", "founded": 1994}},
])
print(f"Created {len(other_companies)} company nodes in batch")




Created 3 company nodes in batch


## Step 3: Relationship Operations

### Creating Relationships

Relationships connect nodes and represent connections between entities. Each relationship has:
- **Type**: The relationship type (e.g., `CEO_OF`, `LOCATED_IN`, `KNOWS`)
- **Properties**: Key-value pairs (e.g., `{"since": 2011}`)
- **Direction**: From `start_node_id` to `end_node_id`


In [9]:
# Create relationships between nodes
ceo_rel = store.create_relationship(
    start_node_id=tim_cook["id"],
    end_node_id=apple["id"],
    rel_type="CEO_OF",
    properties={"since": 2011}
)
print(f"Created relationship: {ceo_rel.get('type')} (ID: {ceo_rel.get('id')})")

location_rel = store.create_relationship(
    start_node_id=apple["id"],
    end_node_id=cupertino["id"],
    rel_type="HEADQUARTERED_IN",
    properties={"since": 1977}
)
print(f"Created relationship: {location_rel.get('type')} (ID: {location_rel.get('id')})")




Created relationship: CEO_OF (ID: 1152921504606846977)




Created relationship: HEADQUARTERED_IN (ID: 1152922604118474752)


## Step 4: Querying Nodes and Relationships

### Retrieving Nodes

You can query nodes by labels, properties, or node IDs.


In [10]:
# Get nodes by label
companies = store.get_nodes(labels=["Company"], limit=10)
print(f"Found {len(companies)} companies:")
for company in companies:
    name = company.get('properties', {}).get('name', 'Unknown')
    founded = company.get('properties', {}).get('founded', 'N/A')
    print(f"  - {name} (founded: {founded})")

# Get a specific node by ID
if apple.get('id'):
    node = store.get_node(node_id=apple["id"])
    print(f"\nRetrieved node by ID: {node.get('properties', {}).get('name')}")




Found 4 companies:
  - Apple Inc. (founded: 1976)
  - Microsoft (founded: 1975)
  - Google (founded: 1998)
  - Amazon (founded: 1994)


In [11]:
# Get relationships for a node
relationships = store.get_relationships(node_id=apple["id"], direction="both")
print(f"Found {len(relationships)} relationships for Apple:")
for rel in relationships:
    rel_type = rel.get('type', 'Unknown')
    props = rel.get('properties', {})
    print(f"  - {rel_type}: {props}")

# Get relationships by type and direction
if tim_cook.get('id'):
    outgoing = store.get_relationships(
        node_id=tim_cook["id"],
        rel_type="CEO_OF",
        direction="out"
    )
    print(f"\nOutgoing CEO_OF relationships: {len(outgoing)}")




Found 2 relationships for Apple:
  - HEADQUARTERED_IN: {'since': 1977}
  - CEO_OF: {'since': 2011}

Outgoing CEO_OF relationships: 1


## Step 5: Cypher Query Execution

### Executing Cypher Queries

Cypher is a powerful graph query language that allows you to express complex graph patterns and operations. The Graph Store module supports **OpenCypher** syntax across all backends.


In [12]:
# Execute a Cypher query to find CEO relationships
results = store.execute_query("""
    MATCH (p:Person)-[r:CEO_OF]->(c:Company)
    RETURN p.name as person, c.name as company, r.since as since
""")

print("CEO Relationships:")
for record in results.get("records", []):
    person = record.get('person', 'Unknown')
    company = record.get('company', 'Unknown')
    since = record.get('since', 'N/A')
    print(f"  - {person} is CEO of {company} since {since}")


CEO Relationships:
  - Tim Cook is CEO of Apple Inc. since 2011


In [13]:
# Parameterized query (safer and more efficient)
results = store.execute_query(
    "MATCH (c:Company) WHERE c.founded > $year RETURN c.name, c.founded ORDER BY c.founded",
    parameters={"year": 1990}
)

print("Companies founded after 1990:")
for record in results.get("records", []):
    name = record.get('c.name', 'Unknown')
    founded = record.get('c.founded', 'N/A')
    print(f"  - {name} (founded: {founded})")


Companies founded after 1990:
  - Amazon (founded: 1994)
  - Google (founded: 1998)


## Step 6: Graph Analytics

### Built-in Analytics Algorithms

The Graph Store module provides several graph analytics algorithms for analyzing your graph structure.


In [14]:
# Get neighbors of a node (traverse the graph)
if apple.get('id'):
    neighbors = store.get_neighbors(
        node_id=apple["id"],
        direction="both",
        depth=2
    )
    
    print(f"Found {len(neighbors)} neighbors (up to depth 2) for Apple:")
    for neighbor in neighbors:
        name = neighbor.get('properties', {}).get('name', 'Unknown')
        labels = neighbor.get('labels', [])
        print(f"  - {name} ({', '.join(labels)})")


In [15]:
# Find shortest path between two nodes
if tim_cook.get('id') and cupertino.get('id'):
    path = store.shortest_path(
        start_node_id=tim_cook["id"],
        end_node_id=cupertino["id"],
        max_depth=5
    )
    
    if path:
        print(f"Shortest path found:")
        print(f"  - Path length: {path.get('length')}")
        print(f"  - Nodes in path: {len(path.get('nodes', []))}")
        print(f"  - Relationships: {len(path.get('relationships', []))}")
    else:
        print("No path found between the nodes")




Shortest path found:
  - Path length: 2
  - Nodes in path: 3
  - Relationships: 2


## Step 7: Update and Delete Operations

### Updating Nodes

You can update node properties using the `update_node` method.


In [16]:
# Update node properties (merge mode - adds/updates properties)
if tim_cook.get('id'):
    updated = store.update_node(
        node_id=tim_cook["id"],
        properties={"age": 64, "title": "CEO & President"},
        merge=True  # Merge with existing properties
    )
    print(f"Updated node: {updated.get('properties', {}).get('name')}")
    print(f"  New age: {updated.get('properties', {}).get('age')}")
    print(f"  New title: {updated.get('properties', {}).get('title')}")

# Example: Replace all properties (merge=False)
# updated = store.update_node(
#     node_id=node_id,
#     properties={"name": "New Name"},
#     merge=False  # Replace all properties
# )




Updated node: Tim Cook
  New age: 64
  New title: CEO & President


## Step 8: Delete Operations

### Deleting Nodes and Relationships

You can delete nodes and relationships when needed.


In [17]:
# Delete a relationship
if location_rel.get('id'):
    deleted = store.delete_relationship(rel_id=location_rel["id"])
    if deleted:
        print(f"Deleted relationship (ID: {location_rel['id']})")

# Delete a node (with detach=True to also delete its relationships)
# WARNING: This will delete the node and all its relationships
# Uncomment to test:
# if cupertino.get('id'):
#     deleted = store.delete_node(node_id=cupertino["id"], detach=True)
#     if deleted:
#         print(f"Deleted node: {cupertino.get('properties', {}).get('name')}")

print("\nTip: Use detach=True to delete a node and all its relationships")
print("     Use detach=False to only delete the node (fails if relationships exist)")




Deleted relationship (ID: 1152922604118474752)

Tip: Use detach=True to delete a node and all its relationships
     Use detach=False to only delete the node (fails if relationships exist)


## Step 9: Graph Statistics

Get comprehensive statistics about your graph.


In [18]:
# Get comprehensive graph statistics
stats = store.get_stats()

print("Graph Statistics:")
print(f"  Total nodes: {stats.get('node_count', 'N/A')}")
print(f"  Total relationships: {stats.get('relationship_count', 'N/A')}")
print(f"\nNode labels:")
for label, count in stats.get('label_counts', {}).items():
    print(f"  - {label}: {count} nodes")
print(f"\nRelationship types:")
for rel_type, count in stats.get('relationship_type_counts', {}).items():
    print(f"  - {rel_type}: {count} relationships")


Graph Statistics:
  Total nodes: 6
  Total relationships: 1

Node labels:
  - Company: 4 nodes
  - Person: 1 nodes
  - Location: 1 nodes

Relationship types:
  - CEO_OF: 1 relationships


## Step 10: Convenience Functions

The Graph Store module provides convenience functions for simpler, function-based operations.


In [19]:
# Using convenience functions (alternative to class methods)
from semantica.graph_store import (
    create_node,
    create_relationship,
    get_nodes,
    execute_query,
    shortest_path
)

# These functions work with a default store instance
# For this example, we'll continue using the store instance we created

# Example: Using convenience functions
# node = create_node(
#     labels=["Person"],
#     properties={"name": "Alice", "age": 30}
# )

print("Convenience functions available:")
print("  - create_node, create_nodes")
print("  - create_relationship, create_relationships")
print("  - get_nodes, get_relationships")
print("  - update_node, delete_node")
print("  - execute_query, shortest_path, get_neighbors")
print("  - run_analytics")


Convenience functions available:
  - create_node, create_nodes
  - create_relationship, create_relationships
  - get_nodes, get_relationships
  - update_node, delete_node
  - execute_query, shortest_path, get_neighbors
  - run_analytics


## Step 11: Index Management

Create indexes to improve query performance, especially for large graphs.


In [20]:
# Create an index on a node property for faster lookups
# This is especially useful for frequently queried properties

index_created = store.create_index(
    label="Company",
    property_name="name",
    index_type="btree"  # Default index type
)

if index_created:
    print("Created index on Company.name for faster queries")
else:
    print("Index may already exist or not be supported by this backend")

# Note: Index creation support varies by backend
# Neo4j: Full support for various index types
# FalkorDB: Limited index support


Created index on Company.name for faster queries


## Step 12: Clean Up

Always close the connection when you're done to free up resources.


In [21]:
# Close the connection
store.close()
print("Connection closed successfully")


Connection closed successfully


## Summary

This notebook covered the Graph Store module, a unified interface for property graph databases supporting Neo4j and FalkorDB.

### What You Learned

- **CRUD Operations**: Create, read, update, and delete nodes and relationships
- **Cypher Queries**: Execute complex graph queries with OpenCypher syntax
- **Graph Analytics**: Shortest path, neighbor traversal, and centrality algorithms
- **Batch Operations**: Efficient bulk data loading for large datasets
- **Index Management**: Performance optimization through indexing

### Key Takeaways

- **Backend Selection**: Use Neo4j for production, FalkorDB for high-performance applications
- **Best Practices**: Use batch operations, parameterized queries, and proper connection management
- **Next Steps**: Explore advanced analytics, graph quality, and visualization modules
