# GraphVerse Demonstration

This notebook demonstrates the usage of the GraphVerse library, which combines random graph generation with a simple Language Model (LLM) to analyze and generate walk sequences on graphs.

In [None]:
%pip uninstall torch
%pip uninstall torchvision

In [None]:
pip install torch==1.5.0+cpu torchvision==0.6.0+cpu -f

In [None]:
%reset -f

In [None]:
# Install the GraphVerse package (uncomment if not already installed)
# !pip install GraphVerse

In [None]:
import graphverse
import networkx as nx
import matplotlib.pyplot as plt

# Set random seed for reproducibility
import random
import torch
random.seed(42)
torch.manual_seed(42)

## 1. Generate a Graph

In [None]:
n = 1000  # Number of vertices
G = graphverse.graph_generator.graph_generation.generate_interesting_graph(n)
graphverse.graph_generator.vertex_designation.designate_special_vertices(G)

print(f"Graph generated with {G.number_of_nodes()} nodes and {G.number_of_edges()} edges")

# Visualize a small subgraph
subgraph_nodes = list(G.nodes())[:20]  # First 20 nodes
subgraph = G.subgraph(subgraph_nodes)
pos = nx.spring_layout(subgraph)
nx.draw(subgraph, pos, with_labels=True, node_color='lightblue', node_size=500, font_size=10)
plt.title("Subgraph of the Generated Graph")
plt.show()

## 2. Generate Random Walks

In [None]:
num_walks = 10000
walks = []
violating_walks = []

for i in range(num_walks):
    start = random.randint(0, n-1)
    walk, rule_violated, violation_details = graphverse.graph_generator.random_walks.random_walk(G, start)
    walks.append(walk)
    if rule_violated:
        violating_walks.append((i, walk, violation_details))

print(f"Generated {len(walks)} walks")
print(f"Number of walks violating rules: {len(violating_walks)}")

# Display a few sample walks
print("\nSample walks:")
for i in range(5):
    print(f"Walk {i}: {walks[i][:10]}... (length: {len(walks[i])})")

## 3. Train LLM on Walk Sequences

In [None]:
model = graphverse.llm.training.train_llm(walks, n, epochs=5)
print("LLM training completed")

## 4. Generate Sequences Using Trained LLM

In [None]:
num_sequences = 1000
max_length = 50
generated_sequences = []

for _ in range(num_sequences):
    start = [random.randint(0, n-1) for _ in range(5)]  # 5-vertex prompt
    generated_sequences.append(graphverse.llm.inference.generate_sequence(model, start, max_length))

print(f"Generated {len(generated_sequences)} sequences")
print("\nSample generated sequences:")
for i in range(5):
    print(f"Sequence {i}: {generated_sequences[i][:10]}... (length: {len(generated_sequences[i])})")

## 5. Analyze LLM Output for Errors

In [None]:
ascenders = [v for v, data in G.nodes(data=True) if data.get('special') == 'ascender']
descenders = [v for v, data in G.nodes(data=True) if data.get('special') == 'descender']

invalid_edges, rule_violations = graphverse.analysis.analyze_llm_output(G, generated_sequences, ascenders, descenders)

print(f"Total generated edges: {sum(len(seq)-1 for seq in generated_sequences)}")
print(f"Invalid edges: {invalid_edges}")
print(f"Rule violations: {rule_violations}")

invalid_edge_rate = invalid_edges / sum(len(seq)-1 for seq in generated_sequences)
rule_violation_rate = rule_violations / sum(len(seq)-1 for seq in generated_sequences)

print(f"\nInvalid edge rate: {invalid_edge_rate:.2%}")
print(f"Rule violation rate: {rule_violation_rate:.2%}")

## Conclusion

This notebook demonstrated the use of the GraphVerse library to generate a random graph, create random walks, train an LLM on these walks, and analyze the LLM's output for errors. The error rates provide insight into how well the LLM has learned the structure of the graph and the special vertex rules.