In [1]:
# Install GraphVerse (uncomment if not already installed)
# !pip install GraphVerse

# GraphVerse Demonstration

This notebook demonstrates the usage of the GraphVerse library, which combines random graph generation with a simple Language Model (LLM) to analyze and generate walk sequences on graphs.

In [1]:
from graphverse import run_simulation
import networkx as nx
import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'graphverse'

## Run the Simulation

We'll now run the simulation with a smaller graph for demonstration purposes. This will generate a graph, perform random walks, train an LLM, and analyze its output.

In [2]:
n = 1000  # Number of vertices
num_walks = 10000  # Number of random walks
llm_training_epochs = 5  # Number of training epochs for the LLM

results = run_simulation(n=n, num_walks=num_walks, llm_training_epochs=llm_training_epochs)
G, walks, violating_walks, model, generated_sequences, invalid_edges, rule_violations = results

NameError: name 'run_simulation' is not defined

## Analyze the Results

In [None]:
print(f"Graph Information:")
print(f"Number of nodes: {G.number_of_nodes()}")
print(f"Number of edges: {G.number_of_edges()}")
print(f"\nRandom Walks:")
print(f"Total number of walks: {len(walks)}")
print(f"Number of violating walks: {len(violating_walks)}")
print(f"\nLLM Output Analysis:")
print(f"Number of generated sequences: {len(generated_sequences)}")
print(f"Invalid edges in generated sequences: {invalid_edges}")
print(f"Rule violations in generated sequences: {rule_violations}")

# Calculate error rates
total_edges = sum(len(seq) - 1 for seq in generated_sequences)
invalid_edge_rate = invalid_edges / total_edges
rule_violation_rate = rule_violations / total_edges

print(f"\nError Rates:")
print(f"Invalid edge rate: {invalid_edge_rate:.2%}")
print(f"Rule violation rate: {rule_violation_rate:.2%}")

## Visualize a Subgraph

To get a sense of the graph structure, let's visualize a small subgraph.

In [None]:
# Select a subgraph for visualization
subgraph_size = 20
subgraph_nodes = list(G.nodes())[:subgraph_size]
subgraph = G.subgraph(subgraph_nodes)

# Set up the plot
plt.figure(figsize=(12, 8))
pos = nx.spring_layout(subgraph)

# Draw the subgraph
nx.draw(subgraph, pos, with_labels=True, node_color='lightblue', node_size=500, font_size=10, font_weight='bold')

# Highlight special nodes
ascenders = [n for n, d in subgraph.nodes(data=True) if d.get('special') == 'ascender']
descenders = [n for n, d in subgraph.nodes(data=True) if d.get('special') == 'descender']
nx.draw_networkx_nodes(subgraph, pos, nodelist=ascenders, node_color='r', node_size=600)
nx.draw_networkx_nodes(subgraph, pos, nodelist=descenders, node_color='g', node_size=600)

plt.title("Subgraph Visualization")
plt.axis('off')
plt.tight_layout()
plt.show()

print("Red nodes: Ascenders")
print("Green nodes: Descenders")
print("Blue nodes: Regular nodes")

## Conclusion

This demonstration shows how GraphVerse can be used to generate a graph, perform random walks, train an LLM on these walks, and analyze the LLM's output. The error rates give us an indication of how well the LLM has learned the structure and rules of the graph.

Keep in mind that the performance of the LLM can be improved by:
1. Increasing the number of training epochs
2. Generating more random walks for training data
3. Adjusting the LLM architecture (e.g., increasing model size or complexity)

Feel free to experiment with different parameters to see how they affect the results!