# Small World Feature Extraction 

By: Jimuel Celeste, Jr. 

Objective: To develop a tool that would extract Small World Represenation Features from audio recordings.

In [1]:
import pandas as pd
import numpy as np
import networkx as nx
from scipy.spatial.distance import pdist, squareform

# 1. Load or create your time series data
# Example: synthetic time series
time_series_data = pd.Series(np.sin(np.linspace(0, 100, 100)) + np.random.rand(100) * 0.1)

# 2. Define a similarity metric and build the graph
# For simplicity, using a distance threshold for edges
threshold = 0.5  # Adjust based on your data and desired connectivity
distance_matrix = squareform(pdist(time_series_data.values.reshape(-1, 1))) # Reshape for pdist
adjacency_matrix = (distance_matrix < threshold).astype(int)

# Create a NetworkX graph from the adjacency matrix
G = nx.from_numpy_array(adjacency_matrix)

# 3. Apply Watts-Strogatz model to achieve small-world properties
# n: number of nodes, k: number of nearest neighbors in initial ring lattice, p: rewiring probability
# You might need to experiment with k and p to get desired small-world characteristics
n_nodes = len(time_series_data)
k_neighbors = 4  # Example: connect to 4 nearest neighbors
rewiring_probability = 0.1 # Example: 10% chance of rewiring an edge

# Ensure G is connected for Watts-Strogatz if starting from an arbitrary graph
# Or, directly create a Watts-Strogatz graph if your time series representation allows it
# For example, if nodes are sequential points and edges are based on temporal proximity
G_small_world = nx.watts_strogatz_graph(n_nodes, k_neighbors, rewiring_probability)

# 4. Analyze the small-world network (optional)
# Calculate properties like average path length and clustering coefficient
avg_path_length = nx.average_shortest_path_length(G_small_world)
clustering_coefficient = nx.average_clustering(G_small_world)

print(f"Average Path Length: {avg_path_length}")
print(f"Average Clustering Coefficient: {clustering_coefficient}")

# Visualize the network (optional)
# import matplotlib.pyplot as plt
# nx.draw(G_small_world, with_labels=False, node_size=10)
# plt.show()

Average Path Length: 4.748080808080808
Average Clustering Coefficient: 0.3722857142857142


## Links
1. NetworkX - Average Shortest Path Length: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.generic.average_shortest_path_length.html
2. NetworkX - Average Clustering: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.cluster.average_clustering.html#networkx.algorithms.cluster.average_clustering
3. NetworkX - Small-World Functions: https://networkx.org/documentation/stable/reference/algorithms/smallworld.html
4. NetworkX - Small-World Sigma: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.smallworld.sigma.html
5. NetworkX - Small-World Omega: https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.smallworld.omega.html
6. Google - WebRTC VAD Segmentation: https://github.com/wiseman/py-webrtcvad
7. CarlosBergillos - ts2vg: Time series to visibility graphs: https://github.com/CarlosBergillos/ts2vg?tab=readme-ov-file#lacasa2008