In [1]:
import networkx as nx
from itertools import combinations
import pandas as pd

In [2]:
def link_prediction(G):
    # Store the predicted links with their scores
    predicted_links = []
    
    # Iterate over all pairs of nodes
    for u, v in combinations(G.nodes, 2):
        if not G.has_edge(u, v):
            # Get common neighbors
            common_neighbors = len(list(nx.common_neighbors(G, u, v)))
            if common_neighbors > 0:
                predicted_links.append(((u, v), common_neighbors))
    
    # Sort the predicted links by the number of common neighbors in descending order
    predicted_links = sorted(predicted_links, key=lambda x: x[1], reverse=True)
    
    return predicted_links

In [3]:
# Create a sample graph
G = nx.Graph()
df = pd.read_csv('../../facebook_clean_data/tvshow_edges.csv')
# Create a sample graph
G = nx.from_pandas_edgelist(df, 'node_1', 'node_2')

# Perform link prediction
predicted_links = link_prediction(G)

# Print the predicted links
for (u, v), score in predicted_links:
    print(f"Predicted link between node {u} and node {v} with common neighbors count: {score}")


Predicted link between node 2008 and node 3254 with common neighbors count: 122
Predicted link between node 1925 and node 2972 with common neighbors count: 56
Predicted link between node 1925 and node 1457 with common neighbors count: 56
Predicted link between node 1925 and node 2935 with common neighbors count: 56
Predicted link between node 1925 and node 1788 with common neighbors count: 56
Predicted link between node 1925 and node 3283 with common neighbors count: 56
Predicted link between node 2303 and node 1457 with common neighbors count: 56
Predicted link between node 2972 and node 2935 with common neighbors count: 56
Predicted link between node 2972 and node 1788 with common neighbors count: 56
Predicted link between node 2972 and node 3283 with common neighbors count: 56
Predicted link between node 2935 and node 1788 with common neighbors count: 56
Predicted link between node 2935 and node 3283 with common neighbors count: 56
Predicted link between node 1788 and node 3283 with

Functionality:

The link_prediction function iterates over all possible pairs of nodes in the graph, excluding pairs that already have an edge between them.

For each pair of nodes, the function calculates the number of common neighbors.

It then stores the pairs with common neighbors and sorts them in descending order based on the number of common neighbors.

Performance:

Efficiency: The method is simple and efficient for smaller or moderately sized graphs. However, as the graph grows, the combination of all node pairs and the common neighbor calculation can become computationally expensive.

Predictive Power: The algorithm works well for networks where common neighbors play a significant role in link formation, such as social networks. However, it may not perform as well in graphs with different underlying structures.

Limitations:

The algorithm assumes that the number of common neighbors is directly correlated with the likelihood of a link forming, which might not be true for all networks.

The code does not consider the possibility of multiple factors influencing link formation, such as node degree or other structural properties of the graph.