# Data Preparation and Analysis Setup

In this initial phase, we're focusing on setting up our environment by importing necessary libraries and loading the data. 

In [1]:
import pandas as pd
import networkx as nx

file_path = 'C:/Users/Timur/Desktop/connection_groups/1.xlsx'
data = pd.read_excel(file_path)

data.head()

Unnamed: 0,name_id,used_id
0,0,0
1,0,1
2,0,2
3,0,3
4,0,4


# Graph Construction and Analysis

With the data successfully loaded, we now turn our attention to the analysis phase. We will construct a graph representing the relationships between different `name_id`s. The analysis will identify clusters or groups of interconnected `name_id`s based on shared `used_id`s. This involves creating nodes, edges, and subsequently, the connected components signifying the groups.

In [2]:
G = nx.Graph()

for name_id in data['name_id'].unique():
    G.add_node(name_id)

used_id_to_name_ids = data.groupby('used_id')['name_id'].apply(list)

for name_ids in used_id_to_name_ids:
    if len(name_ids) > 1:
        for i in range(len(name_ids)):
            for j in range(i + 1, len(name_ids)):
                G.add_edge(name_ids[i], name_ids[j])

connected_components = list(nx.connected_components(G))

connected_components

[{0,
  25,
  1927,
  3044,
  3132,
  3569,
  4391,
  4393,
  5527,
  5528,
  5531,
  5877,
  6212,
  7410,
  7686,
  7848,
  7923,
  7970,
  8029,
  8031,
  8185,
  8446,
  8753,
  8791,
  9157,
  9332,
  9576,
  9718,
  9867,
  10410,
  10903},
 {1, 52, 5303, 9571, 9864},
 {2, 5304},
 {3},
 {4},
 {5},
 {6, 3029},
 {7, 6320},
 {8},
 {8195,
  8197,
  9,
  10,
  11,
  12,
  8204,
  8206,
  21,
  8216,
  8219,
  8220,
  29,
  8221,
  31,
  8222,
  8223,
  8226,
  35,
  8224,
  8225,
  42,
  44,
  8239,
  51,
  54,
  8246,
  57,
  8254,
  64,
  65,
  8261,
  8265,
  83,
  88,
  8287,
  96,
  97,
  100,
  109,
  8302,
  110,
  8303,
  8304,
  8308,
  117,
  116,
  8312,
  122,
  123,
  124,
  8321,
  129,
  8323,
  8322,
  8329,
  138,
  139,
  143,
  146,
  152,
  153,
  8344,
  154,
  156,
  8349,
  158,
  159,
  8355,
  165,
  8358,
  8359,
  167,
  168,
  8361,
  171,
  174,
  175,
  176,
  8369,
  178,
  179,
  180,
  181,
  182,
  8375,
  184,
  185,
  186,
  189,
  8381,
  193,
  194