<center><img src="https://github.com/MagallanesTalks/OpenBigData_atPUCP/blob/main/logo.png?raw=true" width="1000"></center>

# A COSPONSORSHIP STORY

In this example we will take a look to a network that represents cosponsorship, let me show you a net from the period "July 1995-July 1996".

Let's fetch the data first:

In [None]:
# fetch the file (this will appear in your main folder after running the code)
# just do it once!
!wget 'https://github.com/MagallanesTalks/OpenBigData_atPUCP/raw/refs/heads/main/data/cospon_9596.gexf'

Now, let's read the file and show the relationships

In [None]:
import networkx as nx

# read in the data
net95_1=nx.read_gexf("cospon_9596.gexf")


# show net adjacency as a pandas data frame
nx.to_pandas_adjacency(net95_1)

The adjacency tells us how many times TWO legislators appear as supporters of a bill proposal. The rows and columns names represent the legislator's code. Zero means they never supported a bill together during the time the information was collected (notice the zeroes in the diagonal). The existence of value greater than zero means there is a link between two legislators, and the value is the weight of that relationship.

Besides links (or edges) there are nodes, which represent the legislators. We have some attributes in those nodes:

In [None]:
import pandas as pd

pd.DataFrame.from_dict(net95_1.nodes, orient='index')

# <div class="alert alert-success" role="alert">Relevant nodes</div>
Let's compute some important measures that seek to identify actors whose relationship patterns may make them relevant in the network:

In [None]:
# relevant to connect groups
nx.set_node_attributes(net95_1, nx.betweenness_centrality(net95_1), "betweenness")
# relevant to spread information
nx.set_node_attributes(net95_1, nx.closeness_centrality(net95_1), "closeness")
# connected to well connected nodes
nx.set_node_attributes(net95_1, nx.eigenvector_centrality(net95_1), "eigenvector")

You see those values as new columns here:

In [None]:
net95_1_NodeData=pd.DataFrame.from_dict(net95_1.nodes, orient='index')
net95_1_NodeData

We could use a scatter plot:

In [None]:
#updating altair in colab
# !pip install altair -U

In [None]:
import altair as alt

ALT_net=alt.Chart(net95_1_NodeData).properties(width=300,
                                               height=300)

ENC_net=ALT_net.encode(
    alt.X('betweenness:Q'),
    alt.Y('closeness:Q'),
    alt.Size("eigenvector:Q"),
    alt.Color("eigenvector:Q"),
    tooltip=['label','party']
).interactive()
ENC_net.mark_circle()

Let me create add another attribute, to show belonging to a the governing party:

In [None]:
isPartyInGov={l:1 if p=='CAMBIO 90 - NUEVA MAYORIA' else 0 for (l,p) in nx.get_node_attributes(net95_1, 'party').items()}
nx.set_node_attributes(net95_1, isPartyInGov,'isPartyInGov')
net95_1_NodeData=pd.DataFrame.from_dict(net95_1.nodes, orient='index')

In [None]:
ALT_net=alt.Chart(net95_1_NodeData).properties(width=300,
                                               height=300)

ENC_net=ALT_net.encode(
    alt.X('betweenness'),
    alt.Y('closeness'),
    alt.Size("eigenvector:Q"),
    alt.Color('isPartyInGov:N'),
    tooltip=['label','party']
).interactive()

ENC_net.mark_circle()

Networks are complex to visualize, then some combination of plots may help. Let me introduce **brushing**:

# <div class="alert alert-success" role="alert">Network as a whole</div>

The default network viz is not very promising in most cases:


In [None]:
nx.draw(net95_1)

## <div class="alert alert-danger" role="alert">Looking for Communities</div>

We can not go very far with the previous plot; the next step is to find out if actually there are communities emerging from the relationships. Let's compute some basic net stats to see if we can suspect the existence of communities.

In [None]:
# (average) probability that two of your connections are also connected.
nx.transitivity(net95_1)

In [None]:
# (average) probability that all your connections are connected
nx.average_clustering(net95_1,count_zeros=False)

In [None]:
# the amout of maximal cliques
len(list(nx.find_cliques(net95_1)))

In [None]:
# The size of the maximal clique
maxsize_clique=max(len(c) for c in nx.find_cliques_recursive(net95_1))
maxsize_clique

With this information, we can suspect nodes are organised into communities.

There are several algorithms for comunities detection. Let's use the [Louvain algorithm](https://arxiv.org/abs/0803.0476):

In [None]:
# computing
legisLouvain=nx.community.louvain_communities(net95_1, seed=123)

# creating attribute
legisLouvain_attr={z:x for x,y  in enumerate(legisLouvain) for z in y }

# an attribute to the node
nx.set_node_attributes(net95_1, legisLouvain_attr,'louvain')

# how many?
print('comunities found:',len(legisLouvain))

The community label has been assigned to the nodes, let's recover the attributes as a data frame again:

In [None]:
net95_1_NodeData=pd.DataFrame.from_dict(net95_1.nodes, orient='index')
net95_1_NodeData.iloc[:,-5:]

Let's create a viz:

In [None]:
# position of nodes
alt.data_transformers.enable('default', max_rows=10000)
nodePos=nx.spring_layout(net95_1,k=0.5)

# drawing
chart = nxa.draw_networkx(G=net95_1,
                          pos=nodePos,
                          edge_color='grey',
                          width='weight',
                          alpha=0.6,
                          node_size='isPartyInGov:N',
                          node_color='louvain:N',
                          cmap='set1',
                          linewidths=0,
                          node_tooltip=['label','party'])
chart.properties(
    width=600,
    height=600,
).interactive()

The library **netgraph** moves the nodes if you have communities:

In [None]:
# !pip install netgraph

In [None]:
# custom colors
community_to_color = {0 : 'blue', 1 : 'orange',2 : 'green',3 : 'white', 4:'black', 5:'magenta'}
# color dict
custom_node_color = {node: community_to_color[community_id] for node, community_id in legisLouvain_attr.items()}

from netgraph import Graph
Graph(net95_1,
      node_layout='community',edge_color='lightgrey',edge_alpha=0.3,edge_width=0.5,
      node_layout_kwargs=dict(node_to_community=legisLouvain_attr),
      node_color=custom_node_color)

You can try **hives** if we are intested in displaying interactions:

In [None]:
# !pip install hiveplotlib

In [None]:
from hiveplotlib import hive_plot_n_axes
from hiveplotlib.converters import networkx_to_nodes_edges
from hiveplotlib.node import split_nodes_on_variable
from hiveplotlib.viz import hive_plot_viz

# setup
## convert from networkx
nodes, edges = networkx_to_nodes_edges(net95_1)
## organize nodes into communities
communities_dict = split_nodes_on_variable(nodes, variable_name="louvain")
nodes_by_community_toAxes = list(communities_dict.values())
# amount of communities
amountOf_communities=len(nodes_by_community_toAxes)

Time to plot:

In [None]:
hp = hive_plot_n_axes(node_list=nodes,
                      edges=edges,
                      axes_assignments=nodes_by_community_toAxes,
                      sorting_variables=["eigenvector"] * amountOf_communities
)
fig, ax = hive_plot_viz(hp)
ax.set_title("Interaction among communities", y=1.05, size=20)
plt.show()

We could add some color:

In [None]:
fig, ax = hive_plot_viz(
    hp,
    node_kwargs={"color": "red", "s": 10},
    axes_kwargs={"color": "yellow"},
    color="grey",
    ls="dotted"
)
ax.set_title("Interaction among communities", y=1.05, size=20)
plt.show()

This library allows you to reveal within communities interaction:

In [None]:
hp = hive_plot_n_axes(node_list=nodes,
                      edges=edges,
                      axes_assignments=nodes_by_community_toAxes,
                      sorting_variables=["eigenvector"] * amountOf_communities,
                      repeat_axes=[True]*amountOf_communities,
                      all_edge_kwargs={"color": "darkgrey"},
                      repeat_edge_kwargs={"color": "magenta"})
fig, ax = hive_plot_viz(hp)
ax.set_title("Interaction between and within communities", y=1.05, size=20)
plt.show()

Finally, let's use heatmaps:

In [None]:

nodelist = list(net95_1.nodes)
A = nx.to_numpy_array(net95_1, nodelist=nodelist)
A

Let's use **graspologic**. Notice it requires a particular version of scipy.

In [None]:
# !pip install scipy==1.10.1

In [None]:
# !pip install graspologic

**I recommend we restart the session after the last installations.**

In [None]:
from graspologic.plot import heatmap

heatmap(A, cbar=True)

We can not say very much from that plot. But, let me recover the party of the legislators:

In [None]:
isPartyInGov_Values=[val for k,val in nx.get_node_attributes(net95_1, 'party').items()]
isPartyInGov_Values

Then, this nice heatplot appears:

In [None]:
heatmap(A, inner_hier_labels=isPartyInGov_Values, sort_nodes=True, cbar=False,hier_label_fontsize=4,transform='simple-all')

Finally, we could save the net with all the attributes added:

In [None]:
# nx.write_graphml(net95_1,'net95_1.graphml')