# LINATE Quickstart Tutorial

LINATE stands for "Language-Independent Network ATtitudinal Embedding". As its name suggests, it's a module for embedding social networks (graphs) in attitudinal spaces. Attitudinal spaces are geometrical opinion spaces where dimensions act as indicators of positive or negative opinions (i.e., attitudes) towards identifiable attitudinal objects (e.g., ideological positions such as left- or right-wing ideologies, or policy positions such as increasing tax redistribution, or increasing environmental protection).

This module provides tools for two methods: 

1) Ideological embedding: producing a graph embedding in an latent ideological space, where dimensions don't have explicit meaning, but are related to an homophilic model underlying the choises of users forming the graph.

2) Attitudinal embedding: mapping this embedded graph onto a second space that does have explicit meaning for its dimensions. For this, the module uses the position of some reference points that have known positions in both spaces.

Check our publication for further details:

Ramaciotti Morales, Pedro ,Jean-Philippe Cointet, Gabriel Muñoz Zolotoochin, Antonio Fernández Peralta, Gerardo Iñiguez, and Armin Pournaki. "Inferring Attitudinal Spaces in Social Networks." (2022).
https://hal.archives-ouvertes.fr/hal-03573188/document


## Embedding a bipartite graph in latent its ideological space

In [None]:
import pandas as pd
from linate import IdeologicalEmbedding

We load a bipartite social graph of reference users $i$ being followed (on Twitter) by users $j$. Each row must be an edge, i.e., a comma separated pair of node names. In this example, nodes $i$ are French parliamentarians on Twitter, and $j$ are their followers.

In [None]:
bipartite = pd.read_csv('bipartite_graph.csv',dtype=str)
print('columns :'+str(bipartite.columns))
print('edges: '+str(bipartite.shape[0]))
print('num. of reference nodes i: '+ str(bipartite['i'].nunique()))
print('num. of follower nodes j: '+ str(bipartite['j'].nunique()))

We load the model, choosing the number of latent dimensions of the embedding, and the number of neighbohrs that a users $j$ must have to be kept in the bipartite graph (we normally want users that have made enough choices).

In [None]:
ideoembedding_model = IdeologicalEmbedding(n_latent_dimensions = 2,in_degree_threshold = 3)

LINATE works with directed graphs because it models networks as social choices: who chooses to connect or follow whom. Thus, we need to specify the direction of edges, which nodes are the source (those that chose) and which ones are the target (those that are chosen).

Note: there are different available "engines" that you should look out depending on how much memory you have on your machine. 

In [None]:
bipartite.rename(columns={'i':'target','j':'source'},inplace=True)
bipartite

Alternatively, you can use the provided data loader.

In [None]:
bipartite = ideoembedding_model.load_input_from_file(
    'bipartite_graph.csv',
    header_names={'target':'i','source':'j'})

In [None]:
ideoembedding_model.fit(bipartite)

Once the ideological model is computed, we can retrieve the coordinates of the target nodes in the selected number of dimensions...

In [None]:
target_coords = ideoembedding_model.ideological_embedding_target_latent_dimensions_
target_coords

... and the coordinates of the followers.

In [None]:
source_coords = ideoembedding_model.ideological_embedding_source_latent_dimensions_
source_coords

Reference users often come in groups, which is helpful for interpreting what dimensions are capturing. For this, we need a file identifying each reference users $i$ with a group $k$. In our example, parliamentarians belong to parties.

In [None]:
df_ref_group=pd.read_csv('reference_group.csv', dtype=str)
df_ref_group

Let's plot the ideological positions of references, followers, and groups. To plot users and groups, we attribute groups to target users:

In [None]:
import seaborn as sn
import matplotlib.pyplot as plt
color_dic = {'0':'blue','1':'red','2':'gold','3':'orange','4':'green',
             '5':'violet','6':'cyan','7':'magenta','8':'brown','9':'gray'}

In [None]:
target_coords['k'] = target_coords.index.map(df_ref_group.set_index('i')['k'])

In [None]:
g = sn.jointplot(data=source_coords.drop_duplicates(),x='latent_dimension_0',y='latent_dimension_1',kind="hex")
ax=g.ax_joint
for k in target_coords['k'].unique():
    df_k = target_coords[target_coords['k']==k]
    ax.scatter(df_k['latent_dimension_0'],df_k['latent_dimension_1'],
               marker='+',s=30,alpha=0.5,color=color_dic[k])

## Embedding a bipartite graph in an attitudinal reference space

To embed map the network onto a space with explicit meanings for dimensions, we need reference points, such as the positions of parties for some issues.

In [None]:
group_attitudes = pd.read_csv('group_attitudes.csv')
group_attitudes['k'] = group_attitudes['k'].astype(str)
group_attitudes

Because we know the positions of targets and their groups, we can compute group positions as means.

In [None]:
group_ideologies = target_coords.groupby('k').mean()
group_ideologies

In [None]:
fig = plt.figure(figsize=(10,4))# width, height inches
ax = {1:fig.add_subplot(1,2,1),2:fig.add_subplot(1,2,2)}
for k,row in group_ideologies[group_ideologies.index.isin(group_attitudes['k'])].iterrows():
    ax[1].plot(row['latent_dimension_0'],row['latent_dimension_1'],'o',mec='k',color=color_dic[k])
ax[1].set_xlabel('latent_dimension_0'),ax[1].set_ylabel('latent_dimension_1')
ax[1].set_title('Group positions in ideological space')
for k,row in group_attitudes.iterrows():
    ax[2].plot(row['issue_1'],row['issue_2'],'o',mec='k',color=color_dic[row['k']])
ax[2].set_xlabel('issue_1'),ax[2].set_ylabel('issue_2')
ax[2].set_title('Group positions in attitudinal space')

In [None]:
from linate import AttitudinalEmbedding
attiembedding_model = AttitudinalEmbedding(N = 2)

We need a DataFrame containing the latent coordinates, but also name of the nodes:

In [None]:
target_coords['entity'] = target_coords.index 
target_coords

In [None]:
X = attiembedding_model.convert_to_group_ideological_embedding(target_coords, df_ref_group.rename(columns={'i':'entity','k':'group'}))
X

In [None]:
Y = group_attitudes.rename(columns={'k':'entity'})

Using positions of groups, we can compute a map between ideological and attitudinal space.

In [None]:
attiembedding_model.fit(X, Y)

If we provide our target coordinates with an entity column, we can transform their coordinates to attitudinal space.

In [None]:
target_coords['entity'] = target_coords.index
target_attitudinal = attiembedding_model.transform(target_coords)
target_attitudinal

Similarly, for source nodes:

In [None]:
source_coords['entity'] = source_coords.index
source_attitudinal = attiembedding_model.transform(source_coords)
source_attitudinal

And if we put again the groups of each target user, we can compute party positions according to the social network, and plot all nodes and groups in attitudinal space.

In [None]:
target_attitudinal['k'] = target_attitudinal['entity'].map(pd.Series(index=df_ref_group['i'].values,data=df_ref_group['k'].values))

g = sn.jointplot(data=source_attitudinal.drop_duplicates(),x='issue_1',y='issue_2',kind="hex")
ax=g.ax_joint
for k in target_attitudinal['k'].unique():
    df_k = target_attitudinal[target_attitudinal['k']==k]
    df_k_mean = df_k[['issue_1','issue_2']].mean()
    ax.scatter(df_k['issue_1'],df_k['issue_2'],marker='+',s=30,alpha=0.5,color=color_dic[k])
    ax.plot(df_k_mean['issue_1'],df_k_mean['issue_2'],'o',mec='k',color=color_dic[k],ms=7)