This notebook contains an exmaple of how to use the Python version of the package to analyze the [Marvel character social network](https://arxiv.org/abs/cond-mat/0202174), a bipartite network where the top nodes are the characters and the bottom nodes are the comic books.

A cleaned edge list can be found at [http://syntagmatic.github.io/exposedata/marvel/data/source.csv](http://syntagmatic.github.io/exposedata/marvel/data/source.csv).
An edge between a character and a comic book indicates the character appears in that book.

Let's first load the pacakge.
You will need to install the `birankpy` package first.

In [1]:
import birankpy
import pandas as pd

First, download the edge list.

In [2]:
marvel_df = pd.read_csv(
    "http://syntagmatic.github.io/exposedata/marvel/data/source.csv",
    names=['character', 'comic_book']
)

There are the following number of characters:

In [3]:
marvel_df.character.nunique()

6444

THere are the following number of comic books:

In [4]:
marvel_df.comic_book.nunique()

12849

Now, let's create a `BipartiteNetwork` instance to handle the network.

In [5]:
bn = birankpy.BipartiteNetwork()

Set the edge list.

In [6]:
bn.set_edgelist(
    marvel_df,
    top_col='character', bottom_col='comic_book',
    weight_col=None
)

Remember to specify the columns for top and bottom nodes.

Note that this network has no edge weight, but for other networks, you can also specify the column for edge weight.
The ranking will change accordingly.

Now we can try the BiRank algorithm with different normalizers:

In [7]:
char_birank_df, _ = bn.generate_birank(normalizer='HITS')
char_birank_df.sort_values(by='character_birank', ascending=False).head()

Unnamed: 0,character,character_birank
14,CAPTAIN AMERICA,0.027031
63,IRON MAN/TONY STARK,0.019933
34,THING/BENJAMIN J. GR,0.019904
115,HUMAN TORCH/JOHNNY S,0.019248
113,MR. FANTASTIC/REED R,0.018823


In [8]:
char_birank_df, _ = bn.generate_birank(normalizer='CoHITS')
char_birank_df.sort_values(by='character_birank', ascending=False).head()

Unnamed: 0,character,character_birank
80,SPIDER-MAN/PETER PAR,0.014299
14,CAPTAIN AMERICA,0.011232
63,IRON MAN/TONY STARK,0.009827
23,HULK/DR. ROBERT BRUC,0.007843
34,THING/BENJAMIN J. GR,0.007838


Another way of performing PageRank on bipartite networks is first to convert the networks to unipartite counterpats through projection, then apply PageRank.
This can also be done with our package.

First, project the network on to the characters:

In [10]:
un = bn.unipartite_projection(on='character')

In [12]:
char_projected_pagerank_df = un.generate_pagerank()

In [14]:
char_projected_pagerank_df.sort_values(by='pagerank', ascending=False).head()

Unnamed: 0,top_index,character,pagerank
14,14,CAPTAIN AMERICA,0.010981
80,80,SPIDER-MAN/PETER PAR,0.010856
63,63,IRON MAN/TONY STARK,0.008206
123,123,WOLVERINE/LOGAN,0.007166
105,105,THOR/DR. DONALD BLAK,0.007095
