# (Personalized) PageRank

## Graph

* **indegree**: number of incoming edges
* **outdegree**: number of outward edges

## PageRank
Ranking vertices in a graph according to their structural importance.

Given a link from v<sub>i</sub> to v<sub>j</sub>:
*  vote is given from v<sub>i</sub> to v<sub>j</sub>
* rank of v<sub>j</sub> increases
* strength of vote depends on rank of v<sub>i</sub> 



P = cMP + (1-c)v

Important terms
* **first part**: voting scheme
* **second part**: smoothing factor -> aperiodic + irreducible -> converges unique stationary distribution
* **damping**: usually at 0.85, models first and second part
* **v**: stochastic vector -> modifying this makes it **Personalized**

## Personalized PageRank

Given **input text**:
* extract all content words (list *W<sub>i</sub>*)

Insert **context words**
* [DONE]: as nodes in the graph
* [TODO]: link nodes with directed edges to their candidate concepts
    * [QUESTION]: now using snap.PUNGraph, how can I add a directed link to this?
* [TODO]: concentrating the initial probability mass uniformly over the newly introduced word
nodes
    * [QUESTION]: how?
* [TODO]: run PageRank?

In [5]:
import snap

In [10]:
import wn_utils
import snap_utils

In [11]:
lemmapos2mfs_offset = wn_utils.get_lemmapos2mfs_offset()
the_candidates = wn_utils.get_candidates(lemmapos2mfs_offset)

In [7]:
edges = wn_utils.load_wn_edges(categories={'hyponym_hypernym', 'meronym_holonym', 'others'})

In [21]:
g = snap_utils.load_wn_as_undirected_graph(edges)

In [22]:
g.GetNodes()

108908

#### example sentence
```
Our fleet comprises coaches from 35 to 58 seats.
```

In [18]:
content_words = {1 : ('fleet', 'n'),
                 2: ('comprise', 'v'),
                 3: ('coach', 'n'),
                 4: ('seat', 'n')}

#### add nodes to the graph

In [24]:
for identifier, (lemma, pos) in content_words.items():
    g.AddNode(identifier)

In [25]:
g.GetNodes()

108912