# Implementing the model in Python

To begin creating this graph in Python, we can import the nodes from *musae_facebook_target.csv*, using the standard Python csv library:

In [2]:
import csv
with open('data/musae_facebook_target.csv', 'r', encoding='utf-8') as csv_file:
    reader = csv.reader(csv_file)
    data = [line for line in reader]
    print(data[:10])
    print(len(data))

[['id', 'facebook_id', 'page_name', 'page_type'], ['0', '145647315578475', 'The Voice of China 中国好声音', 'tvshow'], ['1', '191483281412', 'U.S. Consulate General Mumbai', 'government'], ['2', '144761358898518', 'ESET', 'company'], ['3', '568700043198473', 'Consulate General of Switzerland in Montreal', 'government'], ['4', '1408935539376139', 'Mark Bailey MP - Labor for Miller', 'politician'], ['5', '134464673284112', 'Victor Dominello MP', 'politician'], ['6', '282657255260177', 'Jean-Claude Poissant', 'politician'], ['7', '239338246176789', 'Deputado Ademir Camilo', 'politician'], ['8', '544818128942324', 'T.C. Mezar-ı Şerif Başkonsolosluğu', 'government']]
22471


Here, we open the *csv* file with *utf-8 encoding*, as some node name strings contain non-standard characters. We use *csv.reader* to read the file, and convert this into a list of lists with a list comprehension (a special construct to encapsulate a loop inside a list to return a new list based on the loop logic, in essence to create a list from another list). Finally, we confirm that the *csv* is loaded correctly by examining the first few lines, and checking the length of the imported list, which should be equal to 22,471, the number of rows.

## Adding nodes and attributes

In *igraph*, we could add nodes one at a to begin creating our graph, as we did in the previous chapter. However, as with many operations in Python, it is faster to add nodes listwise, in a single operation.

To prepare our data for this, we need lists of node names, and each node attribute. We can prepare these lists using more list comprehensions:

In [3]:
node_ids = [int(row[0]) for row in data[1:]]
page_names = [row[2] for row in data[1:]]
page_types = [row[3] for row in data[1:]]

Note that the [1:] list slice on data is removing the csv header from each list, that we would not want to include as a node.

We need to confirm that the *id* row of our data increases sequentially. As mentioned in chapter 1, *igraph* uses a sequentially increasing integer index for every node added to the graph. If our id column also uses this, adding nodes will be a simple process. To confirm that this is the case, we can compare the id column to a Python *range()*:

In [4]:
assert node_ids == list(range(len(node_ids)))

This assert is making sure that a *range()* of 0 to the len() of the node_ids list is equivalent to the list of node ids in our data. This assert should raise no AssertionError, as they are identical.

This means importing our nodes into *igraph*, in this case, is as simple as creating a new, undirected, empty graph and telling igraph how many nodes we would like:

In [5]:
import igraph as ig
g = ig.Graph(directed=False)
g.add_vertices(len(node_ids))


We can confirm how many nodes have been added by accessing the *vs* attribute of the *Graph()* object, and check that this is equal to the length of the *node_ids* list using another *assert*:

In [6]:
print(len(g.vs))
assert len(node_ids) == len(g.vs)

22470


This will show that the number of nodes is 22470, one less than the number of rows in the original csv file, which accounts for the removed header. Additionally, the assert will compare the length of both objects and raise an error if these values are not equal (expressed with the == equality symbol). 

Now that nodes have been added, we can add our attributes to the nodes in a listwise operation, using the attribute lists *page_names* and *page_types* that were prepared earlier:

In [7]:
g.vs['page_name'] = page_names
g.vs['page_type'] = page_types

Here we use the *vs* attribute of the graph to write the page names in order, from node with id 0 to node 22470. Because the order of our properties and ids was preserved when preparing these lists earlier, this is the easiest way to quickly add all of our node attributes.

We can confirm that node attributes have been written to the graph with:

In [8]:
print(g.vs[0]['page_name'])
print(g.vs[0]['page_type'])

The Voice of China 中国好声音
tvshow


Which should print the node name and type of the first data row in our original csv.

## Adding edges

Now that our nodes have been added to the graph, we can begin to connect them together. All the information we need to do this is contained in *musae_facebook_edges.csv*, so lets import this file: