## Generate a simple social graph

[python-igraph Manual](http://igraph.org/python/doc/tutorial/tutorial.html)

In [10]:
import igraph
print(igraph.__version__)


0.7.1


In [11]:
from igraph import *

In [12]:
g = Graph()

In [25]:
print(g)

IGRAPH U--- 0 0 --


In [26]:
g.add_vertices(3)

In [27]:
g.add_edges([(0,1), (1,2)])

In [23]:
print(g)

IGRAPH U--- 0 0 --


In [29]:
g.add_edges([(2,0)])
g.add_vertices(3)
g.add_edges([(2,3),(3,4),(4,5),(5,3)])
print(g)

IGRAPH U--- 6 7 --
+ edges:
0--1 1--2 0--2 2--3 3--4 4--5 3--5


In [30]:
g.add_edges([(2,0)])
print(g) >>> multi-edges

IGRAPH U--- 6 8 --
+ edges:
0--1 1--2 0--2 2--3 3--4 4--5 3--5 0--2


In [31]:
g.get_eid(2,3)

3

In [32]:
g.delete_edges(3)

In [33]:
summary(g)

IGRAPH U--- 6 7 -- 


Graph.Tree() generates a regular tree graph. The one that we generated has 127 vertices and each vertex (apart from the leaves) has two children (and of course one parent).(简单的说就是每个点恰有两个分支) No matter how many times you call Graph.Tree(), the generated graph will always be the same if you use the same parameters:

In [34]:
g = Graph.Tree(127, 2)

In [35]:
summary(g)

IGRAPH U--- 127 126 -- 


In [39]:
g2 = Graph.Tree(127, 2)
g2.get_edgelist() == g.get_edgelist()

True

In [42]:
g2.get_edgelist()[0:10]  # 画出来就知道了

[(0, 1),
 (0, 2),
 (1, 3),
 (1, 4),
 (2, 5),
 (2, 6),
 (3, 7),
 (3, 8),
 (4, 9),
 (4, 10)]

Graph.GRG() generates a geometric random graph: n points are chosen randomly and uniformly inside the unit square (单位正方形) and pairs of points closer to each other than a predefined distance d are connected by an edge. In our case, n is 100 and d is 0.2. Due to the random nature of the algorithm, chances are that the exact graph you got is different from the one that was generated when I wrote this tutorial, hence the values above in the summary will not match the ones you got. This is normal and expected. Even if you generate two geometric random graphs on the same machine, they will be different for the same parameter set:

In [41]:
g = Graph.GRG(100, 0.2)
summary(g)

IGRAPH U--- 100 551 -- 
+ attr: x (v), y (v)


In [43]:
g2 = Graph.GRG(100, 0.2)
g.get_edgelist() == g2.get_edgelist()

False

isomorphic() tells you whether two graphs are isomorphic(同构) or not. In general, it might take quite a lot of time, especially for large graphs, but in our case, the answer can quickly be given by checking the degree distributions of the two graphs.

In [45]:
g.isomorphic(g2)

False

## [Setting and retrieving attributes](http://igraph.org/python/doc/tutorial/tutorial.html#setting-and-retrieving-attributes)

igraph uses vertex and edge IDs in its core. These IDs are integers, starting from zero, and they are always continuous at any given time instance during the lifetime of the graph. This means that whenever vertices and edges are deleted, a large set of edge and possibly vertex IDs will be renumbered to ensure the continuity. Now, let us assume that our graph is a social network where vertices represent people and edges represent social connections between them. One way to maintain the association between vertex IDs and say, the corresponding names is to have an additional Python list that maps from vertex IDs to names. The drawback of this approach is that this additional list must be maintained in parallel to the modifications of the original graph. Luckily, igraph knows the concept of attributes, i.e., auxiliary objects associated to a given vertex or edge of a graph, or even to the graph as a whole. Every igraph Graph, vertex and edge behaves as a standard Python dictionary in some sense: you can add key-value pairs to any of them, with the key representing the name of your attribute (the only restriction is that it must be a string) and the value representing the attribute itself. (即attribute用key表示)

In [13]:
g = Graph([(0,1), (0,2), (2,3), (3,4), (4,2), (2,5), (5,0), (6,3), (5,6)])
g.summary()

'IGRAPH U--- 7 9 -- '

Now, let us assume that we want to store the names, ages and genders of people in this network as vertex attributes, and for every connection, we want to store whether this is an informal friendship tie or a formal tie. Every Graph object contains two special members called vs and es, standing for the sequence of all vertices and all edges, respectively. If you try to use vs or es as a Python dictionary, you will manipulate the attribute storage area of the graph:·

In [14]:
g.vs

<igraph.VertexSeq at 0x10a572a48>

In [15]:
g.vs["name"] = ["Alice", "Bob", "Claire", "Dennis", "Esther", "Frank", "George"] # .vs 是给vertical加attribute

In [16]:
g.vs["age"] = [25, 31, 18, 47, 22, 23, 50]

In [17]:
g.vs["gender"] = ["f", "m", "f", "m", "f", "m", "m"]

In [18]:
g.es["is_formal"] = [False, False, True, True, True, False, True, False, False] # .es 是给edge加attribute

Whenever you use vs or es(EdgeSeq) as a dictionary, you are assigning attributes to all vertices/edges of the graph. However, you can simply alter the attributes of vertices and edges individually by indexing vs or es with integers as if they were lists (remember, they are sequences, they contain all the vertices or all the edges). When you index them, you obtain a Vertex or Edge object, which refers to (I am sure you already guessed that) a single vertex or a single edge of the graph. Vertex and Edge objects can also be used as dictionaries to alter the attributes of that single vertex or edge:

In [19]:
g.es[0]

igraph.Edge(<igraph.Graph object at 0x10a4c7a98>, 0, {'is_formal': False})

In [62]:
g.vs[0]

igraph.Vertex(<igraph.Graph object at 0x10c38a228>, 0, {'name': 'Alice', 'age': 25, 'gender': 'f'})

In [63]:
g.es[0].attributes()

{'is_formal': False}

In [66]:
g.es[0]["is_formal"] = True # 重新赋值attribute

In [67]:
g.es[0]

igraph.Edge(<igraph.Graph object at 0x10c38a228>, 0, {'is_formal': True})

In [70]:
g.es

<igraph.EdgeSeq at 0x10c294888>

Not too surprisingly, Graph objects themselves can also behave as dictionaries:

In [71]:
g["date"] = "2009-01-10"

In [72]:
print(g["date"])

2009-01-10


Finally, it should be mentioned that attributes can be deleted by the Python keyword `del` just as you would do with any member of an ordinary dictionary:

In [74]:
g.vs[3]["foo"] = "bar"

In [75]:
g.vs["foo"]

[None, None, None, 'bar', None, None, None]

In [76]:
del g.vs["foo"]

In [77]:
g.vs["foo"]

KeyError: 'Attribute does not exist'

## [Structural properties of graphs](http://igraph.org/python/doc/tutorial/tutorial.html#structural-properties-of-graphs)

Besides the simple graph and attribute manipulation routines described above, igraph provides a large set of methods to calculate various structural properties of graphs. It is beyond the scope of this tutorial to document all of them, hence this section will only introduce a few of them for illustrative purposes. We will work on the small social network we built in the previous section.

Probably the simplest property one can think of is the vertex degree. The degree of a vertex equals the number of edges adjacent to it. In case of directed networks, we can also define in-degree (the number of edges pointing towards the vertex) and out-degree (the number of edges originating from the vertex). igraph is able to calculate all of them using a simple syntax:

In [78]:
g.degree()

[3, 1, 4, 3, 2, 3, 2]

If the graph was directed, we would have been able to calculate the in- and out-degrees separately using g.degree(type="in") and g.degree(type="out"). You can also pass a single vertex ID or a list of vertex IDs to degree() if you want to calculate the degrees for only a subset of vertices:

In [80]:
g.degree(6) # 最后一个点的degree

2

In [81]:
g.degree([2,3,4])

[4, 3, 2]

Besides degree, igraph includes built-in routines to calculate many other centrality properties, including vertex and edge betweenness (`Graph.betweenness()`, `Graph.edge_betweenness()`) or Google’s PageRank (`Graph.pagerank()`) just to name a few. Here we just illustrate edge betweenness:

In [82]:
g.edge_betweenness()

[6.0, 6.0, 4.0, 2.0, 4.0, 3.0, 4.0, 3.0, 4.0]

In [83]:
g.betweenness()

[5.0, 0.0, 5.5, 1.5, 0.0, 2.5, 0.5]

In [84]:
g.pagerank()

[0.1715187083669299,
 0.07002553879920158,
 0.20933537164407268,
 0.16151684644322287,
 0.11167544439518333,
 0.16265174994590004,
 0.11327634040548959]

Now we can also figure out which connections have the highest betweenness centrality with some Python magic:

In [86]:
ebs = g.edge_betweenness()
max_eb = max(ebs)
[g.es[idx].tuple for idx, eb in enumerate(ebs) if eb == max_eb] ## 称为Python magic 真不为过

[(0, 1), (0, 2)]

In [95]:
[g.es[idx] for idx, eb in enumerate(ebs) if eb == max_eb] ## 所以.tuple可以把对应的edge用两点连线方式表示出来

[igraph.Edge(<igraph.Graph object at 0x10c38a228>, 0, {'is_formal': True}),
 igraph.Edge(<igraph.Graph object at 0x10c38a228>, 1, {'is_formal': False})]

Most structural properties can also be retrieved for a subset of vertices or edges or for a single vertex or edge by calling the appropriate method on the VertexSeq, EdgeSeq, Vertex or Edge object of interest:

In [96]:
g.vs.degree()

[3, 1, 4, 3, 2, 3, 2]

In [97]:
g.es.edge_betweenness()

[6.0, 6.0, 4.0, 2.0, 4.0, 3.0, 4.0, 3.0, 4.0]

In [98]:
g.vs[2].degree()

4

## [Querying vertices and edges based on attributes](http://igraph.org/python/doc/tutorial/tutorial.html#querying-vertices-and-edges-based-on-attributes)

Imagine that in a given social network, you would like to find out who has the largest degree or betweenness centrality. You can do that with the tools presented so far and some basic Python knowledge, but since it is a common task to select vertices and edges based on attributes or structural properties, igraph gives you an easier way to do that:

In [5]:
from igraph import *
g = Graph([(0,1), (0,2), (2,3), (3,4), (4,2), (2,5), (5,0), (6,3), (5,6)])
g.summary()
g.vs["name"] = ["Alice", "Bob", "Claire", "Dennis", "Esther", "Frank", "George"] # .vs 是给vertical加attribute
g.vs["age"] = [25, 31, 18, 47, 22, 23, 50]
g.vs["gender"] = ["f", "m", "f", "m", "f", "m", "m"]
g.es["is_formal"] = [False, False, True, True, True, False, True, False, False] # .es 是给edge加attribute

In [6]:
g.vs.attribute_names() ### >>> ['name', 'age', 'gender']
g.vs.select(_degree = g.maxdegree())["name"] # g.maxdegree() >>> 4

['Claire']

### More on .select() function （可以把符合条件的vertex filter出来）

The syntax may seem a little bit awkward for the first sight, so let’s try to interpret it step by step. `select()` is a method of VertexSeq and its sole purpose is to filter a VertexSeq based on the properties of individual vertices. The way it filters the vertices depends on its positional and keyword arguments. Positional arguments (the ones without an explicit name like `_degree` above) are always processed before keyword arguments as follows:

1. If the first positional argument is None, an empty sequence (containing no vertices) is returned:

In [18]:
seq = g.vs.select(None)
len(seq)

0

2. If the first positional argument is a callable object (i.e., a function, a bound method or anything that behaves like a function), the object will be called for every vertex that’s currently in the sequence. If the function returns True, the vertex will be included, otherwise it will be excluded:

In [19]:
graph = Graph.Full(10)
only_odd_vertices = graph.vs.select(lambda vertex: vertex.index % 2 == 1) # 对graph.vs中每个vertex来做函数，每个vertex都有一个编号
len(only_odd_vertices)

5

In [20]:
graph.vs.indices

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

3. If the first positional argument is an iterable (i.e., a list, a generator or anything that can be iterated over), it must return integers and these integers will be considered as indices into the current vertex set (which is not necessarily the whole graph). Only those vertices that match the given indices will be included in the filtered vertex set. Floats, strings, invalid vertex IDs will silently be ignored:

In [26]:
seq = graph.vs.select([2, 3, 7])
len(seq)

3

In [27]:
[v.index for v in seq]

[2, 3, 7]

In [28]:
seq = seq.select([0, 2])         # filtering an existing vertex set 进一步过滤

In [29]:
[v.index for v in seq]

[2, 7]

In [32]:
seq = graph.vs.select([2, 3, 7, "foo", 3.5])
len(seq) #  Only those vertices that match the given indices will be included in the filtered vertex set. 
         #   Floats, strings, invalid vertex IDs will silently be ignored

3

4. If the first positional argument is an integer, all remaining arguments are also expected to be integers and they are interpreted as indices into the current vertex set. This is just syntactic sugar, you could achieve an equivalent effect by passing a list as the first positional argument, but this way you can omit the square brackets:

In [33]:
seq = graph.vs.select(2, 3, 7)  #<- 懒人方法
len(seq)

3

Keyword arguments can be used to filter the vertices based on their attributes or their structural properties. The name of each keyword argument should consist of at most two parts: the name of the attribute or structural property and the filtering operator. The operator can be omitted; in that case, we automatically assume the equality operator. The possibilities are as follows (where name denotes the name of the attribute or property):

`name_eq` : The attribute/property value must be equal to the value of the keyword argument

`name_ne` : The attribute/property value must not be equal to the value of the keyword argument

`name_lt` : The attribute/property value must be less than the value of the keyword argument

`name_le` : The attribute/property value must be less than or equal to the value of the keyword argument

`name_gt` : The attribute/property value must be greater than the value of the keyword argument

`name_ge` : The attribute/property value must be greater than or equal to the value of the keyword argumen

`name_in` : The attribute/property value must be included in the value of the keyword argument, which must be a sequence in this case

`name_notin` : The attribute/property value must not be included in the value of the the keyword argument, which must be a sequence in this case



For instance, the following command gives you people younger than 30 years in our imaginary social network:

In [48]:
len(g.vs.select(age_lt=30)) # >>> 4  有四个人（vertex） 比三十岁小，把这四个人filter出来了，可以重新指名。。

4

**Note：** Due to the syntactical constraints of Python, you cannot use the admittedly simpler syntax of g.vs.select(age < 30) as only the equality operator is allowed to appear in an argument list in Python.



To save you some typing, you can even omit the `select()` method if you wish: （这太人性化了，以后忘了也不怕了。。）


In [50]:
len(g.vs(age_lt=30))

4

Theoretically, it can happen that there exists an attribute and a structural property with the same name (e.g., you could have a vertex attribute named `degree`). In that case, we would not be able to decide whether the user meant `degree` as a **structural property** or as a vertex attribute. To resolve this ambiguity, structural property names must always be preceded by an underscore (`_`) when used for filtering. For example, to find vertices with degree larger than 2:

In [51]:
len(g.vs(_degree_gt=2))

4

There are also a few special structural properties for selecting edges:

1. Using `_source` or `_from` in the keyword argument list of EdgeSeq.select() filters based on the source vertices of the edges. E.g., to select all the edges originating from Claire (who has vertex index 2):

In [53]:
len(g.es.select(_source=2)) ##  g.vs["name"] >> ['Alice', 'Bob', 'Claire', 'Dennis', 'Esther', 'Frank', 'George']

3

2. Using `_target` or `_to` filters based on the target vertices. This is different from _source and _from if the graph is directed.

3. `_within` takes a VertexSeq object or a list or set of vertex indices and selects all the edges that originate and terminate in the given vertex set. For instance, the following expression selects all the edges between Claire (vertex index 2), Dennis (vertex index 3) and Esther (vertex index 4):

In [57]:
len(g.es.select(_within=[2,3,4]))

3

We could also have used a VertexSeq object:

In [59]:
len(g.es.select(_within=g.vs[2:5]))

3

4. `_between` takes a tuple consisting of two VertexSeq objects or lists containing vertex indices or Vertex objects and selects all the edges that originate in one of the sets and terminate in the other. E.g., to select all the edges that connect men to women:

In [61]:
men = g.vs.select(gender="m")
women = g.vs.select(gender="f")
g.es.select(_between=(men, women)) #  哟，还能这样搞，还真是灵活

<igraph.EdgeSeq at 0x10a2b5bf8>

### Finding a single vertex or edge with some properties (find()  function)

In many cases we are looking for a single vertex or edge of a graph with some properties, and either we do not care which one of the matches is returned if there are multiple matches, or we know in advance that there will be only one match. A typical example is looking up vertices by their names in the name property. VertexSeq and EdgeSeq objects provide the find() method for such use-cases. `find()` works similarly to `select()`, but it **returns only the first match** if there are multiple matches, and throws an exception if no match is found. For instance, to look up the vertex corresponding to Claire, one can do this:

In [62]:
claire = g.vs.find(name="Claire")
type(claire)

igraph.Vertex

In [63]:
claire.index

2

Looking up an unknown name will yield an exception:



In [64]:
g.vs.find(name="Joe")

ValueError: no such vertex: 'Joe'

## [Layouts and plotting](http://igraph.org/python/doc/tutorial/tutorial.html#layouts-and-plotting) (下载好麻烦啊。。下次再说吧，先看看adj)

A graph is an abstract mathematical object without a specific representation in 2D or 3D space. This means that whenever we want to visualise a graph, we have to find a mapping from vertices to coordinates in two- or three-dimensional space first, preferably in a way that is pleasing for the eye. A separate branch of graph theory, namely graph drawing, tries to solve this problem via several graph layout algorithms. igraph implements quite a few layout algorithms and is also able to draw them onto the screen or to a PDF, PNG or SVG file using the Cairo library.

In [65]:
layout = g.layout_kamada_kawai()

In [67]:
layout = g.layout("kamada_kawai")

In [69]:
layout = g.layout("rt", 2)

In [71]:
layout = g.layout("kk")

In [73]:
karate = Graph.Read_GraphML("karate.GraphML")

FileNotFoundError: [Errno 2] No such file or directory: 'karate.GraphML'

## [从igraph 数据转化为一个adj M]()

In [134]:
A = Graph.get_adjacency(g)
A

Matrix([[0, 1, 1, 0, 0, 1, 0], [1, 0, 0, 0, 0, 0, 0], [1, 0, 0, 1, 1, 1, 0], [0, 0, 1, 0, 1, 0, 1], [0, 0, 1, 1, 0, 0, 0], [1, 0, 1, 0, 0, 0, 1], [0, 0, 0, 1, 0, 1, 0]])

In [130]:
Graph.get_adjlist(g)


[[1, 2, 5], [0], [0, 3, 4, 5], [2, 4, 6], [2, 3], [0, 2, 6], [3, 5]]

[adjacency](http://igraph.org/python/doc/igraph.GraphBase-class.html#Adjacency)

In [139]:
p = Graph.Adjacency([[0,1],[1,0]],mode = ADJ_UNDIRECTED)
summary(p)

IGRAPH U--- 2 1 -- 


In [142]:
A = list(A)
p = Graph.Adjacency(A,mode = ADJ_UNDIRECTED)
summary(p)

IGRAPH U--- 7 9 -- 


In [117]:
import pandas as pd
f = pd.read_table("facebook_combined.txt",sep = " ") # 空格当间隔，默认是tab

In [12]:
help(VertexDendrogram)

Help on class VertexDendrogram in module igraph.clustering:

class VertexDendrogram(Dendrogram)
 |  The dendrogram resulting from the hierarchical clustering of the
 |  vertex set of a graph.
 |  
 |  Method resolution order:
 |      VertexDendrogram
 |      Dendrogram
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, graph, merges, optimal_count=None, params=None, modularity_params=None)
 |      Creates a dendrogram object for a given graph.
 |      
 |      @param graph: the graph that will be associated to the clustering
 |      @param merges: the merges performed given in matrix form.
 |      @param optimal_count: the optimal number of clusters where the
 |        dendrogram should be cut. This is a hint usually provided by the
 |        clustering algorithm that produces the dendrogram. C{None} means
 |        that such a hint is not available; the optimal count will then be
 |        selected based on the modularity in such a case.
 |      @param para

[在这个网站里面找igraph的函数](http://igraph.org/python/doc/igraph.GraphBase-class.html)

In [1]:
from igraph import *
GG = Graph.Read_Edgelist("facebook_combined.txt", directed=False)
AA = Graph.get_adjacency(GG)

In [2]:
AA.shape

(4039, 4039)

In [3]:
fc = Graph.community_fastgreedy(GG)

In [20]:
tmp = fc.as_clustering(13) 

In [8]:
type(fc)

igraph.clustering.VertexDendrogram

In [19]:
fc.format()

'((((2596,((2470,(2457,(2195,(2079,((2058,(2576,(2383,(2391,((2044,(2429,(2476,(2478,(1942,(2303,(2234,(2449,(2393,(2067,(2019,(2584,(2092,(2387,(2147,(2262,(2545,(2350,(2541,(2382,(2481,(2167,(2094,(2245,(2487,(2281,(1956,(2535,(2301,(1961,(2427,(2168,(2106,(2437,(2204,(2421,(2634,(2565,(1996,(((2017,(2034,(2321,(2041,(2014,(1950,(2012,(2016,(2050,(2435,(2000,(2337,(1958,(2494,(2626,(2025,(2346,(2658,(2097,(2660,(2639,(2011,(2225,(1957,(1931,(2636,(2061,(2633,(2627,(1914,(1991,(1935,(2585,(2647,(2645,(2583,(2657,(2648,(1998,(2284,(2006,(1980,(1919,(2297,(2159,(2192,(2364,(2620,(1995,(1927,(2001,(2171,(2027,(2202,(2365,(2157,(2022,(2024,(1951,(2004,(2538,(2447,(2378,(2640,(1999,(2018,(1928,(2013,(1921,(1968,(1976,(1973,(2009,(1972,(2431,(2528,(2263,(2621,(2441,(2422,(1937,(2015,(2456,(2375,(2439,(2548,(2100,(2113,(2217,(2450,(2632,(2051,(2558,((2227,(2230,(2358,(2580,(2388,(2008,((2316,(2251,(2080,(2659,(2357,(2360,(2105,(2238,(2130,(1933,(1915,(2424,(2614,(1934,(2373,(2193,(2152,(2455

In [22]:
help(VertexClustering)

Help on class VertexClustering in module igraph.clustering:

class VertexClustering(Clustering)
 |  The clustering of the vertex set of a graph.
 |  
 |  This class extends L{Clustering} by linking it to a specific L{Graph} object
 |  and by optionally storing the modularity score of the clustering.
 |  It also provides some handy methods like getting the subgraph corresponding
 |  to a cluster and such.
 |  
 |  @note: since this class is linked to a L{Graph}, destroying the graph by the
 |    C{del} operator does not free the memory occupied by the graph if there
 |    exists a L{VertexClustering} that references the L{Graph}.
 |  
 |  @undocumented: _formatted_cluster_iterator
 |  
 |  Method resolution order:
 |      VertexClustering
 |      Clustering
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, graph, membership=None, modularity=None, params=None, modularity_params=None)
 |      Creates a clustering object for a given graph.
 |      
 |      @par

In [25]:
tmp.graph

<igraph.Graph at 0x1041c8a98>