## Adjacency Matrix with Pandas

In [18]:
import pandas as pd
import networkx as nx

In [19]:
df = pd.DataFrame({"from": {0: "Died", 1: "Elected Rep", 2: "Married", 3: "Born", 4: "Elected Pres"},
                    "to": {0: "Born", 1: "Elected Pres", 2: "Elected Rep", 3: "Married", 4: "Died"},
                    "weight": {0: 0.1, 1: 1.0, 2: 1.0, 3: 1.0, 4: 1.0},
                })
print(df)

           from            to  weight
0          Died          Born     0.1
1   Elected Rep  Elected Pres     1.0
2       Married   Elected Rep     1.0
3          Born       Married     1.0
4  Elected Pres          Died     1.0


The first parameter of `nx.from_pandas_dataframe()` is a DataFrame, each row defines one edge, and two of the columns, source and target, designate the start and end nodes of the edge. You can convert the remaining columns into edge attributes.

In [20]:
G = nx.from_pandas_edgelist(df, "from", "to", edge_attr=["weight"])
print(G.edges(data=True))

[('Died', 'Born', {'weight': 0.1}), ('Died', 'Elected Pres', {'weight': 1.0}), ('Born', 'Married', {'weight': 1.0}), ('Elected Rep', 'Elected Pres', {'weight': 1.0}), ('Elected Rep', 'Married', {'weight': 1.0})]


### Handling Node Attributes

* let’s add a "date" parameter to Lincoln’s timeline:

In [24]:
events = {"Died": 1865, "Born": 1809, "Elected Rep": 1847, "Elected Pres": 1861, "Married": 1842}
nx.set_node_attributes(G, events, "date")
node_data = G.nodes(data=True)
print(node_data)

[('Died', {'date': 1865}), ('Born', {'date': 1809}), ('Elected Rep', {'date': 1847}), ('Elected Pres', {'date': 1861}), ('Married', {'date': 1842})]


Since node_data is a list of tuples, we can build a DataFrame from node_data.

In [29]:
lincoln_ser = pd.DataFrame(node_data).set_index(0)[1]
print(lincoln_ser)
print(type(lincoln_ser))

0
Died            {'date': 1865}
Born            {'date': 1809}
Elected Rep     {'date': 1847}
Elected Pres    {'date': 1861}
Married         {'date': 1842}
Name: 1, dtype: object
<class 'pandas.core.series.Series'>


* The values in the
column are node attribute dictionaries, and one of the Series constructors
builds a Series from a dictionary. Let’s apply the constructor to each row.

In [31]:
df = lincoln_ser.apply(pd.Series)
df

Unnamed: 0_level_0,date
0,Unnamed: 1_level_1
Died,1865
Born,1809
Elected Rep,1847
Elected Pres,1861
Married,1842


* The result is a DataFrame suitable for further processing. For example, you can
calculate the duration, in years, of each span of Lincoln’s biography:

In [35]:
spans = df.sort_values('date').diff()
print(spans)

              date
0                 
Born           NaN
Married       33.0
Elected Rep    5.0
Elected Pres  14.0
Died           4.0


EdgeView([('Died', 'Born'), ('Died', 'Elected Pres'), ('Born', 'Married'), ('Elected Rep', 'Elected Pres'), ('Elected Rep', 'Married')])