# Reading networks from CSV files

[Run notebook in Google Colab](https://colab.research.google.com/github/pathpy/pathpy/blob/master/doc/tutorial/csv.ipynb)  
[Download notebook](https://github.com/pathpy/pathpy/raw/master/doc/tutorial/csv.ipynb)

This notebook shows you, how to read and write network data from `.csv` files.  

In [1]:
pip install git+git://github.com/pathpy/pathpy.git

Collecting git+git://github.com/pathpy/pathpy.git
  Cloning git://github.com/pathpy/pathpy.git to /tmp/pip-req-build-l221hiix
  Running command git clone -q git://github.com/pathpy/pathpy.git /tmp/pip-req-build-l221hiix
You should consider upgrading via the '/home/max/py3-venv/pathp/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [1]:
import pathpy as pp

import pandas as pd
from pprint import pprint

The simplest way to store network data is in terms of an adjacency list, i.e. a simple text file where each line contains the uids of source and target nodes of an edge, separated by a special character. This widely used format is the default file format of `pathpy` (and many other network analysis packages). We can use the `write` function in the `io.csv` module to save a network instance in this format. In the generation of the toy network below, we also demonstrate how we can add multiple nodes and edges at once based on a sequence of node uid tuples. The call to `add_edges` below generates three nodes and two edges:

In [2]:
n = pp.Network()
n.add_edges(('a', 'b'), ('a', 'c'))
print(n)

Uid:			0x7f08c5606c70
Type:			Network
Directed:		True
Multi-Edges:		False
Number of nodes:	3
Number of edges:	2


To store this network in a `.csv` file, we call:

In [3]:
pp.io.csv.write(n, 'network.csv')

If you inspect this file, you will find that it contains the source and target node uids of all edges, as well as the edge uids. By default a comma separator is used, but we can easily change this using the `sep` parameter of the function.

To directly load a network from a csv file, we can use the `read_network` function in the `io.csv` module. To read the network that we saved before, we can write:

In [4]:
n = pp.io.read_csv('network.csv')
print(n)

AttributeError: module 'pathpy.io' has no attribute 'read_csv'

Since the `io` functions are internally based on pandas data frames, we would get the same network (with a different uid though) if we did the following:

In [28]:
df = pd.read_csv('network.csv')
n = pp.io.csv.from_dataframe(df)
print(n)

TypeError: argument of type 'method' is not iterable

The above call will generate a network with a new uid and no attributes. if we want to assign attributes or a custom uid to the newly generated network, we can simply pass those attributes to the `from_csv` function:

In [None]:
n = pp.io.read_csv(filename='network.csv', uid='csvnetwork')
print(n)

All of the functions above also work with edges that have arbitrary attributes. If only part of the edges have a given attribute, a NaN value will be automatically assigned for other edges. Let's create a small example where this is the case:

In [1]:
n = pp.Network(directed=False)
n.add_edge('a', 'b', weight=2.0)
n.add_edge('a', 'c', type='friendship')
print(n)

NameError: name 'pp' is not defined