In [None]:
import warnings

warnings.filterwarnings("ignore")

In [None]:
import networkx as nx
import pandas as pd
import osmnx as ox

## Introduction

We're going to go through graph I/O,
specifically the APIs on how to convert
graph data that comes to you
into that magical NetworkX object `G`.

There are multiple ways of loading and reading from
disk, you can check out more docs at
https://networkx.org/documentation/latest/reference/readwrite/index.html

- DOT
- GEXF
- GraphML
- GML
- JSON
- LEDA
- Pajek
- Matrix Market
- ...

But in this notebook will be specifically look at Tabular Data.
Let's get going!

## Graph Data as Tables

Let's recall what we've learned in the introductory chapters.
Graphs can be represented using two **sets**:

- Node set
- Edge set

### Node set as tables

Let's say we had a graph with 3 nodes in it: `A, B, C`.
We could represent it in plain text, computer-readable format:

```csv
A
B
C
```

Suppose the nodes also had metadata.
Then, we could tag on metadata as well:

```csv
A, circle, 5
B, circle, 7
C, square, 9
```

Does this look familiar to you?
Yes, node sets can be stored in CSV format,
with one of the columns being node ID,
and the rest of the columns being metadata.

### Edge set as tables

If, between the nodes, we had 4 edges (this is a directed graph),
we can also represent those edges in plain text, computer-readable format:

```csv
A, C
B, C
A, B
C, A
```

And let's say we also had other metadata,
we can represent it in the same CSV format:

```csv
A, C, red
B, C, orange
A, B, yellow
C, A, green
```

If you've been in the data world for a while,
this should not look foreign to you.
Yes, edge sets can be stored in CSV format too!
Two of the columns represent the nodes involved in an edge,
and the rest of the columns represent the metadata.

## Dataset

We will look at how far away you can run away once
you see the ghost of the manor!

We will use osmnx to fetch the data from Open Street Maps.

In [None]:
G, coords = ox.graph_from_address(
    address="Dr. Holms Hotel, Geilo, Norway",
    dist=20_000,  # in meters.
    dist_type="network",
    return_coords=True,
)

In [None]:
coords

In [None]:
bbox_bound = ox.utils_geo.bbox_from_point(coords, dist=20_000, project_utm=True)

In [None]:
G_projected = ox.project_graph(G)

In [None]:
ox.plot_graph(G_projected, node_size=5, bbox=bbox_bound)

Let's look at the graph to see what is inside this data we just fetched.

In [None]:
list(G_projected.nodes.data())[0:2]

In [None]:
list(G_projected.edges.data())[0:2]

## Exercise

Find all the streets in the graph `G_projected` which have the type of `highway` as `primary`.

Hint:

This is a MultiDiGraph!!

When you iterate through the edges you would need to do somthing like:
```python
for u, v, key, ddict in G.edges(data=True, keys=True):
    ...
    ...
```


Now let's export this network to a pandas dataframe
and use dataframe operations to do the same thing!

In [None]:
streets = nx.to_pandas_edgelist(G_projected, edge_key="edge_key")

In [None]:
len(streets[streets.highway == "primary"])

In [None]:
streets[streets.highway == "primary"].head()

## Exercise

Extract the primary highway streets from the dataframe and then use that
information to plot a subgraph street network with only primary highways.

In [None]:
nodes = ...

In [None]:
# ox.plot_graph(G_projected.subgraph(nodes), bbox=bbox_bound)

In [None]:
nodes = streets[streets.highway == "service"][["source", "target"]].values.flatten()
ox.plot_graph(G_projected.subgraph(nodes), bbox=bbox_bound)

### But what about the node attributes??

In [None]:
list(G_projected.nodes.data())[0:5]

In [None]:
df = pd.DataFrame.from_dict(dict(G_projected.nodes(data=True)), orient="index")

In [None]:
df.street_count.unique()

In [None]:
df.plot.scatter(x="lon", y="lat", s=3)

In [None]:
edge_centrality = nx.closeness_centrality(nx.line_graph(G_projected))
nx.set_edge_attributes(G_projected, edge_centrality, "edge_centrality")

In [None]:
# color edges in original graph with closeness centralities from line graph
ec = ox.plot.get_edge_colors_by_attr(G_projected, "edge_centrality", cmap="YlGnBu")
fig, ax = ox.plot_graph(G_projected, edge_color=ec, edge_linewidth=2, node_size=0)