# Network Poet

This is a example of using python to download data from Google Sheets, turn it into a network graph, and to create poetry by traversing / wandering around on the graph!



In [2]:
# Uncomment the line below to run it once, so that you install the necessary libraries.
# Then comment it again since you only have to run it once.

!pip install requests jgraph tracery

In [3]:
import requests
# this downloads/requests stuff from the internet 

import csv
# this helps you parse csv

import re
# this uses regular expressions, or regex, a way to search grab chunks of text from a string

from pprint import pprint
# this helps us 'pretty print' data -- 'pretty print' is a term that means nicely formatting & outputting data

import random
# this helps us choose items from a list randomly, or to create random numbers, etc

import jgraph
# this helps us visualize the graph!

import tracery
# this is a library for generative text
# made by none other than Allison Parrish!

## Downloading data

How do we get the data?
We're going to use a semi-hidden function in Google Sheets - if you go to a specially formatted URL, then Google Sheets will return the data CSV (comma separated value) of the spreadsheet.

### Let's get ready to download things by generating a URL..

In [4]:
# You can replace `gsheet_url` with a link to a publicly viable Google Sheet

gsheet_url = "https://docs.google.com/spreadsheets/d/10Pg_M4B0oSclEu0V-JSi3WTa1yISBjpjZ9KTVHDuUD8/edit#gid=443195507"

This cell uses RegEx, or regular expressions, to do an advanced search and extract the document ID and sheet ID from `gsheet_url`. 

Regular expressions are incredibly helpful, and can also be a bit cryptic (for everyone), so for right now, you don't have to understand exactly how `getDocIdFromUrl` and `getSheetIdFromUrl` work.

In [5]:
# define functions that extract the doc ID / sheet ID from gsheet_url
def getDocIdFromUrl(url):
    return re.search("([-\w]{25,})", url).group(1)
def getSheetIdFromUrl(url):
    return re.search("gid=([-\w]+)", url).group(1)


# craft the URL that gives us a CSV
gsheet_csv_url = "https://docs.google.com/spreadsheets/d/" + getDocIdFromUrl(gsheet_url) + "/export?gid=" + getSheetIdFromUrl(gsheet_url) + "&format=csv"


# if you're curious, try going to this url in your browser! It will automatically download a CSV file
print(gsheet_csv_url)

https://docs.google.com/spreadsheets/d/10Pg_M4B0oSclEu0V-JSi3WTa1yISBjpjZ9KTVHDuUD8/export?gid=443195507&format=csv


### Now we download the data! 


In [6]:
# get response from URL
response = requests.get(gsheet_csv_url)


# make sure it's read/decoded with the right text encoding!
decoded_content = response.content.decode('utf-8')


# using the CSV library, parse the data! 
# Split the content on each 'newline', then make sure we use a comma as a delimiter to understand the difference between cells.
# also - ignore the first line, because it's the header info for the csv
edges = list(csv.reader(decoded_content.splitlines()[1:], delimiter=','))


print(edges)

[['clap', 'sit'], ['clap', 'jump'], ['jump', 'laugh'], ['laugh', 'twirl'], ['twirl', 'jump'], ['twirl', 'clap'], ['hop', 'skip'], ['skip', 'jump'], ['hop', 'clap']]


Optionally, if you want to open a local CSV file on your computer instead, you can use the code in this comment! Just make sure that `sheet.csv` is in the same directory as this Jupyter notebook file.

In [7]:
#with open("sheet.csv", "r") as f:
#    reader = csv.reader(f)
#    edges = list(reader)
#print(edges)

### Let's quickly visualize the data

Jgraph is a library designed to visualize network graphs in 3D. It doesn't show labels, but you can get a general, visual/spatial sense of what the network _feels_ like.

In [13]:
jgraph.draw(edges)

## Let's create a network graph data structure!

The network graph, like the exercise in Graph Commons, assumes that each entry or row in the spreadsheet is a link between the `from` and `to` nodes.

We're going to run through our data, get all the unique nodes, and then use that list to fill a `dict` to represent a graph. 

### Get unique nodes

In [9]:
all_nodes = []

for e in edges:
    # add the first node in the edge to all_nodes
    all_nodes.append(e[0])
    # add the second node in the edge to all_nodes
    all_nodes.append(e[1])
    
# a 'set' is a data structure that can only contain unique elements --
# so by converting to a set and back to a list, we get only a unique list of nodes.
unique_nodes = list(set(all_nodes))

print(unique_nodes)

#### here's a challenge: the single line below does the exact same thing as the lines of code above. can you see why?
# unique_nodes = list(set([e for edge in edges for e in edge]))


['sit', 'twirl', 'laugh', 'clap', 'hop', 'skip', 'jump']


#### (On dicts)

Now, we're going to create a `dict` that represents a graph.

A dict stores things in a `key-value` structure. You can metaphorically think of `keys` are like `names on mailboxes`, `values` are like the `content inside the mailbox`.

You can 'look up' data using the `key`, and get the `value` that's inside. 

For example,
```
sfpc = { "address": "155 bank st", "year founded": "2013" }
print(sfpc['year founded']) ## 2013
```

Dicts are a really helpful (and opinionated!) way to structure data.







### Create a graph

For example, if you have data `a <-> b, b <-> c, c <-> a`, the `graph` dict will look like:

```
graph = {
        'a': ['b', 'c'],
        'b': ['c', 'a'],
        'c': ['a', 'b']
        }
```
This is what's called an [adjacency list](https://en.wikipedia.org/wiki/Adjacency_list), because the data stores the list of adjacencies -- what's close to what.

For example: `graph['b']` will return `['c', 'a']`
which lets us know that "adjacent to node `b` are nodes `c` and `a`"

(Imagine what the data would look like in jgraph - any ideas?)

In [10]:
graph = {}

# iterate through edges

for edge in edges:
    
    # for each edge, let's define 'node_from' and 'node_to' so that the code is easier for humans to read

    node_from = edge[0]
    node_to = edge[1]
    
    # if the node_from doesn't already exist in the 'graph' dict, 
    # let's insert node_from as a key, and have the value be an empty array

    if(node_from not in graph):
        graph[node_from] = []
    
    # in the array that exists in graph[node_from], add node_to

    graph[node_from].append(node_to)
    
    # let's do it in the reverse, since we'll assume that the graph isn't directed
    # which means that there's no directional relationship.
 
    if(node_to not in graph):
        graph[node_to] = []
    graph[node_to].append(node_from)
    
# now you can see how the data is structured as an adjacency list!
# compare this against the data output from the CSV and see if you can tell what's going on intuitively.

# pprint will nicely format the dict.
pprint(graph)

{'clap': ['sit', 'jump', 'twirl', 'hop'],
 'hop': ['skip', 'clap'],
 'jump': ['clap', 'laugh', 'twirl', 'skip'],
 'laugh': ['jump', 'twirl'],
 'sit': ['clap'],
 'skip': ['hop', 'jump'],
 'twirl': ['laugh', 'jump', 'clap']}


In [11]:
# let's see all the keys of the graph!
# this happens to equal the list of 'unique_nodes'.. and that's not a coincidence

print(graph.keys())

dict_keys(['clap', 'sit', 'jump', 'laugh', 'twirl', 'hop', 'skip'])


## Poetry

Here comes the poetry part! 

Imagine our graph -- with words on a node. There's a lot of different ways this could turn into poetry - different games and processes that would alter how the graph is 'read'. 

### Randomly walking around on the graph

We'll try a `random graph traversal`. 
1. We'll start at a random node.
2. Then, we'll examine the nodes we can connect to, and randomly jump to one of those nodes.
3. And then we'll repeat step 2 a few times.

Consider this traversal to be in line with the spirit of a [Situationist dérive](https://en.wikipedia.org/wiki/D%C3%A9rive), or a fluxus art game -- but with a graph!

In [12]:
# choose a random node and hop skip randomly to what it's connected to

# choose a random node!
current_node = random.choice(list(graph.keys()))

print (current_node + " ==> ")

# do this five times
for i in range(5):
    
    # find all the options we can go to in the graph from 'current_node'
    to_options = graph[current_node]
    # pick one of those randomly and set 'current_node' that node
    current_node = random.choice(to_options)
    
    print (current_node + " ==> ")
    
    
# try running this repeatedly

NameError: name 'thisnode' is not defined

That worked great! Let's put this inside a function, and then define a *list of verbs* we'll use to make this list a little bit.. poetic.

In [None]:

def randomly_traverse_graph(graph, number_of_steps):

    current_node = random.choice(list(graph.keys()))
    
    steps = []
    steps.append(current_node)
    
    for i in range(number_of_steps - 1):
        to_options = graph[current_node]
        current_node = random.choice(to_options)
        steps.append(current_node)
        
    return steps
    
    
    
print(randomly_traverse_graph(graph, 5))


In [None]:
list_of_verbs = ["affects", "changes", "alters", "relates to", "thinks about","touches","creates dreams about","helps","also is connected to"]


### Words from our random walk

Now, we'll generate our random path through the graph, then for each word in the list, we'll randomly choose a verb, and list them together.

In [None]:
steps = randomly_traverse_graph(graph, 5)

for s in steps:
    random_verb = random.choice(list_of_verbs)
    print(s + " " + random_verb)


This is getting somewhere! What if we printed out the list in pairs, so that the verb is between two of the words?

In [None]:
poem = ""

for i in range(len(steps) - 1):
    word = steps[i]
    next_word = steps[i + 1]
    
    # each line is a word, a random verb, and then the next word
    poem += word + " " + random.choice(list_of_verbs) + " " + next_word
    
    # if it's the last line add a period; otherwise add a semicolon
    if(i == len(steps) - 2):
        poem += "."
    else:
        poem += "; "    


print(poem)

This works great! Imagine ways in which you could alter this -- by altering the list of verbs, or adding a list of adverbs as well as verbs, changing the formatting. 

## Poetry with Tracery 

In this next and last section, we'll use Tracery, a library created by Kate Compton (and ported to Python by Allison Parrish) to do some more complex poetry creation. 

For a really good intro to Tracery, see Allison Parrish's tutorial from her *Reading and Writing Electronic Text* class on [Tracery and Python](https://github.com/aparrish/rwet/blob/master/tracery-and-python.ipynb).

**The simple explanation is:**

In Tracery, you define a 'grammar', consisting of lists of words. 




In [None]:
testGrammar = { "season": ["summer", "spring", "autumn", "winter"] }

Each list of word is represented by a starting symbol. (In the above case, `season`).

If you use that starting symbol with # symbols around it (`#season#`), then Tracery will randomly pick one of the lists of words and use it.



In [None]:
print(tracery.Grammar(testGrammar).flatten("#season#"))

# try running this repeatedly

Grammars can be nested; you can have a grammar that calls another grammar. Here's an example using our graph data:

In [None]:
def generate_simple_line(node_from, node_to):
    grammarSource = {
        "node_from": node_from,
        "node_to": node_to,
        "adverb": "suddenly|silently|happily|surprisingly|calmly|beautifully".split("|"),
        "verb":  ["affects", "dances around", "changes", "alters", "relates to", "thinks about","touches","creates dreams about","helps","also is connected to"],
        "poem": "#node_from# #adverb# #verb# #node_to#"
    }
    grammar = tracery.Grammar(grammarSource)
    return grammar.flatten("#poem#")
    
print(generate_simple_line("the moon", "the sun"))


In [None]:
steps = randomly_traverse_graph(graph, 10)

poem = ""
for i in range(len(steps) - 1):
    step = steps[i]
    next_step = steps[i + 1]
    poem += generate_simple_line(step, next_step) + "\n"

print(poem)

In [None]:
def generate_poem_line(node_from, node_to):
    grammarSource = {
        "node_from": node_from,
        "node_to": node_to,
        "wonderadj": "quiet|solemn|contemplative|still|tender|harmonious".split("|"),
        "tensionadj": "trembling|buzzing|vibrating|swirling".split("|"),
        "adj": "#wonderadj#,|#tensionadj#,".split("|"),
        "name": "fire|bother|wonder|winter".split("|"),
        "earlylate": "early |late |mid-".split("|"),
        "season": "autumn|summer|fall|winter".split("|"),
        "seasonornot": "|in #earlylate##season#, ".split("|"),
        "adverbornot": "|#adverb#".split("|"),
        "adverb": "briskly|tartly|simply|harshly|fuzzily|freely|ably|copiously|furtively|endlessly|sarcastically|generatively|slowly|distinctly".split("|"),
        "verbs": "dances|thinks|touches|dreams|hesitates|loves|feels|wanders|does|sits|reads|looks|runs|asks|collapses|chats|donates".split("|"),
        "preposition": "around|about|with|of|to|from".split("|"),
        "lineending": ['',',',';','--','.'],
        "poemline1": "#adj# #node_from# #adverb# #verbs# #preposition# #node_to##lineending#",
        "poemline2": "#seasonornot##node_from# and #node_to# #adverb# #verbs##lineending#",
        "poem": "#poemline1#|#poemline2#".split("|")
    }
    grammar = tracery.Grammar(grammarSource)
    return grammar.flatten("#poem#")
    
print(generate_poem_line("moon", "sun"))


In [None]:
steps = randomly_traverse_graph(graph, 10)

poem = ""
for i in range(len(steps) - 1):
    step = steps[i]
    next_step = steps[i + 1]
    poem += generate_poem_line(step, next_step) + "\n"

print(poem)