# Docassemble Beyond Docs

Catherine Devlin

Docacon 2018

https://github.com/catherinedevlin/talks/docassemble

# Me

- [18F](https://18f.gsa.gov/) alum 
- [Code for Dayton](https://codefordayton.org/)
- Now at [Miro](https://miro.one)
- [PyOhio](https://www.pyohio.org/2018/), [PyCon](https://www.youtube.com/watch?v=3kta4GB3PAw) 

![Magnetron](img/magnetron.resized.jpg)

![Radar](img/radar.resized.jpg)

![Microwave oven](img/microwave.resized.jpg)

![Docassemble](img/docassemble.png)

![Full stack web development](img/fullstack.jpeg)

![API-centered development](img/API-centered.svg)

# What's a RESTful API?

[Public APIs](https://github.com/toddmotto/public-apis)

# JSON in, JSON out

In [None]:
!curl https://ghibliapi.herokuapp.com/films

[g1-data-table.yml](g1-data-table.yml)

[g2-use-json.yml](g2-use-json.yml)

In [None]:
import requests 
response = requests.get('https://ghibliapi.herokuapp.com/films')
response.ok

In [None]:
dir(response)

In [None]:
response.json()

[g3-json-get.yml](g3-json-get.yml)

    
# Let's run an API server!

[JSON-server](https://github.com/typicode/json-server)

In [None]:
!cat films.json

`json-server --watch films.json -H 0.0.0.0 `

In [None]:
!curl http://localhost:3000/films/

In [None]:
!curl http://localhost:3000/films/2

In [None]:
!cat run.sh

[g4-local-get.yml](g4-local-get.yml)

# POST!

[g5-local-post.yml](g5-local-post.yml)

# Let's run a real API server

[Django REST Framework](http://www.django-rest-framework.org/)

# [Postgrest](http://postgrest.org/en/v5.0/tutorials/tut0.html)

In [None]:
!./postgrest --help 


[pg_schema.sql](pg_schema.sql)

In [None]:
!createdb films 

In [None]:
!psql -f pg_schema.sql films

In [None]:
!cat films_api.conf

In [None]:
!./postgrest films_api.conf
 

In [None]:
!curl http://localhost:3000/films

In [None]:
!curl http://localhost:3000/films?id=eq.2

In [None]:
!curl http://localhost:3000/films?title=eq.Totoro

[So much more](http://postgrest.org/en/v5.0/api.html)

# POSTing in Postgrest

In [None]:
import requests 
url = 'http://localhost:3000/films'
data = {'title': 'Princess Bride', 'description': 'As you wish'}
resp = requests.post(url, json=data)
resp 

In [None]:
resp.ok

In [None]:
resp.json()

In [None]:
!psql -c "grant insert on api.films to web_anon" films
!psql -c "GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA api TO web_anon" films

In [None]:
resp = requests.post(url, json=data)
resp

In [None]:
resp.ok

In [None]:
!psql -c "SELECT * FROM api.films" films

# Let's see that again 
    
[g5-local-post.yml](g5-local-post.yml)

In [None]:
!psql -c "SELECT * FROM api.films" films

# More tricks

RESTful APIs are the easiest, but not the only, way to get at data.

> It is not necessary to use Python code in an interview, but it is an extremely powerful tool.

# More data

[api.gov](https://www.data.gov/)

[Population](https://catalog.data.gov/dataset/population-by-country-1980-2010)

[file](populationbycountry19802010millions.csv)

# Consuming a CSV 

[csv](https://pymotw.com/3/csv/)

In [None]:
import csv
with open('populationbycountry19802010millions.csv') as infile:
    reader = csv.DictReader(infile)
    content = list(reader)

In [None]:
content[0]

# Webscraping

[Dangnab HTML tables](https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population)

[Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)

[Dependencies in Docassemble](https://docassemble.org/docs/packages.html#tocAnchor-1-4)

In [None]:
import bs4

resp = requests.get('https://en.wikipedia.org/wiki/List_of_United_States_cities_by_population')
soup = bs4.BeautifulSoup(resp.text)
population_table = soup.select('table.wikitable.sortable')[0]

In [None]:
def cleanup(val):
    result = val.replace('\n', ' ').strip()
    result = result.encode('ascii', errors='replace').decode('utf8').replace('?', ' ')
    try:
        return = int(result.replace(',', ''))
    except ValueError:
        return result

def html_table_to_dicts(html_table):
    rows = html_table.select('tr')
    headers = [cleanup(th.text) for th in rows[0].select('th')]
    for row in rows[1:]:
        values = [cleanup(td.text) for td in row.select('td')]
        yield dict(zip(headers, values))

In [None]:
import pprint
pprint.pprint(next(html_table_to_dicts(population_table)))

# You have the power!

[Python Beginners' Guide](https://wiki.python.org/moin/BeginnersGuide)

[Automate the Boring Stuff](https://automatetheboringstuff.com/)

[ChiPy](http://www.chipy.org/)

[PyOhio](http://python.org)
  

# Photo credits

Magnetron: CC BY 2.0 https://www.flickr.com/photos/nickhubbard/26136477417

Radar: public domain, https://commons.wikimedia.org/wiki/File:Radar-p011990.jpg

Microwave oven: CC BY-SA 3.0 https://commons.wikimedia.org/wiki/File:Microwave_oven_(interior).jpg

 

# Docassemble Beyond Docs

Catherine Devlin

Docacon 2018

https://github.com/catherinedevlin/talks/docassemble