## HCI 574 - lecture 35  - Web + HTML 

April 12, 2024 (with after lecture corrections)

- The next 4 lectures are about web programming and you will need the material covered for HW10
- if you don't want to write a Web UI, you can instead use TkInter
- If you are curious, HW10 instructions are already on canvas. (But, you should wait until after lecture 38 before you start ...)

Today:
- Learn how to start a web browser with a URL from within python 
- A bit about HTML ([good intro here](https://developer.mozilla.org/en-US/docs/Learn/HTML/Introduction_to_HTML)) 
- a tiny bit of javascript (for in-brower UI elements) 
- web crawling: opening, saving and searching (parsing) a HTML page from web 
- In much more detail + lotsa good links: [https://automatetheboringstuff.com/chapter11/](https://automatetheboringstuff.com/chapter11/) 
- html text analysis (parsing) with [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)
- how to use the urllib and [requests](http://docs.python-requests.org/en/master/) modules to download web files from python (multiple, optional examples) 


### Some web Basics:

##### Browser:
- Runs a Web browser application 
- acts as "Operating System" or "Desktop" for the application
- acts as client (frontend), connects to server (backend)
- application connects to a internet server via [Hypertext Transfer Protocol (HTTP)](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol) 
- client requests a "resource" from a web server via a [URL](http://www.internet-guide.co.uk/url.html) (Universal Resource Locator): `http:\\www.mystuff.com\`    which by convention defaults to `http:\\www.mystuff.com\index.html` 
- Server responds by "serving up" the file, written in [HTML code](http://www.w3schools.com/html/), which your browser interprets

#### HTML  (HyperText Markup Language): 
- uses __tags__ (stuff with < > around it) to format text for your browser (tags can have other role, like e.g. containing javascript code)
- some tags contain Javascript code (scripts nodes)
- tags can be nested!
- Example:

```HTML
<!DOCTYPE html>
<html>

  <head>  
    <title> Chris Harding's Webpage </title> <- title, shown only in browser tab
  </head>
  
  <body>
    <h2>Chris Harding's Webpage</h2> <- Heading of size 2
    Hi there! This page has been under construction since 1996 ... <- normal text
    <br>  <= line break i.e. newline
    <a href="https://www.iastate.edu/">Iowa State University</a> <- link (anchor tag)
  </body>
  
</html>
```

- `<tag>` opens a tag, `</tag>` closes it
- some tags have info "inside", like the URL (href) in the anchor tag:    
   `<a href="https://www.iastate.edu">Iowa State University</a> `

### Use Python to fill a HTML file template
- Use case: 
    - Maybe you need to make several, very similar HTML docs (web pages)
    - Make a mini info page for each person working here, using the company web styling (css)
    - for each person we'd only need name, title, phone, room, maybe an image
    - maybe have a link to a central "home" page
    - that page could be served as the result of a web search

<p>

- Setup:
    - use a template and fill in the blanks
    - uses a specific name and title for each person, rest of the page is the same for all people
    - use a css file (examples are in the css folder) to make the page fit the company wide style
    
<p>

- Note: the [Html module](https://pypi.python.org/pypi/html/) would be an easier way to create HTML documents, but I want you to work the raw HTML for now.

This is an example of the template filled 

```HTML
<!DOCTYPE html>
<html lang="en-us">
<head>
<meta charset="utf-8">
<title> Chris Harding's Webpage </title> 
</head>
<body>
<div>
<h1>Chris Harding's Webpage</h1>
This is my web page! Gef&#228;llt sie euch?<br><br>
<a href="https://www.iastate.edu/"> Click here for the ISU home page<br> </a>
</div>
</body>
</html>
```

In [1]:

# Make HTML code by filling in super simple template for a web page
def fill_in_template(my_title, my_text):
    
    home_page_link = 'href="https://www.iastate.edu/"'

    doctype = "<!DOCTYPE html>\n"

    html = '<html lang="en-us">\n' # open html tag
    
    head = "<head>\n" # open head tag
    head += '<meta charset="utf-8">\n' # add meta info so we can use utf8 chars, like german umlauts
    
    title = "<title> " + my_title + " </title> \n" # make title  tag
    
    head += title  + "</head>\n" # add title and style to head and close tag


    body = '<body>\n' # open body tag
    
    div = '<div>\n' # open div tag (a division or section)
    div += "<h1>" + my_title + "</h1>\n"     # add title inside heading 1
    div +=  my_text + "<br><br>\n"           # add the text and insert line breaks

    anchor = "<a " + home_page_link + ">" # make home page link, its URL is inside the tag
    anchor = anchor + " " + "Click here for the ISU home page<br>" + " </a>\n"  # close anchor tag
    
    div += anchor  # add anchor node to div tag
    div += '</div>' # close div tag
    
    body += div + "\n</body>\n" # add div to body tag and close it

    # Assemble the page from the parts we created earlier
    p = doctype  + html + head + body + "</html>\n"

    return p

In [2]:
# Make and save a personal web page
page = fill_in_template("Chris Harding's Webpage", "This is my web page! Gef&#228;llt sie euch?")
print(page) # html page we created

<!DOCTYPE html>
<html lang="en-us">
<head>
<meta charset="utf-8">
<title> Chris Harding's Webpage </title> 
</head>
<body>
<div>
<h1>Chris Harding's Webpage</h1>
This is my web page! Gef&#228;llt sie euch?<br><br>
<a href="https://www.iastate.edu/"> Click here for the ISU home page<br> </a>
</div>
</body>
</html>



In [3]:
# save page to file
filename = "index.html"
with open(filename, "w+") as f:
    print(page, file=f)

In [4]:
# this should open the saved page automatically in a new browser window
import webbrowser # https://docs.python.org/3.7/library/webbrowser.html#module-webbrowser
webbrowser.open('index.html',  autoraise=True);

# this may not work! Alternatively, find the file in your file manager and open it there.

### Running a simple local http server to serve the index.html file
- run the cell below, open a browser (chrome?) and open this URL: http://127.0.0.1:8080/
- 127.0.0.1 is by convention the address of a locally run http server
- 8080 is the port that we connect this server on 
- this will load `index.html` by default and render it in the browser

<p>

- Once a web browser connects, you'll see `127.0.0.1 - - [07/Apr/2020 13:32:29] "GET / HTTP/1.1" 200 -` in the log
- the server received a GET request and successfully (code 200) fulfilled it
- the server is able to handle multiple clients, connect to it with multiple pages or multiple browsers!

<p>

- ignore  `code 404, message File not found` message, it wants to load a mini image file as icon (favicon.ico) which we don't provide
 
<p>

- to shut down the server, you must use Stop-cell-execution (or Interrupt), otherwise jupyter will loop forever
- if you get [OSError: [Errno 48] Address already in use](https://stackoverflow.com/questions/19071512/socket-error-errno-48-address-already-in-use), restart your notebook, reopen all browsers or use a different port (8081, etc.) 

<p>

- This provides the so-called backend to the web page code (the front end)
- Later we will use Flask as server framework to dig much deeper into back-end programming

(Note: this seems to not have work during the recorded lecture but when I run it again 10 min after lecture it worked again fine. Very confused as to what I did wrong during the lecture ...)

In [None]:
import http.server
import socketserver

PORT = 8080
Handler = http.server.SimpleHTTPRequestHandler

# open a browser to this URL: http://127.0.0.1:8080/

with socketserver.TCPServer(("", PORT), Handler) as httpd:
    print("serving at port", PORT)
    httpd.serve_forever()
print("done")


#### Other things to explore
- when opening index.html in the browser (manually or via you local server), try this:
    - look at the html source code: Right-click => view source (or use this URL: `view-source:http://127.0.0.1:8080/`
    - open a Web debugger and look at the messages (Web developer - Debugger)

## Browser UI elements (Javascript)
- What about interactive elements (GUI)?  
- [Javascript ](http://www.w3schools.com/js/js_examples.asp)runs inside the browser (client side), so you can't use Python*  (https://brython.info)  
- HTML is used to define the GUI elements (widgets)
- callback functions inside the javascript sippets (nodes) are used to respond to events 
- Interactive Web pages use uses GUI elements (forms) and Javascript together
- Typically, higher level "tools" like [jquery ](http://jquery.com/) are used, instead of pure JS

<br>

- example: `javascript_example.html`  (open it via the jupyter browser) to look at the code that creates this:


![Image](https://lh5.googleusercontent.com/VnePLzOuE3fZvyyhGNLn3VwDUlDhMpKV3YXQJLJft4Z1yIffsHKTodjjn6sE7cGuNZlrGS4K1byFo1M7JnwJfg4HZEpJ6tiOVArrK_Re8iyPOd7iWR994bV8JM9HPFM3zXXgrr_t) 



- This is the html doc the browser evaluates
- the javascript code is only inside the  __`<script>`__ tag
- I defined 2 callback functions for mouse over and mouse out events but only for the heading tag (h1)
    - react to mouse enter and leave
    - react to the submit button, which is just a dummy here b/c we don't have a server to receive any of the browser's values 
- Note how the `.innerHTML` writes text into a previously empty paragraph via `id`

<p>

- special tags (`select`, `input`, `form`, etc.) are used as GUI widgets
- they seemingly work but won't automatically send the new value to the server
- `action="/processStuff" method="post"` is how we would connect to the server
- we will see this in action later when we set up a Flask web server frame work (more complex than the simple http sever from earlier ...)

```HTML
<!DOCTYPE html>
<html>
  <body>
  
	<!-- node with two callbacks in Java script, which runs inside the browser-->
	<script>
	    function mOver(obj){
            obj.innerHTML=Date()
	    }
	    function mOut(obj){
            obj.innerHTML="Mouse Over Me to see the date";
	    }
	    
	    // simulates giving values set in the GUI to the server	    
	    function submitStuff()
	    {        
            // get the values and transmit them to processStuff() on the server
            v = document.getElementById('ddlist').value;
		
            // sets the text of the message 	        
	        document.getElementById('message').innerHTML ="sending list value: " + v;
	    }
    </script>


	<!-- callbacks for mouse-over (enter) and mouseout (leave) -->
	<h1 onmouseover="mOver(this)" onmouseout="mOut(this)" >
		Mouse Over Me to see the date   <!-- Initial text -->
	</h1>


	<!-- form widget: values are sent to the server when submit is pressed -->
	<form action="/processStuff" method="post">
    
       		Example of two Radio buttons: <br>       
        	<input type="radio" name="Status" value="On">On<br>
        	<input type="radio" name="Status" value="Off" selected >Off<br>
            
        	Example of a drop-down list with a pre selected value: <br>     
        	<select name="Languages" id="ddlist">
        		<option value="C++">C++</option>
                <option value="Java">Java</option>
                <option value="Python" selected>Python</option>
                <option value="Fortran">Fortran</option>
        	</select>
	</form>
    
    <!--  Submit button -->
	<input type="submit" value="Submit stuff to server" 	    
		  title="Submit values of the form to the server"
		  onclick="submitStuff()"
	>    
   
    <!-- the message will appear here-->
	<p id="message"></p> 
    
  </body>
</html>
```

## Requests - downloading files from the web  

- It's pretty easy to download (simple) web content using Python:
- Open a remote connection to URL and read file content into a string (text)
- We could now process the string and write it into a local folder as a file 
- Standard python modules: urllib (lower level) or __requests__ (higher level, recommended)
- https://docs.python-requests.org/en/latest/index.html

### Analyzing (parsing) a html document with BeautifulSoup
- Task: Given a URL, display all URL links it contains
- (Could also be, download all images ...)
- html contains structures text as "data", e.g. it may contain links to other html documents
- We will now download a html file (similar to the index.html file we created earlier) and analyze it as if it was a - We'll treat the html as a text file with spacial tags embedded in it.

<p>

- __BeautifulSoup__ is a module that provides a parser for html (as a "text file")
- (we will first need to download the text via a URL)
- BeautifulSoup will go through this text  and convert it into a linked data structure (similar to a dict)
- we need to find all the `<a> <\a>` (anchor) tags, e.g.


```
<a href="http://whatever.com">SomeText</a>
```

- we need to pull out the value of `href` which contains the URL of the link, here: `http://whatever.com`

In [5]:
%pip install beautifulsoup4
%pip install requests

Collecting beautifulsoup4
  Downloading beautifulsoup4-4.12.3-py3-none-any.whl (147 kB)
     ---------------------------------------- 0.0/147.9 kB ? eta -:--:--
     -------- ------------------------------ 30.7/147.9 kB 1.3 MB/s eta 0:00:01
     ---------------------------- --------- 112.6/147.9 kB 1.3 MB/s eta 0:00:01
     -------------------------------------- 147.9/147.9 kB 1.3 MB/s eta 0:00:00
Collecting soupsieve>1.2
  Downloading soupsieve-2.5-py3-none-any.whl (36 kB)
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.12.3 soupsieve-2.5
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: C:\Users\david\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 23.0.1 -> 24.0
[notice] To update, run: C:\Users\david\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [6]:
from bs4 import BeautifulSoup 
import requests 

# The URL of the HTML doc we want to analyze
url= "http://www.iastate.edu/"

# Use requests to grab a HTML document
resp = requests.get(url)

# .text contains the "source" of the page (as simple text)
page_content = resp.text 
print(page_content[:1000]) # show only the first lines of the page

<!DOCTYPE html>
<html lang="en" dir="ltr" prefix="og: https://ogp.me/ns#">
  <head>
    <meta charset="utf-8">
    <!-- Prevents GDPR-dependent scripts from executing on load -->
    <script>
      window.YETT_BLACKLIST = [
        /addthis\.com/
      ];
    </script>
    <script src="//unpkg.com/yett"></script>
    <link rel="preconnect" href="https://fonts.googleapis.com">
    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
    <link href="https://fonts.googleapis.com/css2?family=Merriweather:wght@400;700&display=swap" rel="stylesheet">
    <link rel="icon" href="https://cdn.theme.iastate.edu/favicon/favicon.ico" sizes="any"><!-- 48×48 -->
    <link rel="icon" href="https://cdn.theme.iastate.edu/favicon/favicon.svg" type="image/svg+xml">
    <link rel="apple-touch-icon" href="https://cdn.theme.iastate.edu/favicon/apple-touch-icon.png"><!-- 180×180 -->
    <link rel="manifest" href="https://cdn.theme.iastate.edu/favicon/manifest.webmanifest">
    <meta charset="u

In [7]:
# Analyze ("parse") the page content as html (there are parsers for other types of data)
soup = BeautifulSoup(page_content, "html.parser")  

# find all anchor tags and store as list of strings
anchor_tags = soup.find_all('a')
print("found", len(anchor_tags), "tags:\n")

# and print them out
for a in anchor_tags:
    print(a)
    

found 126 tags:

<a class="skip-link" href="#main-content">
      Skip to main content
    </a>
<a class="site-header__logo" href="https://www.iastate.edu/">
<img alt="Iowa State University" class="site-header__logo-mobile" src="/themes/custom/iastate2022/img/iowa-state-university-logo-no-tagline-red.svg"/>
<img alt="Iowa State University of Science and Technology" class="site-header__logo-desktop" src="/themes/custom/iastate2022/img/iowa-state-university-logo-with-tagline-red.svg"/>
</a>
<a href="https://apps.admissions.iastate.edu/myaccount/">Admissions MyAccount</a>
<a href="/request-info">Request Info</a>
<a data-drupal-link-system-path="node/78" href="/admission-and-aid/apply"> Apply</a>
<a data-drupal-link-system-path="node/79" href="/admission-and-aid/visit">Visit</a>
<a href="https://www.foundation.iastate.edu/s/1463/giving/start.aspx"> Give</a>
<a href="https://students.info.iastate.edu">Current Students</a>
<a href="https://facultystaff.info.iastate.edu">Faculty and Staff</a>

In [8]:
# Grab only the value of href (the link URL), e.g.:

link = anchor_tags[0]
print(link.get('href')) # this starts with a #, so it's an internal link


link = anchor_tags[7]
print(link.get('href')) # this is a proper external link

#main-content
https://students.info.iastate.edu


In [9]:
# pull out all link URLs  (the <a> or anchor tag)
links = []
for link in anchor_tags:
    print("\n", link)
    l = link.get('href')
    if l != None and l[:4] == "http": # only collect external (internet) links
        print(l)
        links.append(l)


 <a class="skip-link" href="#main-content">
      Skip to main content
    </a>

 <a class="site-header__logo" href="https://www.iastate.edu/">
<img alt="Iowa State University" class="site-header__logo-mobile" src="/themes/custom/iastate2022/img/iowa-state-university-logo-no-tagline-red.svg"/>
<img alt="Iowa State University of Science and Technology" class="site-header__logo-desktop" src="/themes/custom/iastate2022/img/iowa-state-university-logo-with-tagline-red.svg"/>
</a>
https://www.iastate.edu/

 <a href="https://apps.admissions.iastate.edu/myaccount/">Admissions MyAccount</a>
https://apps.admissions.iastate.edu/myaccount/

 <a href="/request-info">Request Info</a>

 <a data-drupal-link-system-path="node/78" href="/admission-and-aid/apply"> Apply</a>

 <a data-drupal-link-system-path="node/79" href="/admission-and-aid/visit">Visit</a>

 <a href="https://www.foundation.iastate.edu/s/1463/giving/start.aspx"> Give</a>
https://www.foundation.iastate.edu/s/1463/giving/start.aspx

 <a 

In [None]:
# Write out a minimal html page with just the links and dump into a file
with open("ISU_mainpage_external_links.html", "w+", encoding="latin-1") as f: 
    print("<HTML>\n", file=f)
    for l in links:
        anchor_tag = "<a href=\"" + l + "\"> " + l + " </a><br>" # <br> is a line break in HTML
        print(anchor_tag + "\n", file=f)
    print("</HTML>\n", file=f)

- note that link with index 0 starts with a __`#`__, not with __`http`__, so it's a local link!
- link 4 (http://www.iastate.edu/index/B/) starts with http, so it's a proper external link URL

<br>

- Now, let's pull out all the links but only keep those starting with `http`:
- store those URLs in a list of links

- Open `ISU_mainpage_external_links.html` in a browser to view it

In [None]:
webbrowser.open_new_tab("ISU_mainpage_external_links.html");


### Optional: Search wikipedia for info on a title
- our search term must be transmitted to the server as special parts of the URL
- Example: `http://en.wikipedia.org/w/api.php?titles=Python%20Programming`
- the parts in the URL starting with __?__ are  query parameters (or query arguments)
- query parameter syntax: `parameter_name`=`parameter_value`
- multiple query parameters are separated by &
- you must use __`%20`__ instead of spaces!

<p>

- Click on this link:
http://en.wikipedia.org/w/api.php?action=query&prop=info&format=json&titles=Python%20Programming 

- this requests info about `Python Programming`
- the response will be some data about the searched for item in JSON format
- Depending on our browser this should give you something like this:


```
pageid	8531522
ns	0
title	"Python Programming"
contentmodel	"wikitext"
pagelanguage	"en"
pagelanguagehtmlcode	"en"
pagelanguagedir	"ltr"
touched	"2020-03-31T12:06:51Z"
lastrevid	95547385
length	43
redirect	""
```

- We can simulate this in Python and process the returned JSON data
- We'll use the `requests` module and give it a URL with our search term
- It will contact the server and grab it's response

In [None]:
import requests

# Make the URL:

# 1) base URL for query for JSON type data return 
url = "http://en.wikipedia.org/w/api.php?action=query&prop=info&format=json&titles="

# 2) the actual search term
url += "Iowa%20State%20University" # need %20 for space!
#url += "Python%20Programming" #
#url += "Chris%20Harding" # try this to see how a failed search looks
print(url)

In [None]:
# issue a HTTP protocol GET request to the server at that URL
# get back a response object
resp = requests.get(url)

# 200 means we got a good response, 404 means something didn't work
print(resp)  

- the response object has multiple attributes
- `.text` contains our data:

In [None]:
print(resp.text) # text attribute of response object

- the raw text of response is a string in json format (similar to a python dictionary)
- the string method `.json()` returns converts the string into a python dictionary
- result: the server response was converted into a standard python data structure

In [None]:
from pprint import pprint

# now put the response in json format, which is essential a python dictionary
# more on JSON format: https://www.w3schools.com/js/js_json.asp
resp_as_dict = resp.json() # method!
pprint(resp_as_dict)

### Parsing the response

- Parsing means converting the information contained in the dictionary into a format we can use.
- Requires understanding the "rules" (structure) of the information 
- In our case, we want to be able to access the actual wikipedia web pages found in out query

Rules:
- `query` is a dict that contains another dict (pages)
- `pages` is a dict of pages it found for this search term (we only have one but there could be multiple!)
- each page has act with a pageid as key and another dict with some more info, like language, when last edited, etc.
- with this pageid we can put together an URL for the actual webpage

In [None]:
pprint(resp_as_dict['query']) 

In [None]:
pages = resp_as_dict['query']['pages']
pprint(pages) # dict with on or more(!) pages

In [None]:
pageids = [k for k in pages.keys()] # list of a keys, essentially the page ids
print(pageids[0]) # let's just take the first of possible multiple pageids

In [None]:
# make a URL that will get us that first page from wikipedia
url = "http://en.wikipedia.org/wiki?curid="  + pageids[0]

print(url)  

### Accessing the wikipedia page(s) we searched for
- we can now construct the URL(s) for the wikipedia pages our query returned
- Putting `http://en.wikipedia.org/wiki?curid=` in front of the page id will get us to the actual web page
- Example: `http://en.wikipedia.org/wiki?curid=8531522` 
- We could again download the html code for this URL and process (scrape) it
- For now we'll just look at the result in a browser

## Optional: Scraping web content (e.g. images from xkcd.com)
- from Al Sweigert's book: https://automatetheboringstuff.com/chapter11/
- Xkcd has the current (last) image at this page `http://xkcd.com`
- will parse each page with BeautifulSoup
- the element linking to the actual image is called `#comic img`
- we will download the image and stored it in a folder called `xkcd`
- there's also a previous button at the bottom 
- the element `a[rel="prev"]` links to the __previous__ page, so we'll load that page and get its image, etc.
- will terminate after a few images otherwise it would download thousands of images!

In [None]:
# downloadXkcd.py - Downloads every single XKCD comic.
# from https://automatetheboringstuff.com/chapter11/
# written by Al Sweigert, some small fixes

import requests, os, bs4
from IPython.display import display  
from PIL import Image

max_num_dls = 5 # number of images to download
url = 'http://xkcd.com' # starting url
os.makedirs('xkcd', exist_ok=True) # store comics in ./xkcd


while not url.endswith('#'):
    
    # Download the page.
    print('\n\nPage URL:', url)
    res = requests.get(url)
    res.raise_for_status()

    # Use BSoup to find the URL of the comic image.
    soup = bs4.BeautifulSoup(res.text,  "html.parser") # CH
    comicElem = soup.select('#comic img') # grab this tag
    
    if comicElem == []:
        print('Could not find comic image.')
    else:
        comicUrl = "http:" + comicElem[0].get('src') # get the link URL
        
        # Download the image.
        print("image URL", comicUrl)
        res = requests.get(comicUrl)
        res.raise_for_status()

        # Save the image to ./xkcd
        filename = os.path.join('xkcd', os.path.basename(comicUrl))
        imageFile = open(filename, 'wb')
        for chunk in res.iter_content(100000):
            imageFile.write(chunk)
        imageFile.close()
        
        # read image back in in and show it
        img = Image.open(filename)
        display(img)

    # Check for limit
    max_num_dls -= 1
    if max_num_dls < 0: 
        print("I think you've got enough images now!")
        break
              
    # Get the Prev button's url.
    prevLink = soup.select('a[rel="prev"]')[0]    
    url = 'http://xkcd.com' + prevLink.get('href')
    print("Url for previous link:", url)

print('Done.')


- better: use Scrapy https://scrapy.org/ for web crawling and scraping
- scraping with login via cookies: https://www.youtube.com/watch?v=PpaCpudEh2o 

## Optional: Travel analysis with the Google Directions API 

- many organizations provide a web API (Application Programmers Interface) to access their data
- here I will talk about the Google Directions API only
- caveat: you need to get a Google developers key for this, which you can use for this and many other APIs.
- Start here: https://developers.google.com/maps/documentation/javascript/get-api-key
- Note that you will need a credit card to set up a billing account but there's a pretty generous number of free request (up to 100,000 per month) to use for experimentations like these. If you go over this quota you will be billed.
- (Yes, it's sad that a global near monopolistic company can no longer afford to give you a play around account that limited to a small number of requests per day, like they used to do. Things must be grim at Google ...)
- Once you have an account you'll get a key (something like AIzaSyCkUOdZ5y7hMm0yrcCQoCvLwzdM6M8s5qk) which your request to the API must transmit to authenticate. 

<p>

- The API key used below is my own key. You are welcome to use it to play around but __please don't abuse it!__ I will delete the key at the end of the semester.

<p>

- Beyond this you'll need to get your own key: 
    - Make sure you enable the Directions API in the Google Cloud Platform Console 
    - In APIs & Services, go to Credentials and create an API key. You could restrict this key or simply delete it after you're done experimenting so nobody misuses it.
    - Paste your key into `key=` below.

### Problem: You are planning a bike trip. How long is the journey from A to B going to take?
- a request will ask google for directions (route description) from A to B 
- this request has to be in JSON
- more on parameters to can request: https://developers.google.com/maps/documentation/directions/intro
- Google will return trip data as a series of legs, each has a distance and a duration
```
...
'legs': [{'distance': {'text': '0.6 mi', 'value': 901},
          'duration': {'text': '2 mins', 'value': 143},
...
```
- the __value__ key is the duration in seconds, here 143 seconds
- All we need to do is add those up the duration of all legs to get the total trip duration

#### Example: Scenic bike trip from Ames to Boone (via Slater ...)
- all parameter values get stuffed into a dictionary
- origin and destination must be resolvable, so add the State if in doubt! `Ames,IA` to `Boone,IA`
- road crossings can be specified as well, so lets make a little detour through the junction of Grand and JunctionCity in Slater:`Slater,IA|Grand+JunctionCity,IA`

<p>

- make a GET request to the server at https://maps.googleapis.com/maps/api/directions/json and give it the dictionary as JSON payload
- the result is another dictionary with structured route information, which we'll analyze

<p>

- note that the result contains a ton more information (as you will see when we look at the JSON response), but we're only interested in duration

In [None]:
import requests
url = 'https://maps.googleapis.com/maps/api/directions/json'
from pprint import pprint
# make a dict and pretend it's JSON
json_params = dict(
    origin='Ames,IA',
    waypoints='Slater,IA|Grand+JunctionCity,IA', # scenic route
    destination='Boone,IA',
    mode='bicycling', # or walking or driving (default)
    sensor='false',
    key='AIzaSyDNjIgvOxDeDCVd5HbyjxOSVKIeYeA725U' # if this doesn't work, use you own key here!
)

# give trip specification parameters to the direction server 
resp = requests.get(url=url, params=json_params)

# look at the response in JSON format
data = resp.json()
pprint(data)

In [None]:
# find value for routes, will be a list!
routes = data['routes']
pprint(data['routes'][0]) # look at the first route

# There could be multiple routes (hence the list) but most likely there will be only 1

In [None]:
# loop over all routes, pull out it's legs (ouch!)
# for each leg get the value of duration (in seconds) and add them all up
# Note that each leg is described as a series of steps with even more data, 
# but we'll just take the per leg summary

total_duration = 0

for r in routes:
    
    # for the current route, get the legs (another list)
    legs = r["legs"]
    
    # loop over all legs
    for l in legs:
        #pprint(l)
        
        # for the current leg, get the duraction (dict)
        duration = l["duration"]
        
        # get the key called 'value' and add to total duraction
        dur_secs = duration["value"]
        total_duration += dur_secs
        
        # Print out trip information
        print(l["start_address"], "to:", l["end_address"], ":", l["duration"]["text"], ",", l["distance"]["text"])
        

        
print("total travel time: ",total_duration, "secs or", total_duration/3600, "hours")

- Caveat: I'm unclear if the time to day for the request make a difference in duration, so your results may be different




- Once you're done with this example, try planning some other trips:
    - a journey you've done many time (commute to work, vacation trip)
    - set mode to `driving` to simulate a journey by car instead of by bike
    
    
