#### week1 ~ the-web-as-filesystem...  &nbsp;&nbsp; (hw1pr1.ipynb)

[the google doc with hw1's details](https://docs.google.com/document/d/11ALzpsANe3ZDR5sk8-kgaElaX_fwlDINVlQ8WS8JfQE/edit)
<hr>

#### Problem 1:  Traversing the world - and web - without a browser.   

<b>Using the ISS + USGS APIs</b> 

(hw1pr1.ipynb)

+ Here, you'll make sure you have the `requests` library and then you'll
+ make some calls using `requests` to International Space Station API and the US Geological Survey's earthquake API
+ "API" is short for "Application Programming Interface" 
  + Admittedly, this is not a very informative name, even fully expanded!
  + The API refers to the set of services, usually functions and/or urls, provided by some software (or site)

In [None]:
#
# see if you have the requests library
#

import requests

In [None]:
#
# If you _don't_ have the requests library, let's install it!
#

# for me, it worked to uncomment and run this command, here in this cell:
# !pip3 install requests  OR   !pip install requests

# an alternative is to run, in a terminal, the command  
#  pip3 install requests  OR    pip install requests   (the ! is needed only if inside Python)

# however, the "restart" button (with the loop-arrow) 
#  was _not_ enough for my notebook to recognize the library installed

# for me, I had to (1) disable + re-enable the jupyter extension
# then, (2) disable + re-enable the python extension
# which was enough to now "see" the newly installed library

# 
# My hunch is that some systems will need a full vscode shutdown/restart...
# The _Python: Select Interpreter_ command sems to help when it can find no kernels...
# That command can be accessed with command-shift-p (Mac) or control-shift-p (Win)
#

In [None]:
#
# hopefully, this now works! (if so, it will succeed silently)
#

import requests

Let's try it with the International Space Station api at [http://api.open-notify.org/iss-now.json](http://api.open-notify.org/iss-now.json)
+ [This page has documentation on the ISS API](http://open-notify.org/Open-Notify-API/ISS-Location-Now/)

In [None]:
#
# we assign the url and obtain the api-call result into result
#    Note that result will be an object that contains many fields (not a simple string)
# 

url = "http://api.open-notify.org/iss-now.json"
result = requests.get(url)
result    

# if it succeeded, you should see <Response [200]>

In [None]:
#
# when exploring, you'll often obtain an unfamiliar object. 
# Here, we'll ask what type it is 
type(result)

In [None]:
#
# Here, we'll ask what fields it contains:  
#       dir(ob) returns Python's "directory" of fields available in the object ob
print(dir(result))

In [None]:
#
# Let's try printing a few of those fields: 
print(f"result.url is {result.url}")  # the original url
print(f"result.raw is {result.raw}")  # another object!
print(f"result.encoding is {result.encoding}")  # utf-8 is very common
print(f"result.status_code is {result.status_code}")  # 200 is success!

In [None]:
#
# In this case, we know the result is a JSON file, and we can obtain it:
json_contents = result.json()
print(json_contents)

# json is a javascript dictionary, which is (almost) the same as a Python dictionary

In [None]:
#
# In Python, we can use the resulting dictionary... let's see its keys:
print(list(json_contents.keys()))  

# it has three keys :-)   Let's see the value for the key 'iss_position':
print(f"json_contents['iss_position'] is {json_contents['iss_position']}")

# It's another dictionary!  Let's give it a name -- and then look at its keys!
val = json_contents['iss_position']
print(list(val.keys()))     # it has the lat and long!

In [None]:
#
# Notice that obtaining a specific piece of data may involve "digging" into the structure.
# Here's the latitude of the ISS:
val = json_contents['iss_position']
print(f"The ISS's longitude is {val['latitude']}")

# this is a string... if we want to compute with it, we'll need to convert to a numeric type!
lat = float(val['latitude']) 
claremont_lat = 34.0967
print(f"The ISS is {abs(lat - claremont_lat)}° away from Claremont, latitudinally!")

In [None]:
#
# Your task!     (Task #1)
#

#
# Continue the above reasoning to compute 
# (a) how many degrees of longitude the ISS is away from Claremont
# (b) whether the ISS is closer to Claremont, longitudinally or latitudinally
# (c) extra: estimate how many _miles_ away the ISS is from you right now...  
#            (this will require some extra web-searching and estimating! :-)



#### Not every url returns json data!
+ The url [https://www.cs.hmc.edu/~dodds/demo.html](https://www.cs.hmc.edu/~dodds/demo.html) returns a plain-text file with _markup_ text
+ that is to say, with HTML tags, such as `<title>Title</title>` to designate the components of its content
+ HTML stands for _hypertext markup language_   
+ Often anything with tags similar to `<b>be bold!</b>` is called "markup." 

In [None]:
#
# here, we will obtain plain-text results from a request
url = "https://www.cs.hmc.edu/~dodds/demo.html"  # try it + source
#url = "https://www.webb.org/"  # try it + source
result = requests.get(url)        
print(f"result is {result}")        # hopefully it's 200
text = result.text                  # provides the HTML page as a large string...
print(f"len(text) is {len(text)}")  # let's see how large the HTML page is... 

In [None]:
print("The first 42 characters are\n")
print(text[:42])                  # we'll print the first few characters...  

# change this to text[:] to see the whole document...
# Notice that we can run many different analyses without having to re-call/re-scrape the page (this is good!)

<br>

#### But, we're going to focus on json-providing API calls for now 
+ for pr1 and pr2, at least, we'll use json
+ pr3 has the _option_ of using BeautifulSoup to parse raw html (up to you)
+ Next are anotehr example api call to the ISS (to obtain the current astronaut-list), and
+ An api call to the USGS, for earthquake data
+ Below, you'll use earthquake data to investigate seismic activity for you choice of place...
+   ... summarizing over quake-magnitude, areas-of-relevance, and different months

<br>



In [None]:
#
# json takes some getting used to!
# 

# These two json files (really dictionaries) are the first of this week's two "quizzes" (in-class exercises)

#
# Task #2:    try out the following examples, with an "ear" toward digesting what's going on...
#

In [None]:
#
# The astros!       http://api.open-notify.org/astros.json

url = "http://api.open-notify.org/astros.json"
result = requests.get(url)
print(f"result is {result}")     # we will hope for a 200
print(f"the url used was {result.url}")   # let's see the url again...
json_contents = result.json()    # would throw an exception/error, if it were not json...  
print(f"json_contents.keys() are {list(json_contents.keys())}")

#
# It's best to separate the obtaining of the data from its analysis.
#   otherwise, you might accidentally obtain the data too often, 
#   angering the data-provider, who can stop listening to you...

In [None]:
#
# The astros as a json structure (dictionary)
d = json_contents

# equivalently,
# d = {"people": [{"craft": "ISS", "name": "Mark Vande Hei"}, {"craft": "ISS", "name": "Pyotr Dubrov"}, {"craft": "ISS", "name": "Anton Shkaplerov"}, {"craft": "Shenzhou 13", "name": "Zhai Zhigang"}, {"craft": "Shenzhou 13", "name": "Wang Yaping"}, {"craft": "Shenzhou 13", "name": "Ye Guangfu"}, {"craft": "ISS", "name": "Raja Chari"}, {"craft": "ISS", "name": "Tom Marshburn"}, 
# {"craft": "ISS", "name": "Kayla Barron"}, {"craft": "ISS", "name": "Matthias Maurer"}], "message": "success", "number": 10}

d           # this plain-value formats more neatly than print(d)
print(d)    # this look illustrates how nice it is to have a scripting language interpret things!

In [None]:
#
# let's explore d...  Can we access 'Raja'?
d["people"][2]

In [None]:
#
# Let's loop over all of the names:
N = len(d["people"])

for i in range(N):  # loop over each index of d["people"]
    craft = d["people"][i]["craft"]
    person = d["people"][i]["name"]
    print(f"Item {i} is {person} on the {craft}")

<br>

#### Earthquake data
[Here is the USGS Earthquate data API documentation](https://earthquake.usgs.gov/fdsnws/event/1/)

<br>

In [None]:
#
# The quakes!   https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&minmagnitude=5.6&starttime=2022-01-23&endtime=2022-01-26

url = "https://earthquake.usgs.gov/fdsnws/event/1/query"
# note that this is much shorter than the above full url...
# that latter half consists of parameters to the API call...
# the parameters are documented here: 
#     https://earthquake.usgs.gov/fdsnws/event/1/#parameters

# for three of the parameters, let's use variables:
min_mag = 5.6               # the minimum magnitude considered a quake (min_mag)
start_time = "2022-01-23"   # this is the year-month-day format of the start
finish_time = "2022-01-26"  # similar for the end

# we assemble a dictionary of our parameters, named param_d
# there are many more parameters available. The problems below ask you to explore them...
param_d = {  "format":"geojson",         # this is simply hard-coded to obtain json
             "starttime":start_time,
             "endtime":finish_time,
             "minmagnitude":min_mag,
          }

result = requests.get(url, params=param_d)     # a named input, params, taking the value param_d, above
print(f"result is {result}")                   # hopefully, this is 200
print(f"the full url used was\n {result.url}")   # this includes the parameters!
json_contents = result.json()    # would throw an exception/error, if it were not json...  
print(f"json_contents.keys() are {list(json_contents.keys())}")

#
# It's best to separate the obtaining of the data from its analysis.
#   otherwise, you might accidentally obtain the data too often, 
#   angering the data-provider, who can stop listening to you...

In [None]:
#
# The quakes as a json structure (dictionary)
d = json_contents

d    # this plain-value formats more neatly than print(d)

In [None]:
#
# let's explore:
print(f'len(d["features"]) is {len(d["features"])}')

i = 0
d["features"][i]["properties"]


In [None]:
#
# with the structure, we can loop over them, extracting what we want, e.g., place:

# loop over them
N = len(d["features"])
for i in range(N):
    print(f'Item {i} was in this place: {d["features"][i]["properties"]["place"]}')

In [None]:
#
# Your task!     (Tasks #3 and #4)
#

#
# Continue the example below to create  
# (a) a larger set of API calls to the USGS "count" endpoint (each returns a json result)
# (b) a correspondingly larger text-formatted table of seismic activity (for Claremont, to start...)
# (c) the same larger table for another place (lat/long) of your choice 
# (d) a measurement-value ("metric") of your own design, of "seismic volatility"
# (e) answer which of Claremont + your other spot is more "seismically volatile"?
# (f) answer which _month_ is the most seismically volatile (your choice of location)
# 
# additional details (and hints) are in the hw1 google doc, as well...    On to these _unshakable_ inquiries: 

In [None]:
#
# This example uses a different API:     https://earthquake.usgs.gov/fdsnws/event/1/count
#

import time
url = "https://earthquake.usgs.gov/fdsnws/event/1/count"

# the parameters are documented here: 
#     https://earthquake.usgs.gov/fdsnws/event/1/#parameters

# we will be more ambitious with our API calls and parameters:
min_mag_list = [2.42, 4.42]      # now, a list of these!
start_time = "2022-01-01"   # this is the year-month-day format of the start
finish_time = "2022-01-31"  # similar for the end

all_jsons = {}  # an empty dictionary, to hold all of the resulting jsons...

# This time, we loop over our different min-magnitudes
for min_mag in min_mag_list:   # element-by-element looping, not index-based looping

    param_d = { "format":"geojson",         # this is simply hard-coded
                "starttime":start_time,
                "endtime":finish_time,
                "minmagnitude":min_mag,
            }

    result = requests.get(url, params=param_d)     # a named input, params, taking the value param_d, above
    print(f"for min_mag = {min_mag}, the result is {result}")                   # hopefully, this is 200
    print(f"the full url used was {result.url}")   # this includes the parameters!
    json_contents = result.json()    
    all_jsons[min_mag] = json_contents    # STORE INTO OUR STRUCTURE, named all_jsons
    print(f"json_contents is {json_contents}")
    time.sleep(0.1)                       # polite!

#
# It's best to separate the obtaining of the data from its analysis.
#   otherwise, you might accidentally obtain the data too often, 
#   angering the data-provider, who can stop listening to you...

In [None]:
#
# A starter table, with the two json results now in all_jsons
#

print(f'| {"-- globally --":^25s} |')                # ^ means "center"
print(f'|   {"min_mag":>10s} ~ {"count":>8s}   |')   # > means "right justify"
for min_mag in all_jsons:                            # loop over the keys, which are the min_mag floats
    current_json = all_jsons[min_mag]                # get the current json result
    current_count = current_json['count']            # obtain its count value
    print(f"|   {min_mag:>10.2f} ~ {current_count:>8d}   |") 