# Homework 01 of EPS 88

The goal of this course is to empower you to learn about the Earth using computation.

The tools of data science enable us to learn about the Earth through the approaches of:
- Exploration
  - Identifying patterns in data through visualization
- Inference
  - Using data to obtain reliable insights about the Earth through the application of statistics
- Prediction
  - Use analysis of data we can observe to make informed predictions of things we cannot observe using machine learning

You should have taken Data 8, be taking it concurrently, or have other experience with Python outside of the context of Data 8. All Data 8 materials including lecture videos are openly available which does enable self-study: http://www.data8.org/fa24/ as is the Data 8 textbook: https://inferentialthinking.com/chapters/intro.html

## The Jupyter notebook environment

The Jupyter notebook environment is the environment we will be habitating this semester in EPS 88.

### Markdown cells

This cell is Markdown text. It is the cell type where we can type text that isn't code. Go ahead and double click in this cell and you will see that you can edit it. **Type something here:**

### Code cells

Let's get going right away by dealing with some data within this Jupyter notebook. The first bit of code you will run is in the cell below. This is a code cell rather than a markdown cell (you can change the cell type using the drop-down box above). You can either hit the play button above, or more efficiently press *shift+enter* on your keyboard to run the code.

In [1]:
#This cell is a code cell. It is where we can type code that 
#can be run. The hashtag at the start of this line makes it 
#so that this text is a comment not code. 

import pandas as pd
pd.set_option('display.max_columns', None)

The reason why we execute the code ```import pandas as pd``` is so that we can use the functions of the ```pandas``` library which provides really helpful data structures and data analysis tools. We are using the standard convention of importing it using the nickname ```pd```. One of the fantastic things about doing data analysis in Python is the availability of great data analysis tools such as ```pandas```. One of the frustrating things can be learning how to use these diverse tools and which to use when. You will get more and more comfortable with these tools as the term progresses.

# Finding Birthquakes

Your birthquake is the largest magnitude earthquake that occured on the day you were born. In this in-class exercise, we are going to search an earthquake catalog and find your birthquake.

To do so, we can going to download data from the US Geological Survey (USGS) Earthquake Hazards program. https://earthquake.usgs.gov

We are going to use an API that lets us send an url to the USGS and get earthquake information for a set of specified parameters. 

## Finding my birthquake

Let's do it first for my birthday. We will define my birthday in year-month-day format and the day after my birthday in year-month-day format in order to make a url that gets data starting on 12 am of your birthday and ending 12 am of the next day. We are putting the quote marks (' ') around the dates so that they are **strings** (the Python data type that is a sequence of text).

In [2]:
## e.g., my_birthday = '1990-01-01'


What we just did in the code above is to create the variable ```my_birthday``` and assign it to be set to be the string 'year-mm-dd'. Run a code cell with just that variable, and show the variable as the code output.

Another way to see the variable is to tell python to print it using the ```print()``` function.

We are going to need to tell the USGS to look for earthquakes that are bracketed between the start of the day on my birthday and the end of the day on my birthday for which we can assign a variable named `day_after_my_birthday`. Python variable names cannot have spaces in them which is why I am using underscores `_`.

In [3]:
## e.g., day_after_my_birthday = '1990-01-02'


### Defining my birthday earthquake URL

To make a url that we can send to the USGS and get back data, we need to insert these dates into the USGS earthquake API url format (https://earthquake.usgs.gov/fdsnws/event/1/). API stands for 'application programming interface' and an API url path provides a way to access data that are available online. A nice thing about using an API url to access data is that it makes code portable and reproducible as the code can grab the data from the internet without needing a local copy.

We will define the `standard_url` as a string and include that we want to get the data as a `csv` with `format=csv` and that we want the data to be in the order of magitude with `&orderby=magnitude`. We can then add the dates that were set above to be the `&starttime` and `&endtime`.

In [None]:
standard_url = 'https://earthquake.usgs.gov/fdsnws/event/1/query?format=csv&orderby=magnitude'

my_birthquake_url = (standard_url + '&starttime=' + 
                       my_birthday + '&endtime=' + 
                       day_after_my_birthday)

Take a look at the url that is created:

In [None]:
my_birthquake_url

In [None]:
print(my_birthquake_url)

### Getting my's birthday earthquakes

We now have a url that we can use to get data from the USGS. We could cut and paste this url into a web browser. Go ahead and give that a try.

Alternatively, we can use functions from the `pandas` module that we imported at the top of this notebook to get these data. The standard way to use ```pandas``` is to import it with the shorthand ```pd```. We will use the ```pd.read_csv()``` function which will take the data that comes from the USGS url and make it into a `DataFrame`. A `DataFrame` is a data structure with columns of different data that are defined with column names.

In [6]:
## e.g., my_birthday_earthquakes = pd.read_csv(my_birthquake_url)


These data are sorted by magnitude with the largest magnitude earthquakes at top. Let's look at the first 5 rows of the DataFrame using the ```.head()``` function.

We can just look at values in one row by applying the ```.loc``` function to the DataFrame and calling the index 0. Python is zero-indexed (just like human ages!) so the first row is row zero. Please apply .loc to the first row to see all the details about my birthquake.

It can be useful to return a single value which can be done by calling both the the row and the column using the ```.loc``` function. Please apply .loc to the first row and the column 'mag' to see the magnitude of my birthquake.

What is the magnitude of your birthquake? Where did it occur? Use print statements to show your answers.

What is the largest earthquake of your birthquake day? Use '.sort_values()' function to sort the DataFrame and print the first row, which is the largest earthquake of your birthquake day.

When working with Python for this course, you are going to get errors. I get errors almost everyday. Programming languages are not flexible and want things to be just so. 

Error messages can look quite intimidating, but they often are informative (particularly if you look at the bottom). For example, the code cell below should result in an error. Go ahead and execute it and let's have a look at the result.

In [10]:
my_birthday_earthquakes.loc[0]['birthday_cake']

# Map projections and making your birthquake map

The purpose of this introduction is to give you a bit of a background on map projections and other geospatial concepts. This will help you to:
 
 * choose map projections that are appropriate for plotting data
 * understand the terms used in the ```cartopy``` functions which is a function library we will use for plotting geospatial data

# The world *is not* flat / 2D (sorry flat-Earthers)

<img src="figures/azim-eq.png" style="max-height: 55vh; margin-left: auto; margin-right: auto;">

"Azimuthal equidistant projections of the sphere ... have been co-opted as images of the flat Earth model, depicting Antarctica as an ice wall surrounding a disk-shaped Earth." ([Wikipedia: Flat Earth](https://en.wikipedia.org/wiki/Flat_Earth#Flat_Earth_Society))

## Most of our media for visualization *are* flat

Our two most common media are flat:

 * Paper
 * Screen

### But there are *a few* that aren't...

For example:

 * 3D rendering engine (the engine is then typically responsible for projecting the data to 2D for presentation to screen)
 * A Spherical Projector...


## [Map] Projections: Taking us from spherical to flat

A map projection (or more commonly refered to as just "projection") is:

> a systematic transformation of the latitudes and longitudes of locations from the surface of a sphere or an ellipsoid into locations on a plane. [[Wikipedia: Map projection](https://en.wikipedia.org/wiki/Map_projection)].

## The major problem with map projections

<img src="figures/orange_peel.jpg" style="margin-left: auto; margin-right: auto;">

 * The surface of a sphere is topologically different to a 2D surface, therefore we *have* to cut the sphere *somewhere*
 * A sphere's surface cannot be represented on a plane without distortion.
 
**Watch the video embedded below** (*click the notebook play button to embed it in the notebook or watch it at this link: https://youtu.be/kIID5FDi2JQ*). This video gives an introduction (with nice accompanying visualizations) of this issue and different projections along with the positives and negatives of different commonly used ones.

In [None]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('kIID5FDi2JQ')

### Different projections

We are going to use the function library `cartopy` to make maps. `cartopy`  supports a number of different map projections which enable the 3 dimensional surface of Earth to be shown in 2 dimensions on our computer screens. Having watched the above video will give you some context to appreciate these jokes:

<img src="figures/map_projections.png" style="margin-left: auto; margin-right: auto;">

You can check out the list of projections supported by `cartopy` here: https://scitools.org.uk/cartopy/docs/v0.15/crs/projections.html

### Common distortions of map projections

Properties of maps that are often not preserved in projections:

* Area
* Shape
* Direction
* Distance
* Scale

> all ~~models~~ map projections are wrong, but some are useful - Phileas Elson (SciPy 2018)

## Classifying projections

Two common approaches:

 1. By [2D] surface classification
 2. By preserving a given property (metric)

### Projections by surface classification

<!-- ![](./figures/projections.png) -->
<img src="figures/projections.png">

*Downside: Not all projections can be classified in this way -> Leads to big "pseudo" and "other" groups.*

## Surface classification: Cylindrical

<img src="figures/cylindrical.png">
<p style="font-size: xx-small; float: right;">
Source: http://ayresriverblog.com/2011/05/19/the-world-is-flat/
</p>


* Meridians and paralells are straight and perpendicular.


## Surface classification: Azimuthal

<img src="figures/azimuthal.png">
<p style="font-size: xx-small; float: right;">
Source: http://ayresriverblog.com/2011/05/19/the-world-is-flat/
</p>


* Parallels are complete circles
* Great circles from central point are straight lines.

## Surface classification: Conic
<img src="figures/conic.png">
<p style="font-size: xx-small; float: right;">
Source: http://ayresriverblog.com/2011/05/19/the-world-is-flat/
</p>

* Meridians are straight equally-spaced lines
* Parallels are circular arcs.

### Projections by preserving metric

Downside: Some projections can live in multiple groups.

## Preserving metric: Conformal

Also known as Orthomorphic.

These projections preserve angles locally. Implying that circles anywhere on the Earth's surface map to circles of *varying size* in the projected space.

Examples of conformal projections:

 * Mercator
 * Transverse Mercator
 * Stereographic
 * Lambert conformal conic

## Preserving metric: Conformal

### Use in large scale maps (zoomed in)

Often used to preserve shape to represent their physical counterpart.
Seamless online maps like OSM/Google/Bing typically use a Mercator projection although Google Maps has begun using an 3D-rendered globe projection when the user zooms out:

> The first launch of [Google] Maps actually did not use Mercator, and streets in high latitude places like Stockholm did not meet at right angles on the map the way they do in reality. [[ref](https://productforums.google.com/d/msg/maps/A2ygEJ5eG-o/KbZr_B0h2hkJ)]

The major drawback: it is difficult to compare lengths or areas

## Preserving metric: Conformal

### Use in small scale maps (zoomed out)

Maps reflecting directions, such as an [aero]nautical chart, or whose gradients are important,
such as a weather maps, are often projected by conformal projections.

Historically, many world maps are drawn by conformal projections, but the fact that the scale of the map
varies by location makes it difficult to compare lengths or areas.
Some have gone as far as calling the Mercator projection imperialistic and racist.


## Preserving metric: Equidistant

No map projection can be universally equidistant.

Some projections preserve distance from some standard point or line.

Examples of projections that preserve distances along meridians (but not parallels):

 * Equirectangular / Plate Carree
 * Azimuthal equidistant


## Preserving metric: Equal-area


Equal-area maps preserve area measure, generally distorting shapes in order to do so.

Examples of equal area projections:
 * Albers conic
 * Eckert IV
 * Goode's homolosine
 * Lambert azimuthal equal-area
 * Lambert cylindrical equal-area
 * Mollweide
 * Sinusoidal

## Preserving metric: Compromise

Rather than perfectly preserving any metric properties, compromise
projections aim strike a balance between distortions.
These compromises are often at the cost of polar distortions.

Examples:
    
 * Miller
 * Robinson
 * Winkel Tripel

## Tissot's indicatrix

A mathematical contrivance in order to characterize local distortions of a map projection. Multiple circles (on the sphere/ellipse) of constant area are drawn on the map. By analysing the distortions, we can identify (or more often rule-out) particular preserving metrics. You can see how dramatic the distortion is in an equirectangular projection. 


<img src="figures/tissot.platecarree.1000km.png" style="margin-left: auto; margin-right: auto;">


## Now let's make your first map!

We are going to use ```cartopy``` in conjunction with ```matplotlib``` to make maps. ```cartopy``` can transform points, lines and images into different map projections. ```matplotlib``` provides tools to visualize these projections. We will import them using the standard conventions. **You must press play (or more efficiently shift+enter) on the cell that imports these function libraries for the rest of the code to work.**

In [12]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt

The syntax of using these functions takes some getting used to. Here we will make a figure, create an axis object with a defined projection, and then plot coastlines and a stock image that shows elevation.

In [None]:
plt.figure(figsize=(8, 4))
ax = plt.axes(projection=ccrs.Mollweide())
ax.coastlines()
ax.stock_img()
plt.show()

Let's plot the location of Berkeley on a map. First we want to assign the latitude and longitude of Berkeley to variables:

In [14]:
Berkeley_latitude = 37.8715
Berkeley_longitude = -122.2730

Now we can use the ```plt.scatter()``` function to plot the location of Berkeley. ```plt.scatter()``` is one of the many plotting functions available using the ```matplotlib``` library that we imported above using ```import matplotlib.pyplot as plt```.

We give the ```plt.scatter()``` function ```Berkeley_longitude``` as the x-value, ```Berkeley_latitude``` as the y-value while also telling it to transform it into map coordinates (```transform=ccrs.PlateCarree()```) and to make the point red (```color='red'```). 

In [None]:
plt.figure(figsize=(8, 4))
ax = plt.axes(projection=ccrs.Mollweide())
ax.stock_img()
plt.scatter(Berkeley_longitude, Berkeley_latitude, 
            transform=ccrs.PlateCarree(), color='red')
plt.show()

Revisit the above notebook when you found your birthquake and enter the latitude of longitude in the cell below assigning them to ```birthquake_latitude``` and ```birthquake_longitude```.

In [None]:
birthquake_longitude = 
birthquake_latitude = 

Once ```birthquake_latitude``` and ```birthquake_longitude``` are defined, we can plot them instead of the position of Berkeley. Replace the ellipsis (`...`) with ```birthquake_latitude``` and ```birthquake_longitude``` in the cell below:

In [None]:
plt.figure(figsize=(8, 4))
ax = plt.axes(projection=ccrs.Mollweide())
ax.stock_img()
plt.scatter(..., ..., 
            transform=ccrs.PlateCarree(), color='red')
plt.show()

Now let's plot both the location of Berkeley and the location of your birthquake. Rather than a single values for x (i.e. a single value of longitude) and a single value for y (i.e. a single value of latitude), we want there to be a list of x values and a list of y values. A list can be defined with square brackets with values separated by commas (e.g. ```[value1, value2]```).

In [None]:
plt.figure(figsize=(8, 4))
ax = plt.axes(projection=ccrs.Robinson())
ax.stock_img()
plt.scatter(
    [Berkeley_longitude, birthquake_longitude],
    [Berkeley_latitude, birthquake_latitude],
    transform=ccrs.PlateCarree(),
    color='red')
plt.show()

We can save the figure using ```plt.savefig()``` putting the name of the file with the extension within the ```()```. In this case, let's call it ```'map_w_Berkeley_and_birthquake.png'```. Let's also go ahead and add a title to the plot using ```plt.title()```.

In [None]:
plt.figure(figsize=(8, 4))
ax = plt.axes(projection=ccrs.Robinson())
ax.stock_img()
plt.scatter(
    [Berkeley_longitude, birthquake_longitude],
    [Berkeley_latitude, birthquake_latitude],
    transform=ccrs.PlateCarree(),
    color='red')
plt.title('map with location of Berkeley and my Birthquake')
plt.savefig('map_w_Berkeley_and_birthquake.png')

## Make a map of 5 largest birthdate earthquakes.

Use the code cells below to make another map where you plot the locations of the 5 largest magnitude earthquakes that occured on the day you were born. Choose any projection you want (https://scitools.org.uk/cartopy/docs/v0.15/crs/projections.html). You can see that the examples above use Robinson and Mollweide projections.

Add a title to the map that has your name in it.

# Acknowledgments

This introductory text is modified by Prof. Nicholas Swanson-Hysell from a tutorial on working with geospatial data using the library ```cartopy``` that was presented at the 2018 Scipy conference by Phileas Elson (lots of great things to learn in this tutorial if you want to dig into it at some point):

https://youtu.be/AmidIx6Jmn8

https://github.com/SciTools/cartopy-tutorial

The materials in the linked to tutorials were licensed with an open license as long as they original source is acknowledged.