# Project Lab 2: Part 1
## Core Functions and Initial Figures
___ 



### A) Download the necessary data files
    
Download the following data files from the `Data` folder inside the wk13 lab folder on Canvas and put them in a local or hub folder called `Data` in your own working folder for the project lab.  
- `IRIS_eq_010100_112422_mag4.csv`
- `m_coasts.csv`
- `all_boundaries.csv`

See section **2. Project Lab 2 Data** in the `project_lab2_background.ipynb` reading for file format details and how to provide the appropriate path (e.g. `./Data/m_coasts.csv`) to load them properly.


___ 

### B) Make a module `earthquake_fns`

You will create a new python (`.py`) file called `earthquake_fns.py` which contains the four functions detailed below. 

<!-- Your python file *must* contain the following `import` statements: 

```python
import numpy as np
from datetime import datetime  # to parse dates and times
import pandas as pd            # read, write and rearrange tabular datasets
``` -->

Make sure that your `.py` file can be imported and run without error by importing the file as a module from a new notebook. You will use this module for the remainder of the project lab and will add additional functions to it in **Part 2**. 
> e.g. `import earthquake_fns as quake` will import your module, naming it `quake` in your notebook, so you can call e.g. `get_coastlines` with `quake.get_coastlines(...)`.


<div class="alert alert-block alert-info">
    
**Note**: There are many lines of code that you can use to help you write the functions involving reading / manipulating `DataFrame` objects in the Week 13 notebook `wk13_pandas_getfamiliar.ipynb`.

</div> 

 

### function 1:  `get_coastlines`

**Code:** The body of `get_coastlines` should 

- take as input a `.csv` file containing two columns (column 1: longitudes of the world's coastlines, and column 2: latitudes of the world's coastlines) and read these columns into individual 1D arrays. 
- The function should also raise an `IOError` and exit with a helpful error message if an exception is encountered inside of the function. 

<!-- Note:  the file contains pairs `NaN` values to break up different coastlines -->

**Usage:**
```python
lon_coast, lat_coast = get_coastlines(coasts_file)  
```
**Inputs:** 
> -  `coasts_file`:  file containing coastline lon, lat coordinates:  column 1 = longitudes and column 2 = latitudes.

**Outputs:** 
> - *lon_coast*: an array of longitudes along world coastlines.
> - *lat_coast*: an array of latitudes along world coastlines.

***Note**: all inputs and outputs must be provided in the exact order as above. Output variable names in italics are suggested names.* 

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import earthquake_fns as eq

___ 

### function 2:  `get_plate_boundaries`

**Code:** The body of `get_plate_boundaries` should 

- take as input a `.csv` file containing three columns (column 1: plate boundary name abbreviations, column 2: latitudes in degrees, column 3: longitudes in degrees) and read these columns into a dictionary. Each tectonic plate will have a key in the dictionary (the key should be the plate's name abbreviation), and the associated value for each key should be a $N\times2$ array  with longitudes in the first column and latitudes in the second column. 
- The function should also raise an `IOError` and exit with a helpful error message if an exception is encountered inside of the function. 

**Usage:**
```python
pb_dict = get_plate_boundaries(plates_file)  # returns a dictionary
```
**Inputs:** 
-  `plates_file`:  file containing plate boundary names, latitudes and longitudes.

**Outputs:** 
- *pb_dict*: a dictionary with a key-value pair for each unique tectonic plate in the file.
> - The keys for the dictionary should be the tectonic plate name abbreviation from the first column of the file.
> - The corresponding value for the key should be a $N \times 2$ array with longitudes for the plate in the first column, and latitudes in the second column. 

 

___ 

### function 3: `get_earthquakes`

**Code:** The body of `get_earthquakes` should 

- read an input `.csv` file containing an arbitrary number of columns and rows into a `DataFrame` and return it.
- The function should also raise an `IOError` and exit with a helpful error message if an exception is encountered inside of the function. 


**Usage:**
```python
earthquakes = get_earthquakes(filename)  # returns a DataFrame
```
**Inputs:** 
-  `filename`:  path to a `.csv` file. In practice this will be the IRIS earthquakes file. 

**Outputs:** 
- *earthquakes*: a `DataFrame` containing the contents of the `.csv` file


*Note*: This function will be reusable for any problem involving a loading a `.csv` and outputting a `DataFrame` object. *Output variable name in italics is suggested name.* 



___ 

### function 4:  `parse_earthquakes_to_np`

**Code:** The body of `parse_earthquakes_to_np` should 

- take an input `DataFrame` containing earthquake data and extract the columns `Latitude`, `Longitude`, `Depth`, `Magnitude` into individual 1D arrays. 
- also extract the column `Time`, convert each time in times into `datetime` objects, and return a 1D array which contains the `datetime` objects converted from ISO format (see notebook `wk13_pandas_getfamiliar.ipynb` which shows how to do this if you are stuck). 

**Usage:**
```python
lats, lons, depths, magnitudes, times = parse_earthquakes_to_np(df)  
```
**Inputs:** 
-  `df`:  a `DataFrame` containing earthquake data.  This must contain the columns `Latitude`, `Longitude`, `Depth`, `Magnitude` and `Time` (may also contain others that are not parsed here). 

**Outputs:** 
- *lats*: numpy array of floats containing earthquake latitudes in degrees
- *lons*: numpy array of floats containing earthquake longitudes in degrees
- *depths*: numpy array of floats containing earthquake depths in km
- *magnitudes*: numpy array of floats containing earthquake magnitudes (unitless)
- *times*: numpy array of **datetime objects** containing earthquake dates/times

***Note**: all inputs and outputs must be provided in the exact order as above. Output variable names in italics are suggested names.* 

___ 

## C. Global Seismicity Characteristics:

### C1. Preliminaries:
- Open a new notebook and import your module `earthquake_fns`

- Using the functions in `earthquake_fns`: 
> - Load the coastlines data set using `load_coastlines`.
> - Load the plate boundaries data set using `load_plate_boundaries`.
> - Load the entire earthquake data set into a `DataFrame` using `load_earthquakes`.
> - Parse the `DataFrame` into arrays `lats`, `lons`, `depths`, `magnitudes`, `times` using `parse_earthquakes_to_np`.

<br>


- Write code to find the **largest magnitude** earthquake recorded this century. Identify and save the magnitude, date/time, latitude, longitude, depth of this earthquake in a new variable.  A dictionary would be a good data type to use to store the information about location, time, magnitude, depth that you find. You can use either numpy or pandas to accomplish this. Print your answer to your screen using a formatted string. 

- Similarly, find and save the magnitude, date/time, latitude, longitude, and depth of the **deepest** earthquake this century. Print your answer to your screen using a formatted string. 

In [5]:
longitudes, latitudes = eq.get_coastlines("./m_coasts.csv")
pb_dict = eq.get_plate_boundaries("./all_boundaries.csv")
earthquakes = eq.get_earthquakes("./IRIS_eq_010100_112422_mag4.csv")
lats, lons, depths, magnitudes, times = eq.parse_earthquakes_to_np(earthquakes)

In [6]:
largest_magnitude_index = np.argmax(magnitudes)
largest_magnitude = {
    'magnitude': magnitudes[largest_magnitude_index],
    'date/time': times[largest_magnitude_index],
    'latitude': lats[largest_magnitude_index],
    'longitude': lons[largest_magnitude_index] ,
    'depth': depths[largest_magnitude_index],
}

print(f"Largest Magnitude: {largest_magnitude['magnitude']}\n", 
      f"Date/Time: {largest_magnitude['date/time']}\n",
      f"Latitude:{largest_magnitude['latitude']}\n",
      f"Longitude: {largest_magnitude['longitude']}\n",
      f"Depths: {largest_magnitude['depth']}")


Largest Magnitude: 9.1
 Date/Time: 2011-03-11 05:46:23
 Latitude:38.2963
 Longitude: 142.498
 Depths: 19.7


### C2.  Make a Global Map: 

- Make a map of the locations of earthquakes listed in the last 2500 lines of the file.
> -  Plot the earthquakes as gray filled dots.
> -  Make sure your map has a 2:1 aspect ratio (because there are 360° of longitude and 180° of latitude) so that geography looks nice.

- Add the world coastlines and plate boundaries to the map.

- Add the locations of the largest magnitude and deepest quakes as distinct symbols/colors.

- As usual make sure your map is scientifically useful (e.g. annotate information on the deepest, largest quakes) and aesthetically pleasing.

- Find the dates spanned by the last 2500 lines in the data frame and give your plot a title indicating these e.g. `Seismicity from XX to YY`, where `XX` and `YY` are the dates you found.  See background reading notes in `project_lab2_background.pdf`: section **5. Coding notes** for converting datetime objects to strings.  



In [None]:
last_2500_lats = lats[-2500:]
last_2500_lons = lons[-2500:]
last_2500_depths = depths[-2500:]
last_2500_magnitudes = magnitudes[-2500:]
last_2500_times = times[-2500:]

largest_magnitude_index_2 = np.argmax(last_2500_magnitudes)
deepest_depth_index = np.argmax(last_2500_depths)

largest_magnitude_lat = last_2500_lats[largest_magnitude_index_2]
largest_magnitude_lon = last_2500_lons[largest_magnitude_index_2]

deepest_depth_lat = last_2500_lats[deepest_depth_index]
deepest_depth_lon = last_2500_lons[deepest_depth_index]

start_date = last_2500_times.min().strftime('%Y-%m-%d %H:%M:%S')
end_date = last_2500_times.max().strftime('%Y-%m-%d %H:%M:%S')

fig, ax = plt.subplots(figsize=(10, 5))


ax.plot(longitudes, latitudes, color='black', linewidth=0.5)


for plate_name, plate_data in pb_dict.items():
    ax.plot(plate_data[:, 0], plate_data[:, 1], label=plate_name, linewidth=1)
#the reason is because it's creating a line when it goes from 180 to -180 or vice versa: how do I fix this?? lmfao?? 

ax.scatter(last_2500_lons, last_2500_lats, c='gray', s=20, edgecolors='none', alpha=0.5) #earthquake locations

ax.scatter(largest_magnitude_lon, largest_magnitude_lat, c='red', marker='*', s=100, label='Largest Magnitude') 
ax.scatter(deepest_depth_lon, deepest_depth_lat, c='blue', marker='^', s=100, label='Deepest Depth')
plt.title(f"Seismicity from {start_date} to {end_date}")
plt.legend(fontsize='small', loc='upper center', bbox_to_anchor=(0.5, -0.15), fancybox=True, shadow=True, ncol=3)

ax.set_xlim(-180, 180)
ax.set_ylim(-90, 90)


plt.show()



### C3: Earthquakes in 2022: 

- Make second dataframe `df_2022` containing only earthquakes from 2022. (Consult `wk13_pandas_get_familiar.ipynb` for how to subselect from a dataframe). 

- Make a figure with 2 subplots side-by-side (1 row, 2 columns). 
> - Subplot 1: plot a histogram of quake depths in 2022.
> - Subplot 2: plot a histogram of quake magnitudes in 2022.
>
> *Tip*: Experiment with the number of bins in the histogram and choose one that you think results in a nice figure.


- For both subplots, mark a vertical dashed line that indicates the biggest and deepest quake for the entire dataset that you found in Section **C1** above and label it accordingly.

<div class="alert alert-block alert-success">

# Submission Instructions

###  You should submit: 
- Your python file `earthquake_fns` that is the module described in Part 1
    
- A single Jupyter notebook with ONLY 2 cells with contents as follows
> - Cell 1: A markdown cell with your names and student numbers
> - Cell 2: A code cell with your import statements, function calls, and code to produce the requested figures / analyses for **Section C**. 

- A `.pdf` file of your Jupyter Notebook which includes any output requested (figures, print statements, etc.)

</div>