# Project Lab 2:  Background and Useful Info
___ 

## 1. Use of `pandas`: 

In the project lab you will load a data file which contains recorded earthquake data, and extract from it some columns with information on earthquake location, depth, magnitude and the time of occurence.  

The `pandas` library allows you to do many things we have learned how to to with `numpy` and `matplotlib` using different syntax. `pandas` excels with tabular data and the manipulation of large datasets. `pandas` comes with many helpful file IO tools and the data are stored as a data objects called a `DataFrame`. Each column in the dataframe is a called a `Series`. 

To avoid confusion with too much new syntax that is *only* applicable to `pandas` we will use this library mainly to read in and extract tabular data, as well as writing your own `DataFrame` with modified contents back out to a `.csv` file.

<div class="alert alert-block alert-success">

> Make sure to read the notebook `pl2_pandas_get_familiar.ipynb` before attempting PL2.

> We will go over pandas in class Tues/Thurs of week 13.  File i/o for .csv files is also described in the week 12 notebook on file i/o.
    
</div>

___ 

## 2. Project Lab 2 Data:

#### Overview of data files: 
There are three data files for this lab. Earthquake data for this lab were retrieved from **IRIS**: [&nearr; Incorporated Research Institutions for Seismology](http://ds.iris.edu/ieb/index.html?format=text&nodata=404&starttime=1970-01-01&endtime=2025-01-01&minmag=0&maxmag=10&mindepth=0&maxdepth=900&orderby=time-desc&src=&limit=1000&maxlat=80.56&minlat=-80.56&maxlon=180.00&minlon=-180.00&zm=2&mt=ter)  


**1. Earthquake Data**

The file `'IRIS_eq_010100_112422_mag4.csv'` contains earthquake data for earthquakes with magnitudes 4 or above ($M \geq 4$) that have occurred globally in the 21st century, i.e. between `2000-01-01` and `2022-11-24`.
> **NOTE:  this is NOT the same file as in the `pl2_pandas_getfamiliar.ipynb`.**  It contains a lot more data. In the file, latitudes and longitudes are in degrees, depths are in kilometers, magnitudes are unitless, and times are a string written in what is known as ISO format. The time strings contain `year-month-day`, then the letter `T` then time in `hr-min-sec`, e.g. "2000-10-31T01:30:00.000-05:00".


**2. Coastline Data**

The coastlines file `m_coasts.csv` contains two columns: 
> - Column 1: longitudes of the world's coastlines.  
> - Column 2: latitudes of the world's coastlines.  
        
**Note**: This file contains pairs of `NaN` values to break up different coastlines, so world coastlines can be plotted in one line with: `plt.plot(lon_coast, lat_coast)`


**3. Tectonic Plates Data**

The plate boundaries file `all_boundaries.csv`. This contains information on the positions of the boundaries of major tectonic plates on the Earth.  There are 3 comma-separated columns:
> - Column 1: plate boundary name (abbreviations).
> - Column 2: latitude in degrees.
> - Column 3: longitude in degrees.


#### Instructions for download and organization: 
Download the data files above from the `wk13_project_lab` folder on Canvas and put them in a local or hub folder called `Data` in whatever folder you are using to work on project lab 2. As an example, your file structure should look something like this:  
```
project_lab2/
    |
    |__ Data/
    |   |__ m_coasts.csv
    |   |__ all_boundaries.csv
    |   |__ IRIS_eq_010100_112422_mag4.csv
    |
    |__ Info/
    |   |__ project_lab2_overview.pdf
    |   |__ project_lab2_background.ipynb
    |   |__ project_lab2_part1_instructions.ipynb
    |   |__ project_lab2_part2_instructions.ipynb
    |
    |__ project_lab2_part1.ipynb
    |__ project_lab2_part2.ipynb
    |__ earthquake_fns.py
```

From the folder you are working in (called `project_lab2/` in the example above), you will access your files e.g. using the relative path to your file `"./Data/IRIS_eq_010100_112422_mag4.csv"`.  In the example above, I saved all of the lab instructions in a folder named `Info/` and the main level folder only contains Jupyter Notebooks for Parts 1 and 2, as well as the `earthquake_fns` module. .

You must supply your file IO functions with a *path* to the file that you are attempting to load. The file path includes information on what the name of the file is (filename), and path to where the file is stored (the folder, or directory). 
> `./Data/` is the folder path which the python interpreter will use to search for the file named `IRIS_eq_010100_112422_mag4.csv`.  
> 
> In more detail: `./Data/IRIS_eq_010100_112422_mag4.csv` means *in the current folder/directory (`./`) look for a folder/directory called `Data` and inside that directory (`/`) find the file `IRIS_eq_010100_112422_mag4.csv`*.



___ 

## 3:  For Part 2:  Geometry of subducting slabs

The seismicity in the region of interest for the Project Lab occurs in a *subduction zone*, where one tectonic plate (the strong outer part of the rocky planet comprising the crust and upper part of the mantle) subducts beneath another plate. 

**Figure 1** below (image from the USGS) shows the subduction zone that we live on: the Cascadia subduction zone, which stretches from northern Vancouver Island to California.  As the two tectonic plates slide past each other, earthquakes can occur in the *subducting plate* (pink dots, Fig 1), the *overriding plate* (yellow dots, Fig 1), and especially at the interface between them (red dots, Fig. 1).



<center><img src="./Cascadia_earthquake_sources.png"/></center>

**Figure 1:** Schematic showing the geometry of the Cascadia subduction zone, and the sources of different earthquakes generated by the motion of the subducting plate. 





___ 
## 4.  For Part 2:  Concepts from Earlier Labs

In Part 2 we will use two concepts from earlier labs:

### Distance along a line of constant latitude on a sphere:     
1. Recall from the Week 4 lab that the *length* of one degree of longitude, $\phi$, at some latitude, $\lambda$, is:

$$ L_{\phi} = \frac{2 \pi R \cos{\lambda}}{360} $$

where $R$ is the radius of the Earth (6371 km). 

To calculate distance, $d$, relative to a reference longitude, $\phi_1$ along a line of constant latitude (in the week 4 lab this was relative to Vancouver) we need the difference between a given second longitude, $\phi_2$, and the longitude of the reference point, $\Delta \phi =  (\phi_2 - \phi_1)$. The distance in units of length along the surface between $\phi_1$ and $\phi_2$ is then  $d = L_{\phi} \cdot  \Delta\phi$

### Slope of a line segment:     
2. Recall from the week 6 lab that the slope of a line segment is given by:

$$ m = \frac{rise}{run} \equiv \frac{\Delta y}{\Delta x}$$
    
We can convert this value of slope to an angle using `np.arctan(m)`.  Remember this will be in radians so we can get a slope in degrees using `np.rad2deg()`. 


___

## 5. Coding Notes

### Extracting strings from datetime objects

#### Using `time.strftime` to extract strings from `datetime` objects:  
This can be very useful for labeling plots. 

e.g., This code will write a specific date to the string variable mydate: `my_date = dtimes[ii].strftime("%m%d%Y")`, where `ii` is the index of the datetime object you want to extract.  `my_date` can then be used in plot titles, annotations etc.

For example: 

In [1]:
from datetime import datetime

# get a datetime object that represents the date and time right *now*
now = datetime.now()

# print the type of now and the value
print(type(now), now)

# convert the datetime object to a formatted string: 
now_str = now.strftime('%m-%d-%Y')
print(type(now_str), now_str)

<class 'datetime.datetime'> 2023-11-25 23:09:52.855575
<class 'str'> 11-25-2023
