# Class 7: Data Applications (Continued)

Here, we take a different source of data (same satellite) and look at a different field of study or application.

### 7.1 Semester-Long Project

Remember that we have used DISCVR's solar wind products to develop a "monitor" plot of the data that allows us to see when certain events occur. Now, we will use a different instrument from DISCVR to see an Earth Science application.

### 7.1.1 Goal: Obtain Data

---

1. Obtain cloud data for the period of the eclipse in 2017 where DISCOVR imagery showed the shadow of the moon on the face of the Earth.

### 7.1.1-aside

__Note:__ We will be using three new packages: `xarray`, `netCDF4`, and `cartopy` that do not come bundled in Anaconda.

In [None]:
# Install a conda package in the current Jupyter kernal
import sys
!conda install --yes --prefix {sys.prefix} xarray netCDF4 cartopy

If this doesn't work, I recommend quitting the current Jupyter session and running the following from the command line (within the Anaconda environment):

``conda install xarray netCDF4 cartopy``

and follow the prompts to install it.  

Reference: [Link](http://jakevdp.github.io/blog/2017/12/05/installing-python-packages-from-jupyter/)

Continuing to download the data.

In [None]:
import xarray as xr
%matplotlib inline

data = xr.open_dataset(
    'https://opendap.larc.nasa.gov:443/opendap/DSCOVR/EPIC/AER/2017/08/DSCOVR_EPIC_L2_AER_01_20170821174450_02.he5'
)

What this has done, is go to an OpenDAP webserver and referenced the data file found there. Most of the time, Earth scientists work with common data types of ASCII, binary or hierarchical data types like HDF or netCDF. Here, this is a HDF EOS-type file that is like a normal HDF file, but it contains EOS standards. Let's see the structure of this data.

In [None]:
data

We notice that the data has variables, but with xarray, it automatically detects certain coordinates such as latitude, longitude and time. We can physically grab the data using xarray/pandas (pandas is a great package built on top of NumPy) through the `load` function. Here, let's download the Cloud Optical Depth data.

### 7.1.2 Obtain Specific Data

In [None]:
cloud_optical_depth = data['CloudOpticalDepth']
cloud_optical_depth = cloud_optical_depth.load()
cloud_optical_depth.shape

Notice the difference in the kernal timing of executing that cell? That's because we are connecting to the server, downloading the file, and specifically selecting that variable data to download. Let's see quickly through xarray what this data looks like.

### 7.1.3 Visualize the Data

In [None]:
cloud_optical_depth.plot()

Let's modify this plot so we can get a better view with a larger image.

In [None]:
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [20, 10]

In [None]:
cloud_optical_depth.plot()

Something doesn't look right here. Can you see what is wrong?

### 7.2.1 Manipulate Data

---

Here, we will work with the Cloud Optical Depth data for more insight.

In [None]:
cloud_optical_depth = cloud_optical_depth.transpose()

In [None]:
cloud_optical_depth.plot()

Let's also fix those coordinates by assigning the proper values to this dataset:

In [None]:
cloud_optical_depth['XDim'] = -180 + cloud_optical_depth.XDim/len(cloud_optical_depth.XDim)*360.
cloud_optical_depth['YDim'] = -90 + cloud_optical_depth.YDim/len(cloud_optical_depth.YDim)*180.

In [None]:
cloud_optical_depth.transpose().plot(robust=True)

In [None]:
cloud_optical_depth.max()

In [None]:
cloud_optical_depth.isel(XDim=20)[500:600].to_dataframe()

After some testing and such, here is the code to "see" the eclipse:

In [None]:
eclipse = []

eclipse.append(xr.open_dataset(
    'https://opendap.larc.nasa.gov:443/opendap/DSCOVR/EPIC/AER/2017/08/DSCOVR_EPIC_L2_AER_01_20170821171450_02.he5'
)['CloudFraction'].load().transpose())
eclipse.append(xr.open_dataset(
    'https://opendap.larc.nasa.gov:443/opendap/DSCOVR/EPIC/AER/2017/08/DSCOVR_EPIC_L2_AER_01_20170821184450_02.he5'
)['CloudFraction'].load().transpose())
eclipse.append(xr.open_dataset(
    'https://opendap.larc.nasa.gov:443/opendap/DSCOVR/EPIC/AER/2017/08/DSCOVR_EPIC_L2_AER_01_20170821191450_02.he5'
)['CloudFraction'].load().transpose())
eclipse.append(xr.open_dataset(
    'https://opendap.larc.nasa.gov:443/opendap/DSCOVR/EPIC/AER/2017/08/DSCOVR_EPIC_L2_AER_01_20170821194450_02.he5'
)['CloudFraction'].load().transpose())

In [None]:
fig, axes = plt.subplots(ncols=4)
eclipse[0].plot(robust=True, cmap=plt.cm.Blues_r,yincrease=False,ax=axes[0])
eclipse[1].plot(robust=True, cmap=plt.cm.Blues_r,yincrease=False,ax=axes[1])
eclipse[2].plot(robust=True, cmap=plt.cm.Blues_r,yincrease=False,ax=axes[2])
eclipse[3].plot(robust=True, cmap=plt.cm.Blues_r,yincrease=False,ax=axes[3])