# Day 5

## Part 1. Xarray and its application

Xarray extends the capabilities of NumPy by providing a data structure for labeled, multi-dimensional arrays. The two main data structures in Xarray are:


- DataArray

- Dataset

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

xr.set_options(keep_attrs=True, display_expand_data=False)
np.set_printoptions(threshold=10, edgeitems=2)

Xarray Data Structures
Xarray provides two core data structures:

 - DataArray: A single multi-dimensional array with labeled dimensions, coordinates, and metadata.

 - Dataset: A collection of DataArray objects, each corresponding to a variable, sharing the same dimensions and coordinates.

### Loading a Dataset
- Understand the basic data structures in Xarray
- Inspect DataArray and Dataset objects.
- Read and write netCDF files using Xarray.
- Understand that there are many packages that build on top of xarray

In [None]:
ds = xr.tutorial.load_dataset("air_temperature")
ds

In [None]:
# pull out "air" dataarray with dictionary syntax
ds["air"]

In [None]:
# pull out dataarray using dot notation
ds.air

In [None]:
da = ds.air

da.name

In [None]:
da.dims

In [None]:
# extracting coordinate variables
da.lon

In [None]:
# extracting coordinate variables from .coords
da.coords["lon"]

In [None]:
# Data extractions 
da.data

In [None]:
# what is the type of the underlying data
type(da.data)

## Part 2. Projections


A map projection (or more commonly refered to as just “projection”) is:

a systematic transformation of the latitudes and longitudes of locations from the surface of a sphere or an ellipsoid into locations on a plane.\n

In Cartopy, each projection is a class. Most classes of projection can be configured in projection-specific ways, although Cartopy takes an opinionated stance on sensible defaults.

We need cartopy’s crs module. This is typically imported as ccrs (Cartopy Coordinate Reference Systems). CRS



In [None]:
import cartopy.crs as ccrs
import cartopy

In [None]:
ccrs.PlateCarree()

Drawing a map
Cartopy optionally depends upon matplotlib, and each projection knows how to create a matplotlib Axes (or AxesSubplot) that can represent itself.

In [None]:

import matplotlib.pyplot as plt

plt.axes(projection=ccrs.PlateCarree())

In [None]:
plt.figure()
ax = plt.axes(projection=ccrs.PlateCarree())
ax.coastlines()

In [None]:
# That was a little underwhelming, but we can see that the Axes created is indeed one of those GeoAxes[Subplot] instances.

# One of the most useful methods that this class adds on top of the standard matplotlib Axes class is the coastlines method. With no arguments, it will add the Natural Earth 1:110,000,000 scale coastline data to the map.

In [None]:
fig, ax = plt.subplots(subplot_kw={'projection': ccrs.PlateCarree()})
ax.coastlines()

In [None]:
ccrs.PlateCarree?

In [None]:
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=180))
ax.coastlines()

Examples of Different Projections

In [None]:
projections = [ccrs.PlateCarree(),
               ccrs.Robinson(),
               ccrs.Mercator(),
               ccrs.Orthographic(),
               ccrs.InterruptedGoodeHomolosine()
              ]


for proj in projections:
    plt.figure()
    ax = plt.axes(projection=proj)
    ax.stock_img()
    ax.coastlines()
    ax.set_title(f'{type(proj)}')

## Part 3. Geopandas

This quick tutorial introduces the key concepts and basic features of GeoPandas to help you get started with your projects.


In [None]:
import geopandas
from geodatasets import get_path

path_to_data = get_path("nybb")
gdf = geopandas.read_file(path_to_data)

gdf


In [None]:
# writing files 
gdf.to_file("my_file.geojson", driver="GeoJSON")

## Simple accessors and methods
Now we have our GeoDataFrame and can start working with its geometry.

In [None]:
# Measuring area
# To measure the area of each polygon

gdf = gdf.set_index("BoroName")

In [None]:
gdf["area"] = gdf.area
gdf["area"]

In [None]:
# Getting polygon boundary and centroid
gdf["boundary"] = gdf.boundary
gdf["boundary"]

In [None]:
# We can also create new geometries, which could be, for example, a buffered version of the original one (i.e., GeoDataFrame.buffer(10)) or its centroid:gdf["centroid"] = gdf.centroid
gdf["centroid"] = gdf.centroid
gdf["centroid"]

In [None]:
# Measuring distance
first_point = gdf["centroid"].iloc[0]
gdf["distance"] = gdf["centroid"].distance(first_point)
gdf["distance"]

In [None]:
# For example, to calculate the average of the distances measured above, access the ‘distance’ column and call the mean() method on it:

gdf["distance"].mean()

In [None]:
# Making maps

gdf.plot("area", legend=True)

In [None]:
gdf = gdf.set_geometry("centroid")
gdf.plot("area", legend=True)

## Part 5. Air quality Analysis

Xarray extends the capabilities of NumPy by providing a data structure for labeled, multi-dimensional arrays. The two main data structures in Xarray are:


- DataArray

- Dataset