# Scatter Plot

A scatter plot typically displays two variables for a set of data. The resulting pattern can visualize statistical correlations in the data.

In geoscience, a common application for scatter plots is weather station data, or any other geo-located data collection system such as tide guages. Stations have discrete and non-continuous data values, which makes them a good candidate for these kinds of plots. 

In [None]:
import xarray as xr
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt

import geocat.datafiles as gdf

## Get Station Dataset

In [None]:
# Use Xarray and GeoCAT datafiles to pull up a sample station dataset
ds = xr.open_dataset(gdf.get("netcdf_files/95031800_sao.cdf"),
                     decode_times=False)

# Get station lat and lon data, the `.isel()` here filters out NaNs.
lat = ds.lat.isel()
lon = ds.lon.isel()

## Simple Scatter Plot

The simplest way to create any scatter plot is simply to call up `ax.scatter(x, y)`.

### Base Case

In [None]:
# Generate figure (set its size (width, height) in inches)
fig, ax = plt.subplots(figsize=(8, 6))

# Scatter-plot the location data
ax.scatter(lon, lat);

### Simple Customizations

In this example we demonstrate:
- setting the marker size with `s=10`,
- the marker color with `c='blue'`,
- the marker style with `marker='x'`,
- and the marker linewidth with `linewidth=0.5`.

Check out [Matplotlib's `scatter` documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html) to see all the keyword argument customization options available to you.

In [None]:
# Generate figure (set its size (width, height) in inches)
fig, ax = plt.subplots(figsize=(8, 6))

# Scatter-plot the location data
ax.scatter(lon, lat, s=10, c='blue', marker='+', linewidth=0.5);

## Scatter Plot with Cartopy

The above plot accurately displays the data, but it doesn't look how we'd expect knowing that this is lat/lon data. To make our scatter plot on a map, we use Cartopy and select our desired projection.

In the below example, we demonstrate adding Cartopy Plate Carree axes, land features, and lat-lon gridlines.

One thing to note here, is the addition of the `zorder` kwarg in our land feature creation and scatter plot. If we're looking down at an x-y plot, the z axis points out of the screen, and `zorder` refers to how you'd like to stack these image layers. The lower the `zorder`, the closer to the back, and the higher the `zorder` the closer to the front. This ensures that our scatter points are not underneath our gridlines or land feature.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c='blue', marker='+', linewidth=0.5, zorder=1)

plt.title("Station Locations");

### Cartopy Customization

You can change the way your Cartopy plot looks with the inclusion of `edgecolor` and `facecolor` specifications.  Light gray tends to look nice.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                edgecolor='lightgray',
                facecolor='lightgray',
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c='blue', marker='+', linewidth=0.5, zorder=1)

plt.title("Station Locations");

## Coded Scatter Plot 

The above figure is a good representation of where our weather stations are. By coding the datapoints we are able to display one more variable of data. Typically scatter plots are coded to either vary the size or color of points. 

Let's grab the station's maximum temperature data and add that to our plot.

In [None]:
tmax = ds.Tmax.isel()

### Default Colorbar

In the following plot, our scatter plot color kwarg is no longer set to blue, but to a variable: `c=tmax`, using the default colormap "viridis".

We also here add a default colorbar.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                edgecolor='lightgray',
                facecolor='lightgray',
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c=tmax, marker='+', linewidth=1, zorder=1)

plt.xlim(-170,-50)
plt.ylim(0)

c = plt.colorbar()

### Colorbar Customization

Now we've added a `cmap` kwarg to specify the colormap we'd like to plot our temperature on (almost any colormap that you want can be reversed by adding `_r` to its name).

We shrunk the colorbar so that it wouldn't be taller than the plot and added a label. `$^\circ$` is used to add a degree symbol to a string in Python.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                edgecolor='lightgray',
                facecolor='lightgray',
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c=tmax, cmap='hot_r', marker='+', linewidth=1, zorder=1)

plt.xlim(-170,-50)
plt.ylim(0)

c = plt.colorbar(shrink=0.75, label='Maximum Temperature ($^\circ$C)')

## Finishing Touches

### Grid Lines

When adding gridlines, we want to again engage the `zorder` keyword argument. Gridlines tend to look best above the land featurees but under the plotted data, so we bumped the scatter plot's `zorder` up to 2, left the land features at 0, and added gridlines at a `zorder` of 1.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                edgecolor='lightgray',
                facecolor='lightgray',
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c=tmax, cmap='hot_r', marker='+', linewidth=1, zorder=2)

plt.xlim(-170,-50)
plt.ylim(0)

c = plt.colorbar(shrink=0.75, label='Maximum Temperature ($^\circ$C)')

gl = ax.gridlines(draw_labels=True, zorder=1)

### Titles and Labels

All plots should have an informative title. In this final step we add a title to our plot and turn off the redundant top and right gridlabels.

In [None]:
# Generate figure (set its size (width, height) in inches) and axes using Cartopy projection
fig = plt.figure(figsize=(8, 6))

# Generate axes using Cartopy
ax = plt.axes(projection=ccrs.PlateCarree())

# Turn on continent shading
ax.add_feature(cfeature.LAND,
                edgecolor='lightgray',
                facecolor='lightgray',
                zorder=0)

# Scatter-plot the location data on the map
plt.scatter(lon, lat, s=10, c=tmax, cmap='hot_r', marker='+', linewidth=1, zorder=2)

plt.xlim(-170,-50)
plt.ylim(0)

c = plt.colorbar(shrink=0.75, label='Maximum Temperature ($^\circ$C)')

gl = ax.gridlines(draw_labels=True, zorder=1)
gl.top_labels = False
gl.right_labels = False

plt.title("Station Locations");