# **2. PyData and PyGMT - plotting data on maps**

In this tutorial, we'll be using tools from the [PyData](https://pydata.org) ecosystem's stack
to load in point and grid datasets via
[pandas](https://pandas.pydata.org) and [xarray](http://xarray.pydata.org) respectively.
We will then learn how to plot these on a map via [PyGMT](https://www.pygmt.org).
Those familiar with [matplotlib](https://matplotlib.org) will find the syntax to be rather similar,
but [PyGMT](https://www.pygmt.org/dev/overview) has a greater focus on producing high quality vector graphic maps
for publications, posters, talks, etc that is more suited for a geospatial audience.

## **Getting started**

We'll start by importing the Python libraries we'll be using.
Following common conventions, we'll abbreviate
[pandas](https://pandas.pydata.org) as `pd` and
[xarray](http://xarray.pydata.org) as `xr`.

In [None]:
import pandas as pd
import pygmt
import xarray as xr

## **2.1 Loading and plotting points**

Now let's have a go at plotting our earthquake points on a map!
GMT really shines when it comes to plotting data on a map.

We'll use pandas to read in a CSV file of historical earthquakes from October 2018 to 2019
directly via the [GeoNet Quake Search](https://quakesearch.geonet.org.nz) API kindly provided
by the New Zealand GeoNet project and its sponsors EQC, GNS Science and LINZ.
Alternatively, you can read in your own csv file by passing in the path to a local file.

In [None]:
quakes = pd.read_csv(
    "https://quakesearch.geonet.org.nz/csv?minmag=4&bbox=160,-50,185,-10&startdate=2018-10-01T0:00:00&enddate=2019-10-31T1:00:00",
    skipinitialspace=True
)

We can preview the first few columns of the pandas.DataFrame
to see what the earthquake data looks like.

In [None]:
quakes.head()

Now let's start a new figure by creating a blank instance of `pygmt.Figure`,
this time though, we'll set our focus to Fiji (FJ) instead.
We'll also use [pygmt.Figure.text](https://www.pygmt.org/dev/api/generated/pygmt.Figure.text)
to plot the country's capital Suva at a suitable coordinate.

In [None]:
fig = pygmt.Figure()
fig.basemap(region="FJ", frame=True)
fig.coast(land="grey", water="lightblue")
fig.text(x=178.4, y=-18.2, text="Suva", font="12p,Helvetica-Bold,black")
fig.show()

The [pygmt.Figure.plot](https://www.pygmt.org/dev/api/generated/pygmt.Figure.plot) method
is used to plot the points.

You'll notice that our "quakes" table has 'longitude' and 'latitude' columns,
and we'll pass those into the `x` and `y` arguments of `plot`.
We set the `style` as 'c0.3c' which means circles of 0.3 cm in size,
The `pen` attribute controls the outline of the symbols and `color` controls the fill.

In [None]:
fig.plot(x=quakes.longitude, y=quakes.latitude, style="c0.3c", pen="black", color="orange")
fig.show()

### **Changing the colour of points**

We can also colour the circles according to the earthquake's magnitude.
Let's start by creating a new `pygmt.Figure` once more.

In [None]:
fig = pygmt.Figure()
fig.basemap(region="FJ", frame=["af", '+t"Earthquakes near Fiji"'])
fig.coast(land="grey", water="lightblue")
fig.text(x=178.4, y=-18.2, text="Suva", font="12p,Helvetica-Bold,black")

Next, we'll create a color palette table (cpt) using
[pygmt.makecpt](https://www.pygmt.org/dev/api/generated/pygmt.makecpt),
scaling it to an appropriate range or `series`, which is about 3.5 to 6.5 in this case.
We need to do this because most of GMT's
[built-in color palette tables](https://docs.generic-mapping-tools.org/latest/cookbook/cpts.html#built-in-color-palette-tables-cpt)
assume a data range of values between 0 and 1,
and our earthquake magnitudes fall outside of that range.

We'll use the [scientific colourmap](http://www.fabiocrameri.ch/colourmaps.php) 'batlow' here,
but feel free to explore other alternatives for your own use case!

In [None]:
pygmt.makecpt(cmap="batlow", series=[3.5, 6.5])

After that, we then tell [pygmt.Figure.plot](https://www.pygmt.org/dev/api/generated/pygmt.Figure.plot)
to `color` the points using the 'magnitude' values.
GMT will automatically use the scaled cpt we just made if we setting `cmap` to True.

In [None]:
fig.plot(x=quakes.longitude, y=quakes.latitude, color=quakes.magnitude, style="c0.3c", cmap=True)

Lastly, we'll set a colorbar on the Bottom Center (BC) with an appropriate label.

In [None]:
fig.colorbar(position="JBC", frame=["af", "y+lMagnitude"])
fig.show()

There are many other ways we can improve the display of points and other vector datasets on a map.
For example, you can follow this [tutorial](https://www.pygmt.org/dev/tutorials/plot.html#plotting-data-points)
to change the size of the points depending on the earthquake's magnitude.
There are also some [PyGMT gallery examples](https://www.pygmt.org/dev/gallery/index.html#plotting-map-items)
and pure [GMT gallery examples](https://docs.generic-mapping-tools.org/latest/gallery.html) you can
refer to for more inspiration.

We'll leave it here for now,
but just realize that [pygmt.Figure.plot](https://www.pygmt.org/dev/api/generated/pygmt.Figure.plot)
is a very powerful tool that can used for plotting lines, polygons and other symbols,
for not only maps but also non-geographic 2D plots as well!

## **2.2 Loading and plotting grids**

Now let's have a go at plotting some raster grids again!

This time though, we'll use [xarray](http://xarray.pydata.org) and [rasterio](https://github.com/mapbox/rasterio)
to load in a seamless 10 m spatial resolution bathymetry and topography DEM of Sydney Harbour
(published under a [CC BY 3.0](https://creativecommons.org/licenses/by/3.0/) license
by [Wilson and Power, 2018](https://doi.org/10.1594/PANGAEA.885014)).
The file is in an ESRI ASCII Raster format, but the workflow will be very similar for
[GeoTIFF](https://en.wikipedia.org/wiki/GeoTIFF) and [NetCDF](https://en.wikipedia.org/wiki/NetCDF) files,
or any of the file formats which xarray/rasterio can read in.

Note: The filename below looks funny because we're reading it directly from the web,
and unzipping the zip file 'on the fly' to get at the file we want
(see [here](https://rasterio.readthedocs.io/en/stable/topics/datasets.html#advanced-datasets) for more info).
Alternatively, you can substitute the filename to the path of a dataset stored locally.

In [None]:
grid = xr.open_rasterio(
    filename="zip+https://store.pangaea.de/Publications/WilsonK_etal_2018/Sydney.zip!Sydney/syd_10m_utm56.txt"
)

We can preview the resulting xarray.DataArray to look at some of the metadata and attributes.

In [None]:
grid

PyGMT's [grdimage](https://www.pygmt.org/dev/api/generated/pygmt.Figure.grdimage) function
requires the grid to only have 2 dimensions (i.e. x and y),
but our grid currently has 3 (band, x and y) as denoted by the asterisk *.
We'll need to select just one of the bands (number 1 in this case)
so that things will work properly.

In [None]:
grid = grid.sel(band=1)
grid

The next steps are just some additional preprocessing.
We'll remove the NaN values so that they don't get plotted,
and also sort our grid so that the x and y dimensions
go in ascending order (i.e. West to East, South to North).

In [None]:
grid = grid.where(grid != grid.nodatavals, drop=True)
grid = grid.sortby(variables=list(grid.dims))
grid

Finally, we can make our plot!
We'll create a color palette table (cpt) scaled to our minimum and maximum elevation.
Our colobar will be plotted on the Middle Right (MR) side with an X-offset of 1 cm,
and there will be a little square showing the NaN color (+n).

In [None]:
fig = pygmt.Figure()
pygmt.makecpt(series=[float(grid.min()), float(grid.max())], cmap="geo")
fig.grdimage(grid=grid, frame=["af", 'WSne+t"Sydney Harbour Topography"'])
fig.colorbar(position="JMR+n", frame=["af", 'y+l"Elevation (m)"'], X="1c")
fig.show()

**Additional Notes**

PyGMT currently has a few limitations when it comes with plotting grids.
Here is a list of some of the common ones (and their workarounds):

- The datatype (dtype) of the xarray.DataArray values must be either int/uint/float 32/64,
  (i.e. the standard uint/int16 found in many GeoTiff files won't work out of the box).
  - Workaround: Convert the dtype using `grid = grid.astype(dtype=np.uint32)`
- Large datasets stored in an xarray.DataArray may take a long time for PyGMT to plot.
  - Workaround: Subset your xarray grid using `grid.sel(x=slice(minx, maxx), y=slice(miny, maxy))`
  - Alternative: Pass in the filename to your grid directly, instead of loading it via xarray.
  

So far we've gone through a good chunk of PyGMT's plotting modules,
but there's a whole lot more to PyGMT than that!
Next on, we'll look at one example of PyGMT's data processing functionality.
Specifically, taking a point cloud and processing it into a Digital Elevation Model (DEM) grid!