# Visualize Climate Data with R

In this notebook, we will learn how to read the metadata from a NetCDF file and visualize data on a map.

Specifically, we will be using `raster`, an R package for spatial data analysis.

In [None]:
library(raster)
library(fs)
Sys.setenv(PROJ_LIB = "/opt/conda/envs/r-kernel/share/proj")
home = path_expand("~")

As an example, we use one of the NetCDF file stored in the ENES Data Space archive. The file refers to the **tasmax** variable, the *Daily-Maximum Near-Surface Air Temperature*.

In [None]:
tasmax_file_path = home+'/data/CMIP6/ScenarioMIP/CMCC/CMCC-ESM2/ssp585/r1i1p1f1/Amon/tasmax/gn/v20210126/tasmax_Amon_CMCC-ESM2_ssp585_r1i1p1f1_gn_201501-210012.nc'

We use the `raster` function to create a RasterLayer from the NetCDF file. In this case, the following additional arguments may be provided:

- `varname`: The variable name, such as 'tasmax' or 'pr'. If not supplied and the file include multiple variables, a guess will be made (and reported)

- `lvar`: default=3. To select the 'level variable' (3rd dimension variable) to use, if the file has 4 dimensions (e.g. depth instead of time)

- `level`: default=1. To select the 'level' (4th dimension variable) to use, if the file has 4 dimensions, e.g. to create a RasterBrick of weather over time at a certain height.

In [None]:
dset <- raster(tasmax_file_path) # We get only the first timestep (band)
dset

From the output above, we can see that our `dset` object is a RasterLayer and we get additional metadata information about it:

    name of the variable
    dimensions
    resolution
    coordinate reference system

When printed, we get all the metadata associated with our NetCDF data file.

In [None]:
print(dset)

## Quick visualization

A quick visualization can be obtained by using the `plot` function.

In [None]:
plot(dset)

To shift longitudes to `[-180; 180]` instead of `[0; 360]`, we can use the `rotate` function. This function rotates a `Raster*` object that has x coordinates (longitude) from 0 to 360, to standard coordinates between -180 and 180 degrees. Longitude between 0 and 360 is frequently used in data from global climate models.

In [None]:
dset_r <- rotate(dset)
dset_r

In [None]:
plot(dset_r)

## Plotting with ggplot2

`ggplot2` is a plotting package that makes it simple to create complex plots from data in a data frame. It provides a more programmatic interface for specifying what variables to plot, how they are to be displayed, and to define general visual properties. This helps in creating publication quality plots with minimal amounts of adjustments and tweaking.

In [None]:
library(ggplot2)

ggplot graphics are built step by step by adding new elements. Adding layers in this fashion allows for extensive flexibility and customization of plots. To build a ggplot, we will follow the following basic template that can be used for different types of plots:

    ggplot(data = <DATA>, mapping = aes(<MAPPINGS>)) +  <GEOM_FUNCTION>()

To visualise our data (`dset_r`) in R using `ggplot2`, we need to convert it to a dataframe. The raster package has an built-in function for conversion to a plotable dataframe:

In [None]:
df <- as.data.frame(dset_r, xy = TRUE) 

Inspecting the structure of our data, we can see a standard dataframe format

In [None]:
str(df)

Once converted to a dataframe, we can plot it. We will also use the `coord_quickmap()` function to use an approximate Mercator projection for our plot.

In [None]:
ggplot() +
  geom_raster(data = df , aes(x = x, y = y, fill = Daily.Maximum.Near.Surface.Air.Temperature)) + 
  coord_quickmap()

We can then set the color scale to `scale_fill_viridis_c` which is a color-blindness friendly color scale.

In [None]:
ggplot() +
  geom_raster(data = df , aes(x = x, y = y, fill = Daily.Maximum.Near.Surface.Air.Temperature)) +
  scale_fill_viridis_c() +
  coord_quickmap()

We can also create our own colormap using `colorRampPalette`

In [None]:
jet.colors <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))

Then, we can use this new colormap with `scale_fill_gradientn`

In [None]:
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  coord_quickmap()

Let’s add continents and a projection using `borders`

In [None]:
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  borders() + 
  coord_quickmap()

We can then improve our map by using other functions, such as:
- `options`: it allows the user to set and examine a variety of global options which affect the way in which R computes and displays its results. In this case, we'll set the `repr.plot.*` option to overwrite the default values (7 inches) for both plotting area `width` and `height`. 
- `coord_sf`: it ensures that all layers use a common CRS.
- `annotate`: this is useful for adding small annotations (such as text labels)

In [None]:
options(repr.plot.width=15, repr.plot.height=8)
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  borders()+
  coord_sf(xlim = c(-180, 180), ylim = c(-90, 90), expand = FALSE) +
  annotate(geom = "text", x = -100, y = 40, label = "USA", 
    fontface = "italic", color = "black", size = 5) +
  annotate(geom = "text", x = 12, y = 40, label = "Italy", 
    fontface = "italic", color = "black", size = 5) + 
  annotate(geom = "text", x = 80, y = -10, label = "Indian Ocean", 
    fontface = "italic", color = "black", size = 5)

The `coord_sf` is also useful if we want tha map to be zoomed to show a specific area.

#### Zoom on EUROPE

In [None]:
# EUROPE
north=75
south=35
east=40
west=-25

options(repr.plot.width=15, repr.plot.height=8)
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  borders()+
  coord_sf(xlim = c(west, east), ylim = c(south, north), expand = FALSE)

#### Zoom on AFRICA

In [None]:
# AFRICA
north=40
south=-35
east=60
west=-25

options(repr.plot.width=15, repr.plot.height=8)
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  borders()+
  coord_sf(xlim = c(west, east), ylim = c(south, north), expand = FALSE)

#### Zoom on AMERICA

In [None]:
# AMERICA
north=90
south=-60
east=-10
west=-180

options(repr.plot.width=15, repr.plot.height=8)
ggplot() +
  geom_raster(data = df, aes(x=x, y=y, fill=Daily.Maximum.Near.Surface.Air.Temperature)) + 
  scale_fill_gradientn(colors = jet.colors(7)) + 
  borders()+
  coord_sf(xlim = c(west, east), ylim = c(south, north), expand = FALSE)

We can then save the plot using the [ggsave](https://ggplot2.tidyverse.org/reference/ggsave.html) function, which saves by default the last plot that you displayed.

In [None]:
ggsave(home+"/work/tasmax_america_201501.png", width = 15, height = 8)

Check and display the exported PNG.

In [None]:
library("IRdisplay")
display_png(file=home+"/work/tasmax_america_201501.png")

### Change projection

It is often convenient to visualize data using a different projection than the original data. 

The projection is specified with `coord_map`. *Orientation* takes 3 parameters: latitude,longitude,rotation.

We also used `scale_color_distiller` to change the palette.

In [None]:
ggplot(df, aes(y=y, x=x, color=Daily.Maximum.Near.Surface.Air.Temperature)) +
  geom_point(size=2, shape=15) +
  borders('world', xlim=range(df$x), ylim=range(df$y), colour='black') +  
  scale_color_distiller(palette='Spectral') +
  coord_map('ortho', orientation = c(40, 20, 0))

## A second example

As another example, let'use a NetCDF file related to the **pr** variable (Precipitation).

In [None]:
pr_file_path = home+'/data/CMIP6/ScenarioMIP/CMCC/CMCC-ESM2/ssp585/r1i1p1f1/Amon/pr/gn/v20210126/pr_Amon_CMCC-ESM2_ssp585_r1i1p1f1_gn_201501-210012.nc'

In this case, we are going to use the `stack` function from the `raster` package. The `stack` function is used to create a `RasterStack` object, which is a collection of `RasterLayer` objects with the same spatial extent and resolution. In this case, we want to bring in all bands of a multi-band raster (here all times).

In [None]:
dset_pr <- stack(pr_file_path)
dset_pr

We can look at the bands (times):

In [None]:
options(max.print=3)
dset_pr@layers

Let’s now select one layer (time), for example "June 2050".

To ease our search we grep for "2050.06" in the names of the layers. The variable for precipitation is name **pr** (there is only one variable), so let’s look at its metadata.

In [None]:
dset_205006 <- raster::subset(dset_pr, grep('2050.06.', names(dset_pr), value = T))

In [None]:
print(dset_205006)

As before, we shift the longitudes from 0 and 360 to -180 and 180 with `rotate`

In [None]:
dset_205006_r <- rotate(dset_205006)

And then convert to a dataframe for plotting

In [None]:
df <- as.data.frame(dset_205006_r, xy = TRUE)
df

The column names are not very meaningful, so let’s rename them using the `dplyr` R package. (Please ignore the warnings, if any)

In [None]:
library(dplyr)
df <- df %>% 
  rename(
    precipitation = X2050.06.16,
    longitude = x,
    latitude = y
  )
df

The unit is **kg m-2 s-1**. We want to convert it to something that we are a little more familiar with, like **mm day-1** or **m day-1** (metre per day).

To do so, consider that 1 kg of rain water spread over 1 m2 of surface is 1 mm in thickness and that there are 86400 seconds in one day. Therefore

    1 kg m-2 s-1 = 86400 mm day-1 = 86.40 m day-1

So we can multiply that array by 86.40.

In [None]:
df$precipitation <- df$precipitation * 86.4

Then we can plot the precipitation field

In [None]:
ggplot() +
  geom_raster(data = df , aes(x = longitude, y = latitude, fill = precipitation)) +
  scale_fill_viridis_c(limits = c(0.0, 0.02)) + 
  borders() + 
  coord_quickmap()

Or using the custom jet colormap

In [None]:
options(repr.plot.width=15, repr.plot.height=8)
jet.colors <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))

ggplot() +
  geom_raster(data = df, aes(x=longitude, y=latitude, fill=precipitation)) + 
  scale_fill_gradientn(colors = jet.colors(7),limits = c(0.0, 0.02)) +  
  borders()+
  coord_sf(xlim = c(-180, 180), ylim = c(-90, 90), expand = FALSE)