# Section 6. Satellite Remote Sensing

#### Instructor: Pierre Biscaye

The content of this notebook draws on material from UC Berkeley's Spatial Data Analysis [course](https://docs.google.com/document/d/1oC10pjyeBQTenQazCpaB8Lx1b5PC1SR3WFiPgCtXqcs/edit?tab=t.0) notes by [Jaecheol Lee](https://sites.google.com/view/jaecheollee) and from a University of Bordeaux Machine Learning [course](https://github.com/jdnmiguel/Applied-ML) by [Jeremy do Nascimento Miguel](https://jdnmiguel.github.io/).

In developing countries, we can lack of some information about weather events, agricultural production, urban settlements, which can explain policy responses and ease policy targeting. One way to overcome this data scarcity is to complement traditional data sources with remote sensing data. 

Data from remote sensing cover a wide range of topics: for instance land use, urban settlements, population density, agricultural production, weather, etc. 

If you want to know more about these data sources, I strongly recommend to read [Donaldson, Dave, and Adam Storeygard. "The view from above: Applications of satellite data in economics." Journal of Economic Perspectives 30.4 (2016): 171-198.](https://pubs.aeaweb.org/doi/pdf/10.1257/jep.30.4.171)
Another great paper is [Jain, Meha. "The benefits and pitfalls of using satellite data for causal inference." Review of Environmental Economics and Policy (2020).](https://www.journals.uchicago.edu/doi/abs/10.1093/reep/rez023?journalCode=reep)
But there are many others! We will talk about how to identify some of these data sources today.
    
### Learning Objectives 
    
* Understand what satellite imagery looks like in raw form
* Introduction to accessing remotely sensed data through Google Earth Engine
* Get familiar with how to plot spatial data available online
* Think about different ways of measuring the same thing with spatial data

### Sections

1. Satellite imagery
2. Google's Earth Engine
3. Accessing spatial data
4. Comparing data sources: mapping floods in Nigeria

In [None]:
# Import Packages

import random
random.seed(0)
import numpy as np
np.random.seed(0)

import matplotlib.pyplot as plt
import geopandas as gpd
import pandas as pd
import sys
import rasterio
from matplotlib.colors import LinearSegmentedColormap

%matplotlib inline

# 1. Satellite imagery

Many remotely sensed data sources are derived from raw **satellite data**, either imagery or data. Let's have a look at an example of pulling raw satellite imagery from the web directly in python.

Note that there are python integrations to **pull satellite imagery from other sources**, such as **Google Earth Engine**. We will not cover that today but we will cover working in Google Earth Engine to access geospatial data.

### Pulling Landsat images from the NASA API

We will pull imagery from the Landsat satellite using NASA API that assists in the indexing of a Google Earth API, which provides the satellite imagery. 

The NASA API is described in the imagery section of this link: https://api.nasa.gov/api.html#earth. You need to provide the API with the location of the image and a date, and it will return the image taken by Landsat 8 that is closest to that date, and an estimate of how much the image is covered in clouds. Note that this API returns near-true-color satellite imagery with just 3 RGB bands, rather than the full Landsat stack of spectral bands. Different services (such as Google Earth Engine) can allow access to the full stack.

The NASA site allows a computer to make 30 queries per hour using the generic api key DEMO_KEY. But if you make more than 30 requests, you will need to register for your own key by going here: https://api.nasa.gov/index.html#apply-for-an-api-key.

**Querying an API** means providing a URL that contains information about the query; the URL server will be pinged and provide back the requested information. It is important that the query and the information returned must be in standard forms. In this exercise, the NASA API will return information stored in JSON and the Google Earth API will return a RGB 8-bit image. You must construct a URL defining your query. One way is construct the URL as a string, using the standard format described by NASA. You assign certain fields the values you stored as strings using an '=' and seperating fields using '&'. The URL is built by simply concatenating several strings.

We'll start by looking at the area around the Eiffel Tower in Paris. Note that individual images are 0.025 degree squares.

In [None]:
# Build query
lat="48.861"
lon="2.295"
date="2022-06-15"
my_api_key = "DEMO_KEY"
base_url="https://api.nasa.gov/planetary/earth/imagery/"
image_query=base_url+"?lat="+lat+"&lon="+lon+"&date="+date
image_query=image_query+"&api_key="+my_api_key
print(image_query)

In [None]:
# Request data and parse reply using requests library
import requests as re
import json
r = re.get(image_query)
print(r.url)
# Alternative
#r = re.get(base_url, params={'lat':lat,'lon':lon,
#                             'date':date,'api_key':my_api_key})
#print(r.url)

In [None]:
# Retrieve satellite image
def download_file(url,local_filename):
    r = re.get(url, stream=True)
    with open(local_filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            if chunk:
                f.write(chunk)
    return local_filename

download_file(image_query,"Data/paris.png")

We'll now use the `skimage` library to load the satellite image file. `scikit-image` is a collection of algorithms for image processing. 

In [None]:
from skimage import io

In [None]:
# Display satellite image
im = io.imread('Data/paris.png')
plt.imshow(im)
plt.show()

It's hard to tell what we are seeing, and notice that we can clearly see the cloud cover! That is an important limitation of satellite imagery.

Let's try another date.

In [None]:
date="2022-07-15"
image_query=base_url+"?lat="+lat+"&lon="+lon+"&date="+date
image_query=image_query+"&api_key="+my_api_key
download_file(image_query,"Data/paris2.png")
im2 = io.imread('Data/paris2.png')
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(im2)
plt.show()

That's more clear, but it's a little hard to tell what we are looking at. 

We can **stitch together satellite images** to give us a broader picture

In [None]:
lat_n=str(48.861+0.025)
image_query=base_url+"?lat="+lat+"&lon="+lon+"&date="+date
image_query=image_query+"&api_key="+my_api_key
download_file(image_query,"Data/paris2_n.png")
im2_n = io.imread('Data/paris2_n.png')

In [None]:
# Concatenate images
big_im = np.vstack((im2_n,im2))

# Plot together
fig, ax = plt.subplots(ncols=1, figsize=(6, 6))
ax.imshow(big_im)
plt.show()

Now let's zoom out a bit using a built-in feature of the NASA API call.

In [None]:
# Zooming out
# The dim argument tells the width and height in degrees of the square area desired, at same resolution; the default is 0.025
image_query=base_url+"?lat="+lat+"&lon="+lon+"&date="+date+"&dim=0.1"
image_query=image_query+"&api_key="+my_api_key
download_file(image_query,"Data/paris3.png")
im3 = io.imread('Data/paris3.png')
fig, ax = plt.subplots(figsize=(10, 10))
ax.imshow(im3)
plt.show()

### Equalization and sharpening

Let's try two approaches to make the images a bit sharper. 

1. **Histogram equalization** redistributes the intensities to use the full available range more evenly. This can help make details more visible, and works well with images that are too dark or too bright.
2. **Unsharp masking** sharpens images by subtracting a blurred version from the original.

Functions for both approaches are available from the `skimage` library.

We'll start with the smaller image.

In [None]:
from skimage import exposure
from skimage.filters import unsharp_mask

# Global histogram equalization
im2_eq = exposure.equalize_hist(im2)

# Unsharp masking
# radius controls the size of the blur
# amount controls how much the unsharp mask effect is amplified
im2_sharp = unsharp_mask(im2_eq, radius=1, amount=1)

In [None]:
# Plot images together
fig, axes = plt.subplots(ncols=3, figsize=(14, 7))

axes[0].imshow(im2)
axes[0].set_title("Original Image")
axes[0].axis("off")

axes[1].imshow(im2_eq, cmap='gray')
axes[1].set_title("Enhanced Image (Equalized)")
axes[1].axis("off")

axes[2].imshow(im2_sharp, cmap='gray')
axes[2].set_title("Enhanced Image (Equalized + Sharpened)")
axes[2].axis("off")

plt.tight_layout()
plt.show()

Now we can see things a bit more clearly! Certainly equalizing helped highlight the contrast across pixels in the original image. It's not clear how much we gained from sharpening but there are some differences. 

Here's what it looks like in [Google Maps](https://www.google.com/maps/place/Eiffel+Tower/@48.8586559,2.2848253,2483m/data=!3m1!1e3!4m6!3m5!1s0x47e66e2964e34e2d:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!16zL20vMDJqODE?entry=ttu&g_ep=EgoyMDI1MDExNS4wIKXMDSoASAFQAw%3D%3D).

Let's see how it works with the zoomed out image.

In [None]:
# Global histogram equalization
im3_eq = exposure.equalize_hist(im3)

# Unsharp masking
im3_sharp = unsharp_mask(im3_eq, radius=1, amount=1)

# Plot images together
fig, axes = plt.subplots(ncols=2, figsize=(14, 7))

axes[0].imshow(im3)
axes[0].set_title("Original Image")
axes[0].axis("off")

axes[1].imshow(im3_sharp, cmap='gray')
axes[1].set_title("Enhanced Image (Equalized + sharpened)")
axes[1].axis("off")

plt.tight_layout()
plt.show()

### Exploring image bands

Most imagery comes with multiple spectral bands. Analyzing them can give us specific information about the image and its contents.

In [None]:
im3_sharp.shape

We can see it is a square of pixels with 4 bands. The first three correspond to red, green, and blue.

Let's plot the different color bands and see how they look.

In [None]:
fig, ax = plt.subplots(ncols=3, nrows=2, figsize=(12, 6))
ax[0,0].imshow(im3[:, :, 0], cmap='binary')
ax[0,1].imshow(im3[:, :, 1], cmap='binary')
ax[0,2].imshow(im3[:, :, 2], cmap='binary')
ax[1,0].imshow(im3_sharp[:, :, 0], cmap='binary')
ax[1,1].imshow(im3_sharp[:, :, 1], cmap='binary')
ax[1,2].imshow(im3_sharp[:, :, 2], cmap='binary')
ax[0,0].set_title('Red band')
ax[0,1].set_title('Green band')
ax[0,2].set_title('Blue band')
plt.show()

It is not very easy to tell apart the images from the red, green, and blue bands when they are plotted separately in black and white in the raw image, but it is easier after equalizing and sharpening. The red band has some lower values than the blue band, while the green band is in the middle, but they all look similar. Water, as in the river, as greenery as in the Bois de Boulogne, stand more in the blue than in the red band. 

What does this imply about how much blue buildings reflect relative to water and vegetation?

With this image we do not have additional image bands, but if we had the near infrared band we could use it to calculate NDWI, the normalized difference water index.
$$ NDWI= (GREEN - NIR)/(GREEN + NIR) $$

We could then plot that to see where water is identified.

Let's simulate what plotting an index would look like, using $\frac{Green- Red}{Green + Red}$

In [None]:
# Extract the red and green bands
red   = im3_sharp[..., 0].astype(float)  # Band 0
green = im3_sharp[..., 1].astype(float)  # Band 1

# Compute the index: (G - R) / (G + R)
#    Add a small epsilon to avoid possible division by zero
eps = 1e-7
index = (green - red) / (green + red + eps)

# Plot the resulting index
plt.figure(figsize=(8, 6))
# Use a diverging colormap like 'RdYlGn', 'bwr', or 'seismic'
plt.imshow(index, cmap='RdYlGn', vmin=-1, vmax=1)
plt.colorbar(label='(Green - Red) / (Green + Red)')
plt.title('(G - R) / (G + R)')
plt.axis('off')
plt.show()

What do we observe from this image? What are we most identifying? Compare to the [Google Maps](https://www.google.com/maps/place/Eiffel+Tower/@48.8586559,2.2848253,2483m/data=!3m1!1e3!4m6!3m5!1s0x47e66e2964e34e2d:0x8ddca9ee380ef7e0!8m2!3d48.8583701!4d2.2944813!16zL20vMDJqODE?entry=ttu&g_ep=EgoyMDI1MDExNS4wIKXMDSoASAFQAw%3D%3D) image.

### Identifying features

We've seen how we could use even just 3 bands of optical imagery to make some progress toward identifying water and vegetation.

The applications are almost limitless, and often combine remotely sensed data and some measures of 'ground truth' using machine learning.
* [Measuring economic growth](https://www.aeaweb.org/articles?id=10.1257/aer.102.2.994)
* [Mapping poverty](https://cdn.vanderbilt.edu/vu-my/wp-content/uploads/sites/2095/2019/04/14134552/ScienceMachineLearningArticle.pdf)
* [Estimating crop yield](https://www.sciencedirect.com/science/article/pii/S2352938522000015)
* [Predicting air quality](https://www.sciencedirect.com/science/article/pii/S0034425723001608)
* and so much more...

We are limited in what we can do here with just 3 bands of imagery, but let's try a rough proxy of **mapping buildings**.

[Several](https://onlinelibrary.wiley.com/doi/10.1155/2022/4831223) [papers](https://www.researchgate.net/publication/360414085_Detecting_Buildings_and_Nonbuildings_from_Satellite_Images_Using_U-Net) have worked on models incorporating machine learning and satellite data for building identification.

The Normalized Difference Built-up Index (NDBI): $\frac{SWIR-NIR}{SWIR+NIR}$. We cannot calculate this, but we could potentially create a proxy that relies on the fact that buildings are typically brighter in RGB imagery - more 'white' meaning more reflectance on all bands.

Can we do a rough proxy ourselves?

Let's try this with images of the University of California at Berkeley, where there is a nice combination of scattered buildings and vegetation.

In [None]:
Berkeley_lat="37.87"
Berkeley_lon="-122.26"
date="2014-05-21"
my_api_key = "DEMO_KEY"
base_url="https://api.nasa.gov/planetary/earth/imagery/"
image_query=base_url+"?lat="+Berkeley_lat+"&lon="+Berkeley_lon+"&date="+date
image_query=image_query+"&api_key="+my_api_key
download_file(image_query,"Data/image.png")
berk = io.imread('Data/image.png')
plt.imshow(berk)
plt.show()

In [None]:
fig, ax = plt.subplots(ncols=3, figsize=(12, 6))
ax[0].imshow(berk[:, :, 0], cmap='binary')
ax[1].imshow(berk[:, :, 1], cmap='binary')
ax[2].imshow(berk[:, :, 2], cmap='binary')
ax[0].set_title('Red band')
ax[1].set_title('Green band')
ax[2].set_title('Blue band')
plt.show()

Buildings appear to show up as more dark in the blue band, but generally as more bright overall.

Let's see if we can use pixel brightness to proxy for buildings. We'll proxy 'bright' pixels as those with values above 50 on all color bands.

In [None]:
bright = np.logical_and(berk[:,:,0]>50, 
                        berk[:,:,1]>50, 
                        berk[:,:,2]>50)

Now, let's plot this! 

We can also use the bright pixels as a mask to show what the satellite imagery looks like only in the identified areas.

In [None]:
berk_masked= berk*np.stack((bright,bright,bright), axis=2)
#berk_masked[berk_masked==0]=255 #Change masked out pixels to white

fig, ax = plt.subplots(ncols=3, figsize=(12,18))
ax[0].imshow(berk)
ax[1].imshow(bright, cmap='coolwarm')
ax[2].imshow(berk_masked)
ax[0].set_title('Satellite image of \n north-south area around \n UC Berkeley campus')
ax[1].set_title('Building locations predicted \n based on pixel brightness')
ax[2].set_title('Satellite image of only \n predicted building locations')
plt.show()

The mask appears to do a good job identifying buildings; this is clear in both the plot of the mask and the plot of the masked satellite image, though choosing to white out all other locations makes it a bit harder to distinguish some of the buildings that are very bright/close to white. 

Varying the thresholds on some of the bands might lead to a slightly more effective mask. But this is just a rough proxy and there are better ways to identify buildings from satellites.

For example, you can check out this blog summarizing the Mapping Africa's Buildings [project](https://research.google/blog/mapping-africas-buildings-with-satellite-imagery/) to get a sense of how they proceeded.

# 2. Using Google's Earth Engine

An important tool for working with remotely sensed and satellite data is Google's **Earth Engine**.

I have used Earth Engine to access and process weather and climate data (CHIRPS, ERA5, SPEI), agricultural/vegetation data (NDVI), land cover data (from Copernicus and others), flood risk data (JRC, WRI), and more.

### What is Earth Engine?

Earth Engine (EE) is a cloud/browser-based platform for planetary scale geospatial analysis that relies on Google's processing and storage capabilities to enable large analyses in very little time. 
It’s most relevant for people that are interested in using satellite and aerial imagery to study large areas, long time periods, or both. 
Earth Engine is home to hundreds of public remote sensing/geospatial datasets totaling more than thirty petabytes, and is continuously updated as images are captured.

Check out the [Data Catalog](https://developers.google.com/earth-engine/datasets) for more information.

### How do you work with Earth Engine?

First, you can create a free account - there is no cost for using Earth Engine for academic purposes. Once you have an account, you can log in to EE and start writing scripts to work with spatial data.

EE is an Application Programming Interface (API), meaning that we request data (raw or processed in some way) using a programming language. 
EE has both a JavaScript and Python API – I have primarily used the JavaScript API because the JavaScript Playground is more visually interactive and is easier to set up. The Python API requires some additional authentication and set up, but it can be very useful to have it integrated into your Python code.

### Is it worth learning?

**Absolutely**, if you plan to work with large-scale spatial data. 

The main **constraint** is learning how to write the code in JavaScript to access and manipulate the data. But LLMs can really help to overcome this.

The huge **advantage** is access to massive computational power and speed for treating big spatial data. It facilitates analysis that would crash or take a very long time on most personal machines.

### Seeing an example

We won't go into detail on how to use EE today, but I will walk you through you example code so you can see it.

Let's look at some of my Earth Engine [code](https://code.earthengine.google.com/?project=ee-pbiscaye).

1. Browsing for datasets: floods
2. Asking ChatGPT for help with EE JavaScript code
3. Looking through and running code for mapping flood hazard
4. Using the console to understand the data
5. Interacting with the map and using the Inspector
6. Running a task/exporting data
7. Showing how I then used the data in Python
8. Looking briefly at code for other EE datasets (SPEI prep)

# 3. Accessing spatial data

There are many places to access spatial data. Many datasets are just a quick Google search away!

For example, here are the web addresses of some datasets I have used in my work that we will look at below:

* Gridded population of the world: https://sedac.ciesin.columbia.edu/gpw-v2/index.html?main.html&2
* Global administrative boundaries: https://gadm.org/download_country.html
* Global agricultural lands: https://www.earthdata.nasa.gov/data/catalog/sedac-ciesin-sedac-aglands-pas2000-1.00
* WorldClim monthly temperature and precipitation data: https://www.worldclim.org/data/monthlywth.html
* Global land cover classifications: https://gaez.fao.org/

All the datasets are at different spatial and temporal resolutions, so require some work to prepare to merge and work with. 

Let's have a look at some of them!

In [None]:
# Setting my data path
#path="/Users/pierrebiscaye/Dropbox/Data/"
path="C:/Users/pibiscay/Dropbox/Data/"

### Gridded population of the world from CIESIN

We've already seen the GPW data. The GPWv4 rev11 population estimates are at a 15 arcminute (0.25 degree) level globally interpolated for every 5 years from 2000-2020. 

There are a lot of calculations under the hood already to estimate population. We won't get into those here but you should be aware that every data source you find has a variety of calculations under the hood that may affect the quality of the data.

Let's have another look, considering 2020.

In [None]:
pop20=rasterio.open(path+"Spatial/GPW/2020/gpw_v4_population_count_rev11_2020_15_min.tif")

In [None]:
nodes = [0, 0.33, 0.67, 1]  # positions for each color from 0-1
color_scheme = ['bisque', 'orange', 'red', 'purple']  # corresponds to nodes
custom_cmap = LinearSegmentedColormap.from_list(
    'WhiteYellowRed', list(zip(nodes, color_scheme)))
custom_cmap.set_under('gray')  # set values under vmin to gray
custom_cmap.set_over('blue')  # set values over vmax to black

# plotting starts
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(pop20.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=0, vmax=500000)
fig.colorbar(im)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_title('Gridded Population of the World in 2020 (Source: CIESIN)')
plt.show()

### Monthly total precipitation from WorldClim

Weather-related data are available from a variety of sources. The first source I used was WorldClim. 

WorldClim has data on monthly total precipitation and maximum temperature available at a 2.5 arcminute resolution (around 0.04 degrees) globally for every month from 1985-2018. The files are provided as TIFs.

Let's look at precipitation in January 2018.

In [None]:
rainpath=path+"/Spatial/WorldClim/Precipitation/wc2.1_2.5m_prec_2010-2018/"
rain_01=rasterio.open(rainpath+"wc2.1_2.5m_prec_2018-01.tif")

In [None]:
nodes = [0, 0.5, 1]  # positions for each color from 0-1
color_scheme = ['gray', 'lightblue', 'blue']  # corresponds to nodes
custom_cmap = LinearSegmentedColormap.from_list(
    'WhiteYellowRed', list(zip(nodes, color_scheme)))
custom_cmap.set_under('white')  # set values under vmin to gray
custom_cmap.set_over('purple')  # set values over vmax to black

# plotting starts
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(rain_01.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=0, vmax=250)
fig.colorbar(im)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_title('Total Precipitation in January 2018 (Source: WorldClim)')
plt.show()

### Land cover from CIESIN

CIESIN has a global agricultural lands dataset that shows the share of land that is crop or pasture land in 2000, at the 0.0833 degree level. This dataset has no temporal variation, though there are several other land classification datasets with information on changes over time.

In [None]:
cropland=rasterio.open(path+"Spatial/Land Cover/CIESIN 2000/gl-croplands-geotif/cropland.tif")
pasture=rasterio.open(path+"Spatial/Land Cover/CIESIN 2000/gl-pastures-geotif/pasture.tif")

In [None]:
nodes = [0, 0.5, 1]  # positions for each color from 0-1
color_scheme = ['gray', 'gold', 'green']  # corresponds to nodes
custom_cmap = LinearSegmentedColormap.from_list(
    'WhiteYellowRed', list(zip(nodes, color_scheme)))
custom_cmap.set_under('white')  # set values under vmin to gray
custom_cmap.set_over('darkgreen')  # set values over vmax to black

# plotting starts
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(cropland.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=0, vmax=1)
fig.colorbar(im)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_title('Share of cropland in pixel, 2000 (Source: CIESIN)')
plt.show()

### Global Administrative Boundaries

There are a variety of datasets on administrative boundaries around the world. The GADM dataset we've used before is probably the most useful for subnational boundaries. The below national country boundaries are for the whole world. Let's plot national boundaries in Africa over cropland shares.

In [None]:
world=gpd.read_file(path+"/Country boundaries/Country raw/" + 
                    "UIA_World_Countries_Boundaries/World_Countries__Generalized_.shp")

In [None]:
# plotting starts
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(cropland.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=0, vmax=1)
fig.colorbar(im)
world.plot(ax=ax,color='none', edgecolor='k', alpha=0.3)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_title('Share of cropland in pixel, Africa 2000 (Source: CIESIN)')
ax.set_xlim([-25,60])
ax.set_ylim([-40,40])
plt.show()

### GAEZ

The Global Agro-Ecological Zones data portal has a huge amount of spatial data on land characteristics.

Let's look at a map of the share of agricultural land that is irrigated in 2000.

In [None]:
irr=rasterio.open(path+"/Spatial/GAEZ/LR/wat/faoirr00.tif")

In [None]:
irr.read(1).max()

In [None]:
nodes = [0, 0.5, 1]  # positions for each color from 0-1
color_scheme = ['lightblue', 'blue', 'darkblue']  # corresponds to nodes
custom_cmap = LinearSegmentedColormap.from_list(
    'WhiteYellowRed', list(zip(nodes, color_scheme)))
custom_cmap.set_under('white')  # set values under vmin to gray
custom_cmap.set_over('darkblue')  # set values over vmax to black

# plotting starts
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(irr.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=1, vmax=50)
fig.colorbar(im)
world.plot(ax=ax,color='none', edgecolor='k', alpha=0.3)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_xlim([-25,60])
ax.set_ylim([-40,40])
ax.set_title('Irrigated lands in Africa in 2000 (Source: GAEZ)')
plt.show()

Now France! [for comparison](https://www.researchgate.net/figure/Global-map-of-irrigation-areas-irrigation-intensity-in-the-EU-as-area-equipped-for_fig7_46488610)

In [None]:
fig, ax = plt.subplots(figsize=(10, 4))
im = ax.imshow(irr.read(1),
               cmap=custom_cmap,
               extent=(-180, 180, -90, 90),
               vmin=1, vmax=50)
fig.colorbar(im)
world.plot(ax=ax,color='none', edgecolor='k', alpha=0.3)
# label axes and title
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
ax.set_xlim([-10,10])
ax.set_ylim([40,55])
ax.set_title('Irrigated lands in France in 2000 (Source: GAEZ)')
plt.show()

# 4. Comparing Data Sources

There are often many ways to measure the same thing, and many different data sources available.

Josephson et al (2024)'s [paper](https://arxiv.org/abs/2409.07506) shows that the choice of data source can affect economic analyses. They combine remotely-sensed data on weather and measures of smallholder agricultural productivity in the LSMS-ISA and show that the results are not robust to the choice of weather data, with differences in both magnitude and sign!

Many [other](https://www.aeaweb.org/articles?id=10.1257/pandp.20191064) [papers](https://www.journals.uchicago.edu/doi/full/10.1093/reep/rez023) have reported on issues for using remotely sensed data in economic analyses. 

Let's go through an ongoing project I am working on about mapping flood exposure.

GO TO FLOOD MAPPING SLIDES.