Skip to content

Commit

Permalink
Merge pull request #35 from digitalearthafrica/week-3-in-progress
Browse files Browse the repository at this point in the history
Session 3 content.
  • Loading branch information
caitlinadams committed Sep 10, 2020
2 parents 84ed410 + 1f3e032 commit 7cc72cf
Show file tree
Hide file tree
Showing 40 changed files with 708 additions and 7 deletions.
12 changes: 6 additions & 6 deletions docs/Frequently_asked_questions.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"\n",
"Visit the Sandbox homepage at [https://sandbox.digitalearth.africa/hub/login](https://sandbox.digitalearth.africa/hub/login) and click **Login or Sign up**. Click **Forgot your password?** and follow the instructions to reset your password.\n",
"\n",
"<img align=\"middle\" src=\"_static/sandbox-forgot-password.PNG\" alt=\"Forgot your password?\" width=\"300\">\n"
"<img align=\"middle\" src=\"_static/other_information/faq-sandbox-forgot-password.PNG\" alt=\"Forgot your password?\" width=\"300\">\n"
]
},
{
Expand Down Expand Up @@ -79,11 +79,11 @@
"1. Go to the Digital Earth Africa training homepage at [https://training.digitalearthafrica.org/en/latest/index.html](https://training.digitalearthafrica.org/en/latest/index.html).\n",
"2. Scroll to the bottom of the page and click **v: latest** to expand the menu.\n",
"\n",
" ![Latest](_static/faq-download-1.PNG){ width=\"400px\" }\n",
" ![Latest](_static/other_information/faq-download-1.PNG){ width=\"400px\" }\n",
"\n",
"3. Select **PDF**. This will generate a PDF of the training website. Click **Save** or **Download** to save a copy to your computer.\n",
"\n",
" ![PDF](_static/faq-download-2.PNG){ width=\"400px\" }\n",
" ![PDF](_static/other_information/faq-download-2.PNG){ width=\"400px\" }\n",
"\n",
"Note Sandbox access requires an internet connection. You will not be able to complete Sandbox exercises offline.\n",
"\n",
Expand All @@ -103,7 +103,7 @@
"source": [
"The Digital Earth Africa Sandbox has a finite amount of processing power allocated to each user. This means if you attempt to load or calculate a lot of data, the kernel may crash. You will see a message like this:\n",
"\n",
"![Kernel error](_static/faq-kernel-restart.PNG){ width=\"300px\" }\n",
"![Kernel error](_static/other_information/faq-kernel-restart.PNG){ width=\"300px\" }\n",
"\n",
"You will not lose any files, but will have to rerun your notebook to continue working. However, if the notebook is still demanding too many resources, it will crash again.\n",
"\n",
Expand Down Expand Up @@ -133,9 +133,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
}
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/session_3/02_cloud_masking_xrgeo.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 2 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@ Should you require help, see the :doc:`Frequently_asked_questions` and :doc:`Con

session_1/index
session_2/index
session_3/index

Sessions 3 -- 6 will be added shortly.
Sessions 4 -- 6 will be added shortly.

.. toctree::
:maxdepth: 1
Expand Down
269 changes: 269 additions & 0 deletions docs/session_3/01_intro_composites.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,269 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Introduction to cloud-free composites"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overview\n",
"\n",
"This session is about creating representative datasets and images from multiple timesteps. This allows us to remove and replace unwanted data, such as clouds, and also form images that accurately represent the area of interest over a period of time. \n",
"\n",
"This is summarised in this week's video, **Time aggregation of data**. Watch the video to see how to use Earth observation data at different points in time."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Video: Time aggregation of data"
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. only:: html\n",
"\n",
" .. youtube:: -V7X46wHTgM\n",
" :width: 100%"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the video, we saw that we can compensate for missing or cloudy data points by using data from different points in time to fill in the gaps. This is a two-step process:\n",
"\n",
"1. Identify and remove cloudy data &mdash; this is known as 'cloud masking'\n",
"2. Use data from a different time to fill in missing data &mdash; this can be done by calculating geomedians\n",
"\n",
"In this section, we focus on why cloud masking is an important step in preparing your dataset, and introduce the easiest way to do this in the Sandbox. We then briefly explain the significance of geomedians compared to other statistical values.\n",
"\n",
"The two pages following this introduction will involve walkthrough exercises, so you can try performing these steps yourself after reading about them in this section. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Recap: loading and plotting data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the [last session](../session_2/04_Load_data_exercise.ipynb), we plotted RGB images for Dar es Salaam in Tanzania. The image had clouds in both the Landsat 8 and Sentinel-2 versions.\n",
"\n",
"<img align=\"middle\" src=\"../_static/session_3/01_intro_composites_daressalaam_landsat_sentinel2_022018.PNG\" alt=\"dc.load images from previous session.\" width=\"600\">\n",
"\n",
"*Landsat 8 data from 16 February 2018 (left), and Sentinel-2 data from 15 February 2018 (right). Some cloud cover is visible in both images.*\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What if we want to know what is underneath those clouds? If you have data at only one point in time, this is not possible. However, if we have data for the same place at a different time, when the clouds are not present, we can use this data to 'fill' in areas of cloud. \n",
"\n",
"To do that, we must first identify which pixels are clouds. The process of determining and removing cloud data points is known as 'cloud masking'."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## `load_ard()` vs `dc.load()`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The easiest way to apply a cloud mask to your dataset is to load it into the Sandbox using the `load_ard()` function. `ard` stands for 'Analysis-Ready Data' and the `load_ard()` function automatically applies a cloud mask.\n",
"\n",
"We previously loaded data in the Sandbox using `dc.load()`. `dc.load()` is a universal function for loading data from the datacube and it is important to know how to use it. However, it does not have a built-in cloud masking capability. \n",
"\n",
"When we plotted the RGB images of Tanzania with data loaded using `dc.load()`, the clouds were part of the image. This makes it difficult to perform cloud masking, as the dataset does not distinguish cloud and not-cloud.\n",
"\n",
"In this exercise, we will instead use the command `load_ard()` to load our data. It demands similar parameters to `dc.load()`, but automatically identifies pixels of cloud, and applies cloud masking to the loaded data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img align=\"middle\" src=\"../_static/session_3/01_intro_composites_daressalaam_sentinel2_022018_masked.PNG\" alt=\"dc.load vs load_ard.\" width=\"600\">\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"*Sentinel-2 data from 15 February 2018 loaded with* `dc.load()` *(left) and* `load_ard()` *(right). The* `dc.load` *image shows cloud cover, while the white pixels in the* `load_ard` *image are not clouds, but points where data has been removed by the cloud mask.*"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The cloud masking algorithm on Sentinel-2 data is more aggressive than its Landsat 8 counterpart. This means it sometimes misinteprets white sand beaches or urban regions as cloud. This can reduce the amount of data available. "
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. note::\n",
" ``load_ard()`` is compatible with both the Sentinel-2 dataset and the Landsat 8 dataset we used in the last session. However, it is not compatible with some other Digital Earth Africa products, such as Water Observations from Space, with which you will need to use ``dc.load()``."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Calculating a composite"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we know how to mask out clouds by using `load_ard()`, we can load multiple timesteps of our cloud-masked data. These need to be combined in a meaningful way to produce a single image. We do this by compositing our data.\n",
"\n",
"Compositing creates one value for each band for each pixel based on the time series data for that pixel.\n",
"\n",
"We will briefly compare median and geomedian composites, and explain why it is more reliable to use geomedians."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Median composites\n",
"\n",
"For each band in the image, median composites set the value of each pixel to the median value for that band. For a given pixel, each band's median is independent of the others. \n",
"\n",
"The benefit of a median composite is that it is very fast to compute, so it can be used to quickly create cloud-free images for an area.\n",
"\n",
"However, medians do not account for the fact that every pixel holds information for multiple bands. It is therefore better to use a statistic that is configured for multi-dimensional data, such as a geomedian."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Geomedian composites\n",
"\n",
"Geomedian &mdash; or 'geometric median' &mdash; composites are multi-band generalisations of median composites. Instead of finding a pixel's median value for each band **individually**, like a median composite does, a geomedian composite finds the median values of the bands for each pixel when considered **together**. \n",
"\n",
"This means they represent the data **better** than median composites. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Comparing medians and geomedians"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The difference between medians and geomedians can often be subtle, especially if you are looking at the overall composite image. For example, the RGB images for these median and geomedian composites look almost identical."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img align=\"middle\" src=\"../_static/session_3/01_intro_composites_rgb.png\" alt=\"RGBs of median and geomedian\" width=\"600\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, on a pixel-by-pixel basis, it is possible to visualise the difference between median and geomedian.\n",
"\n",
"<img align=\"middle\" src=\"../_static/session_3/01_intro_composites_geomedian_median_line.png\" alt=\"Dataset scatter plot.\" width=\"850\">\n",
"\n",
"Inspect the above plot of reflectance for a single pixel. The values for median and geomedian are **not** the same &mdash; you can see the green and red lines do not quite overlap. Imagine this difference over millions of pixels. The composite results will certainly be affected. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Geomedians take more processing time to calculate than median composites. However, unless you are only doing a quick visualisation, you should use the geomedian method when creating composites. This is because the geomedian value is more scientifically rigorous as it accounts for all the bands in the dataset.\n",
"\n",
"To learn more about geomedians, you can read the following paper: [High-Dimensional Pixel Composites From Earth Observation Time Series](https://ieeexplore.ieee.org/abstract/document/8004469)."
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. note::\n",
" The ``median`` and ``geomedian`` are plotted here as lines to better show their differences -- scatter points are harder to see. The lines are not indicative of any 'trends' over the different bands."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You now know we can perform cloud masking using the `load_ard()` function, and that we should combine different timesteps of data using a geomedian calculation.\n",
"\n",
"The exercise for this session is covered in the next two sections.\n",
"\n",
"1. We will walk through the process of using `load_ard()` to load data with a cloud mask. \n",
"2. We will then use the loaded data to make and plot geomedians.\n",
"\n",
"This technique will be useful in future sessions for conducting analysis on cloud-free images."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.4"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

0 comments on commit 7cc72cf

Please sign in to comment.