Skip to content

Commit

Permalink
Merge pull request #61 from digitalearthafrica/external_dataset
Browse files Browse the repository at this point in the history
Import external datasets tutorial
  • Loading branch information
eefaye committed Dec 14, 2020
2 parents 4a06110 + bcffc1d commit 1eed78b
Show file tree
Hide file tree
Showing 6 changed files with 240 additions and 0 deletions.
239 changes: 239 additions & 0 deletions docs/External_dataset.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Import external datasets\n",
"\n",
"The Digital Earth Africa Sandbox allows users to add external data such as shapefiles and .geojson files to their algorithms.\n",
"\n",
"This tutorial will take you through:\n",
"\n",
"1. The packages to import\n",
"2. Setting the path for the vector file\n",
"3. Loading the external dataset\n",
"4. Displaying the dataset on a basemap\n",
"5. Loading the satellite imagery by using the extent of the external dataset\n",
"6. Mask the area of interest from the satellite imagery using the extenal dataset\n",
"\n",
"For this tutorial, the example external dataset is in a shapefile format."
]
},
{
"cell_type": "raw",
"metadata": {
"raw_mimetype": "text/restructuredtext"
},
"source": [
".. note::\n",
" Before you proceed, ensure you have completed all lessons in the `DE Africa Six-Week Training Course <session_1/01_what_is_digital_earth_africa.ipynb>`_."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Set up notebook\n",
"\n",
"In your **Training** folder, create a new Python 3 notebook. Name it `external_dataset.ipynb`. For more instructions on creating a new notebook, see the [instructions from Session 2](./session_2/04_load_data_exercise.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load packages and functions\n",
"\n",
"In the first cell, type the following code and then run the cell to import necessary Python dependencies.\n",
"\n",
" import sys\n",
" import datacube\n",
" import numpy as np\n",
" import pandas as pd\n",
" import geopandas as gpd\n",
" \n",
" from datacube.utils import geometry\n",
"\n",
" sys.path.append('../Scripts')\n",
" from deafrica_datahandling import load_ard, mostcommon_crs\n",
" from deafrica_plotting import map_shapefile, rgb\n",
" from deafrica_spatialtools import xr_rasterize"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Take note of the packages below on how they were imported with other packages above.\n",
"\n",
"These packages are the packages you will need when you want to use external dataset.\n",
"\n",
" import geopandas as gpd\n",
" from datacube.utils import geometry\n",
" from deafrica_plotting import map_shapefile\n",
" from deafrica_spatialtools import xr_rasterize"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to the datacube\n",
"\n",
"Enter the following code and run the cell to create our `dc` object, which provides access to the datacube.\n",
"\n",
" dc = datacube.Datacube(app='import_dataset')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a folder called **data** in the **Training** directory.\n",
"Download this [zip file](_static/external_dataset/reserve.zip) and extract on your local machine.\n",
"Upload the `reserve` shapefile (cpg, dbf, shp, shx) into the **data** folder.\n",
"\n",
"Create a variable called `shapefile_path`,to store the path of the shapefile as shown below.\n",
"\n",
" shapefile_path = \"data/reserve.shp\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Read the shapefile into a GeoDataFrame using the `gpd.read_file` function.\n",
"\n",
" gdf = gpd.read_file(shapefile_path)\n",
"\n",
"Convert all of the shapes into a datacube geometry using `geometry.Geometry`\n",
"\n",
" geom = geometry.Geometry(gdf.unary_union, gdf.crs)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use the `map_shapefile` function to display the shapefile on a basemap.\n",
"\n",
" map_shapefile(gdf, attribute=gdf.columns[0], fillOpacity=0, weight=2)\n",
" \n",
"<img align=\"middle\" src=\"_static/external_dataset/1.PNG\" alt=\"The DE Africa\" width=\"400\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a query object\n",
"\n",
"We will replace `x` and `y` with `geopolygon`, as shown below. We remove the `x`, `y` arguments and replace it with `geopolygon`.\n",
"\n",
" query = {\n",
" 'x' : x,\n",
" 'y' : y,\n",
" 'group_by': 'solar_day',\n",
" 'time' : ('2019-01-15'),\n",
" 'resolution': (-10, 10),\n",
" }\n",
"\n",
"Remove `x`, `y` from `query` and update with `geopolygon`:\n",
"\n",
" query = {\n",
" 'geopolygon' : geom,\n",
" 'group_by': 'solar_day',\n",
" 'time' : ('2019-01-15'),\n",
" 'resolution': (-10, 10),\n",
" }\n",
"\n",
"\n",
"We then identify the most common projection system in the input query, and load the dataset `ds`.\n",
" \n",
" output_crs = mostcommon_crs(dc=dc, product='s2_l2a', query=query)\n",
"\n",
" ds = load_ard(dc=dc,\n",
" products=['s2_l2a], \n",
" output_crs=output_crs,\n",
" measurements=[\"red\",\"green\",\"blue\"],\n",
" **query\n",
" )\n",
"\n",
"Print the `ds` result.\n",
"\n",
" ds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ploting of the result\n",
"We will dipslay the returned dataset using the `rgb` functions. \n",
" \n",
" rgb(ds)\n",
" \n",
"<img align=\"middle\" src=\"_static/external_dataset/2.PNG\" alt=\"The DE Africa\" width=\"400\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Rasterise the shapefile\n",
"\n",
"Before we can apply the shapefile data as a mask, we need to convert the shapefile to a raster using the `xr_rasterize` function.\n",
" \n",
" mask = xr_rasterize(gdf, ds)\n",
"\n",
"## Mask the dataset\n",
"\n",
"Mask the dataset using the `ds.where` and `mask` to set pixels outside the polygon to `NaN`.\n",
"\n",
" ds = ds.where(mask)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the masked result of the dataset\n",
"\n",
" rgb(ds)\n",
" \n",
"<img align=\"middle\" src=\"_static/external_dataset/3.PNG\" alt=\"The DE Africa\" width=\"400\">"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"You can apply this method to already exisiting notebooks you are working with. It is useful for selecting specific areas of interest, and for transferring information between the Sandbox and GIS platorms."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Binary file added docs/_static/external_dataset/1.PNG
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/external_dataset/2.PNG
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/external_dataset/3.PNG
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_static/external_dataset/reserve.zip
Binary file not shown.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,4 +63,5 @@ Latest Updates
Geomedian_widget
Maps_help
OWS_tutorial
External_dataset
License

0 comments on commit 1eed78b

Please sign in to comment.