<link rel="stylesheet" href="https://use.typekit.net/dvn1law.css">
<style>        
@font-face {
font-family:"futura-pt-bold";
src:url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/l?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("woff2"),url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/d?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("woff"),url("https://use.typekit.net/af/053fc9/00000000000000003b9af1e4/27/a?primer=7cdcb44be4a7db8877ffa5c0007b8dd865b3bbc383831fe2ea177f62257a9191&fvd=n7&v=3") format("opentype");
font-display:auto;font-style:normal;font-weight:700;font-stretch:normal;
}
</style>
<div style="display: flex; margin: 0px; padding-top: 1.5rem; padding-bottom: 1.5rem; font-family: futura-pt, 'Tahoma', 'Segoe UI', Geneva, Verdana, sans-serif;">
    <span style="margin-right: 15px; padding-right: 2rem; background-color: #3b6d48;"></span>
    <div style="margin-bottom: auto; margin-top: auto; margin-right: auto; padding-right: 15px;">
        <div style="margin: 0; padding-top: 0.2rem; padding-bottom: 3.3rem; letter-spacing: 0.15rem; color: #a6ce37; font-weight: bold; font-size: 3rem; font: futura-pt-bold"> CEOS Analytics Lab</div>
        <div style="margin: 0; color: #469ab9; font-weight: bold; font-size: 1.5rem;">Welcome to the CEOS Analytics Lab!</div>
        <div style="margin: 0; padding-bottom: 0.2rem; color: #474c38; font-size: 1.25rem;"><span>Tutorial</span><span>| </span><span style="color: #3b6d48; font-weight: bold;">Cleaning Data</span></div>
        <hr style="border: 1px solid #474c38;">
    </div>
    <div style="margin-top: auto; margin-bottom: auto; margin-left: auto; padding-left: 15px;">
        <div><img style="vertical-align: middle; padding: 0.5rem; width: 300px; height: auto;" src="https://ceos.org/document_management/Communications/CEOS-Logos/CEOS_logo_colour_no_text-small.png" /></div>
    </div>
</div>

# Introduction

In this tutorial we will clean Landsat 8 data and display a median mosaic of the cleaned dataset. 

# Tutorial

## Import Dependencies and Configure EASI

Begin by initializing CAL. Initializing CAL provides access to large library of convenience utilities that greatly simplify data analysis. After running this cell the output will show "Successfully found configuration for deployment 'eail'".

In [None]:
import sys, os
sys.path.append(os.path.expanduser('~/cal-notebooks/scripts'))
os.environ['USE_PYGEOS'] = '0'

### EASI tools
from easi_tools import EasiDefaults
from easi_tools import notebook_utils
easi = EasiDefaults() # Get the default parameters for this system

## Select Area of Interest

Next, define the area of interest and display it on a map. Showing a map is a good first step in any notebook as it helps both us (the writers) and the readers visualize the area of interest. Notice the use of CAL's default area of interest and the convenience utility for displaying the map.

In [None]:
latitude = easi.latitude
longitude = easi.longitude

from dea_tools.plotting import display_map
display_map(longitude, latitude)

## Load Data

Now load the data. This may take some time. When the data is finished loading, you will see a summary of the dataset.

In [None]:
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True)

import datacube
dc = datacube.Datacube()
landsat_dataset = dc.load(latitude = latitude,
                          longitude = longitude,
                          time = ('2021-01-01', '2021-01-31'),
                          product = 'landsat8_c2l2_sr',
                          measurements = ['red', 'green', 'blue', 'nir', 'swir1', 'swir2', 'pixel_qa'],
                          output_crs = 'EPSG:6933',
                          resolution = (-30,30),
                         ) 
landsat_dataset

## Clean the data

Next, clean the data. Here, we again see the use of a CAL convenience utility for cleaning data. Data must be cleaned prior to performing any analysis so that invalid or unneeded data doesn't impact the results.

In [None]:
from datacube.utils import masking
clean_mask = masking.make_mask(landsat_dataset['pixel_qa'], clear='clear')
cleaned_landsat_dataset = landsat_dataset.where(clean_mask)
cleaned_landsat_dataset

## Create mosaic

After we have cleaned the data, we create a median mosaic.

In [None]:
from odc.algo import to_f32
from ceos_utils.data_cube_utilities.dc_mosaic import create_median_mosaic
landsat_composite = to_f32(create_median_mosaic(cleaned_landsat_dataset, clean_mask), scale=0.0000275, offset=0.2)
landsat_composite

## Display Results

Finally, display the mosaic. We have chosen a Landsat 8 band combination commonly used to vizualize urban areas.

In [None]:
from dea_tools.plotting import rgb
rgb(landsat_composite, ['swir2', 'swir1', 'red'])

## Conclusion

You have successfully cleaned Landsat 8 data and used it to display a median mosaic

# Learn More

For more information, see the following

- The CAL Environment
- CAL Utilities
- EASI
- Loading data using ODC
- Data Cleaning Explained
- Pixel Compositors
- Landsat 8 Band Combinations

In [None]:
import odc.algo
odc.algo.__file__

In [None]:
!ls /env/lib/python3.10/site-packages/odc