<a target="_blank" href="https://colab.research.google.com/github/trchudley/GEOG2462/blob/main/Week_1_Search_and_Download/1_Download_Single_Scene.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Downloading a single scene from Google Earth Engine

## Logging in to Google Earth Engine

Now that we understand the basics of using Python and Jupyter Notebooks, we will search for and download a satellite scene from Google Earth Engine (GEE).

First, we will import the modules `ee` and `geemap`. Both of these are GEE-related Python modules which functions to interact with the GEE processing and data:

In [1]:
import ee
import geemap

Next, we must 'log in' to Google Earth Engine. You must have already registered for Google Earth Engine for this to work. You should have done this based on the instructions I sent before the class started (!), but if you haven't, you can easily register at the [following link](https://code.earthengine.google.com/register) - just make sure to select `Unpaid Usage` > `Academia & Research`.

Having done that, you can run the next cell, changing the project name to your own default 'project ID' that you would have created upon registration with GEE. If you don't know what it is, go to [code.earthengine.google.com/](https://code.earthengine.google.com) - your default project ID should appear in the top right hand corner.

After running the cell, a pop-up will appear and you wil be prompted to go through some approval screens - you can accept these. On certain setups, you may also be provided with a verification code (a long string of letters and numbers), which you can paste into the box that will appear below the cell.

In [2]:
ee.Authenticate()  # Trigger the authentication flow.
ee.Initialize(project='ee-trchudley')    # Change to your own default project name.

Unfortunately, we must do this every time we open a notebook, so get used to this process...

## Define an area of interest

We will now define a square region of the Earth to examine (hereafter our 'area of interest', or 'AOI'). Let's take a look at Durham!

In the following cell, we will define variables with a central latitude and longitude, and a length (the length of one side of our AOI, in metres). I've inserted the coordinates for Durham - however, if you're feeling adventurous, there's nothing stopping you changing your search region to your own region of interest now...

In [3]:
latitude = 54.77   # Degrees of latitude
longitude = -1.58  # Degrees of longitude
size = 10000  # Size of AOI, in metres

Now, we can define this as an Earth Engine 'geometry' (the format in which GEE holds geospatial shapes). In the first line, we define a single point, and then we 'buffer' the point to a square of the the size we want.

In [4]:
point = ee.Geometry.Point(longitude, latitude)  # Create a point
region = point.buffer(size/2).bounds()  # Buffer the point to a 2D shape

The `geemap` tool provides an ability for us to visualise our data on a map. Let's plot our AOI and see whether we're happy with it:

In [5]:
Map = geemap.Map()  # Create an empty Map
Map.addLayer(region, {}, "Search Region")  # Add our AOI
Map.centerObject(region, zoom=12)  # Centre our map on the region of interest
Map

Map(center=[54.770000194972646, -1.5798674269496404], controls=(WidgetControl(options=['position', 'transparen…

Looking good!

# Search for data

We're now going to search for an image in GEE's archive of Landsat 8 data.

We've already defined the spatial area of interest - now let's also define our temporal bounds by setting a start and end date for our search.

In [6]:
date_start = '2023-05-01'
date_end = '2023-09-30'

Note we've defined the data as a string of the format 'YYYY-MM-DD'. Scientists and programmers working with large quantities of date data often prefer to work in this format, the international standard format (ISO 8601). This is becauase (i) it avoids confusion when working with international colleauges who may have different default ways of writing dates (e.g. DD/MM/YYY vs MM/DD/YYY); and (ii) alphabetically sorting dates of this format will also sort them by time, which is useful if you have large numbers of files with dates included.

Now we can search GEE. Google Earth Engine has a number of different Landsat datasets, but for now, we are going to choose [Landsat 8, Collection 2, Tier 1 Top of Atmopshere Reflectance](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_TOA) (follow the link for Google's description of the data). Collection 2 is the 'version 2' standard reprocessing of the Landsat archive. Tier 1 is the [highest-quality data](https://www.usgs.gov/landsat-missions/landsat-collection-2-level-1-data), and Top of Atmosphere means that the data has not been corrected to surface reflectance (these data will be appended 'SR' rather than 'TOA'). TOA data makes less assumptions about the atmosphere, so we will be using this for now.

Take a good look at the lines of code below - see how we select our desired 'Image Collection' by the link to the data, and then filter it using our region and dates.

In [7]:
# Get Landsat 8 image collection
landsat8_collection = ee.ImageCollection("LANDSAT/LC08/C02/T1_TOA")

# Filter to desired region and date bounds
landsat8_collection = landsat8_collection.filterBounds(region)
landsat8_collection = landsat8_collection.filterDate(date_start, date_end)


This returns an 'Image Collection' of the scenes that fit the search parameters. By running the below cell and clicking the drop-down tabs, you can see we have found 13 'features' (images) that have 17 'bands' (e.g. red, blue, near infra-red...). There's also a lot of metadata 'properties' associated with each image, such as acquisition dates/times and geospatial data. You can explore what is included [here](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_TOA).

Have an explore of the data structure:

In [8]:
landsat8_collection

Finally, we're going to select the _least cloudy_ scene from the image collection, by sorting the collection from least to most cloudy and selecting the first image. We will also clip this image to our search region, so that we don't download or process unnecessary data:

In [9]:
image = landsat8_collection.sort('CLOUD_COVER').first()
image = image.clip(region)

image

Let's visualise this image. This process is similar to our last map, but now we will include visualisation parameters which tell `geemap` that we want to visualise a colour image from bands 4, 3, and 2 (red, green, and blue respectively).

In [10]:

Map = geemap.Map()  # Create empty map

max_reflectance = 0.25 # Set the upper limit of reflectance to visualise.
                       # Play with this value (between 0-1) to see what it
                       # does. It will need to be higher for snowy/icy
                       # scenes.

visParams = {'bands': ['B4', 'B3', 'B2'], 'max': max_reflectance}
Map.addLayer(image, visParams, 'Colour Composite Image')

Map.centerObject(region, zoom=12)
Map

Map(center=[54.770000194972646, -1.5798674269496404], controls=(WidgetControl(options=['position', 'transparen…

# Download the image

We can export this image to our Google Drive.

I've set up a script below that will construct a suitable filename. See if you can figure out what's going on. You can set the folder and file description.


In [11]:

# You can edit these variables
folder = 'scires_project_2A'
region_name = 'durham'

# Get the data of the image from the metadata
date_string = image.get('DATE_ACQUIRED').getInfo()

# Now we will construct the filename automatically
filename = region_name + '_' + date_string + '_image'

# Visualise for testing
print("The image will be saved to your Google Drive at:\n" + folder + '/' + filename + '.tif')

The image will be saved to your Google Drive at:
scires_project_2A/durham_2023-09-04_image.tif


We will now save the image using the `Export.image.toDrive` function. We will also use a function from the `time` package so that we can print updates every five seconds, just so we know the program is still running / hasn't crashed if it's taking a while.

In [12]:
import time

# Export the image, specifying scale and region.
task = ee.batch.Export.image.toDrive(**{
    'image': image.select(['B4', 'B3', 'B2', 'B5', 'B6']),
    'description': filename,
    'folder': folder,
    'scale': 30,
    'region': region.getInfo()['coordinates']
})
task.start()

while task.active():
  print('Task processing ongoing... (id: {}).'.format(task.id))
  time.sleep(5)

print('Finished processing. Image is exported to your Drive.')

Task processing ongoing... (id: Q5KXCABJ7QJ4AJHYLJSJ27LS).
Task processing ongoing... (id: Q5KXCABJ7QJ4AJHYLJSJ27LS).
Task processing ongoing... (id: Q5KXCABJ7QJ4AJHYLJSJ27LS).
Task processing ongoing... (id: Q5KXCABJ7QJ4AJHYLJSJ27LS).
Finished processing. Image is exported to your Drive.


Download the scene from your Google Drive to your local hard drive (in a sensible location, not just in the `Downloads` folder!), and move onto the next document - `2_Visualise_in_QGIS.md`

![Image of the file succesfully exported to Google Drive](https://github.com/trchudley/GEOG2462/blob/main/_images/1_googledrive.png?raw=true)