# Extracting Raster Data

## **Overview**

This is a demonstration of how we can use the [XEE](https://github.com/google/Xee) package to directly extract a raster from GEE and save it as a GeoTIFF file. This process skips the export process and uses `rioxarray` to save the extracted XArray DataSet to a GeoTIFF file.


## Setup and Data Download

The following blocks of code will install the required packages and download the datasets to your Colab environment.

In [1]:
%%capture
if 'google.colab' in str(get_ipython()):
    !pip install --upgrade xee
    !pip install rioxarray


In [2]:
import ee
import xarray
import rioxarray as rxr
import matplotlib.pyplot as plt
import pandas as pd
import os
import datetime
import numpy as np

In [3]:
output_folder = 'output'

if not os.path.exists(output_folder):
    os.mkdir(output_folder)

Initialize EE with the [High-Volume EndPoint](https://developers.google.com/earth-engine/guides/processing_environments#high-volume_endpoint) recommended to be used with XEE.

Replace the cloud_project with your own project from [Google Cloud Console](https://console.cloud.google.com/).

In [4]:
cloud_project = 'spatialthoughts'

try:
    ee.Initialize(project=cloud_project, opt_url='https://earthengine-highvolume.googleapis.com')
except:
    ee.Authenticate()
    ee.Initialize(project=cloud_project, opt_url='https://earthengine-highvolume.googleapis.com')

## Procedure

Here we will use the [LandScan Population Data Global 1km](https://developers.google.com/earth-engine/datasets/catalog/projects_sat-io_open-datasets_ORNL_LANDSCAN_GLOBAL) from the Awesome GEE Community Catalog Catalog and extract a population raster for 2023. For the country boundary, we will use the [LSIB 2017: Large Scale International Boundary Polygons](https://developers.google.com/earth-engine/datasets/catalog/USDOS_LSIB_SIMPLE_2017) dataset.





Select a country and extract the geometry.

In [9]:
country = 'Kenya'

lsib = ee.FeatureCollection('USDOS/LSIB_SIMPLE/2017')
country_boundary = lsib.filter(ee.Filter.eq('country_na', country))
geometry = country_boundary.geometry()

Select the population raster from Landscan.

In [16]:
year = 2023
start_date = ee.Date.fromYMD(year, 1, 1)
end_date = ee.Date.fromYMD(year+1, 1, 1)
landscan = ee.ImageCollection('projects/sat-io/open-datasets/ORNL/LANDSCAN_GLOBAL')
filtered = landscan.filter(ee.Filter.date(start_date, end_date))
image = filtered.first()

We now clip the image to prepare it for download. XEE works only with ImageCollections so we convert the clipped image to an ImageCollection.

In [19]:
clipped = image.clip(geometry)
clipped_col = ee.ImageCollection([clipped])

Now we have an ImageCollection that we want to get it as a XArray Dataset. We define the region of interest and extract the ImageCollection using the 'ee' engine. XEE needs `scale` to be in the same unit as the CRS, so we use `0.0083333333333333` (i.e 1km).

In [20]:
ds = xarray.open_dataset(
    clipped_col,
    engine='ee',
    crs='EPSG:4326',
    scale=0.0083333333333333,
    geometry=geometry,
)

In [21]:
ds

The Dataset has only 1 time coordinate (the chosen year) and 1 variable (the band b1). Select it to get a DataArray.

In [22]:
da = ds.isel(time=0).b1
da

We can now clip and save the results as using `rioxarray` as GeoTIFF file.

In [30]:
# transform the image to suit rioxarray format
da_export = da \
  .rename({'lat': 'y', 'lon': 'x'}) \
  .transpose('y', 'x') \
  .rio.write_crs('EPSG:4326')

output_file = 'population.tif'
output_path = os.path.join(output_folder, output_file)
da_export.rio.to_raster(output_path, driver='COG')