# Clip Tiles to Boundary

This notebook contains steps to clip the tile index - dem, aerial, point cloud - to an area of interest (AOI) boundary.
- Acquire the Tile Index Layer of interest
    - DEM
    - Aerial
    - Point Cloud
- Bring in your AOI
    - shapefile
    - GeoJson
    - Rest End Point
    - other format
- Spatial Analysis
- Export
    - List to work with locally
    - download files directly from S3

In [None]:
# # If necessary, install the needed libraries
# %pip install boto3
# %pip install botocore
# %pip install geopandas

## Get Tile Index

Use Tile Index Geopackage [link](https://github.com/ianhorn/kyfromabove-on-aws-examples/blob/main/README.md#kyfromabove-geopackage-links) or from [KyFromAbove Open Data Explorer](https://kyfromabove.s3.us-west-2.amazonaws.com/index.html).  Just navigate to the appropriate Aerial or Elevation Folder.<br>
<br><center>
<img src="https://github.com/ianhorn/kyfromabove-on-aws-examples/blob/main/media/aws_explorer.jpg?raw=true" width="600" height="200">
</center><br>

Once you navigate to the Geopackage Folder, you can right click on the Geopackage you need and copy the link.

In [1]:
# import modules
import boto3
from botocore import UNSIGNED
from botocore.client import Config
import geopandas as gpd
# import matplotlib
import os

## Read in the tile index

For Pandas GeodataFrames, I will use the prefix **gdf_**.

The only thing you need to change would be the URL to the tile index layer and your AOI url/path.

In [2]:
# enter the url for title_url
# you can copy and paste from above
tile_url = 'https://kyfromabove.s3-us-west-2.amazonaws.com/elevation/DEM/TileGrids/kyfromabove_phase2_5k_dem_grid.gpkg'

In [None]:
# read the tile_url into a geodataframe
gdf_tiles = gpd.read_file(tile_url)
gdf_tiles = gdf_tiles.to_crs(epsg=4326)
gdf_tiles.head(1)

In [None]:
# Plot the tiles, no need to add a map yet.
print(gdf_tiles.columns)
# gdf_tiles.plot()

___
## Add an area of interest

This can be done by reading in a file locally or making a request.  For this example, I go to the US Forest Service [Data Hub](https://data-usfs.hub.arcgis.com/)

I then opened up the [FS National Forest Dataset](https://data-usfs.hub.arcgis.com/datasets/3451bcca1dbc45168ed0b3f54c6098d3_0/explore?location=27.954318%2C-107.853700%2C3.81)

I clicked the "*I want to use this*" tab at the bottom, then selected the `View Data Source`.  This opens the REST endpoint for the service.  Scroll to the bottom and click *query*.  

In the *Where* clause, use `FORESTNAME LIKE 'Daniel Boone%'`.  In *Out Fields*, enter _*_.  Switch the *Format* to `GeoJson`.  Press the `Query (Get)` button.  Copy the URL Address for the GeoJson.

If you use chrome, the [Map-Services-Enhanced](https://github.com/raykendo/Map-Services-Enhanced) extension will let you copy a shortened url of your query.

*_Note: If you have a geometry (Shapefile, GeoJson, GeoParquet, etc) file locally, you can substitute the `url` for your data file path. Geopandas can read many different formats._

In [None]:
# paste in the url from the rest endpoint query
aoi_url = 'https://apps.fs.usda.gov/arcx/rest/services/EDW/EDW_ForestSystemBoundaries_01/MapServer/1/query?where=FORESTNAME+LIKE+%27Daniel+Boone%25%27&timeRelation=esriTimeRelationOverlaps&units=esriSRUnit_Foot&outFields=*&returnExtentOnly=false&sqlFormat=none&featureEncoding=esriDefault&f=geojson'

# use gdf for geodatframe
gdf_aoi = gpd.read_file(aoi_url)
gdf_aoi

In [None]:
# print columns
gdf_aoi.columns

In [None]:
# clean up the dataframe a little bit
cols_to_keep = ['FORESTNAME', 'geometry']
gdf_aoi = gdf_aoi[cols_to_keep]
gdf_aoi.plot(
    color='green'  # you know, for the forest
)

___

## Create a function to clip the tiles to the AOI

In [None]:
#### Create a function to clip the tiles to the AOI
def clip_tiles_to_aoi(gdf_tiles, gdf_aoi):
    # clip the tiles to the AOI
    gdf_tiles_clipped = gpd.clip(gdf_tiles, gdf_aoi)
    return gdf_tiles_clipped

#### Clip the tiles

In [None]:
gdf_tiles_clipped = clip_tiles_to_aoi(gdf_tiles, gdf_aoi)

# print the number of tiles
print(f'The clipped tiles AOI has {gdf_tiles_clipped.shape[0]} tiles.')
gdf_tiles_clipped.head(1)

In [None]:
gdf_tiles_clipped.plot()

___
## Download tiles for AOI

Now that we've clipped the tile in our area of interest, now we want to download the files.  Here are some steps:
1. Setup boto3 for no-sign requests.
2. Choose an output location (local drive, your own AWS bucket, etc)
3. Define a function
4. Download

### Set up AWS

The bucket name is `kyfromabove`.  We just need to set that variable once.

When we set up the downloads, we need to use the `key` column from the clipped GeoDataFrame.

*Note: Refer to the Help Documentation to set up [File Transfer Configurations](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3.html)*

In [None]:
# set up the AWS Client
s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))

bucket = 'kyfromabove'

# print the key values 
print(gdf_tiles_clipped['key'].head(3))

### Create a function to download files

This will leverage Geopandas *.itertuples():* to iterate through the dataframe download the files.

If you want to be prompted to enter your output folder path, uncomment this the block below.  Otherwise, skip.

In [12]:
# # enter a download path
# download_path = input('enter your output folder path: ')

# print(download_path)

If you just want to enter your path for the variable, use this block below.  Otherwise comment out or skip

In [None]:
# enter a download path
download_path = '../downloads/tiles'  # enter your path in quotations
print(download_path)

This next block sets a limit on number of downloads.  Skip or comment it out if you want to proceed to downloading all files.

In [14]:
# create a function to download the tiles
def download_tiles(s3, bucket, gdf_tiles_clipped, download_path, max_downloads=int):  # Just downloading 5 for my example
    downloaded_files = []  # creating a list of downloaded files
    try:
        for i, row in enumerate(gdf_tiles_clipped.itertuples()):
            if i >= max_downloads:
                break  # stop after max iterations
            
            key = row.key
            # get the base file name of the key
            file_name = key.split('/')[-1]
            local_file_path = os.path.join(download_path, file_name)
            # append to file list
            downloaded_files.append(file_name)
            
            # download file
            if not os.path.exists(local_file_path):  # this will skip downloading duplicates
                s3.download_file(bucket, key, local_file_path)
                print(f'Downloaded {file_name}')
    except Exception as e:
        print(f"Error occurred: {e}")


    return downloaded_files

Uncomment this next block to download all files

In [15]:
# def download_tiles(s3, bucket, gdf_tiles_clipped, download_path):
#     downloaded_files = []  # creating a list of downloaded files
#     try:
#         for i, row in enumerate(gdf_tiles_clipped.itertuples()):
#             if i >= max_downloads:
#                 break  # stop after max iterations
            
#             key = row.key
#             # get the base file name of the key
#             file_name = key.split('/')[-1]
#             local_file_path = os.path.join(download_path, file_name)
#             # append to file list
#             downloaded_files.append(file_name)
            
#             # download file
#             if not os.path.exists(local_file_path):  # this will skip downloading duplicates
#                 s3.download_file(bucket, key, local_file_path)
#                 print(f'Downloaded {file_name}')
#     except Exception as e:
#         print(f"Error occurred: {e}")


#     return downloaded_files

#### Download files

In [None]:
# download 30 files
download = download_tiles(s3, bucket, gdf_tiles_clipped, download_path, max_downloads=30)

My test download.<br>
<br><img src='https://raw.githubusercontent.com/ianhorn/kyfromabove-on-aws-examples/refs/heads/main/media/downloaded_tiles.jpg' width="300" height="200">

In [17]:
# # download all files
# download_tiles(s3, bucket, gdf_tiles_clipped, download_path)

___
## Create a raster or mosaic dataset in your GIS software of choice.