# Data Download

The purpose of this notebook is to download data from The National Map, using the TNM Rest API. API Documentation [available here](https://tnmaccess.nationalmap.gov/api/v1/docs). A more comphensive dataset documentation is [available here](../../docs/Dataset.md).

Data downloaded:
- **Watershed Boundary (WBD)**



In [1]:
# Import necessary modules
import requests
import pandas as pd
from pathlib import Path
import json
import sys
import os

In [2]:
# Base path
project_base_path = Path.cwd().parent.parent

In [3]:
# Add src to system path 
sys.path.append(str(project_base_path / 'src'))

# import modules
from dataDownload.download import download

## Watershed Boundary Dataset (WBD)

Let's start by fetching all available Watershed Boundary Dataset and look for the one with resolution that fits the objective of this study.

In [4]:
# Load the bbox of the neighboring New York state
neighboring_ny_state_bbox_path =  project_base_path / 'data' / 'other' / 'ny_neighboring_bbox.json'
with open(neighboring_ny_state_bbox_path, 'r') as f:
    neighboring_ny_state_bbox_dict = json.load(f)

corners = ['bottom_left', 'bottom_right', 'top_right', 'top_left']
pairs = [f"{neighboring_ny_state_bbox_dict[corner][0]} {neighboring_ny_state_bbox_dict[corner][1]}" for corner in corners]
bbox = ",".join(pairs)

# Define the base URL for the TNM API
base_url = "https://tnmaccess.nationalmap.gov/api/v1/"

# Define parameters for the API request to query available datasets
params = {
    "polygon": bbox, # Sepcify the area to seach for
    "datasets": "National Watershed Boundary Dataset (WBD)",  # Specify Watershed Boundary Dataset
    "outputFormat": "JSON"  # Specify JSON output
}

# Send a GET request to the API
response = requests.get(base_url + "products", params=params)

# Check if the request was successful
if response.status_code == 200:
    # Parse the JSON response
    response_json = response.json()
    
    # Display the dataset information
    print("Available Watershed Boundary Datasets:")
    for dataset in response_json.get("items"):
        print(f"- Name: {dataset['title']}")
        print(f"  Extent: {dataset['extent']}")
        print(f"  Description: {dataset['body']}")
        print(f"  Metadata URL: {dataset['metaUrl']}\n")
else:
    print(f"Failed to retrieve data. HTTP Status Code: {response.status_code}")

Available Watershed Boundary Datasets:
- Name: USGS Watershed Boundary Dataset (WBD) - National (published 20250107) FileGDB
  Extent: National
  Description: The Watershed Boundary Dataset (WBD) is a comprehensive aggregated collection of hydrologic unit data consistent with the national criteria for delineation and resolution. It defines the areal extent of surface water drainage to a point except in coastal or lake front areas where there could be multiple outlets as stated by the "Federal Standards and Procedures for the National Watershed Boundary Dataset (WBD)" "Standard" (https://pubs.usgs.gov/tm/11/a3/). Watershed boundaries are determined solely upon science-based hydrologic principles, not favoring any administrative boundaries or special projects, nor particular program or agency. This dataset represents the hydrologic unit boundaries to the 12-digit (6th level) for the entire United States. Some areas may also include additional subdivisions representing the 14- and 16-digi

We are interested at the the HU-2 digits extent in the shapefile format. 

In [5]:
# Filter files within the extent HU-2 digit in shapefile format
filtered_response = [
    record for record in response_json.get('items')
    if record.get('format') == 'Shapefile' and record.get('extent') == 'HU-2 Region'
]
print(f'{len(filtered_response)} files were found.')

# Save to a file the filtered response
try:
    export_path = project_base_path / 'data' / 'other' / 'watershed_boundary_dataset.json'
    with open(export_path, 'w') as f:
        json.dump(filtered_response, f, indent=4)
    print(f'Watershed Boundary Dataset exported sucessfully to {export_path}.')
except Exception as err:
    print(f'Could not save the file. An err was ecountered: {err}')
    

4 files were found.
Watershed Boundary Dataset exported sucessfully to c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\other\watershed_boundary_dataset.json.


**Download**

In [None]:
for item in filtered_response:
    try:
        url = item.get('downloadURL')
        
        item_base_name = os.path.basename(url)
        item_local_path = project_base_path / 'data' / 'shapefiles' / item_base_name

        download(url=url, filename=item_local_path, unzip=True)
        print(f'{item_base_name} downloaded and unzipped successfully.\n')

    except Exception as err:
        print(f'Failed to download or unzip file {item_base_name}: {err}')
        next

Downloaded: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_01_HU2_Shape.zip
Extracted files to: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_01_HU2_Shape
WBD_01_HU2_Shape.zip downloaded and unzipped successfully.
Downloaded: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_02_HU2_Shape.zip
Extracted files to: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_02_HU2_Shape
WBD_02_HU2_Shape.zip downloaded and unzipped successfully.
Downloaded: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_04_HU2_Shape.zip
Extracted files to: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_04_HU2_Shape
WBD_04_HU2_Shape.zip downloaded and unzipped successfully.
Downloaded: c:\Users\esanttos\Documents\Alan temp\Unit-hydrograph-Model\data\shapefiles\WBD_05_HU2_Shape.zip
Extracted files to: c:\Users\esanttos\Documents\