# Download and Organize Satellite Data

This notebook downloads NASA MODISA Level 2 Ocean Color data, organizes the downloaded files by geographical regions and time periods, and cleans up the downloaded files.

### Imports and Authentication

Imports required libraries and authenticates with EarthAccess.

In [1]:
import earthaccess
from pathlib import Path
import shutil

# Authenticate with EarthAccess
auth = earthaccess.login(persist=True)

### Setup: Directories, Regions, Time Periods, and Dataset

Sets up the base directory, defines the regions and time periods of interest, and specifies the dataset to be downloaded.

In [3]:
base_dir = Path("E:\\satdata")
base_dir.mkdir(parents=True, exist_ok=True)

# Define regions
regions = {
    "Texas Louisiana Shelf": {"lon_min": -94, "lon_max": -88, "lat_min": 27.5, "lat_max": 30.5}
}

# Define time periods
time_periods = [
    # ("2024-06-01", "2024-06-30")
    # ("2012-08-24", "2012-09-02")

    ("2005-06-01", "2005-11-30"),
    ("2006-06-01", "2006-11-30"),
    ("2007-06-01", "2007-11-30"),
    ("2008-06-01", "2008-11-30"),
    ("2009-06-01", "2009-11-30"),
    ("2010-06-01", "2010-11-30"),
    ("2011-06-01", "2011-11-30"),
]

### Data Download and Organization

Iterates over defined time periods and regions to search, download, and organize the data files accordingly.

In [4]:
# Dataset to use
dataset_short_name = "MODISA_L2_OC"

for time_period in time_periods:
    for region, coords in regions.items():
        print(f"\n🔎 Searching for {region} data from {time_period[0]} to {time_period[1]}")

        results = earthaccess.search_data(
            short_name=dataset_short_name,
            temporal=time_period,
            bounding_box=(
                coords["lon_min"],
                coords["lat_min"],
                coords["lon_max"],
                coords["lat_max"]
            )
        )

        if not results:
            print(f"No results for {region} during {time_period[0]} to {time_period[1]}")
            continue

        # Download data to the default cache directory
        downloaded_files = earthaccess.download(results)

        # Create output directory and move downloaded files
        out_dir = base_dir / f"{region}_{time_period[0]}_{time_period[1]}"
        out_dir.mkdir(parents=True, exist_ok=True)

        for file_path in downloaded_files:
            src = Path(file_path)
            dest = out_dir / src.name
            shutil.move(str(src), str(dest))  # Move file from cache to output directory

        print(f"✅ Downloaded and organized {len(downloaded_files)} files into {out_dir}")


🔎 Searching for Texas Louisiana Shelf data from 2005-06-01 to 2005-11-30


QUEUEING TASKS | :   0%|          | 0/285 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/285 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/285 [00:00<?, ?it/s]

✅ Downloaded and organized 285 files into E:\satdata\Texas Louisiana Shelf_2005-06-01_2005-11-30

🔎 Searching for Texas Louisiana Shelf data from 2006-06-01 to 2006-11-30


QUEUEING TASKS | :   0%|          | 0/291 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/291 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/291 [00:00<?, ?it/s]

✅ Downloaded and organized 291 files into E:\satdata\Texas Louisiana Shelf_2006-06-01_2006-11-30

🔎 Searching for Texas Louisiana Shelf data from 2007-06-01 to 2007-11-30


QUEUEING TASKS | :   0%|          | 0/292 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/292 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/292 [00:00<?, ?it/s]

✅ Downloaded and organized 292 files into E:\satdata\Texas Louisiana Shelf_2007-06-01_2007-11-30

🔎 Searching for Texas Louisiana Shelf data from 2008-06-01 to 2008-11-30


QUEUEING TASKS | :   0%|          | 0/289 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/289 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/289 [00:00<?, ?it/s]

✅ Downloaded and organized 289 files into E:\satdata\Texas Louisiana Shelf_2008-06-01_2008-11-30

🔎 Searching for Texas Louisiana Shelf data from 2009-06-01 to 2009-11-30


QUEUEING TASKS | :   0%|          | 0/284 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/284 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/284 [00:00<?, ?it/s]

✅ Downloaded and organized 284 files into E:\satdata\Texas Louisiana Shelf_2009-06-01_2009-11-30

🔎 Searching for Texas Louisiana Shelf data from 2010-06-01 to 2010-11-30


QUEUEING TASKS | :   0%|          | 0/290 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/290 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/290 [00:00<?, ?it/s]

✅ Downloaded and organized 290 files into E:\satdata\Texas Louisiana Shelf_2010-06-01_2010-11-30

🔎 Searching for Texas Louisiana Shelf data from 2011-06-01 to 2011-11-30


QUEUEING TASKS | :   0%|          | 0/296 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/296 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/296 [00:00<?, ?it/s]

✅ Downloaded and organized 296 files into E:\satdata\Texas Louisiana Shelf_2011-06-01_2011-11-30


In [10]:
# SST Dataset to use
sst_dataset_short_name = "MODIS_A-JPL-L2P-v2019.0"

for time_period in time_periods:
    for region, coords in regions.items():
        print(f"\n🔎 Searching for {sst_dataset_short_name} data in {region} from {time_period[0]} to {time_period[1]}")

        results = earthaccess.search_data(
            short_name=sst_dataset_short_name,
            temporal=time_period,
            bounding_box=(
                coords["lon_min"],
                coords["lat_min"],
                coords["lon_max"],
                coords["lat_max"]
            )
        )

        if not results:
            print(f"No results for {sst_dataset_short_name} in {region} during {time_period[0]} to {time_period[1]}")
            continue

        # Download data to the default cache directory
        downloaded_files = earthaccess.download(results)

        # Create output directory and move downloaded files
        # Differentiate SST output directory
        out_dir = base_dir / f"{sst_dataset_short_name}_{region}_{time_period[0]}_{time_period[1]}"
        out_dir.mkdir(parents=True, exist_ok=True)

        for file_path in downloaded_files:
            src = Path(file_path)
            dest = out_dir / src.name
            shutil.move(str(src), str(dest))  # Move file from cache to output directory

        print(f"✅ Downloaded and organized {len(downloaded_files)} files for {sst_dataset_short_name} into {out_dir}")


🔎 Searching for MODIS_A-JPL-L2P-v2019.0 data in Texas Louisiana Shelf from 2024-06-01 to 2024-06-30


QUEUEING TASKS | :   0%|          | 0/86 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/86 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/86 [00:00<?, ?it/s]

✅ Downloaded and organized 86 files for MODIS_A-JPL-L2P-v2019.0 into E:\satdata\MODIS_A-JPL-L2P-v2019.0_Texas Louisiana Shelf_2024-06-01_2024-06-30


In [5]:
# SST Dataset to use
# jplMURSST41anom1day
# MUR‑JPL‑L4‑GLOB‑v4.1
sst_dataset_short_name = "MUR-JPL-L4-GLOB-v4.1"

for time_period in time_periods:
    for region, coords in regions.items():
        print(f"\n🔎 Searching for {sst_dataset_short_name} data in {region} from {time_period[0]} to {time_period[1]}")

        results = earthaccess.search_data(
            short_name=sst_dataset_short_name,
            temporal=time_period,
            bounding_box=(
                coords["lon_min"],
                coords["lat_min"],
                coords["lon_max"],
                coords["lat_max"]
            )
        )

        if not results:
            print(f"No results for {sst_dataset_short_name} in {region} during {time_period[0]} to {time_period[1]}")
            continue

        # Download data to the default cache directory
        downloaded_files = earthaccess.download(results)

        # Create output directory and move downloaded files
        # Differentiate SST output directory
        out_dir = base_dir / f"{sst_dataset_short_name}_{region}_{time_period[0]}_{time_period[1]}"
        out_dir.mkdir(parents=True, exist_ok=True)

        for file_path in downloaded_files:
            src = Path(file_path)
            dest = out_dir / src.name
            shutil.move(str(src), str(dest))  # Move file from cache to output directory

        print(f"✅ Downloaded and organized {len(downloaded_files)} files for {sst_dataset_short_name} into {out_dir}")


🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2005-06-01 to 2005-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2005-06-01_2005-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2006-06-01 to 2006-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2006-06-01_2006-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2007-06-01 to 2007-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2007-06-01_2007-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2008-06-01 to 2008-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2008-06-01_2008-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2009-06-01 to 2009-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2009-06-01_2009-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2010-06-01 to 2010-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2010-06-01_2010-11-30

🔎 Searching for MUR-JPL-L4-GLOB-v4.1 data in Texas Louisiana Shelf from 2011-06-01 to 2011-11-30


QUEUEING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

PROCESSING TASKS | :   0%|          | 0/184 [00:00<?, ?it/s]

COLLECTING RESULTS | :   0%|          | 0/184 [00:00<?, ?it/s]

✅ Downloaded and organized 184 files for MUR-JPL-L4-GLOB-v4.1 into E:\satdata\MUR-JPL-L4-GLOB-v4.1_Texas Louisiana Shelf_2011-06-01_2011-11-30
