# Download time-series data from OC-CCI

**Last updated: 28/04/2024**

This script downloads L3 time-series data from the European Space Agency's [**Ocean Colour Climate Change Initiative (OC-CCI)**](https://www.oceancolour.org). This script uses the conda environment `bashenv` since the Bash command `wget` is essential for downloading data from the OC-CCI website.

The OC-CCI products downloaded in this script for my area of study are:
* **daily data at 1 km resolution**, chla and water classes
* **daily data 4 km resolution**, chla, chla error
* **5-day data 4 km resolution**, chla, chla error
* **8-day data 4 km resolution**, chla, chla error

**This script serves as a template. Modify the code sections below to tailor it to your specific area of study and datasets.**

Before running the code below, obtain the URLs for your OC-CCI products of interest by following these steps:
1. Visit https://www.oceancolour.org and navigate to *OPeNDAP*.
2. Choose the dataset of interest and select the *NetcdfSubset* option under *Access*.
3. Specify the variables, coordinates and time period you need. If the time period spans several decades, split the dataset into two periods (e.g., 1997–2010 and 2010–2024) to efficiently manage file size. Since the data will not be subset at the exact boundaries provided by your lat/lon (due to precision limits), expand the lat/lon to ensure complete coverage with the subsetting tool (e.g., my area of study, the Endurance site, has the coordinates longitude = [0.265, 1.801] and latitude = [53.758, 54.656], which I have expanded to longitude = [0.20, 1.85] and latitude = [53.70, 54.70]).
4. Copy the download URL provided at the bottom of the page and paste it into the "URL_LIST" in the code below.

In [None]:
%% bash

URL_LIST=(
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=chlor_a&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=chlor_a&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class1&var=water_class2&var=water_class3&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class4&var=water_class5&var=water_class6&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class7&var=water_class8&var=water_class9&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class10&var=water_class11&var=water_class12&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class13&var=water_class14&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class1&var=water_class2&var=water_class3&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class4&var=water_class5&var=water_class6&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class7&var=water_class8&var=water_class9&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class10&var=water_class11&var=water_class12&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class13&var=water_class14&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-DAILY?var=chlor_a&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-DAILY?var=chlor_a&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-5DAY?var=chlor_a&var=total_nobs_sum&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-5DAY?var=chlor_a&var=total_nobs_sum&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-8DAY?var=chlor_a&var=total_nobs_sum&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
    "https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-8DAY?var=chlor_a&var=total_nobs_sum&var=chlor_a_log10_bias&var=chlor_a_log10_rmsd&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf"
)

# Output filename list (should correspond to the dataset names listed above)
OUTPUT_FILENAME_LIST=(
    "occci_1km_1day_chl_9710.nc"
    "occci_1km_1day_chl_1024.nc"
    "occci_1km_1day_waterclass_01to03_9710.nc"
    "occci_1km_1day_waterclass_04to06_9710.nc"
    "occci_1km_1day_waterclass_07to09_9710.nc"
    "occci_1km_1day_waterclass_10to12_9710.nc"
    "occci_1km_1day_waterclass_13to14_9710.nc"
    "occci_1km_1day_waterclass_01to03_1024.nc"
    "occci_1km_1day_waterclass_04to06_1024.nc"
    "occci_1km_1day_waterclass_07to09_1024.nc"
    "occci_1km_1day_waterclass_10to12_1024.nc"
    "occci_1km_1day_waterclass_13to14_1024.nc"
    "occci_4km_1day_chl_9710.nc"
    "occci_4km_1day_chl_1024.nc"
    "occci_4km_5day_chl_9710.nc"
    "occci_4km_5day_chl_1024.nc"
    "occci_4km_8day_chl_9710.nc"
    "occci_4km_8day_chl_1024.nc"
)

In [None]:
%%bash

# Parameters to define the data download directories
ROOT_DIR="../.."  # two directories up
DATA_DIR="data/raw/OCCCI_data"
DATA_SUBDIR='data_timeseries_areastudy_OCCCI_nc'

In [3]:
%%bash

# Construct paths

# Use printf to create a portable file path
data_subdir_path=$(printf '%s/%s/%s/' "$ROOT_DIR" "$DATA_DIR" "$DATA_SUBDIR")

# Create the directory at the specified path if it doesn't already exist
if [ ! -d "${data_subdir_path}" ]; then
    mkdir -p "${data_subdir_path}"
    echo "Directory created: ${data_subdir_path}"
fi

# Combine data_subdir_path with the OUTPUT_FILENAME_LIST and add to output_filename_list_path
output_filename_list_path=()
for filename in "${OUTPUT_FILENAME_LIST[@]}"; do 
    output_filename_list_path+=("${data_subdir_path}${filename}")
done

# Combine URL_LIST and output_filename_list_path using paste with a space delimiter. We will pass
# two arguments at a time (-n 2) to wget, and execute at most 4 parallel wget processes at a time 
# (-P 4). The -q argument executes wget quietly (no output to the terminal) and xarg returns only
# after the last spawned process has finished.
paste -d ' ' <(printf "%s\n" "${URL_LIST[@]}") <(printf "%s\n" "${output_filename_list_path[@]}") | \
xargs -n 2 -P 4 bash -c 'echo "URL: $1"; echo "Output file path: $2"; wget -q -e robots=off -O "$2" "$1"' --

echo "Download completed!"

URL: https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=chlor_a&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=2010-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf
Output file path: ../../data/raw/OCCCI_data/data_timeseries_areastudy_OCCCI_nc/occci_1km_1day_chl_9710.nc
URL: https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=chlor_a&var=total_nobs&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=2010-01-02T00%3A00%3A00Z&time_end=2024-01-01T00%3A00%3A00Z&timeStride=1&accept=netcdf
Output file path: ../../data/raw/OCCCI_data/data_timeseries_areastudy_OCCCI_nc/occci_1km_1day_chl_1024.nc
URL: https://www.oceancolour.org/thredds/ncss/CCI_ALL-v6.0-1km-DAILY?var=water_class1&var=water_class2&var=water_class3&north=54.7&west=0.20&east=1.85&south=53.70&disableProjSubset=on&horizStride=1&time_start=1997-09-04T00%3A00%3A00Z&time_end=201