# Historical Climate Data Download

Logan Gall, gall0487
08 December 2023

This file downloads data from the NOAA's NCEI datasets. This contains 5km gridded climate data dating back to 1951. 
Source: https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C01589/html#

The program uses requests to download grid files for each month of the year one at a time.

In [3]:
import requests
import os

##DOWNLOAD_FILE: performs request call and downloads file to given directory.
def download_file(url, target_folder):
    # Extract filename from URL
    filename = url.split('/')[-1]

    # Make sure the target folder exists
    if not os.path.exists(target_folder):
        os.makedirs(target_folder)

    # Full path for saving the file
    full_path = os.path.join(target_folder, filename)

    # Send a GET request to the URL
    response = requests.get(url, stream=True)
    
    # Raise an exception if the request was unsuccessful
    response.raise_for_status()

    # Open the target file in binary write mode
    with open(full_path, 'wb') as file:
        # Write the content of the response in chunks to the file
        for chunk in response.iter_content(chunk_size=8192):
            file.write(chunk)

    print(f"File downloaded: {full_path}")

In [4]:
##file_downloads: This is an interface for the NOAA data. needs web directory and then iterates for 12 months.
def file_downloads(year):
    web_dir = 'https://www.ncei.noaa.gov/data/nclimgrid-daily/access/grids/'
    year = str(year)
    output_dir = 'downloaded_files/' + year
    
    start = 'ncdd-'
    end = '-grd-scaled.nc'
    for i in range(12):
        adder = ''
        if i+1 < 10:
            adder = '0'
        filepath = '/' + year + '/' + start + year + adder + str(i+1) + end
        print(filepath)
        filepath = web_dir + filepath
        try:
            download_file(filepath, output_dir)
        
        except Exception as e:
            print('error')
            print(e)

In [5]:
##Main function. Change how many years you want to download.
#WARNING: downloading all the data from 1951 on will take a lot of storage!
# Roughtly 50gb (~60Mb per file, 12 months per year, 72 years)
def main():
    start = 2001
    end = 2024
    
    for i in range(end-start+1):
        thisyear = start + i
        file_downloads(thisyear)

In [6]:
main()

/2001/ncdd-200101-grd-scaled.nc
error
502 Server Error: Proxy Error for url: https://www.ncei.noaa.gov/data/nclimgrid-daily/access/grids//2001/ncdd-200101-grd-scaled.nc
/2001/ncdd-200102-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200102-grd-scaled.nc
/2001/ncdd-200103-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200103-grd-scaled.nc
/2001/ncdd-200104-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200104-grd-scaled.nc
/2001/ncdd-200105-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200105-grd-scaled.nc
/2001/ncdd-200106-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200106-grd-scaled.nc
/2001/ncdd-200107-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-200107-grd-scaled.nc
/2001/ncdd-200108-grd-scaled.nc
error
502 Server Error: Proxy Error for url: https://www.ncei.noaa.gov/data/nclimgrid-daily/access/grids//2001/ncdd-200108-grd-scaled.nc
/2001/ncdd-200109-grd-scaled.nc
File downloaded: downloaded_files/2001/ncdd-2001