# How to Access GES DISC Data Using Python

<p></p>

<div style="border:1px solid #cccccc;padding:5px 10px;">Please, be very judicious when working on long data time series residing on a remote data server.<br />
It is very likely that attempts to apply similar approaches on remote data, such as hourly data, for more than a year of data at a time, will result in a heavy load on the remote data server. This may lead to negative consequences, ranging from very slow performance that will be experienced by hundreds of other users, up to denial of service.</div>

### Overview

There are multiple ways to work with GES DISC data resources using Python. For example, the data can accessed using [techniques that rely on a native Python code](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html). 

Still, there are several third-party libraries that can further simplify the access. In the sections below, we describe four techniques that make use of Requests, Pydap, Xarray, and netCDF4-python libraries.

### Prerequisites

This notebook was written using Python 3.8, and requires these libraries and files:

- `netrc` file with valid Earthdata Login credentials
   - [How to Generate Earthdata Prerequisite Files](https://disc.gsfc.nasa.gov/information/howto?title=How%20to%20Generate%20Earthdata%20Prerequisite%20Files)
- [requests](https://docs.python-requests.org/en/latest/) (version 2.22.0 or later)
- [pydap](https://github.com/pydap/pydap) (we recommend using version 3.4.0 or later)
- [xarray](https://docs.xarray.dev/en/stable/)
- [netCDF4-python](https://github.com/Unidata/netcdf4-python) (we recommend using version 1.6.2)

### Python Using 'Requests'

'Requests' is a popular Python library that simplifies Python access to Internet-based resources. In the following code, we demonstrate how to use 'Requests' to access GES DISC data using cookies created by a host operating system.

Download GES DISC data using the following Python3 code:

In [None]:
import requests

# Set the URL string to point to a specific data URL. Some generic examples are:
#   https://data.gesdisc.earthdata.nasa.gov/data/MERRA2/path/to/granule.nc4

URL = 'your_URL_string_goes_here'

# Set the FILENAME string to the data file name, the LABEL keyword value, or any customized name. 
FILENAME = 'your_file_string_goes_here'

import requests
result = requests.get(URL)
try:
    result.raise_for_status()
    f = open(FILENAME,'wb')
    f.write(result.content)
    f.close()
    print('contents of URL written to '+FILENAME)
except:
    print('requests.get() returned an error code '+str(result.status_code))

### Python Using 'Pydap'

A convenient access to GES DISC OPeNDAP resources can be also achieved with 'Pydap', a Python library that both provides an interface for Python programs to read from OPeNDAP servers and the netCDF4 Python module which uses the netCDF-C library to actually access data.

Use the code below to access data on OPeNDAP servers ( [read more](https://pydap.readthedocs.io/en/latest/client.html#urs-nasa-earthdata) ):

In [None]:
from pydap.client import open_url
from pydap.cas.urs import setup_session

dataset_url = 'https://servername/opendap/path/file[.format[?subset]]'

username = 'your_earthdata_username_goes_here'
password = 'your_earthdata_password_goes_here'

try:
    session = setup_session(username, password, check_url=dataset_url)
    dataset = open_url(dataset_url, session=session)
except AttributeError as e:
    print('Error:', e)
    print('Please verify that the dataset URL points to an OPeNDAP server, the OPeNDAP server is accessible, or that your username and password are correct.')

### Python using 'Xarray'

If you wish to open datasets as Xarray data objects, you can simply pass in a dataset URL to the <code>open_dataset()</code> function. Depending on the subsetting service that you wish to access, different Earthdata authentication files may be required. Here, we demonstrate accessing a granule via OPeNDAP and THREDDS.

#### OPeNDAP in Xarray:


In [None]:
import xarray as xr

# Reading a single granule URL:
ds = xr.open_dataset('https://servername/opendap/path/file[.format[?subset]]')

#### THREDDS in Xarray:

Datasets that include <code>.ncml</code> aggregation, like some provided through THREDDS, may be useful for quickly subsetting multiple granules into a single data array.

This operation requires a <code>.dodsrc</code> file in your root and working directories, and a <code>.netrc</code> file in your root directory.

In [None]:
# Subsetting a .ncml file URL:
URL = 'https://servername/thredds/dodsC/path/dataset_Aggregation.ncml'

try:
    ds = xr.open_dataset(URL).sel(lat=slice(lat1,lat2),lon=slice(lon1,lon2),time=slice(time1,time2))
except OSError as e:
    print('Error', e)
    print('Please check that your .dodsrc files are in their correct locations, or that your .netrc file has the correct username and password.')

### Python using 'netCDF4-python'

netCDF4-python is a Python library that uses the [netCDF-c](https://github.com/Unidata/netcdf-c) library to open and read netCDF4 files. It can be used to remotely access OPeNDAP netCDF4 granules, or locally downloaded netCDF4 granules.

#### OPeNDAP in netCDF4-python:

This step requires a `.netrc` file in your root directory.

In [None]:
import netCDF4 as nc4

nc = nc4.Dataset('https://servername/opendap/path/file[.nc4[?subset]]')