# Downloading data

To download data from EarthData, we can use the data-downloader package. To use the downloader, we first import it from the library. 

In [2]:
from data_downloader import downloader

### Signing in

EarthData has a username-password entering system in order to record who downloads data. We set up a Netrc variable which will allow us to download data using someone's password. 

In [None]:
netrc = downloader.Netrc()
netrc.add('urs.earthdata.nasa.gov','Username','Password')
print(netrc.hosts)

Next, we set up the url from which the data will be downloaded. Each data product's granules have different urls from which they can be retrieved. Some of the patterns are easier to understand than others. 

### Examples of MODIS (fire) links:

The links below work for downloading a granule of MODIS fires data. 

These links download MODIS fires information on a granule if the browser is not open. When browser is open/signed-in to EarthData, it downloads the specified granule. 

This link downloads an image with labels indicating where fires are recorded. 

### Examples of OMI/Aura (NO2) links:

This is an example of a link used to download NO2 data:

This link downloads a manual type document for this product.

### Examples of Night Time Lights (power outages) links:

The following urls can be used to download a granule from the NTL power outages data. You may have to download a granule manually first to convince your computer to recognize the text as a valid link. 

### Examples of Lightning Imaging Sensor (flashes) links:

The giant links below can be used to download a specific granule from the LIS flash data. 

When put in the search bar, the link below downloads the LIS file in downloads. Curiously, it does not work here.

Copy a url from the cell above and paste it in the box below to try using one of these links. 

In [None]:
url = "https://"

Finally, we use the downloader to download the data onto the same computer. The code below downloads it into the same folder as where this notebook is stored. 

In [None]:
downloader.download_data(url, authorize_from_browser=True)

Optionally, we can specify a few other parameters for this function:

#### `downloader.download_data(url, folder=None, authorize_from_browser=False, file_name=None, client=None, retry=0)`

- **url** - str - url of web file
- **folder** - str - the folder to store output files. Default is current folder. Ex: C:\Users\Me\University\GEODAC\downloading_files
- **authorize_from_browser** - bool - whether to load cookies used by your web browser for authorization. This means you can use python to download data by logining in to website via browser (So far the following browsers are supported: Chrome，Firefox, Opera, Edge, Chromium"). It will be very usefull when website doesn't support "HTTP Basic Auth". Default is False.
- **file_name** - str - the file name. If None, will parse from web response or url. file_name can be the absolute path if folder is None.
- **client** - httpx.Client() object - client maintaining connection. Default is None
- **retry** - int - number of reconnects when status code is 503

Another potentialy useful function for our purposes is the download_datas. It will download files from all the urls we provide it in a list. 

#### `downloader.download_datas(urls, folder=None, authorize_from_browser=False, file_names=None):`

- **urls** -  iterator - iterator contains urls
- **folder** - str - the folder to store output files. Default is current folder.
- **authorize_from_browser** - bool - whether to load cookies used by your web browser for authorization. This means you can use python to download data by logining in to website via browser (So far the following browsers are supported: Chrome，Firefox, Opera, Edge, Chromium"). It will be very usefull when website doesn't support "HTTP Basic Auth". Default is False.
- **file_names** - iterator - iterator contains names of files. Leaving it None if you want the program to parse them from website. file_names can cantain the absolute paths if folder is None.

## Getting around the LIS' weird links

To avoid the complicated links required by this method to access the LIS flash data, we can use another library called webbrowser. This library allows us to input a url into a search engine so we can use much simpler urls to access the LIS data. 

This library is downloaded in a standard Python installation so we don't have to worry about downloading it. We do still have to import the library when calling it:

In [4]:
import webbrowser

Next, we set up the url, similar to above but now we can use the LIS' simpler link. 

In [5]:
url = "https://data.ghrc.earthdata.nasa.gov/ghrcw-protected/isslis_v2_nqc__2/202206/ISS_LIS_SC_V2.1_20220616_201621_NQC.hdf" 

Now we can call the open(url) function to open the webpage in a compatible browser. 

The library can recognize the following: mozilla, firefox, netscape, galeon, epiphany, skipstone, kfmclient, konqueror, kfm, mosaic, opera, grail, links, elinks, lynx, w3m, windows-default, macosx, safari, google-chrome, chrome, chromium, and chromium-browser. (Konqueror, kfm, windows-default, macosx, and safari are not recommended as they are platform specific.) 

In [None]:
webbrowser.open(url)

An added complication with this method is that we need to sign into EarthData in the browser else it will prompt for a password. 