# Accessing Datasets under an Access Control List (ACL)

## <img src="https://logos-world.net/wp-content/uploads/2020/05/NASA-Logo-1959-present.png" width="100px" align="middle" /> NASA Earthdata API Client 🌍


> Note: Before we can use `earthaccess` we need an account with **[NASA EDL](https://urs.earthaccess.nasa.gov/)**


In [2]:
from earthaccess import Auth, DataCollections, DataGranules, Store

auth = Auth()

#### Auth()

`earthaccess`'s **Auth** class provides 3 different strategies to authenticate ourselves with NASA EDL.

* **netrc**: Do we have a `.netrc` file with our EDL credentials? if so, we can use it with `earthaccess`.
If we don't have it and want to create one we can, earthaccess allows users to type their credentials and persist them into a `.netrc` file.
* **environment**: If we have our EDL credentials as environment variables 
  * EARTHDATA_USERNAME
  * EARTHDATA_PASSWORD
* **interactive**: We will be asked for our EDL credentials with optinal persistance to `.netrc`

To persist our credentials to a `.netrc` file we have to do the following:

```python
auth.login(strategy="interactive", persist=True)
```

In this notebook we'll use the environment method followed by the `netrc` strategy. You can of course use the interactive strategy if you don't have a `.netrc` file.


In [3]:
auth.login(strategy="environment")
# are we authenticated?
if not auth.authenticated:
    auth.login(strategy="netrc")

You're now authenticated with NASA Earthdata Login
Using token with expiration date: 07/24/2022


## Querying for restricted datasets

The DataCollection client can query CMR for any collection (dataset) using all of CMR's Query parameters and has built-in functions to extract useful information from the response.

```python
auth.refresh_tokens()
```


If we belong to an early adopter group within NASA we can pass the Auth object to the other classes when we instantiate them.

```python
# An anonymous query to CMR
Query = DataCollections().keyword('elevation')
# An authenticated query to CMR
Query = DataCollections(auth).keyword('elevation')
```

and it's the same with DataGranules


```python
# An anonymous query to CMR
Query = DataGranules().keyword('elevation')
# An authenticated query to CMR
Query = DataGranules(auth).keyword('elevation')
```


> **Note**: Some collections under an access control list are flagged by CMR and won't count when asking about results with `hits()`. 


In [5]:
# The first step is to create a DataCollections query 
Query = DataCollections()

# Use chain methods to customize our query
Query.short_name("ATL06").version("005")

print(f'Collections found: {Query.hits()}')

# filtering what UMM fields to print, to see the full record we omit the fields filters
# meta is always included as 
collections = Query.fields(['ShortName','Version']).get(5)
# Inspect some results printing just the ShortName and Abstract
collections

Collections found: 1


[{
   "meta": {
     "concept-id": "C2144439155-NSIDC_ECS",
     "granule-count": 123484,
     "provider-id": "NSIDC_ECS"
   },
   "umm": {
     "ShortName": "ATL06",
     "Version": "005"
   }
 }]

In [6]:
if not auth.refresh_tokens():
    print("Something went wrong, we may need to regenerate our tokens manually")

earthaccess generated a token for CMR with expiration on: 07/24/2022


In [7]:
Query = DataCollections(auth)

# Use chain methods to customize our query
Query.short_name("ATL06").version("005")

# This will say 1, even though we get 2 back.
print(f'Collections found: {Query.hits()}')

collections = Query.fields(['ShortName','Version']).get()
# Inspect some results printing just the ShortName and Abstract
collections

Collections found: 1


[{
   "meta": {
     "concept-id": "C2144439155-NSIDC_ECS",
     "granule-count": 123484,
     "provider-id": "NSIDC_ECS"
   },
   "umm": {
     "ShortName": "ATL06",
     "Version": "005"
   }
 },
 {
   "meta": {
     "concept-id": "C2153572614-NSIDC_CPRD",
     "granule-count": 123484,
     "provider-id": "NSIDC_CPRD"
   },
   "umm": {
     "ShortName": "ATL06",
     "Version": "005"
   }
 }]


**Oh no! What!? only 1 collection found even though we got 2 results back?!**

#### Interpreting the results

The `hits()` method above will tell you the number of query hits, but only for publicly available data sets.
In this case because cloud hosted ICESat-2 data are not yet publicly available, CMR will return “1” hits, if you filtered DataCollections by provider = NSIDC_CPRD you'll get `0` hits. For now we need an alternative method of seeing how many cloud data sets are available at NSIDC. This is only temporary until cloud-hosted ICESat-2 become publicly available. We can create a collections object (we’re going to want one of these soon anyhow) and print the len() of the collections object to see the true number of hits. 

> **Note**: Since we cannot rely on `hits()` we need to be aware that `get()` may get us too many metadata records depending on the dataset and how broad our query is.


In [8]:
Query = DataGranules(auth).concept_id("C2153572614-NSIDC_CPRD").bounding_box(-134.7,58.9,-133.9,59.2).temporal("2020-03-01", "2020-03-30")

# Unfortunately the hits() methods will behave the same for granule queries
print(f"Granules found with hits(): {Query.hits()}")

cloud_granules = Query.get()

print(f"Actual number found: {len(cloud_granules)}")

Granules found with hits(): 0
Actual number found: 4


In [None]:
store = Store(auth)
files = store.get(cloud_granules, "./data/C2153572614-NSIDC_CPRD/")