Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 116 additions & 1 deletion vignettes/03-get-images-python.Rmd
Original file line number Diff line number Diff line change
@@ -1 +1,116 @@
# Images Vignette
---
title: "Get Source Image Files"
output: html_document
---

# Objective: To be able to demonstrate how to locate and retrieve RGB image files

This vignette shows how to locate and retrieve image files associated with growing Season 6
from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/)
using Python. The files are stored online on the data management system Clowder,
which is accessed using an API. We will be working with the image files generated during the
month of May by limiting the requests to that time period.

After completing this vignette it should be possible to search for and retrieve other
files through the use of the API.

As an added bonus we've also included an exmple of how to retrieve the list of available
sensor names through the API. By using the sensor names returned, it's possible to retrieve
other files containing the data the sensors have collected.

**requirements**
* Python 3
* the terrautils library
* this can be installed from pypi by running `pip install terrautils` in the terminal
* an API key to access these data

The API key is a string that gets generated upon request through your Clowder account. Existing
API keys will work with this vignette. To get a new API key it is necessary to first register
with Clowder at "https://terraref.ncsa.illinois.edu/clowder/". First click the `Login` button and
wait for the login screen to appear. Then select the `Sign up` button and enter an email
address you have access to. An email is sent to the entered address with instructions for
completing the registration process. Once registration is complete, log
into Clowder and select the `View profile` menu option from the drop-down that is near the search
control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new
key is gnerated.

## Locating the images

To begin looking for files, a sensor name and site name are needed. We will be using
'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this
vignette we show how to retrieve the list of available sensors.

As mentioned in the overview, the url string will point to the API to use. In this case
we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the
one you created for your Clowder account.

```{python eval=FALSE}
from terrautils.products import get_file_listing

url = 'https://terraref.ncsa.illinois.edu/clowder/api'
key = 'YOUR_KEY_GOES_HERE'
sensor = 'RGB GeoTIFFs Datasets'
sitename = ''
files = get_file_listing(None, url, key, sensor, sitename,
since='2018-05-01', until='2018-05-31')
```

The `files` variable now contains an array of all the file in the datasets that match the
sensor in the plot for the month of May. When performing you own queries it's possible that there
are no matches found and the `files` array would be empty.

# Retrieving the images

Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL
that identifies the file to retrieve, making the API call to retrieve the file contents, and writing
the contents to disk.

To create the correct URL we start with the one defined before and attach the keyword '/files/'
followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving
the file would be:

``` {sh eval=FALSE}
https://terraref.ncsa.illinois.edu/clowder/api/files/111
```

By looping through each of the returned files from the previous example, and using their ID and
filename, we can retrieve the files from the server and store them locally.

We are streaming the data returned from our server request (`stream=True` in the code below) due to
the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire
contents would be in the `r` variable which could then be written to the local file.

```{python eval=FALSE}
# We are using the same `url` and `key` variables declared in the previous example above.
filesurl = url + '/files/'
params={ 'key': key }

for f in files:
r = requests.get(fileurl + f.id, params=params, stream=True)
with open(f.filename, 'wb') as o:
for chunk in r.iter_content(chunk_size=1024):
if chunk:
o.write(chunk)

```

The images are now stored on the local file system.

# Retrieving sensor names

In this section we retrieve the names of different sensor types that are available. This will
allow you to retrieve files other than those containing RBG image data.

```{python eval=FALSE}
# We are using the same `url` and `key` variables declared in the previous example above.
from terrautils.products import get_sensor_list, unique_sensor_names

sensors = get_sensor_list(None, url, key)
names = unique_sensor_names(sensors)
```

The variable `names` will now contain the list of all available sensors. Using these sensor
names it's possible to use the above search to locate and then retrieve additional data files.
Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is
assigned above.