-
Notifications
You must be signed in to change notification settings - Fork 11
Fleshed out images vignette #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1541181
ad5b12c
ea0874f
413a538
f4f25d3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,116 @@ | ||
| # Images Vignette | ||
| --- | ||
| title: "Get Source Image Files" | ||
| output: html_document | ||
| --- | ||
|
|
||
| # Objective: To be able to demonstrate how to locate and retrieve RGB image files | ||
|
|
||
| This vignette shows how to locate and retrieve image files associated with growing Season 6 | ||
| from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) | ||
| using Python. The files are stored online on the data management system Clowder, | ||
| which is accessed using an API. We will be working with the image files generated during the | ||
| month of May by limiting the requests to that time period. | ||
|
|
||
| After completing this vignette it should be possible to search for and retrieve other | ||
| files through the use of the API. | ||
|
|
||
| As an added bonus we've also included an exmple of how to retrieve the list of available | ||
| sensor names through the API. By using the sensor names returned, it's possible to retrieve | ||
| other files containing the data the sensors have collected. | ||
|
|
||
| **requirements** | ||
| * Python 3 | ||
| * the terrautils library | ||
| * this can be installed from pypi by running `pip install terrautils` in the terminal | ||
| * an API key to access these data | ||
|
|
||
| The API key is a string that gets generated upon request through your Clowder account. Existing | ||
| API keys will work with this vignette. To get a new API key it is necessary to first register | ||
| with Clowder at "https://terraref.ncsa.illinois.edu/clowder/". First click the `Login` button and | ||
| wait for the login screen to appear. Then select the `Sign up` button and enter an email | ||
| address you have access to. An email is sent to the entered address with instructions for | ||
| completing the registration process. Once registration is complete, log | ||
| into Clowder and select the `View profile` menu option from the drop-down that is near the search | ||
| control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new | ||
| key is gnerated. | ||
|
|
||
| ## Locating the images | ||
|
|
||
| To begin looking for files, a sensor name and site name are needed. We will be using | ||
| 'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this | ||
| vignette we show how to retrieve the list of available sensors. | ||
|
|
||
| As mentioned in the overview, the url string will point to the API to use. In this case | ||
| we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the | ||
| one you created for your Clowder account. | ||
|
|
||
| ```{python eval=FALSE} | ||
| from terrautils.products import get_file_listing | ||
|
|
||
| url = 'https://terraref.ncsa.illinois.edu/clowder/api' | ||
| key = 'YOUR_KEY_GOES_HERE' | ||
| sensor = 'RGB GeoTIFFs Datasets' | ||
| sitename = '' | ||
| files = get_file_listing(None, url, key, sensor, sitename, | ||
| since='2018-05-01', until='2018-05-31') | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. even after adding my API key, I get the following error:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like you have 'apiapi' in the URL. Is that a copy/paste error or is something not clear? |
||
| ``` | ||
|
|
||
| The `files` variable now contains an array of all the file in the datasets that match the | ||
| sensor in the plot for the month of May. When performing you own queries it's possible that there | ||
| are no matches found and the `files` array would be empty. | ||
|
|
||
| # Retrieving the images | ||
|
|
||
| Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL | ||
| that identifies the file to retrieve, making the API call to retrieve the file contents, and writing | ||
| the contents to disk. | ||
|
|
||
| To create the correct URL we start with the one defined before and attach the keyword '/files/' | ||
| followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving | ||
| the file would be: | ||
|
|
||
| ``` {sh eval=FALSE} | ||
| https://terraref.ncsa.illinois.edu/clowder/api/files/111 | ||
| ``` | ||
|
|
||
| By looping through each of the returned files from the previous example, and using their ID and | ||
| filename, we can retrieve the files from the server and store them locally. | ||
|
|
||
| We are streaming the data returned from our server request (`stream=True` in the code below) due to | ||
| the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire | ||
| contents would be in the `r` variable which could then be written to the local file. | ||
|
|
||
| ```{python eval=FALSE} | ||
| # We are using the same `url` and `key` variables declared in the previous example above. | ||
| filesurl = url + '/files/' | ||
| params={ 'key': key } | ||
|
|
||
| for f in files: | ||
| r = requests.get(fileurl + f.id, params=params, stream=True) | ||
| with open(f.filename, 'wb') as o: | ||
| for chunk in r.iter_content(chunk_size=1024): | ||
| if chunk: | ||
| o.write(chunk) | ||
|
|
||
| ``` | ||
|
|
||
| The images are now stored on the local file system. | ||
|
|
||
| # Retrieving sensor names | ||
|
|
||
| In this section we retrieve the names of different sensor types that are available. This will | ||
| allow you to retrieve files other than those containing RBG image data. | ||
|
|
||
| ```{python eval=FALSE} | ||
| # We are using the same `url` and `key` variables declared in the previous example above. | ||
| from terrautils.products import get_sensor_list, unique_sensor_names | ||
|
|
||
| sensors = get_sensor_list(None, url, key) | ||
| names = unique_sensor_names(sensors) | ||
| ``` | ||
|
|
||
| The variable `names` will now contain the list of all available sensors. Using these sensor | ||
| names it's possible to use the above search to locate and then retrieve additional data files. | ||
| Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is | ||
| assigned above. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.