From 7b8751d684da8b3243d641d9fac7add42471110f Mon Sep 17 00:00:00 2001 From: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> Date: Mon, 5 Nov 2018 13:58:46 -0800 Subject: [PATCH 01/83] fixed link to agronomic metadata tutorial --- traits/01-web-access.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/traits/01-web-access.Rmd b/traits/01-web-access.Rmd index e0b4872..f3dc7cb 100644 --- a/traits/01-web-access.Rmd +++ b/traits/01-web-access.Rmd @@ -1,4 +1,4 @@ ---- +6--- title: "Accessing Trait Data Via the BETYdb Web Interface" author: "David LeBauer" date: "`r Sys.Date()`" @@ -32,4 +32,4 @@ On the Welcome page there is a search option for trait and yield data. This tool * if you want all of the data, including data that has not gone through QA/QC, make sure to check the 'include unchecked records' option * in the upper right, you will see a button that will allow you to download the search results as a CSV file. Click it. Open the file in a text editor or spreadsheet program and review its contents. -Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](../traits/04-agronomic-metadata.Rmd). +Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](../traits/06-agronomic-metadata.Rmd). From 6872cc78fb04b0f043187a1222345a4a5995face Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 6 Nov 2018 13:34:50 -0800 Subject: [PATCH 02/83] Updated tutorial. Added a summary showing the 3 ways to access API data. --- traits/02-betydb-api-access.Rmd | 219 ++++++++++++++++---------------- 1 file changed, 112 insertions(+), 107 deletions(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index 8d90266..c579098 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -1,107 +1,112 @@ ---- -title: "Accessing Trait Data Via the BETYdb API" -author: "David LeBauer" -date: "11/7/2017" -output: html_document ---- - - -## Using URLs to construct Queries - -The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. - -### What is an API? - -An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data. - -All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs. - - -### Using Your API key to Connect - -An API key is like a password. It allows you to access data, and should be kept private. -Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the terraref.ncsa.illinois.edu/bety-test database. - -A common way of handling private API keys is to place it in a text file in your home directory. -Don't put it in a project directory where it might be inadvertently shared. - -Here is how to find and save your API key: - -* click file --> new --> text file -* copy the api key that was sent when you registered into the file -* file --> save as '~/.betykey' - -For the public key, you can call this file `~/.betykey_public`. - -### Components of a URL query - - -* base url: `terraref.ncsa.illinois.edu/bety` -* path to the api: `/api/beta` -* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. -* Query parameters: `genus=Sorghum` -* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. - - -### Constructing a URL query - -First, lets construct a query by putting together a URL. - -1. start with the database url: `terraref.ncsa.illinois.edu/bety` - * this url brings you to the home page -2. Add the path to the API, `/api/beta` - * now we have terraref.ncsa.illinois.edu/bety/api/beta, which points to the API documentation -3. Add the name of the table you want to query. Lets start with `variables` - * terraref.ncsa.illinois.edu/bety/api/beta/variables -4. add query terms by appending a `?` and combining with `&`, for example: - * `key=9999999999999999999999999999999999999999` - * `type=trait` where the variable type is 'trait' - * `name=~height` where the variable name contains 'height' -5. This is your complete query: - * `terraref.ncsa.illinois.edu/bety/api/beta/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999` - * it will query all variables that are type trait and have 'height' in the name - * Does it return the expected values? - - -#### Your Turn - -> What will the URL https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? - -> write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner` - -What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? - -### Our first Query - -#### Shell - -```sh -wget -O sorghum.json \\ # -O names the output file - "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=999999999999999999999999999999999999 -9999" -``` - -If you want to write the query without exposing the key in plain text, you can construct it thus: - -```sh -wget -O sorghum.json \\ - "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=`cat ~/.betykey_public`" -``` - -> What does `cat ~/.betykey_public` do? - -> How can you look at the files? - - -#### R - using the jsonlite package - -```{r text-api} -sorghum.json <- readLines( - paste0("https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=", - readLines('~/.betykey'))) - -## print(sorghum.json) -## not a particularly useful format -## lets convert to a data frame -sorghum <- jsonlite::fromJSON(sorghum.json) -``` +--- +title: "Accessing Trait Data Via the BETYdb API" +author: "David LeBauer" +date: "11/7/2017" +output: html_document +--- + + +## Using URLs to construct Queries + +The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. + +### What is an API? + +An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data. + +All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs. + + +### Using Your API key to Connect + +An API key is like a password. It allows you to access data, and should be kept private. +Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the terraref.ncsa.illinois.edu/bety-test database. + +A common way of handling private API keys is to place it in a text file in your home directory. +Don't put it in a project directory where it might be inadvertently shared. + +Here is how to find and save your API key: + +* click file --> new --> text file +* copy the api key that was sent when you registered into the file +* file --> save as '~/.betykey' + +For the public key, you can call this file `~/.betykey_public`. + +### URL query + +## Components of a URL query + +* base url: `terraref.ncsa.illinois.edu/bety` +* path to the api: `/api/beta` +* api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. +* Query parameters: `genus=Sorghum` +* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. + +### Ways to access API data + +1. Through a URL query +2. Using the bash shell +3. Using the R jsonlite package + + +## Constructing a URL query + +First, lets construct a query by putting together a URL. + +1. start with the database url: `terraref.ncsa.illinois.edu/bety` + * this url brings you to the home page +2. Add the path to the API, `/api/beta` + * now we have terraref.ncsa.illinois.edu/bety/api/beta, which points to the API documentation +3. Add the name of the table you want to query. Lets start with `variables` + * terraref.ncsa.illinois.edu/bety/api/beta/variables +4. add query terms by appending a `?` and combining with `&`, for example: + * `key=9999999999999999999999999999999999999999` + * `type=trait` where the variable type is 'trait' + * `name=~height` where the variable name contains 'height' +5. This is your complete query: + * `terraref.ncsa.illinois.edu/bety/api/beta/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999` + * it will query all variables that are type trait and have 'height' in the name + * Does it return the expected values? + + +#### Your Turn + +> What will the URL https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? + +> Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner` + +What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? + + +#### Shell + +```sh +wget -O sorghum.json \\ # -O names the output file + "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999" +``` + +If you want to write the query without exposing the key in plain text, you can construct it thus: + +```sh +wget -O sorghum.json \\ + "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=`cat ~/.betykey_public`" +``` + +> What does `cat ~/.betykey_public` do? + +> How can you look at the files? + + +#### R - using the jsonlite package + +```{r text-api} +sorghum.json <- readLines( + paste0("https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=", + readLines('~/.betykey'))) + +## print(sorghum.json) +## not a particularly useful format +## lets convert to a data frame +sorghum <- jsonlite::fromJSON(sorghum.json) +``` From 2c80e2e05b400f01c8fb9b2b2c5c92b92d790ddd Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 6 Nov 2018 14:30:41 -0800 Subject: [PATCH 03/83] Further updated the tutorial on how to access API using URL query and bash shell. --- traits/02-betydb-api-access.Rmd | 34 +++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index c579098..d564018 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -33,7 +33,14 @@ Here is how to find and save your API key: For the public key, you can call this file `~/.betykey_public`. -### URL query +### Ways to access API data + +1. Through a URL query +2. Using the bash shell +3. Using the R jsonlite package + + +### Accessing data using a URL query ## Components of a URL query @@ -43,13 +50,6 @@ For the public key, you can call this file `~/.betykey_public`. * Query parameters: `genus=Sorghum` * Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. -### Ways to access API data - -1. Through a URL query -2. Using the bash shell -3. Using the R jsonlite package - - ## Constructing a URL query First, lets construct a query by putting together a URL. @@ -69,8 +69,7 @@ First, lets construct a query by putting together a URL. * it will query all variables that are type trait and have 'height' in the name * Does it return the expected values? - -#### Your Turn +## Your Turn > What will the URL https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? @@ -78,25 +77,28 @@ First, lets construct a query by putting together a URL. What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? - -#### Shell + +#### Accessing data using the Shell + +Type the following command into a bash shell (the -O option names the output file): ```sh -wget -O sorghum.json \\ # -O names the output file +wget -O sorghum.json "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999" ``` -If you want to write the query without exposing the key in plain text, you can construct it thus: +If you want to write the query without exposing the key in plain text, you can construct it like this: ```sh -wget -O sorghum.json \\ +wget -O sorghum.json "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=`cat ~/.betykey_public`" ``` -> What does `cat ~/.betykey_public` do? +> Do you know what `cat ~/.betykey_public`. Run the command and see what it outputs in your shell. > How can you look at the files? +Hint: search what a json file is #### R - using the jsonlite package From 67665616699a6fc5a8f75e0890c4e7bd667a3d42 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 6 Nov 2018 14:54:30 -0800 Subject: [PATCH 04/83] Further updated tutorial. Added short comment on the use of the fromJSON function. --- traits/02-betydb-api-access.Rmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index d564018..87db5fa 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -100,7 +100,9 @@ wget -O sorghum.json Hint: search what a json file is -#### R - using the jsonlite package +### Accessing API data using the R jsonlite package + +JSON content can be converted into a R data frame using the fromJSON function. ```{r text-api} sorghum.json <- readLines( From b017aa2219493314f07984f445cc631ab6818664 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 6 Nov 2018 15:49:41 -0800 Subject: [PATCH 05/83] Further updated R jsonlite portion of tutorial --- traits/02-betydb-api-access.Rmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index 87db5fa..e828afe 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -102,7 +102,8 @@ Hint: search what a json file is ### Accessing API data using the R jsonlite package -JSON content can be converted into a R data frame using the fromJSON function. +JSON content can be converted into a R object using the fromJSON function. +Access the 'data' element of the R object to get the data frame. ```{r text-api} sorghum.json <- readLines( @@ -113,4 +114,5 @@ sorghum.json <- readLines( ## not a particularly useful format ## lets convert to a data frame sorghum <- jsonlite::fromJSON(sorghum.json) +sorghum.dataframe <- sorghum$data ``` From bd4c8f426d4f49dc7975cf86b2bd705d63293a06 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Wed, 7 Nov 2018 10:01:45 -0800 Subject: [PATCH 06/83] Update traits/01-web-access.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> --- traits/01-web-access.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/01-web-access.Rmd b/traits/01-web-access.Rmd index f3dc7cb..fb7d2dd 100644 --- a/traits/01-web-access.Rmd +++ b/traits/01-web-access.Rmd @@ -1,4 +1,4 @@ -6--- +--- title: "Accessing Trait Data Via the BETYdb Web Interface" author: "David LeBauer" date: "`r Sys.Date()`" From d24928c12e3dfdda6b8a2f707d6495307610447a Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 11:56:36 -0800 Subject: [PATCH 07/83] added comment. link to traitvis webapp not working. --- traits/00-BETYdb-getting-started.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index fb5d3c4..63e2d18 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -40,5 +40,5 @@ The traitvis webapp provides an interface for exploring available data that is u * website: https://traitvis.workbench.terraref.org ```{r} -knitr::include_app("https://traitvis.workbench.terraref.org/", height = "1400px") +knitr::include_app("https://traitvis.workbench.terraref.org/", height = "1400px") #fix link? no screenshot is displayed in html output ``` From 169ab1a8208d53fd05cf39a90317c4dcf7ca99ec Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 12:27:23 -0800 Subject: [PATCH 08/83] fixed links to terraref and pecan sites, betydb schemas, and doi reference. also updated comment on embedded shiny app (link to traitvis app does work). --- traits/00-BETYdb-getting-started.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index 63e2d18..a8e5f13 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -14,15 +14,15 @@ It contains trait (phenotype) data at the plot or plant level as well as meta da ### Introduction to BETYdb The TERRA REF trait database (terraref.ncsa.illinois.edu/bety) uses the BETYdb data schema (structure) and web application. -The BETYdb software is actively used and developed by the [TERRA Reference](terraref.org) program as well as by the [PEcAn project](pecanproject.org). +The BETYdb software is actively used and developed by the [TERRA Reference](http://terraref.org) program as well as by the [PEcAn project](http://pecanproject.org). For more information about BETYdb, see the following: * BETYdb documentation (available via the web application under 'Docs') * _Data Access_: how to access data * _Data Entry Workflow:_ how to add data to the database - * _BETYdb Technical Documentation_ is written for advanced users and website and database administrators who may also be interested in the [full database schema](betydb.org/schemas) -* BETYdb: A Yield, Trait and Ecosystem Service Database Applied to Second Generation Bioenergy Feedstocks. ([LeBauer et al, 2017](dx.doi.org/10.1111/gcbb.12420)) + * _BETYdb Technical Documentation_ is written for advanced users and website and database administrators who may also be interested in the [full database schema](https://www.betydb.org/schemas) +* BETYdb: A Yield, Trait and Ecosystem Service Database Applied to Second Generation Bioenergy Feedstocks. ([LeBauer et al, 2017](https://onlinelibrary.wiley.com/doi/abs/10.1111/gcbb.12420)) Other than the TERRA REF trait database, there are a handful of other projects that use the BETYdb software, mostly with the PEcAn and TERRA programs. The content presented here is focused on the TERRA REF instance of BETYdb. Most of the information presented here is relevant to other databases, but the TERRA program has more emphasis on trait diversity among cultivars or genotypes within a crop whereas PEcAn focuses on the diversity of traits within ecosystems and plant functional types. In addition, the TERRA program is more focused on high throughput phenotyping - intensive monitoring of agricultural breeding trials whereas PEcAn focuses on assimilating heterogeneous data to forecast ecosystem functioning. Fortunately, both uses can use the shared ecosystem of software used for these tasks. For example, the PEcAn crop modeling infrastructure can be directly used to infer additional targets of breeding, and the diversity of traits observed in breeding trials can be a first step toward predicting the impacts of crop traits on productivity and ecosystem functioning. @@ -40,5 +40,5 @@ The traitvis webapp provides an interface for exploring available data that is u * website: https://traitvis.workbench.terraref.org ```{r} -knitr::include_app("https://traitvis.workbench.terraref.org/", height = "1400px") #fix link? no screenshot is displayed in html output +knitr::include_app("https://traitvis.workbench.terraref.org", height = "1400px") #not working; shiny app is not being displayed in html output ``` From 8d903a45f366faf8be0148345c10a2c538ea811f Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 12:41:34 -0800 Subject: [PATCH 09/83] updated beta user program link, and fixed links to agronomic metadata tutorial and terraref bety home page --- traits/01-web-access.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/traits/01-web-access.Rmd b/traits/01-web-access.Rmd index f3dc7cb..a6a5ec8 100644 --- a/traits/01-web-access.Rmd +++ b/traits/01-web-access.Rmd @@ -1,4 +1,4 @@ -6--- +--- title: "Accessing Trait Data Via the BETYdb Web Interface" author: "David LeBauer" date: "`r Sys.Date()`" @@ -12,7 +12,7 @@ output: html_document ### Web interface * Sign up for an account at https://terraref.ncsa.illinois.edu/bety -* Sign up for the TERRA REF [beta user program](https://docs.google.com/forms/d/e/1FAIpQLScBsD042RrRok70BCGCRwARTcm9etvVHqvQaz1c5X7c5y0H3w/viewform?c=0&w=1) +* Sign up for the TERRA REF [beta user program](https://docs.google.com/forms/d/e/1FAIpQLScIUJL_OSL9BvBOdlczErds3aOg5Lwz4NIdNQnUiXdsLsYdhw/viewform) * Wait for database access to be granted * Your API key will be sent in the email. It can also be found - and regenerated - by navigating to the Users page (data --> [users](https://terraref.ncsa.illinois.edu/bety/users)) in the web interface. @@ -25,11 +25,11 @@ On the Welcome page there is a search option for trait and yield data. This tool ### Download search results as as csv file from the web interface -* Point your browser to terraref.ncsa.illinois.edu/bety +* Point your browser to https://terraref.ncsa.illinois.edu/bety/ * login * enter "NDVI" in the search box * on the next page you will see the results of this search * if you want all of the data, including data that has not gone through QA/QC, make sure to check the 'include unchecked records' option * in the upper right, you will see a button that will allow you to download the search results as a CSV file. Click it. Open the file in a text editor or spreadsheet program and review its contents. -Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](../traits/06-agronomic-metadata.Rmd). +Note that the web interface only provides a core set of data and limited meta-data. To access all of the data within BETYdb, it is necessary to search and merge multiple tables. More complex queries, such as those in the [Agronomic metadata](06-agronomic-metadata.Rmd). From dd392788318419eeadb436fc422dc8a33fa44592 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 13:56:10 -0800 Subject: [PATCH 10/83] made changes according to previous comments --- traits/02-betydb-api-access.Rmd | 46 ++++++++++++++++----------------- 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index e828afe..d835965 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -20,18 +20,18 @@ All of our databases have web interfaces for humans to browse as well as APIs th ### Using Your API key to Connect An API key is like a password. It allows you to access data, and should be kept private. -Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the terraref.ncsa.illinois.edu/bety-test database. +Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the https://terraref.ncsa.illinois.edu/bety-test database. -A common way of handling private API keys is to place it in a text file in your home directory. +A common way of handling private API keys is to place it in a text file in your current directory. Don't put it in a project directory where it might be inadvertently shared. Here is how to find and save your API key: * click file --> new --> text file * copy the api key that was sent when you registered into the file -* file --> save as '~/.betykey' +* file --> save as '.betykey' -For the public key, you can call this file `~/.betykey_public`. +For the public key, you can call this file `.betykey_public`. ### Ways to access API data @@ -42,10 +42,11 @@ For the public key, you can call this file `~/.betykey_public`. ### Accessing data using a URL query + ## Components of a URL query * base url: `terraref.ncsa.illinois.edu/bety` -* path to the api: `/api/beta` +* path to the api: `/api/v1` * api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. * Query parameters: `genus=Sorghum` * Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. @@ -56,63 +57,60 @@ First, lets construct a query by putting together a URL. 1. start with the database url: `terraref.ncsa.illinois.edu/bety` * this url brings you to the home page -2. Add the path to the API, `/api/beta` - * now we have terraref.ncsa.illinois.edu/bety/api/beta, which points to the API documentation +2. Add the path to the API, `/api/v1` + * now we have terraref.ncsa.illinois.edu/bety/api/v1, which points to the API documentation 3. Add the name of the table you want to query. Lets start with `variables` - * terraref.ncsa.illinois.edu/bety/api/beta/variables + * terraref.ncsa.illinois.edu/bety/api/v1/variables 4. add query terms by appending a `?` and combining with `&`, for example: * `key=9999999999999999999999999999999999999999` * `type=trait` where the variable type is 'trait' * `name=~height` where the variable name contains 'height' 5. This is your complete query: - * `terraref.ncsa.illinois.edu/bety/api/beta/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999` + * `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999` * it will query all variables that are type trait and have 'height' in the name * Does it return the expected values? ## Your Turn -> What will the URL https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? +> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? > Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner` What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? + #### Accessing data using the Shell -Type the following command into a bash shell (the -O option names the output file): +Type the following command into a bash shell (the `-o` option names the output file): ```sh -wget -O sorghum.json - "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=9999999999999999999999999999999999999999" +curl -o sorghum.json \ + "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999" ``` If you want to write the query without exposing the key in plain text, you can construct it like this: ```sh -wget -O sorghum.json - "https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=`cat ~/.betykey_public`" +curl -o sorghum.json \ + "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`" ``` -> Do you know what `cat ~/.betykey_public`. Run the command and see what it outputs in your shell. +> What does `cat .betykey_public` do? > How can you look at the files? -Hint: search what a json file is - ### Accessing API data using the R jsonlite package -JSON content can be converted into a R object using the fromJSON function. -Access the 'data' element of the R object to get the data frame. - ```{r text-api} sorghum.json <- readLines( - paste0("https://terraref.ncsa.illinois.edu/bety/api/beta/species?genus=Sorghum&key=", - readLines('~/.betykey'))) + paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", + readLines('.betykey'))) ## print(sorghum.json) ## not a particularly useful format ## lets convert to a data frame sorghum <- jsonlite::fromJSON(sorghum.json) -sorghum.dataframe <- sorghum$data ``` + +More on how to use the rOpenSci traits package coming up in the [next tutorial](03-access-r-traits.Rmd) From 63616dcc873f5766894b9e42f9b42a814395ab36 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 15:24:54 -0800 Subject: [PATCH 11/83] made minor edits - assuming betykey in traits directory since ~ does not work on Windows --- index.Rmd | 4 +++- traits/02-betydb-api-access.Rmd | 4 ---- traits/03-access-r-traits.Rmd | 8 ++++---- traits/04-danforth-indoor-phenotyping-facility.Rmd | 5 +++-- 4 files changed, 10 insertions(+), 11 deletions(-) diff --git a/index.Rmd b/index.Rmd index dba5486..26484cc 100644 --- a/index.Rmd +++ b/index.Rmd @@ -13,4 +13,6 @@ output: ```{r} knitr::opts_chunk$set(echo = FALSE, cache = TRUE) -``` \ No newline at end of file +options(warn = -1) +``` + diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index d835965..b1cd1b5 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -96,10 +96,6 @@ curl -o sorghum.json \ "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`" ``` -> What does `cat .betykey_public` do? - -> How can you look at the files? - ### Accessing API data using the R jsonlite package ```{r text-api} diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 77adde3..f0eab25 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -33,11 +33,11 @@ library(dplyr) ```{r writing-key} # This should be done once with the key sent to you in your email # writeLines('abcdefg_rest_of_key_sent_in_email', -# con = '~/.betykey') +# con = '.betykey') # Example with the public key: writeLines('9999999999999999999999999999999999999999', - con = '~/.betykey_public') + con = '.betykey_public') ``` #### R - using the traits package @@ -55,7 +55,7 @@ sorghum_info <- betydb_query(table = 'species', api_version = 'beta', limit = 'none', betyurl = "https://terraref.ncsa.illinois.edu/bety/", - key = readLines('~/.betykey', warn = FALSE)) + key = readLines('.betykey', warn = FALSE)) ``` @@ -65,7 +65,7 @@ Notice all of the arguments that the `betydb_query` function requires? We can ch ```{r} -options(betydb_key = readLines('~/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'beta') ``` diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index 394c162..96104ef 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -2,7 +2,8 @@ title: "Phenotype Analysis" author: "David LeBauer, Craig Willis" date: "`r Sys.Date()`" -output: md_document +output: + html_document: default --- ```{r 02-setup, include=FALSE} @@ -26,7 +27,7 @@ library(traits) Unlike the first two tutorials, now we will be querying real data from the public TERRA REF database. So we will use a new URL, https://terraref.ncsa.illinois.edu/bety/, and we will need to use our own private key. ```{r terraref-connect-options} -options(betydb_key = readLines('~/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'beta') ``` From f7b517c84bc92dfea002555daac34c078d55208d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 7 Nov 2018 15:55:01 -0800 Subject: [PATCH 12/83] minor edit - removed comment --- traits/00-BETYdb-getting-started.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index a8e5f13..1489430 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -40,5 +40,5 @@ The traitvis webapp provides an interface for exploring available data that is u * website: https://traitvis.workbench.terraref.org ```{r} -knitr::include_app("https://traitvis.workbench.terraref.org", height = "1400px") #not working; shiny app is not being displayed in html output +knitr::include_app("https://traitvis.workbench.terraref.org", height = "1400px") ``` From d2470a402a450cbbc8a173086b24374e90e9dacd Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 13 Nov 2018 11:35:35 -0800 Subject: [PATCH 13/83] -changed trait in sorghum height search from 'canopy_cover' to 'canopy_height' -removed geom_smooth layer from ggplot --- traits/03-access-r-traits.Rmd | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index f0eab25..1496f62 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -52,7 +52,7 @@ Lets start with the query of information about Sorghum from species table from a sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", - api_version = 'beta', + api_version = 'v1', limit = 'none', betyurl = "https://terraref.ncsa.illinois.edu/bety/", key = readLines('.betykey', warn = FALSE)) @@ -67,32 +67,34 @@ Notice all of the arguments that the `betydb_query` function requires? We can ch ```{r} options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", - betydb_api_version = 'beta') + betydb_api_version = 'v1') ``` Now the same query can be reduced to: +```{r query-species-reduce} +sorghum_info <- betydb_query(table = 'species', + genus = "Sorghum", + limit = 'none') +``` + +### Time series of height + +Now let's query some trait data. ```{r sv_area} sorghum_height <- betydb_query(table = 'search', - trait = "canopy_cover", + trait = "canopy_height", site = "~Season 6", limit = 'none') ``` -### Time series of height - -Now we can take a look at the data that we have just queried. - ```{r} -ggplot(data = sorghum_height, - aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean, color = cultivar)) + - geom_smooth(se = FALSE, size = 0.5) + - geom_point(size = 0.5, position = position_jitter(width = 0.1)) + -# scale_x_datetime(date_breaks = '6 months', date_labels = "%b %Y") + -# ylim(c(0,6)) + - xlab("Day of Year") + ylab("Plant Height") + +ggplot(data = sorghum_height, + aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) + + geom_point(size = 0.5, position = position_jitter(width = 0.1)) + +# scale_x_datetime(date_breaks = '6 months') + + xlab("Day of Year") + ylab("Plant Height") + guides(color = guide_legend(title = 'Genotype')) + theme_bw() - ``` From 4459ae16d6fd7b4ef06cf21af1d4c635df9c913d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 13 Nov 2018 12:02:39 -0800 Subject: [PATCH 14/83] made minor edits. changed api version from 'beta' to 'v1' and hid the results of the 'query-danforth' r chunk --- traits/04-danforth-indoor-phenotyping-facility.Rmd | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index 96104ef..f457555 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -7,7 +7,7 @@ output: --- ```{r 02-setup, include=FALSE} -knitr::opts_chunk$set(echo = FALSE, cache = TRUE) +knitr::opts_chunk$set(echo = TRUE, cache = TRUE) library(jsonlite) library(dplyr) library(ggplot2) @@ -29,13 +29,13 @@ Unlike the first two tutorials, now we will be querying real data from the publi ```{r terraref-connect-options} options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", - betydb_api_version = 'beta') + betydb_api_version = 'v1') ``` ### Query data from the Danforth Phenotyping Facility First we will use the generic search to query the output from the Lemnatec indoor phenotyping system at the Danforth Center in St. Louis, MO. -```{r query-danforth} +```{r query-danforth, results='hide'} danforth_sorghum <- traits::betydb_query( # sitename = 'Danforth Plant Science Center Bellweather Phenotyping Facility', trait = 'sv_area', @@ -43,7 +43,7 @@ danforth_sorghum <- traits::betydb_query( ``` -To get the equivalent query via the web interface, you can construct the following URL. Once you learn how to write a query using the url API, you can use this to +To get the equivalent query via the web interface, you can construct the following URL. ```{r api-query-in-browser} search_url <- paste0(options()$betydb_url, @@ -54,7 +54,7 @@ search_url <- paste0(options()$betydb_url, print(gsub(options()$betydb_key, 'secretkey', search_url)) ``` -you can open this in your browser to see (but you may need to grant permission) +Once you learn how to write a query using the url API, you can use it to open in your browser to see (but you may need to grant permission) ```{r open-api-url, eval=FALSE} From 82d34ee0fa89ef36fc95907107e713c4a7e014eb Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 13 Nov 2018 15:11:53 -0800 Subject: [PATCH 15/83] changed image name from 'betydb-postgis' to 'bety-postgis' --- traits/07-betydb-sql-access.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/07-betydb-sql-access.Rmd b/traits/07-betydb-sql-access.Rmd index d9facfa..cfd5f10 100644 --- a/traits/07-betydb-sql-access.Rmd +++ b/traits/07-betydb-sql-access.Rmd @@ -43,7 +43,7 @@ DB: bety You can run the entire database locally, with daily imports: ```sh -docker run --name betydb -p 5432:5432 terraref/betydb-postgis +docker run --name betydb -p 5432:5432 terraref/bety-postgis ``` Now it will appear that you have the entire trait database running at localhost on port 5432 just like if it were installed on your system! \ No newline at end of file From fa664fe7a64e30366c3ea6ec0b01cedafafc1b71 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 3 Dec 2018 14:03:08 -0800 Subject: [PATCH 16/83] changed ~ (home directory) to . (current directory) --- traits/05-maricopa-field-scanner.Rmd | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index a8ae123..8a7133d 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -16,9 +16,9 @@ library(sp) # for implicitly called rbind.SpatialPolygons method library(leaflet) -options(betydb_key = readLines('~/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", - betydb_api_version = 'beta') + betydb_api_version = 'v1') ``` @@ -33,7 +33,7 @@ sites <- betydb_query( city = "Maricopa", sitename = "~Season 2 range", limit = "none") ``` -A more robust (but complicated way) would be to query the experiments and experimeents_sites tables. But we will leave that for later. +A more robust (but complicated way) would be to query the experiments and experiments_sites tables. But we will leave that for later. ### Plot Season 2 plots @@ -89,7 +89,8 @@ Exercise: Why are there two variables named canopy_height, and what database fie Now retrieve all available measurements for each variable. -```{r traits-05-get-variables} +```{r traits-05-get-variables} +#getting HTTP 504 -- Gateway Timeout when running this chunk vars_measures <- (variables %>% group_by(id, name) From 0f7a54525dc582c1d3a74965bb0f81ee086e0fab Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 3 Dec 2018 14:03:52 -0800 Subject: [PATCH 17/83] changed ~ (home directory) to . (current directory) --- traits/10-simulated-sorghum.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/traits/10-simulated-sorghum.Rmd b/traits/10-simulated-sorghum.Rmd index 66f8c39..83371c6 100644 --- a/traits/10-simulated-sorghum.Rmd +++ b/traits/10-simulated-sorghum.Rmd @@ -35,7 +35,7 @@ The 'genotypes' are based on five-hundred quasi-random parameterizations of a bi _Note that these data sets contain numerical artifacts and scientific reinterpretations for illustrative purpose._ -All of these simulated datasets are released with an unrestrive [copyright](https://creativecommons.org/publicdomain/zero/1.0/). This means you can copy, modify, ans share the data. Please keep in mind that the data sets are not production quality - they have been developed solely to inspire and solicit feedback. +All of these simulated datasets are released with an unrestrive [copyright](https://creativecommons.org/publicdomain/zero/1.0/). This means you can copy, modify, and share the data. Please keep in mind that the data sets are not production quality - they have been developed solely to inspire and solicit feedback. ### Design of Simulation Experiment @@ -121,7 +121,7 @@ This dataset includes what a sensor might observe, daily for five years during t ```{r} -options(betydb_key = readLines('~/.betykey_public', warn = FALSE), +options(betydb_key = readLines('.betykey_public', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety-test/", betydb_api_version = 'beta') From f4594439f08c0cf8daf068245509c6de9d20045c Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 3 Dec 2018 14:08:50 -0800 Subject: [PATCH 18/83] Removed YAML metadata header. Added Chapter title. --- traits/00-BETYdb-getting-started.Rmd | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index 1489430..3cf1201 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -1,9 +1,4 @@ ---- -title: "Getting Started with BETYdb" -author: "David LeBauer" -date: "`r Sys.Date()`" -output: html_document ---- +# Getting Started with BETYdb ## TERRA Ref Trait Database From de8cf71186ee99ad82a52fd1cc4a0a5311b53606 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 3 Dec 2018 14:10:40 -0800 Subject: [PATCH 19/83] Removed YAML header. Added chapter title. --- traits/01-web-access.Rmd | 9 +-------- traits/02-betydb-api-access.Rmd | 8 +------- 2 files changed, 2 insertions(+), 15 deletions(-) diff --git a/traits/01-web-access.Rmd b/traits/01-web-access.Rmd index a6a5ec8..c46ee5a 100644 --- a/traits/01-web-access.Rmd +++ b/traits/01-web-access.Rmd @@ -1,11 +1,4 @@ ---- -title: "Accessing Trait Data Via the BETYdb Web Interface" -author: "David LeBauer" -date: "`r Sys.Date()`" -output: html_document ---- - - +# Accessing Trait Data Via the BETYdb Web Interface ## Getting an account for the TERRA trait database diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index b1cd1b5..b5bf60d 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -1,10 +1,4 @@ ---- -title: "Accessing Trait Data Via the BETYdb API" -author: "David LeBauer" -date: "11/7/2017" -output: html_document ---- - +# Accessing Trait Data Via the BETYdb API ## Using URLs to construct Queries From a28629d25274809f330e3da55116c1fa12c0a3ed Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 10:13:10 -0800 Subject: [PATCH 20/83] Changed chapter title to be more specific. Chapter 6 title is also Phenotype Analysis. --- traits/03-access-r-traits.Rmd | 7 +------ traits/04-danforth-indoor-phenotyping-facility.Rmd | 8 +------- traits/05-maricopa-field-scanner.Rmd | 8 ++------ traits/06-agronomic-metadata.Rmd | 7 +------ traits/07-betydb-sql-access.Rmd | 9 +-------- 5 files changed, 6 insertions(+), 33 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 1496f62..1918f31 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -1,9 +1,4 @@ ---- -title: "Accessing Trait Data in R" -author: "David LeBauer" -output: html_document ---- - +# Accessing Trait Data in R ## Using the R traits package to query the database diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index f457555..720ef7f 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -1,10 +1,4 @@ ---- -title: "Phenotype Analysis" -author: "David LeBauer, Craig Willis" -date: "`r Sys.Date()`" -output: - html_document: default ---- +# Phenotype Analysis ```{r 02-setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, cache = TRUE) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index 8a7133d..41794c9 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -1,9 +1,5 @@ ---- -title: "Plot level data from the field scanner in Maricopa, AZ" -author: "David LeBauer, Chris Black" -date: "`r Sys.Date()`" -output: md_document ---- +# Plot level data from the field scanner in Maricopa, AZ + ```{r traits-05-mac-traits-setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, cache = TRUE) library(dplyr) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 678f3fa..f8c46a6 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -1,9 +1,4 @@ ---- -title: "Phenotype Analysis" -author: "David LeBauer, Craig Willis" -date: "`r Sys.Date()`" -output: md_document ---- +# Phenotype Analysis ## Joining database tables diff --git a/traits/07-betydb-sql-access.Rmd b/traits/07-betydb-sql-access.Rmd index cfd5f10..22cf4d0 100644 --- a/traits/07-betydb-sql-access.Rmd +++ b/traits/07-betydb-sql-access.Rmd @@ -1,11 +1,4 @@ ---- -title: "Accessing Traits w/ PostgreSQL" -author: "David LeBauer" -date: "`r Sys.Date()`" -output: html_document ---- - - +# Accessing Traits w/ PostgreSQL will be derived from https://github.com/pi4-uiuc/2017-bootcamp/blob/master/content/post/2017-05-30-databases-and-sql.Rmd#with From 034dc81318d7edb4c5eb916baef7a44a1ddf1359 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 10:13:58 -0800 Subject: [PATCH 21/83] Removed text comment --- traits/10-simulated-sorghum.Rmd | 8 -------- 1 file changed, 8 deletions(-) diff --git a/traits/10-simulated-sorghum.Rmd b/traits/10-simulated-sorghum.Rmd index 83371c6..a940df4 100644 --- a/traits/10-simulated-sorghum.Rmd +++ b/traits/10-simulated-sorghum.Rmd @@ -1,11 +1,3 @@ - - # A Simulated Phenotype Dataset ```{r warnings=FALSE, echo=FALSE} From 229ad652750e9ae3ee2b5a2fd2b6b5eb7158cf5c Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 10:15:27 -0800 Subject: [PATCH 22/83] updated chapter title - made more specific --- traits/04-danforth-indoor-phenotyping-facility.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index 720ef7f..93607fc 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -1,4 +1,4 @@ -# Phenotype Analysis +# Danforth Indoor Phenotype Analysis ```{r 02-setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, cache = TRUE) From 509d0e7952c5476d639c077e97d73681e43c7339 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 10:19:16 -0800 Subject: [PATCH 23/83] commented out all chunks that were related to the traits-05-get-variables chunk (running this chunk results in a HTTP 504 Gateway Timeout) --- traits/05-maricopa-field-scanner.Rmd | 76 ++++++++++++++++------------ 1 file changed, 43 insertions(+), 33 deletions(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index 41794c9..d345f65 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -83,51 +83,61 @@ variables %>% Exercise: Why are there two variables named canopy_height, and what database fields should you examine to decide which one you want? -Now retrieve all available measurements for each variable. +```{r traits-05-get-variables-comment} + +#Now retrieve all available measurements for each variable. + +``` ```{r traits-05-get-variables} #getting HTTP 504 -- Gateway Timeout when running this chunk +#comment out this chunk -vars_measures <- (variables - %>% group_by(id, name) - %>% do(traits = betydb_record( # Get full trait list by variable ID - id = .$id, table = "variables")$traits)) +#vars_measures <- (variables +# %>% group_by(id, name) +# %>% do(traits = betydb_record( # Get full trait list by variable ID +# id = .$id, table = "variables")$traits)) # Only needed if some variables may contain zero traits # If none are empty, can just do `vars_measures %>% unnest()` -traitdata <- left_join( # ensures we keep a blank row for any variables with no trait info - vars_measures %>% select(id, name), - vars_measures %>% filter(length(traits) > 0) %>% unnest()) +#traitdata <- left_join( # ensures we keep a blank row for any variables with no trait info +# vars_measures %>% select(id, name), +# vars_measures %>% filter(length(traits) > 0) %>% unnest()) ``` -Add cultivar information so we can plot by ecotype +```{r traits-05-cultivar-info-comment} + +#Add cultivar information so we can plot by ecotype + +``` ```{r traits-05-cultivar-info} -traitdata <- (traitdata - %>% rename( - variable_name = name, - cultivar_id = trait.cultivar_id, - site_id = trait.site_id, - mean = trait.mean) - %>% mutate(date = as.Date(trait.date)) - %>% left_join(cultivars, by = "cultivar_id")) +#must comment out since chunk refers to traitdata object from above (that could not be created) +#traitdata <- (traitdata +# %>% rename( +# variable_name = name, +# cultivar_id = trait.cultivar_id, +# site_id = trait.site_id, +# mean = trait.mean) +# %>% mutate(date = as.Date(trait.date)) +# %>% left_join(cultivars, by = "cultivar_id")) ``` ```{r traits-05-plots} -(ggplot( - traitdata %>% filter(variable_name == "canopy_height"), - aes(date, mean, group = site_id * cultivar_id)) - + geom_line() - + facet_wrap(~ecotype) - + xlab("Date") - + ylab("Canopy height, cm")) - -(ggplot( - (traitdata - %>% filter(variable_name == "NDVI") - %>% mutate(emphasize = (ecotype != "RIL"))), # to reduce overplotting - aes(date, mean, color = ecotype, group = site_id*cultivar_id, alpha = emphasize)) - + geom_line() - + scale_alpha_discrete(guide = FALSE) - + theme(legend.position = c(0.1, 0.9))) +#(ggplot( +# traitdata %>% filter(variable_name == "canopy_height"), +# aes(date, mean, group = site_id * cultivar_id)) +# + geom_line() +# + facet_wrap(~ecotype) +# + xlab("Date") +# + ylab("Canopy height, cm")) + +#(ggplot( +# (traitdata +# %>% filter(variable_name == "NDVI") +# %>% mutate(emphasize = (ecotype != "RIL"))), # to reduce overplotting +# aes(date, mean, color = ecotype, group = site_id*cultivar_id, alpha = emphasize)) +# + geom_line() +# + scale_alpha_discrete(guide = FALSE) +# + theme(legend.position = c(0.1, 0.9))) ``` From 7b1f88df096a203e4a2f878ac338e36e795d2580 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 15:04:33 -0800 Subject: [PATCH 24/83] set betydb_query parameters using options --- traits/06-agronomic-metadata.Rmd | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index f8c46a6..2a26fa8 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -73,16 +73,16 @@ library(rgeos) library(leaflet) year <- lubridate::year -betyurl <- "https://terraref.ncsa.illinois.edu/bety/" -betykey <- "9999999999999999999999999999999999999999" +options(betydb_key = readLines('.betykey', warn = FALSE), + betydb_url = "https://terraref.ncsa.illinois.edu/bety/", + betydb_api_version = 'v1') ## query and join tables -species <- (betydb_query(table = "species", limit = "none", betyurl = betyurl, key = betykey, api_version = "beta") +species <- (betydb_query(table = "species") %>% select(specie_id = id, scientificname, genus)) -sites <- (betydb_query(table = "sites", limit = "none", sitename = "~Season 2 range", - betyurl = betyurl, key = betykey, - api_version = "beta")) +sites <- betydb_query(table = "sites") +names(sites)[1] <- 'site_id' sites %>% group_by(city, state, country) %>% summarize(n()) @@ -100,10 +100,10 @@ sites_point <- do.call("rbind", filter(site_geom, geom_type == "SpatialPoints")$ leaflet() %>% addTiles() %>% addPolygons(data = sites_poly, color = "red") #%>% addMarkers(data = sites_point) # points removed by only querying Season 2 -citations <- (betydb_query(table = "citations", betyurl = betyurl, key = betykey, api_version = "beta") +citations <- (betydb_query(table = "citations")#, betyurl = betyurl, key = betykey_public, api_version = "v1") %>% select(citation_id = id, author, year, title)) -traits <- (betydb_query(table = "traits", betyurl = betyurl, key = betykey, api_version = "beta") +traits <- (betydb_query(table = "traits") %>% select( id, date, mean, n, statname, stat, @@ -120,11 +120,11 @@ Let's do the manual equivalent of a cross-table join. BETY actually does contain The key idea here is that each treatment is associated with some (possibly many) managements, but the treatments table only reports the number of associated managements. To see the management IDs themselves, we need to query an individual treatment ID. So, we retrieve one table, then iterate over each row extracting the foreign keys for the other table. This requires an API call for every treatment, so beware that it is likely to be slow! ```{r} -treatments <- (betydb_query(table = 'treatments', betyurl = betyurl, key = betykey, api_version = "beta") +treatments <- (betydb_query(table = 'treatments') %>% select(treatment_id = id , name, definition, control)) get_mgid <- function(trtid){ - betydb_record(id = trtid, table = "treatments", betyurl = betyurl, key = betykey, api_version = "beta")$managements$management.id + betydb_record(id = trtid, table = "treatments")$managements$management.id } managements_treatments <- (treatments @@ -132,7 +132,7 @@ managements_treatments <- (treatments %>% do(management_id = get_mgid(.$treatment_id)) %>% unnest()) -managements <- (betydb_query(table = 'managements', betyurl = betyurl, key = betykey, api_version = "beta") +managements <- (betydb_query(table = 'managements') %>% filter(mgmttype %in% c('Fertilization_N', 'Planting', 'Irrigation')) %>% select(management_id = id, date, mgmttype, level, units) %>% left_join(managements_treatments, by = 'management_id') From 920230600fd379d89743c874420b546c4bf9602e Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 15:05:52 -0800 Subject: [PATCH 25/83] loaded in some needed packages; set betydb_query function parameters using options --- traits/10-simulated-sorghum.Rmd | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/traits/10-simulated-sorghum.Rmd b/traits/10-simulated-sorghum.Rmd index a940df4..71abeb4 100644 --- a/traits/10-simulated-sorghum.Rmd +++ b/traits/10-simulated-sorghum.Rmd @@ -9,6 +9,10 @@ library(GGally) theme_set(theme_bw()) library(dplyr) library(httr) +library(labeling) +library(highr) +library(lubridate) +library(tidyr) .libPaths('~/R/library') ``` @@ -111,11 +115,17 @@ This dataset includes what a sensor might observe, daily for five years during t ### Accessing the TERRA Simulated Data Database -```{r} +```{r set-options} options(betydb_key = readLines('.betykey_public', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety-test/", - betydb_api_version = 'beta') + betydb_api_version = 'v1') + + +``` + + +```{r} sorghum_sla <- betydb_query(table = 'search', trait = "SLA", @@ -137,11 +147,13 @@ ggplot(sorghum_sla) + ```{r query-traits} + trait_list <- c("Vcmax", "c2n_leaf", "cuticular_cond", "SLA", "quantum_efficiency", "leaf_respiration_rate_m2", "stomatal_slope.BB", "Jmax", "chi_leaf", "extinction_coefficient_diffuse") variables <- betydb_query(table = 'variables', limit = 'none') + knitr::kable(variables %>% filter(name %in% trait_list) %>% select(name, description, units)) From b20b78becae4e9413ce87aa0ac6fc9544142cbb3 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 15:52:27 -0800 Subject: [PATCH 26/83] set options for chunks to improve look of output --- traits/03-access-r-traits.Rmd | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 1918f31..fea8cca 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -8,13 +8,13 @@ First, make sure we have the latest version from the terraref fork of the reposi ### Install the package -```{r install_traits, echo=FALSE} +```{r install_traits, echo = FALSE, include = FALSE} devtools::install_github('terraref/traits') ``` Now, we can load the packages that we will need to get started. -```{r 00-setup} +```{r 00-setup, message = FALSE} library(traits) knitr::opts_chunk$set(echo = FALSE, cache = TRUE) library(ggplot2) @@ -43,7 +43,7 @@ The R traits package is an API 'client'. It does two important things: Lets start with the query of information about Sorghum from species table from above -```{r query-species} +```{r query-species, echo = TRUE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", @@ -59,7 +59,7 @@ sorghum_info <- betydb_query(table = 'species', Notice all of the arguments that the `betydb_query` function requires? We can change this by setting the default connection options thus: -```{r} +```{r set-up, echo = TRUE} options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') @@ -67,7 +67,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), Now the same query can be reduced to: -```{r query-species-reduce} +```{r query-species-reduce, echo = TRUE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", limit = 'none') @@ -76,7 +76,7 @@ sorghum_info <- betydb_query(table = 'species', ### Time series of height Now let's query some trait data. -```{r sv_area} +```{r sv_area, echo = TRUE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", site = "~Season 6", From fb583ac09e936cc73aad1a5baa8fd815b133ec1a Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 16:00:44 -0800 Subject: [PATCH 27/83] changed headers to level 2 so not treated like a chapter title --- traits/07-betydb-sql-access.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/traits/07-betydb-sql-access.Rmd b/traits/07-betydb-sql-access.Rmd index 22cf4d0..73eb97d 100644 --- a/traits/07-betydb-sql-access.Rmd +++ b/traits/07-betydb-sql-access.Rmd @@ -2,7 +2,7 @@ will be derived from https://github.com/pi4-uiuc/2017-bootcamp/blob/master/content/post/2017-05-30-databases-and-sql.Rmd#with -# On workbench +## On workbench On the TERRA REF Workbench, you have access to the database. These connnections will only work if you are on the workbench. @@ -30,7 +30,7 @@ Password: DelchevskoOro DB: bety ``` -# Installing the database locally +## Installing the database locally You can run the entire database locally, with daily imports: From c1d65fbf6660552ee6c0df78ed93ef8875944238 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 16:16:21 -0800 Subject: [PATCH 28/83] set warning option to FALSE --- traits/02-betydb-api-access.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index b5bf60d..714a5da 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -92,7 +92,7 @@ curl -o sorghum.json \ ### Accessing API data using the R jsonlite package -```{r text-api} +```{r text-api, warning = FALSE} sorghum.json <- readLines( paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", readLines('.betykey'))) From 1857dc4879c14133440f7b47d4aee7ea8093e77f Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 16:28:59 -0800 Subject: [PATCH 29/83] removed .Rmd files that are not ready to be built into book --- _bookdown.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index d40e19f..ae6599e 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -5,6 +5,5 @@ language: chapter_name: "Chapter " rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", -"traits/04-danforth-indoor-phenotyping-facility.Rmd", "traits/05-maricopa-field-scanner.Rmd", -"traits/06-agronomic-metadata.Rmd"]#, "10-simulated-sorghum.Rmd" +"traits/04-danforth-indoor-phenotyping-facility.Rmd"]#, "10-simulated-sorghum.Rmd" From 023020c5050dc42dc59f19030ab4b2008ef8d392 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 16:50:13 -0800 Subject: [PATCH 30/83] changed path to .betykey file --- traits/02-betydb-api-access.Rmd | 2 +- traits/03-access-r-traits.Rmd | 4 ++-- traits/04-danforth-indoor-phenotyping-facility.Rmd | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index 714a5da..7e32519 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -95,7 +95,7 @@ curl -o sorghum.json \ ```{r text-api, warning = FALSE} sorghum.json <- readLines( paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", - readLines('.betykey'))) + readLines('traits/.betykey'))) ## print(sorghum.json) ## not a particularly useful format diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index fea8cca..7f5b48c 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -50,7 +50,7 @@ sorghum_info <- betydb_query(table = 'species', api_version = 'v1', limit = 'none', betyurl = "https://terraref.ncsa.illinois.edu/bety/", - key = readLines('.betykey', warn = FALSE)) + key = readLines('traits/.betykey', warn = FALSE)) ``` @@ -60,7 +60,7 @@ Notice all of the arguments that the `betydb_query` function requires? We can ch ```{r set-up, echo = TRUE} -options(betydb_key = readLines('.betykey', warn = FALSE), +options(betydb_key = readLines('traits/.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') ``` diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index 93607fc..e2c10b8 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -21,7 +21,7 @@ library(traits) Unlike the first two tutorials, now we will be querying real data from the public TERRA REF database. So we will use a new URL, https://terraref.ncsa.illinois.edu/bety/, and we will need to use our own private key. ```{r terraref-connect-options} -options(betydb_key = readLines('.betykey', warn = FALSE), +options(betydb_key = readLines('traits/.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') ``` From 56204cd9b065d7c96aa5c592d928caebe5f0c005 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 4 Dec 2018 16:53:16 -0800 Subject: [PATCH 31/83] added tutorial 07 to rmd_files --- _bookdown.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_bookdown.yml b/_bookdown.yml index ae6599e..16d3ee9 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -5,5 +5,5 @@ language: chapter_name: "Chapter " rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", -"traits/04-danforth-indoor-phenotyping-facility.Rmd"]#, "10-simulated-sorghum.Rmd" +"traits/04-danforth-indoor-phenotyping-facility.Rmd", "traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" From 81756bbac44c8d2eb5aac8832da4df4be5c2ee9e Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 5 Dec 2018 08:19:37 -0800 Subject: [PATCH 32/83] minor edit to rmd_files --- _bookdown.yml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index 16d3ee9..d7c070f 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -3,7 +3,7 @@ output_dir: "docs" language: ui: chapter_name: "Chapter " -rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", -"traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", -"traits/04-danforth-indoor-phenotyping-facility.Rmd", "traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" +rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", +"traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", +"traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" From 7de8f6de877d91d29de99096c3db47b6e93e6130 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 5 Dec 2018 08:21:54 -0800 Subject: [PATCH 33/83] minor edits --- traits/04-danforth-indoor-phenotyping-facility.Rmd | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index e2c10b8..742239c 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -90,7 +90,6 @@ ggplot(data = danforth_sorghum) + ``` - ### Growth rate over time ```{r danforth-phenotypes, fig.width=8, fig.height=4} @@ -101,12 +100,12 @@ ggplot(data = danforth_sorghum, aes(x = date, y = mean, color = cultivar)) + geom_point(alpha = 0.4, size = 0.1) + facet_wrap(~label, scales = 'free_y') + labs(color = 'Cultivar') + ggthemes::theme_few() + ``` ### Your turn - 1. Compute phenotypes for each cultivar 2. An 'entity' is a replicate. * How many entities are there? * How many entities per cultivar? - * Did they all make it through the entire growing season? \ No newline at end of file + * Did they all make it through the entire growing season? From bdab514f630cc04d294b18ac7f5d82110017586d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 10 Dec 2018 10:47:25 -0800 Subject: [PATCH 34/83] removed extra parentheses; commented out portion of code that called the 'yields' object (object not ever created - and no rows returned for yields table query) --- traits/06-agronomic-metadata.Rmd | 105 ++++++++++++++++--------------- 1 file changed, 53 insertions(+), 52 deletions(-) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 2a26fa8..df42621 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -78,8 +78,8 @@ options(betydb_key = readLines('.betykey', warn = FALSE), betydb_api_version = 'v1') ## query and join tables -species <- (betydb_query(table = "species") - %>% select(specie_id = id, scientificname, genus)) +species <- betydb_query(table = "species") %>% + select(specie_id = id, scientificname, genus) sites <- betydb_query(table = "sites") names(sites)[1] <- 'site_id' @@ -89,29 +89,29 @@ sites %>% group_by(city, state, country) %>% summarize(n()) # A simple plot of all site coordinates. # Marker pins = sites with coords reported as a single point # Red polygons = sites reporting full boundaries -site_geom <- (sites - %>% filter(!is.na(geometry)) - %>% group_by(id) - %>% do(parsed_geometry = readWKT(text = .$geometry, id = .$id)) - %>% mutate(geom_type = class(parsed_geometry))) +site_geom <- sites %>% + filter(!is.na(geometry)) %>% + group_by(site_id) %>% + do(parsed_geometry = readWKT(text = .$geometry, id = .$site_id)) %>% + mutate(geom_type = class(parsed_geometry)) sites_poly <- do.call("rbind", filter(site_geom, geom_type == "SpatialPolygons")$parsed_geometry) sites_point <- do.call("rbind", filter(site_geom, geom_type == "SpatialPoints")$parsed_geometry) leaflet() %>% addTiles() %>% addPolygons(data = sites_poly, color = "red") #%>% addMarkers(data = sites_point) # points removed by only querying Season 2 -citations <- (betydb_query(table = "citations")#, betyurl = betyurl, key = betykey_public, api_version = "v1") - %>% select(citation_id = id, author, year, title)) +citations <- betydb_query(table = "citations") %>% + select(citation_id = id, author, year, title) -traits <- (betydb_query(table = "traits") - %>% select( +traits <- betydb_query(table = "traits") %>% + select( id, date, mean, n, statname, stat, site_id, specie_id, treatment_id, - citation_id, cultivar_id) - %>% left_join(species, by = 'specie_id') - %>% left_join(sites, by = 'site_id') - %>% left_join(citations, by = 'citation_id')) + citation_id, cultivar_id) %>% + left_join(species, by = 'specie_id') %>% + left_join(sites, by = 'site_id') %>% + left_join(citations, by = 'citation_id') ``` @@ -120,46 +120,47 @@ Let's do the manual equivalent of a cross-table join. BETY actually does contain The key idea here is that each treatment is associated with some (possibly many) managements, but the treatments table only reports the number of associated managements. To see the management IDs themselves, we need to query an individual treatment ID. So, we retrieve one table, then iterate over each row extracting the foreign keys for the other table. This requires an API call for every treatment, so beware that it is likely to be slow! ```{r} -treatments <- (betydb_query(table = 'treatments') - %>% select(treatment_id = id , name, definition, control)) +treatments <- betydb_query(table = 'treatments') %>% + select(treatment_id = id , name, definition, control) get_mgid <- function(trtid){ betydb_record(id = trtid, table = "treatments")$managements$management.id } -managements_treatments <- (treatments - %>% group_by(treatment_id) - %>% do(management_id = get_mgid(.$treatment_id)) - %>% unnest()) - -managements <- (betydb_query(table = 'managements') - %>% filter(mgmttype %in% c('Fertilization_N', 'Planting', 'Irrigation')) - %>% select(management_id = id, date, mgmttype, level, units) - %>% left_join(managements_treatments, by = 'management_id') - %>% left_join(treatments, by = 'treatment_id')) - -planting <- (managements - %>% filter(mgmttype == "Planting") - %>% select(treatment_id, planting_date = date, nrate = level)) - -grass_yields <- (yields - %>% filter(genus %in% c('Miscanthus', 'Panicum')) - %>% left_join(planting, by = 'treatment_id') - %>% collect - %>% replace_na(replace = list(nrate = 0)) - %>% mutate( - age = year(date) - year(planting_date), - SE = case_when( - .$statname == "SE" ~ .$stat, - .$statname == 'SD' ~ .$stat / sqrt(.$n), - TRUE ~ NA_real_), - continent = case_when( - .$lon < -30 ~ "united_states", - .$lon < 75 ~ "europe", - TRUE ~ "asia")) - %>% filter(!duplicated(.))) - -ggplot(data = grass_yields, aes(lon,lat)) + - geom_point(aes(color = genus, size = mean), - alpha = 0.1) +managements_treatments <- treatments %>% + group_by(treatment_id) %>% + do(management_id = get_mgid(.$treatment_id)) %>% + filter(!is.null(management_id)) %>% + unnest() + +managements <- betydb_query(table = 'managements') %>% + filter(mgmttype %in% c('Fertilization_N', 'Planting', 'Irrigation')) %>% + select(management_id = id, date, mgmttype, level, units) %>% + left_join(managements_treatments, by = 'management_id') %>% + left_join(treatments, by = 'treatment_id') + +planting <- managements %>% + filter(mgmttype == "Planting") %>% + select(treatment_id, planting_date = date, nrate = level) + +#grass_yields <- yields %>% +# filter(genus %in% c('Miscanthus', 'Panicum')) %>% +# left_join(planting, by = 'treatment_id') %>% +# collect %>% +# replace_na(replace = list(nrate = 0)) %>% +# mutate( +# age = year(date) - year(planting_date), +# SE = case_when( +# .$statname == "SE" ~ .$stat, +# .$statname == 'SD' ~ .$stat / sqrt(.$n), +# TRUE ~ NA_real_), +# continent = case_when( +# .$lon < -30 ~ "united_states", +# .$lon < 75 ~ "europe", +# TRUE ~ "asia")) %>% +# filter(!duplicated(.)) + +#ggplot(data = grass_yields, aes(lon,lat)) + +# geom_point(aes(color = genus, size = mean), +# alpha = 0.1) ``` From 29f05cc01b5d0d9df0aec7b922417c9d49003fe1 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 10 Dec 2018 12:45:11 -0800 Subject: [PATCH 35/83] rearranged chunks and added some chunk options to improve output --- traits/06-agronomic-metadata.Rmd | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index df42621..14f802c 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -63,7 +63,7 @@ Here are some key tables and fields that we will look at: | management_id | managements.id | -```{r} +```{r set-up, include = FALSE} library(dplyr) library(tidyr) @@ -77,6 +77,11 @@ options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') +``` + + + +```{r} ## query and join tables species <- betydb_query(table = "species") %>% select(specie_id = id, scientificname, genus) @@ -143,6 +148,9 @@ planting <- managements %>% filter(mgmttype == "Planting") %>% select(treatment_id, planting_date = date, nrate = level) +``` + +```{r yields-chunk, include = FALSE} #grass_yields <- yields %>% # filter(genus %in% c('Miscanthus', 'Panicum')) %>% # left_join(planting, by = 'treatment_id') %>% From 6c0a6add2e8a0d44c58a45dee33ef4769fd10b78 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 10 Dec 2018 12:52:25 -0800 Subject: [PATCH 36/83] added chunk option --- traits/03-access-r-traits.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 7f5b48c..0e1e5a0 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -76,7 +76,7 @@ sorghum_info <- betydb_query(table = 'species', ### Time series of height Now let's query some trait data. -```{r sv_area, echo = TRUE} +```{r sv_area, echo = TRUE, results = FALSE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", site = "~Season 6", From 11439e42990391b7cbdc93b86a23b6a904fa31a1 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Mon, 10 Dec 2018 15:25:33 -0700 Subject: [PATCH 37/83] Traits tutorials revisions (#41) * added some context to index * updated / added some stubs where we need more details --- index.Rmd | 56 +++++++++++++++++++++++++++++++-- traits/02-betydb-api-access.Rmd | 38 ++++++++++++---------- traits/03-access-r-traits.Rmd | 16 +++++----- 3 files changed, 83 insertions(+), 27 deletions(-) diff --git a/index.Rmd b/index.Rmd index 26484cc..e781f31 100644 --- a/index.Rmd +++ b/index.Rmd @@ -6,10 +6,62 @@ date: "`r Sys.Date()`" documentclass: book output: bookdown::gitbook: default - bookdown::pdf_book: default --- -# Preamble +# Overview + +This book is intended to introduce users to TERRA REF data as quickly as possible. + +It introduces to the wide range of phenomics datasets generated by the TERRA Reference program. Not only does TERRA REF have a large number of data sets, but many of the databases can be accessed in a number of different ways. While this makes it more complicated to learn, the goal is to provide users with the flexibility to access data in the most useful way. + +## User Accounts and permission to access TERRA REF data + +TODO: link to relevant parts of docs.terraref.org + +## Ways of Acessing Data + +* Web Interfaces +* Files +* Programming APIs +* API Clients + +## Other Resources + +The TERRA REF website: terraref.org +The TERRA REF Technical Documentation: docs.terraref.org + +## Contents + +Scope ... + +Audience ... + + +## Pre-requisites + +While we assume that readers will have some familiarity with the nature of the problem - remote sensing of crop plants - for the most part, these tutorials assume that the user will bring their own scientific questions and a sense of curiosity and are eager to learn. + +Some of the lessons only require a web browser; others will assume familarity with programming at the command line in (typically only one of) Python, R, and / or SQL. You should be willing to find help (see finding help, below). + +## Technical Requirements + +At a minimum, you should have: + +* An internet connection +* Web Browser +* A TERRA REF Beta User account + * If you have not done so, please sign up at terraref.org/beta +* Access to the data that you are using + * The tutorials will state which databases you will need access to +* Software: + * Software requirements vary with the tutorials, and may be complex + * + +## Finding help + +* Slack +* GitHub +* Google ```{r} knitr::opts_chunk$set(echo = FALSE, cache = TRUE) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index 4c0b33c..eafb8e0 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -1,15 +1,27 @@ # Accessing Trait Data Via the BETYdb API -## Using URLs to construct Queries - -The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. +This will teach you how to query trait data using a browser as well as using the command line tool `curl`. This interface is the primary way in which you can access data from the command line. -### What is an API? +## What is an API? An API is an 'Application Programming Interface'. An API is a way that you and your software can connect to and access data. All of our databases have web interfaces for humans to browse as well as APIs that are constructed as URLs. +## Tutorial Contents + +In this tutorial, we will describe three ways to access data using: + +1. A URL typed into your browser +2. The command line, or terminal +3. The R jsonlite package + +We also have interfaces using R 'traits' package or the Python 'terrautils' package that return data in a more familiar and ready to analyze tabular format; these will be described later. You can skip ahead to those chapters, but this chapter will provide some insight into the methods that underlie those libraries. + +## Using URLs to construct Queries + +The first step toward reproducible pipelines is to automate the process of searching the database and returning results. This is one of the key roles of an Application programming interface, or 'API'. You can learn to use the API in less than 20 minutes, starting now. + ### Using Your API key to Connect @@ -27,17 +39,11 @@ Here is how to find and save your API key: For the public key, you can call this file `.betykey_public`. -### Ways to access API data - -1. Through a URL query -2. Using the bash shell -3. Using the R jsonlite package +## Accessing data using a URL query -### Accessing data using a URL query - -## Components of a URL query +### Components of a URL query * base url: `terraref.ncsa.illinois.edu/bety` * path to the api: `/api/v1` @@ -45,7 +51,7 @@ For the public key, you can call this file `.betykey_public`. * Query parameters: `genus=Sorghum` * Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. -## Constructing a URL query +### Constructing a URL query First, lets construct a query by putting together a URL. @@ -72,9 +78,7 @@ First, lets construct a query by putting together a URL. What do you see? Do you think that this is all of the records? What happens if you add `&limit=none`? - - -#### Accessing data using the Shell +## Accessing data using the Command Line Terminal Type the following command into a bash shell (the `-o` option names the output file): @@ -90,7 +94,7 @@ curl -o sorghum.json \ "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`" ``` -### Accessing API data using the R jsonlite package +## Using the R jsonlite package to access the API with a URL query ```{r text-api, warning = FALSE} sorghum.json <- readLines( diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 0a4e713..9ed7289 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -1,20 +1,20 @@ # Accessing Trait Data in R -## Using the R traits package to query the database +The rOpenSci traits package makes it easier to query the TERRA REF trait database because 1) you can pass the query parameters in an R function, and the package takes care of putting the parameters into a valid URL and 2) because the package returns data in a tabular format that is ready to analyze. -The rOpenSci traits package makes it easier to query the TERRA REF trait database, or any database that uses BETYdb software. +## Using the R traits package to query the database -First, make sure we have the latest version from the terraref fork of the repository on github. (you can install using the standard `install.packages('traits')` but as of Jan 2018 this version times out on very large datasets). +## Setup -### Install the traits package +Install the traits package -This is for users working on their own computer - if you are using the TERRA REF Rstudio Docker container (including on workbench) you can skip this step. +The traits package is on CRAN, and can therefore be installed using the following command: ```{r install_traits, echo = FALSE, include = FALSE} -devtools::install_github('terraref/traits') +install.packages('traits') ``` -Now, we can load the packages that we will need to get started. +Load other packages that we will need to get started. ```{r 00-setup, message = FALSE} library(traits) @@ -24,7 +24,7 @@ theme_set(theme_bw()) library(dplyr) ``` - +Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. The public key will provide access to all metadata; you will need a personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions. ```{r writing-key} # This should be done once with the key sent to you in your email From 3577f0260092fb638ec87967f8e73df6b4bfd5e3 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 10 Dec 2018 16:42:38 -0800 Subject: [PATCH 38/83] updated rmd_files (added tutorials 5 and 6) --- _bookdown.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index d7c070f..c7087a1 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -4,6 +4,6 @@ language: ui: chapter_name: "Chapter " rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", -"traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", -"traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" +"traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", +"traits/05-maricopa-field-scanner.Rmd", "traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" From 088e186c4f1a576b5a84b2d3d5da9ed61cc00506 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 10 Dec 2018 16:43:27 -0800 Subject: [PATCH 39/83] added section header; added pointer urls --- index.Rmd | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/index.Rmd b/index.Rmd index e781f31..884bd6e 100644 --- a/index.Rmd +++ b/index.Rmd @@ -8,6 +8,8 @@ output: bookdown::gitbook: default --- +# Section 1: Traits {-} + # Overview This book is intended to introduce users to TERRA REF data as quickly as possible. @@ -18,17 +20,26 @@ It introduces to the wide range of phenomics datasets generated by the TERRA Ref TODO: link to relevant parts of docs.terraref.org +* Info on how to [request access to data](https://docs.terraref.org/user-manual/how-to-access-data/using-betydb-trait-data-experimental-metadata) + ## Ways of Acessing Data * Web Interfaces + + [Clowder](https://docs.terraref.org/user-manual/how-to-access-data/using-clowder-sensor-and-genoomics-data) (sensor and genomic data) + + [Globus](https://docs.terraref.org/user-manual/how-to-access-data/using-globus-sensor-and-genomics-data) (sensor and genomic data) + + [BETYdb](https://docs.terraref.org/user-manual/how-to-access-data/using-betydb-trait-data-experimental-metadata) (trait data and experimental metadata) + + [CoGe](https://docs.terraref.org/user-manual/how-to-access-data/using-coge-genomics) (genomic data) * Files * Programming APIs + + [BETYdb API](https://pecan.gitbook.io/betydb-data-access/api-for-url-based-queries) * API Clients + + [rOpenSci traits package](https://pecan.gitbook.io/betydb-data-access/ropensci-traits-package) ## Other Resources -The TERRA REF website: terraref.org -The TERRA REF Technical Documentation: docs.terraref.org +The TERRA REF website: [terraref.org](http://terraref.org/) + +The TERRA REF Technical Documentation: [docs.terraref.org](docs.terraref.org) ## Contents @@ -50,20 +61,20 @@ At a minimum, you should have: * An internet connection * Web Browser * A TERRA REF Beta User account - * If you have not done so, please sign up at terraref.org/beta + + If you have not done so, please sign up at [terraref.org/beta](terraref.org/beta) * Access to the data that you are using - * The tutorials will state which databases you will need access to + + The tutorials will state which databases you will need access to * Software: - * Software requirements vary with the tutorials, and may be complex - * + + Software requirements vary with the tutorials, and may be complex + ## Finding help -* Slack -* GitHub -* Google +- [Slack](terra-ref.slack.com) +- [GitHub](https://github.com/terraref/tutorials) +- [Google](https://www.google.com/) -```{r} +```{r, include = FALSE} knitr::opts_chunk$set(echo = FALSE, cache = TRUE) options(warn = -1) ``` From 739a691823deb5669ab3313685e180e80ef6c7e7 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Tue, 11 Dec 2018 15:36:54 -0700 Subject: [PATCH 40/83] added some context to traits/06 --- traits/06-agronomic-metadata.Rmd | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 14f802c..0d4db17 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -1,4 +1,12 @@ -# Phenotype Analysis +# Querying Agronomic Meta-data + +In previous tutorials you have learned how to query trait data using a variety of different methods, including the web interface, an API, and the R traits package. Here you will continute to use the R traits package, and learn how to access meta-data from other tables in the database. + +While the basic search query that we have used in previous sections provides the key information that you may need for an analysis - the genotype name, the location, date, and method, there are other tables that contain more specific metadata. + +For example, the managements table provides information about planting and harvest dates, planting density, and rates of fertilizer, pesticide, and herbicide applications. + +While the main search results provide the latitude and longitude of the center of each plot, if you query the sites table directly you can also find the plot boundary - this can be useful for subsetting georeferenced images. ## Joining database tables From fde14fad76e3cd9ea66686094763992864eb48fc Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:16:50 -0800 Subject: [PATCH 41/83] changed files to include in rmd_files --- _bookdown.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/_bookdown.yml b/_bookdown.yml index c7087a1..9bfa094 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -5,5 +5,5 @@ language: chapter_name: "Chapter " rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", -"traits/05-maricopa-field-scanner.Rmd", "traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd"]#, "10-simulated-sorghum.Rmd" +"traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"]#, "10-simulated-sorghum.Rmd" From df7096f508d22dcb1dae0269c00baf3e6285ae5a Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:17:28 -0800 Subject: [PATCH 42/83] removed section header - not supposed to be in this file --- index.Rmd | 1 - 1 file changed, 1 deletion(-) diff --git a/index.Rmd b/index.Rmd index 884bd6e..8da0886 100644 --- a/index.Rmd +++ b/index.Rmd @@ -8,7 +8,6 @@ output: bookdown::gitbook: default --- -# Section 1: Traits {-} # Overview From 3ff3e25576e4c149cef49aedebb55dbb572f50d2 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:17:44 -0800 Subject: [PATCH 43/83] created section header --- traits/00-BETYdb-getting-started.Rmd | 2 ++ 1 file changed, 2 insertions(+) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index 3cf1201..5457247 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -1,3 +1,5 @@ +# (PART\*) Secton 1: Traits {-} + # Getting Started with BETYdb ## TERRA Ref Trait Database From b9dde017e399e30a86fc6c3e4324fea042096ca9 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:18:58 -0800 Subject: [PATCH 44/83] changed some chunk parameters and added repository for traits installation --- traits/03-access-r-traits.Rmd | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index c78870f..97a1920 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -10,13 +10,13 @@ Install the traits package The traits package is on CRAN, and can therefore be installed using the following command: -```{r install_traits, echo = FALSE, include = FALSE} -install.packages('traits') +```{r install_traits, echo = TRUE, message = FALSE} +install.packages('traits', repos = 'http://cran.rstudio.com/') ``` Load other packages that we will need to get started. -```{r 00-setup, message = FALSE} +```{r 00-setup, message = FALSE, echo = TRUE} library(traits) library(ggplot2) library(ggthemes) @@ -26,14 +26,14 @@ library(dplyr) Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. The public key will provide access to all metadata; you will need a personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions. -```{r writing-key} +```{r writing-key, echo = TRUE} # This should be done once with the key sent to you in your email # writeLines('abcdefg_rest_of_key_sent_in_email', # con = '.betykey') # Example with the public key: writeLines('9999999999999999999999999999999999999999', - con = '.betykey_public') + con = 'traits/.betykey_public') ``` #### R - using the traits package @@ -69,7 +69,7 @@ options(betydb_key = readLines('traits/.betykey', warn = FALSE), Now the same query can be reduced to: -```{r query-species-reduce, echo = TRUE} +```{r query-species-reduce, echo = TRUE, results = FALSE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", limit = 'none') From a9542da90a033c1d36818c9be34b9acb1618b5b9 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:19:13 -0800 Subject: [PATCH 45/83] changed chunk parameters --- traits/04-danforth-indoor-phenotyping-facility.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index 742239c..b53f7ec 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -29,7 +29,7 @@ options(betydb_key = readLines('traits/.betykey', warn = FALSE), First we will use the generic search to query the output from the Lemnatec indoor phenotyping system at the Danforth Center in St. Louis, MO. -```{r query-danforth, results='hide'} +```{r query-danforth, message = FALSE} danforth_sorghum <- traits::betydb_query( # sitename = 'Danforth Plant Science Center Bellweather Phenotyping Facility', trait = 'sv_area', From 05155fb24a252e85299e8e5dea27d2a09f5fb954 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:20:10 -0800 Subject: [PATCH 46/83] changed chunk parameters and commented out sections that were not running correctly (HTTP 504) --- traits/05-maricopa-field-scanner.Rmd | 51 +++++++++++++++++----------- 1 file changed, 31 insertions(+), 20 deletions(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index d345f65..910228d 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -23,7 +23,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), First, query the plots for Season 2. The simple way to use this is based on the fact that the plot names at Maricopa contain the season. -```{r traits-05-query-mac-sites} +```{r traits-05-query-mac-sites, echo = TRUE} sites <- betydb_query( table = "sites", city = "Maricopa", sitename = "~Season 2 range", limit = "none") @@ -33,10 +33,10 @@ A more robust (but complicated way) would be to query the experiments and experi ### Plot Season 2 plots -```{r traits-05-map-mac-polygons} +```{r traits-05-map-mac-polygons, echo = TRUE} site_bounds <- (sites - %>% rowwise() - %>% do(boundaries = readWKT(text = .$geometry, id = .$id))) + %>% rowwise() + %>% do(boundaries = readWKT(text = .$geometry, id = .$id))) site_bounds <- do.call('rbind', site_bounds$boundaries) #names(site_bounds) <- sites$sitename @@ -53,35 +53,46 @@ leaflet() %>% addPolygons(data=site_bounds, popup = sites$sitename) ``` +```{r} ## Cultivars +``` ```{r traits-05-mac-cultivars} -cultivars <- betydb_query( - table = "cultivars", limit = "none") %>% - rename(cultivar_id = id) - -traits <- traits <- betydb_search( - Season = "~Season 4", - include_unchecked = 'true', - limit = "none") %>% - rename(trait_id = id) +#cultivars <- betydb_query( +# table = "cultivars", limit = "none") %>% +# rename(cultivar_id = id) + + +# getting HTTP 504 - Gateway timeout +# comment this chunk out +#traits <- traits <- betydb_search( +# Season = "~Season 4", +# include_unchecked = 'true', +# limit = "none") %>% +# rename(trait_id = id) ``` - +```{r} ## Time series of canopy cover, height, NDVI -First look up variables by name. Let's look for measurements related to canopy size: +#First look up variables by name. Let's look for measurements related to canopy size: + +``` ```{r traits-05-height-cover-ndvi} -variables <- betydb_query( - table = "variables", name = "~^(NDVI|canopy_height|canopy_cover|)$") +#variables <- betydb_query( +# table = "variables", name = "~^(NDVI|canopy_height|canopy_cover|)$") -variables %>% - select(id, name, units, n_records = `number of associated traits`) +#variables %>% +# select(id, name, units, n_records = `number of associated traits`) ``` -Exercise: Why are there two variables named canopy_height, and what database fields should you examine to decide which one you want? +```{r} + +#Exercise: Why are there two variables named canopy_height, and what database fields should you examine to decide which one you want? + +``` ```{r traits-05-get-variables-comment} From abd27c39a6a6d0e6e28fdba63ca1528f4032de06 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 11 Dec 2018 15:20:55 -0800 Subject: [PATCH 47/83] changed chunk parameters --- traits/06-agronomic-metadata.Rmd | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 14f802c..1f02fc5 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -63,7 +63,7 @@ Here are some key tables and fields that we will look at: | management_id | managements.id | -```{r set-up, include = FALSE} +```{r tutorial-06-set-up, include = FALSE} library(dplyr) library(tidyr) @@ -73,7 +73,7 @@ library(rgeos) library(leaflet) year <- lubridate::year -options(betydb_key = readLines('.betykey', warn = FALSE), +options(betydb_key = readLines('traits/.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') @@ -81,7 +81,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), -```{r} +```{r 06_tibble, echo = TRUE, warning = FALSE} ## query and join tables species <- betydb_query(table = "species") %>% select(specie_id = id, scientificname, genus) @@ -124,7 +124,7 @@ Let's do the manual equivalent of a cross-table join. BETY actually does contain The key idea here is that each treatment is associated with some (possibly many) managements, but the treatments table only reports the number of associated managements. To see the management IDs themselves, we need to query an individual treatment ID. So, we retrieve one table, then iterate over each row extracting the foreign keys for the other table. This requires an API call for every treatment, so beware that it is likely to be slow! -```{r} +```{r 06_cross_join, echo = TRUE, results = 'hide'} treatments <- betydb_query(table = 'treatments') %>% select(treatment_id = id , name, definition, control) From dfce7fc8224b57cddc7a22a97b193a3a8de65118 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Tue, 11 Dec 2018 16:57:24 -0700 Subject: [PATCH 48/83] fix canopy_height query ... using sitename now. --- traits/03-access-r-traits.Rmd | 20 ++++++++++---------- traits/06-agronomic-metadata.Rmd | 2 +- 2 files changed, 11 insertions(+), 11 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 97a1920..30191ff 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -33,7 +33,7 @@ Create a file that contains your API key. If you have signed up for access to th # Example with the public key: writeLines('9999999999999999999999999999999999999999', - con = 'traits/.betykey_public') + con = '.betykey_public') ``` #### R - using the traits package @@ -51,7 +51,7 @@ sorghum_info <- betydb_query(table = 'species', api_version = 'v1', limit = 'none', betyurl = "https://terraref.ncsa.illinois.edu/bety/", - key = readLines('traits/.betykey', warn = FALSE)) + key = readLines('.betykey', warn = FALSE)) ``` @@ -62,7 +62,7 @@ Notice all of the arguments that the `betydb_query` function requires? We can ch ```{r set-up, echo = TRUE} -options(betydb_key = readLines('traits/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') ``` @@ -71,18 +71,18 @@ Now the same query can be reduced to: ```{r query-species-reduce, echo = TRUE, results = FALSE} sorghum_info <- betydb_query(table = 'species', - genus = "Sorghum", - limit = 'none') + genus = "Sorghum", + limit = 'none') ``` ### Time series of height Now let's query some trait data. -```{r sv_area, echo = TRUE, results = FALSE} -sorghum_height <- betydb_query(table = 'search', - trait = "canopy_height", - site = "~Season 6", - limit = 'none') +```{r canopy_height, echo = TRUE, results = FALSE, cache = TRUE} +sorghum_height <- betydb_query(table = 'search', + trait = "canopy_height", + sitename = "~Season 6", + limit = 'none') ``` ```{r} diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 0d4db17..86a03b7 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -71,7 +71,7 @@ Here are some key tables and fields that we will look at: | management_id | managements.id | -```{r set-up, include = FALSE} +```{r 06-setup, include = FALSE} library(dplyr) library(tidyr) From 26ad84476dca951da3ef4bde067aa164e74f9f39 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Tue, 11 Dec 2018 17:02:49 -0700 Subject: [PATCH 49/83] remove cache from canopy_height chunk --- traits/03-access-r-traits.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 30191ff..ea0e107 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -78,7 +78,7 @@ sorghum_info <- betydb_query(table = 'species', ### Time series of height Now let's query some trait data. -```{r canopy_height, echo = TRUE, results = FALSE, cache = TRUE} +```{r canopy_height, echo = TRUE, results = FALSE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", sitename = "~Season 6", From c358865b27784f5527107ef57a8554270b956092 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Tue, 11 Dec 2018 17:02:49 -0700 Subject: [PATCH 50/83] remove cache from canopy_height chunk --- traits/03-access-r-traits.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 30191ff..bb42101 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -61,7 +61,7 @@ Notice all of the arguments that the `betydb_query` function requires? We can ch -```{r set-up, echo = TRUE} +```{r 03-set-up, echo = TRUE} options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') @@ -78,14 +78,14 @@ sorghum_info <- betydb_query(table = 'species', ### Time series of height Now let's query some trait data. -```{r canopy_height, echo = TRUE, results = FALSE, cache = TRUE} +```{r canopy_height, echo = TRUE, results = FALSE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", sitename = "~Season 6", limit = 'none') ``` -```{r} +```{r plot_height} ggplot(data = sorghum_height, aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) + geom_point(size = 0.5, position = position_jitter(width = 0.1)) + From 2b028c116881cb462cfa94b9af2725cde3db6324 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Mon, 17 Dec 2018 12:41:01 -0700 Subject: [PATCH 51/83] query canopy_height from season 2 instead of season 6 --- traits/03-access-r-traits.Rmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index bb42101..1ff6940 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -69,7 +69,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), Now the same query can be reduced to: -```{r query-species-reduce, echo = TRUE, results = FALSE} +```{r query-species-reduce, echo = TRUE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", limit = 'none') @@ -81,7 +81,7 @@ Now let's query some trait data. ```{r canopy_height, echo = TRUE, results = FALSE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", - sitename = "~Season 6", + sitename = "~Season 2", limit = 'none') ``` From c46d99a5a57c2335016c493d2508d839ddddc970 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:51:50 -0800 Subject: [PATCH 52/83] updated tutorials to include in rmd_files --- _bookdown.yml | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index 9bfa094..08e5219 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -5,5 +5,4 @@ language: chapter_name: "Chapter " rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", -"traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"]#, "10-simulated-sorghum.Rmd" - +"traits/05-maricopa-field-scanner.Rmd", "traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"] From 7820ad561bb16d01933f03e370438efc924bfb5d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:53:24 -0800 Subject: [PATCH 53/83] removed cache = TRUE from chunk option --- index.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/index.Rmd b/index.Rmd index 8da0886..d228b44 100644 --- a/index.Rmd +++ b/index.Rmd @@ -74,7 +74,7 @@ At a minimum, you should have: - [Google](https://www.google.com/) ```{r, include = FALSE} -knitr::opts_chunk$set(echo = FALSE, cache = TRUE) +knitr::opts_chunk$set(echo = FALSE) options(warn = -1) ``` From 084912b5511a6dcd6de8fd8bd536a16241705185 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:53:51 -0800 Subject: [PATCH 54/83] changed chunk options and path to .betykey --- traits/04-danforth-indoor-phenotyping-facility.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/traits/04-danforth-indoor-phenotyping-facility.Rmd b/traits/04-danforth-indoor-phenotyping-facility.Rmd index b53f7ec..238ecfa 100644 --- a/traits/04-danforth-indoor-phenotyping-facility.Rmd +++ b/traits/04-danforth-indoor-phenotyping-facility.Rmd @@ -1,7 +1,7 @@ # Danforth Indoor Phenotype Analysis ```{r 02-setup, include=FALSE} -knitr::opts_chunk$set(echo = TRUE, cache = TRUE) +knitr::opts_chunk$set(echo = TRUE, cache = FALSE) library(jsonlite) library(dplyr) library(ggplot2) @@ -21,7 +21,7 @@ library(traits) Unlike the first two tutorials, now we will be querying real data from the public TERRA REF database. So we will use a new URL, https://terraref.ncsa.illinois.edu/bety/, and we will need to use our own private key. ```{r terraref-connect-options} -options(betydb_key = readLines('traits/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') ``` @@ -92,7 +92,7 @@ ggplot(data = danforth_sorghum) + ### Growth rate over time -```{r danforth-phenotypes, fig.width=8, fig.height=4} +```{r danforth-phenotypes, fig.width=8, fig.height=4, message = FALSE} ggplot(data = danforth_sorghum, aes(x = date, y = mean, color = cultivar)) + # geom_line(aes(group = entity), size = 0.1) + From 66a0dd2369b388f477efaa6c2f0bc8dffe762952 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:55:22 -0800 Subject: [PATCH 55/83] changed chunk options and path to .betykey --- traits/05-maricopa-field-scanner.Rmd | 12 +++++---- traits/06-agronomic-metadata.Rmd | 2 +- traits/10-simulated-sorghum.Rmd | 39 ++++++++++++++-------------- 3 files changed, 28 insertions(+), 25 deletions(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index 910228d..e5a3be6 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -1,7 +1,7 @@ # Plot level data from the field scanner in Maricopa, AZ ```{r traits-05-mac-traits-setup, include=FALSE} -knitr::opts_chunk$set(echo = FALSE, cache = TRUE) +knitr::opts_chunk$set(echo = FALSE, cache = FALSE) library(dplyr) library(tidyr) library(ggplot2) @@ -23,7 +23,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), First, query the plots for Season 2. The simple way to use this is based on the fact that the plot names at Maricopa contain the season. -```{r traits-05-query-mac-sites, echo = TRUE} +```{r traits-05-query-mac-sites, echo = TRUE, message = FALSE} sites <- betydb_query( table = "sites", city = "Maricopa", sitename = "~Season 2 range", limit = "none") @@ -34,9 +34,10 @@ A more robust (but complicated way) would be to query the experiments and experi ### Plot Season 2 plots ```{r traits-05-map-mac-polygons, echo = TRUE} -site_bounds <- (sites - %>% rowwise() - %>% do(boundaries = readWKT(text = .$geometry, id = .$id))) + +site_bounds <- sites %>% + rowwise() %>% + do(boundaries = readWKT(text = .$geometry, id = .$id)) site_bounds <- do.call('rbind', site_bounds$boundaries) #names(site_bounds) <- sites$sitename @@ -54,6 +55,7 @@ leaflet() %>% ``` ```{r} + ## Cultivars ``` diff --git a/traits/06-agronomic-metadata.Rmd b/traits/06-agronomic-metadata.Rmd index 899d696..b701788 100644 --- a/traits/06-agronomic-metadata.Rmd +++ b/traits/06-agronomic-metadata.Rmd @@ -81,7 +81,7 @@ library(rgeos) library(leaflet) year <- lubridate::year -options(betydb_key = readLines('traits/.betykey', warn = FALSE), +options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1') diff --git a/traits/10-simulated-sorghum.Rmd b/traits/10-simulated-sorghum.Rmd index 71abeb4..03a8041 100644 --- a/traits/10-simulated-sorghum.Rmd +++ b/traits/10-simulated-sorghum.Rmd @@ -1,8 +1,8 @@ # A Simulated Phenotype Dataset -```{r warnings=FALSE, echo=FALSE} +```{r include = FALSE} library(traits) -knitr::opts_chunk$set(echo = FALSE, cache = TRUE) +knitr::opts_chunk$set(echo = FALSE, cache =FALSE) library(ggplot2) library(ggthemes) library(GGally) @@ -145,7 +145,7 @@ ggplot(sorghum_sla) + ## Your turn: query the list of available traits from the variables table -```{r query-traits} +```{r query-traits, message = FALSE} trait_list <- c("Vcmax", "c2n_leaf", "cuticular_cond", "SLA", "quantum_efficiency", "leaf_respiration_rate_m2", "stomatal_slope.BB", "Jmax", "chi_leaf", "extinction_coefficient_diffuse") @@ -164,7 +164,7 @@ knitr::kable(variables %>% These traits are not time series, each of the ~500 genotypes is associated with a single value for each trait. This is different from the time series of LAI that we saw in the previous exercise or the biomass data that we will look at below. -```{r} +```{r traits-sel, message = FALSE} traits_list <- list() for(trait in trait_list){ @@ -202,7 +202,7 @@ knitr::kable(variables %>% select(name, description, units)) ``` -```{r all_sorghum, cache=TRUE} +```{r all_sorghum, cache = TRUE, results = 'hide'} site_id <- betydb_query(table = 'sites', sitename = "Central IL Plot D")$id @@ -224,24 +224,24 @@ for(t in c('canopy_height', 'stem_biomass', 'LAI', 'NDVI')){ ``` -``` This is how you can query a time series of sorghum height data for the Northern IL site. -```{r query-sorghum-height} -sorghum_height <- betydb_query(table = 'search', - trait = 'canopy_height', - year = 1022, - site = "~Northern IL", - limit = 'none') -#save(sorghum_height, file = 'data/sorghum_height.RData') +```{r query-sorghum-height, echo = TRUE} +#sorghum_height <- betydb_query(table = 'search', +# trait = 'canopy_height', +# year = 1022, +# site = "~Northern IL", +# limit = 'none') + +#save(sorghum_height, file = 'traits/sorghum_height.RData') ``` However, with almost 200k rows it currently takes 40 minutes to query (this is a limitation of the API). For the purposes of this tutorial, we will use a cached copy of the dataset. -```{r} -#load('data/sorghum_height.RData') +```{r 10-sim-sorg-plot, message = FALSE} +load('traits/sorghum_height.RData') s <- sorghum_height %>% mutate(day = lubridate::yday(raw_date), @@ -267,7 +267,7 @@ Now lets look at a 'pairs' plot to see if there is any covariance among the trai First, lets rearrange the data from 'long' to 'wide' format. We will also take this chance to rename the 'cultivar' field to 'genotype'. -```{r} +```{r 10_traits_wide, echo = TRUE} traits_wide <- traits %>% select(genotype = cultivar, trait, mean) %>% @@ -277,7 +277,7 @@ traits_wide <- traits %>% Now, lets create a variable called `max_height` -```{r max_height} +```{r max_height, echo = TRUE} # create the variable max height max_height <- s %>% group_by(genotype) %>% @@ -287,13 +287,14 @@ max_height <- s %>% Now, join the traits data frame with the new max_height data frame trait data we will merge the two data frames on the `genotype` field. -```{r join_traits_height} +```{r join_traits_height, echo = TRUE, warning = FALSE} + traits_height <- traits_wide %>% left_join(max_height, by = 'genotype') ``` Which traits are related to height? We can discover this in a few way, for example, a pairs plot that shows correltations: -```{r trait_pairs, fig.height = 8, fig.width = 8} +```{r trait_pairs, fig.height = 8, fig.width = 8, warning = FALSE} ggpairs(traits_height %>% select(-genotype), lower = list(continuous = 'density'), upper = list(continuous = 'cor'), From 097b7a33707afdf017a0027692f74fe30ee98b31 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:56:10 -0800 Subject: [PATCH 56/83] loaded in jsonlite package; changed path to .betykey --- traits/02-betydb-api-access.Rmd | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index eafb8e0..19fd2b6 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -96,10 +96,16 @@ curl -o sorghum.json \ ## Using the R jsonlite package to access the API with a URL query +```{r 02-jsonlite-load, include = FALSE} + +library(jsonlite) + +``` + ```{r text-api, warning = FALSE} sorghum.json <- readLines( paste0("https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=", - readLines('traits/.betykey'))) + readLines('.betykey'))) ## print(sorghum.json) ## not a particularly useful format From ad70a5e24f40126da6f4f1406680090231f860e3 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 17 Dec 2018 14:57:57 -0800 Subject: [PATCH 57/83] changed chunk options and installation of traits package from CRAN to github --- traits/03-access-r-traits.Rmd | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 1ff6940..4b6e579 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -8,10 +8,14 @@ The rOpenSci traits package makes it easier to query the TERRA REF trait databas Install the traits package -The traits package is on CRAN, and can therefore be installed using the following command: +The traits package can be installed through github using the following command: ```{r install_traits, echo = TRUE, message = FALSE} -install.packages('traits', repos = 'http://cran.rstudio.com/') + +if(packageVersion("traits") == '0.2.0'){ + devtools::install_github('ropensci/traits') +} + ``` Load other packages that we will need to get started. @@ -44,7 +48,7 @@ The R traits package is an API 'client'. It does two important things: Lets start with the query of information about Sorghum from species table from above -```{r query-species, echo = TRUE} +```{r query-species, results = 'hide', echo = TRUE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", @@ -69,7 +73,7 @@ options(betydb_key = readLines('.betykey', warn = FALSE), Now the same query can be reduced to: -```{r query-species-reduce, echo = TRUE} +```{r query-species-reduce, message = FALSE, echo = TRUE} sorghum_info <- betydb_query(table = 'species', genus = "Sorghum", limit = 'none') @@ -78,7 +82,7 @@ sorghum_info <- betydb_query(table = 'species', ### Time series of height Now let's query some trait data. -```{r canopy_height, echo = TRUE, results = FALSE} +```{r canopy_height, echo = TRUE, message = FALSE} sorghum_height <- betydb_query(table = 'search', trait = "canopy_height", sitename = "~Season 2", From 40a32466f8fd91712c286e686906b30ca2a5b9aa Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 23 Jan 2019 13:58:10 -0800 Subject: [PATCH 58/83] updated file to include new vignettes section --- _bookdown.yml | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index 08e5219..da3d252 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -3,6 +3,7 @@ output_dir: "docs" language: ui: chapter_name: "Chapter " -rmd_files: ["index.Rmd", "traits/00-BETYdb-getting-started.Rmd", "traits/01-web-access.Rmd", -"traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", +rmd_files: ["index.Rmd", "vignettes/00-vignettes-introduction.Rmd", "vignettes/01-traits-vignette.Rmd", "vignettes/02-weather-vignette.Rmd", +"vignettes/02-weather-vignette.Rmd", "vignettes/03-images-vignette.Rmd", "vignettes/04-synthesis-vignette.Rmd", "traits/00-BETYdb-getting-started.Rmd", +"traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", "traits/05-maricopa-field-scanner.Rmd", "traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"] From 6ffe38bf334ab06622e1f293294ad3b76cfc29bd Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 23 Jan 2019 14:00:39 -0800 Subject: [PATCH 59/83] created vignette folder and files --- vignettes/00-vignettes-introduction.Rmd | 3 +++ vignettes/01-traits-vignette.Rmd | 1 + vignettes/02-weather-vignette.Rmd | 1 + vignettes/03-images-vignette.Rmd | 1 + vignettes/04-synthesis-vignette.Rmd | 1 + 5 files changed, 7 insertions(+) create mode 100644 vignettes/00-vignettes-introduction.Rmd create mode 100644 vignettes/01-traits-vignette.Rmd create mode 100644 vignettes/02-weather-vignette.Rmd create mode 100644 vignettes/03-images-vignette.Rmd create mode 100644 vignettes/04-synthesis-vignette.Rmd diff --git a/vignettes/00-vignettes-introduction.Rmd b/vignettes/00-vignettes-introduction.Rmd new file mode 100644 index 0000000..2dd17c6 --- /dev/null +++ b/vignettes/00-vignettes-introduction.Rmd @@ -0,0 +1,3 @@ +# (PART\*) Secton 1: Vignettes {-} + +# Vignettes Introduction \ No newline at end of file diff --git a/vignettes/01-traits-vignette.Rmd b/vignettes/01-traits-vignette.Rmd new file mode 100644 index 0000000..9bdc468 --- /dev/null +++ b/vignettes/01-traits-vignette.Rmd @@ -0,0 +1 @@ +# Traits Vignette \ No newline at end of file diff --git a/vignettes/02-weather-vignette.Rmd b/vignettes/02-weather-vignette.Rmd new file mode 100644 index 0000000..157521e --- /dev/null +++ b/vignettes/02-weather-vignette.Rmd @@ -0,0 +1 @@ +# Weather Vignette \ No newline at end of file diff --git a/vignettes/03-images-vignette.Rmd b/vignettes/03-images-vignette.Rmd new file mode 100644 index 0000000..bc5fed0 --- /dev/null +++ b/vignettes/03-images-vignette.Rmd @@ -0,0 +1 @@ +# Images Vignette \ No newline at end of file diff --git a/vignettes/04-synthesis-vignette.Rmd b/vignettes/04-synthesis-vignette.Rmd new file mode 100644 index 0000000..8e8313f --- /dev/null +++ b/vignettes/04-synthesis-vignette.Rmd @@ -0,0 +1 @@ +# Synthesis Vignette \ No newline at end of file From 299feb03b79a34391b3cb79a3adfba7d22076498 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 23 Jan 2019 14:01:29 -0800 Subject: [PATCH 60/83] changed traits section number to 2 --- traits/00-BETYdb-getting-started.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/00-BETYdb-getting-started.Rmd b/traits/00-BETYdb-getting-started.Rmd index 5457247..e4dc960 100644 --- a/traits/00-BETYdb-getting-started.Rmd +++ b/traits/00-BETYdb-getting-started.Rmd @@ -1,4 +1,4 @@ -# (PART\*) Secton 1: Traits {-} +# (PART\*) Secton 2: Traits {-} # Getting Started with BETYdb From 6c4adef3aaaeb36182f99f97fe87a586a6ea72b2 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Wed, 23 Jan 2019 14:01:59 -0800 Subject: [PATCH 61/83] minor edits to chunk options and variable names --- traits/03-access-r-traits.Rmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 4b6e579..2b1b428 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -20,7 +20,7 @@ if(packageVersion("traits") == '0.2.0'){ Load other packages that we will need to get started. -```{r 00-setup, message = FALSE, echo = TRUE} +```{r 00-setup, message = FALSE, echo = TRUE, warning = FALSE} library(traits) library(ggplot2) library(ggthemes) @@ -83,14 +83,14 @@ sorghum_info <- betydb_query(table = 'species', Now let's query some trait data. ```{r canopy_height, echo = TRUE, message = FALSE} -sorghum_height <- betydb_query(table = 'search', +canopy_height <- betydb_query(table = 'search', trait = "canopy_height", sitename = "~Season 2", limit = 'none') ``` ```{r plot_height} -ggplot(data = sorghum_height, +ggplot(data = canopy_height, aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) + geom_point(size = 0.5, position = position_jitter(width = 0.1)) + # scale_x_datetime(date_breaks = '6 months') + From 09f8561903c45b46beee0bba3794d77f9e365c8e Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Wed, 23 Jan 2019 16:37:11 -0700 Subject: [PATCH 62/83] updated introduction --- index.Rmd | 68 +++++++++++++++++++++++++++---------------------------- 1 file changed, 33 insertions(+), 35 deletions(-) diff --git a/index.Rmd b/index.Rmd index d228b44..041bed2 100644 --- a/index.Rmd +++ b/index.Rmd @@ -11,16 +11,37 @@ output: # Overview -This book is intended to introduce users to TERRA REF data as quickly as possible. +This book is intended to quickly introduce users to TERRA REF data through a series of tutorials. TERRA REF has many types of data, and most can be accessed in multiple ways. Although this makes it more complicated to learn (and teach!), the objective is to provide users with the flexibility to access data in the most useful way. -It introduces to the wide range of phenomics datasets generated by the TERRA Reference program. Not only does TERRA REF have a large number of data sets, but many of the databases can be accessed in a number of different ways. While this makes it more complicated to learn, the goal is to provide users with the flexibility to access data in the most useful way. -## User Accounts and permission to access TERRA REF data +## Contents + +The first section walks the user through the steps of downloading and combining three different types of data: plot level phenotypes, meteorological data, and images. Subesquent sections provide more detailed examples that show how to access a larger variety of data and meta-data. + +## Pre-requisites + +While we assume that readers will have some familiarity with the nature of the problem - remote sensing of crop plants - for the most part, these tutorials assume that the user will bring their own scientific questions and a sense of curiosity and are eager to learn. + +These tutorials are aimed at users who are familiar with or willing to learn programming languages including R (particularly for accessing plot level trait data) and Python (primarily for accessing environmental data and sensor data). In addition, there are examples of using SQL for more sophisticated database queries as well as the bash terminal. + +Some of the lessons only require a web browser; others will assume familarity with programming at the command line in (typically only one of) Python, R, and / or SQL. You should be willing to find help (see finding help, below). -TODO: link to relevant parts of docs.terraref.org +## Technical Requirements -* Info on how to [request access to data](https://docs.terraref.org/user-manual/how-to-access-data/using-betydb-trait-data-experimental-metadata) +At a minimum, you should have: +* An internet connection +* Web Browser +* Access to the data that you are using + + The tutorials will state which databases you will need access to +* Software: + + Software requirements vary with the tutorials, and may be complex + +## User Accounts and permission to access TERRA REF data + +We have tried to write these tutorials using open access sample data sets. However, access to much of the data will require you to 1) fill out the TERRA REF Beta user questionaire ([terraref.org/beta](terraref.org/beta)) and 2) request access to specific databases. + + + ## Other Resources The TERRA REF website: [terraref.org](http://terraref.org/) The TERRA REF Technical Documentation: [docs.terraref.org](docs.terraref.org) -## Contents - -Scope ... - -Audience ... - - -## Pre-requisites - -While we assume that readers will have some familiarity with the nature of the problem - remote sensing of crop plants - for the most part, these tutorials assume that the user will bring their own scientific questions and a sense of curiosity and are eager to learn. - -Some of the lessons only require a web browser; others will assume familarity with programming at the command line in (typically only one of) Python, R, and / or SQL. You should be willing to find help (see finding help, below). - -## Technical Requirements - -At a minimum, you should have: - -* An internet connection -* Web Browser -* A TERRA REF Beta User account - + If you have not done so, please sign up at [terraref.org/beta](terraref.org/beta) -* Access to the data that you are using - + The tutorials will state which databases you will need access to -* Software: - + Software requirements vary with the tutorials, and may be complex - - ## Finding help -- [Slack](terra-ref.slack.com) -- [GitHub](https://github.com/terraref/tutorials) -- [Google](https://www.google.com/) +- Slack at terra-ref.slack.com ([signup](https://terraref-slack-invite.herokuapp.com/)) +- Browse issues and repositories in GitHub: + - search the organization at github.com/terraref + - questions about the tutorials in the [tutorials repository](https://github.com/terraref/tutorials/issues) + - about the data in the [reference-data repository](https://github.com/terraref/reference-data/issues) ```{r, include = FALSE} knitr::opts_chunk$set(echo = FALSE) From 8486827d48417ca54e68471717fbd1bb59819245 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 14:10:30 -0800 Subject: [PATCH 63/83] changed names of Rmd files - deleted old ones --- vignettes/00-vignettes-introduction.Rmd | 3 --- vignettes/01-traits-vignette.Rmd | 1 - vignettes/02-weather-vignette.Rmd | 1 - vignettes/03-images-vignette.Rmd | 1 - vignettes/04-synthesis-vignette.Rmd | 1 - 5 files changed, 7 deletions(-) delete mode 100644 vignettes/00-vignettes-introduction.Rmd delete mode 100644 vignettes/01-traits-vignette.Rmd delete mode 100644 vignettes/02-weather-vignette.Rmd delete mode 100644 vignettes/03-images-vignette.Rmd delete mode 100644 vignettes/04-synthesis-vignette.Rmd diff --git a/vignettes/00-vignettes-introduction.Rmd b/vignettes/00-vignettes-introduction.Rmd deleted file mode 100644 index 2dd17c6..0000000 --- a/vignettes/00-vignettes-introduction.Rmd +++ /dev/null @@ -1,3 +0,0 @@ -# (PART\*) Secton 1: Vignettes {-} - -# Vignettes Introduction \ No newline at end of file diff --git a/vignettes/01-traits-vignette.Rmd b/vignettes/01-traits-vignette.Rmd deleted file mode 100644 index 9bdc468..0000000 --- a/vignettes/01-traits-vignette.Rmd +++ /dev/null @@ -1 +0,0 @@ -# Traits Vignette \ No newline at end of file diff --git a/vignettes/02-weather-vignette.Rmd b/vignettes/02-weather-vignette.Rmd deleted file mode 100644 index 157521e..0000000 --- a/vignettes/02-weather-vignette.Rmd +++ /dev/null @@ -1 +0,0 @@ -# Weather Vignette \ No newline at end of file diff --git a/vignettes/03-images-vignette.Rmd b/vignettes/03-images-vignette.Rmd deleted file mode 100644 index bc5fed0..0000000 --- a/vignettes/03-images-vignette.Rmd +++ /dev/null @@ -1 +0,0 @@ -# Images Vignette \ No newline at end of file diff --git a/vignettes/04-synthesis-vignette.Rmd b/vignettes/04-synthesis-vignette.Rmd deleted file mode 100644 index 8e8313f..0000000 --- a/vignettes/04-synthesis-vignette.Rmd +++ /dev/null @@ -1 +0,0 @@ -# Synthesis Vignette \ No newline at end of file From 5a5b0975978535bc9f8640d85f89fb008e70e4f7 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 14:13:13 -0800 Subject: [PATCH 64/83] created new Rmd files (changed file names) --- vignettes/00-introduction.Rmd | 3 +++ vignettes/01-get-trait-data-R.Rmd | 1 + vignettes/02-get-weather-data-R.Rmd | 1 + vignettes/03-get-images-python.Rmd | 1 + vignettes/04-synthesis-data.Rmd | 1 + 5 files changed, 7 insertions(+) create mode 100644 vignettes/00-introduction.Rmd create mode 100644 vignettes/01-get-trait-data-R.Rmd create mode 100644 vignettes/02-get-weather-data-R.Rmd create mode 100644 vignettes/03-get-images-python.Rmd create mode 100644 vignettes/04-synthesis-data.Rmd diff --git a/vignettes/00-introduction.Rmd b/vignettes/00-introduction.Rmd new file mode 100644 index 0000000..2dd17c6 --- /dev/null +++ b/vignettes/00-introduction.Rmd @@ -0,0 +1,3 @@ +# (PART\*) Secton 1: Vignettes {-} + +# Vignettes Introduction \ No newline at end of file diff --git a/vignettes/01-get-trait-data-R.Rmd b/vignettes/01-get-trait-data-R.Rmd new file mode 100644 index 0000000..9bdc468 --- /dev/null +++ b/vignettes/01-get-trait-data-R.Rmd @@ -0,0 +1 @@ +# Traits Vignette \ No newline at end of file diff --git a/vignettes/02-get-weather-data-R.Rmd b/vignettes/02-get-weather-data-R.Rmd new file mode 100644 index 0000000..157521e --- /dev/null +++ b/vignettes/02-get-weather-data-R.Rmd @@ -0,0 +1 @@ +# Weather Vignette \ No newline at end of file diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd new file mode 100644 index 0000000..bc5fed0 --- /dev/null +++ b/vignettes/03-get-images-python.Rmd @@ -0,0 +1 @@ +# Images Vignette \ No newline at end of file diff --git a/vignettes/04-synthesis-data.Rmd b/vignettes/04-synthesis-data.Rmd new file mode 100644 index 0000000..8e8313f --- /dev/null +++ b/vignettes/04-synthesis-data.Rmd @@ -0,0 +1 @@ +# Synthesis Vignette \ No newline at end of file From ad51a8f0fbc7d67a0f91261649a95ce5c79f5f39 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 14:34:21 -0800 Subject: [PATCH 65/83] removed tutorials that have not yet been revised and updated and added sensor tutorials on images and weather --- _bookdown.yml | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index da3d252..bc18e03 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -4,6 +4,5 @@ language: ui: chapter_name: "Chapter " rmd_files: ["index.Rmd", "vignettes/00-vignettes-introduction.Rmd", "vignettes/01-traits-vignette.Rmd", "vignettes/02-weather-vignette.Rmd", -"vignettes/02-weather-vignette.Rmd", "vignettes/03-images-vignette.Rmd", "vignettes/04-synthesis-vignette.Rmd", "traits/00-BETYdb-getting-started.Rmd", -"traits/01-web-access.Rmd", "traits/02-betydb-api-access.Rmd", "traits/03-access-r-traits.Rmd", "traits/04-danforth-indoor-phenotyping-facility.Rmd", -"traits/05-maricopa-field-scanner.Rmd", "traits/06-agronomic-metadata.Rmd", "traits/07-betydb-sql-access.Rmd", "traits/10-simulated-sorghum.Rmd"] +"vignettes/02-weather-vignette.Rmd", "vignettes/03-images-vignette.Rmd", "vignettes/04-synthesis-vignette.Rmd", "traits/03-access-r-traits.Rmd", +"sensors/01-meteorological-data.Rmd", "sensors/06-list-datasets-by-plot.Rmd"] From 413fff1c81c31f7dc1193efad23a86ae03aca7f5 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 15:00:09 -0800 Subject: [PATCH 66/83] removed all references to the public key --- traits/02-betydb-api-access.Rmd | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/traits/02-betydb-api-access.Rmd b/traits/02-betydb-api-access.Rmd index 19fd2b6..1405f39 100644 --- a/traits/02-betydb-api-access.Rmd +++ b/traits/02-betydb-api-access.Rmd @@ -26,10 +26,9 @@ The first step toward reproducible pipelines is to automate the process of searc ### Using Your API key to Connect An API key is like a password. It allows you to access data, and should be kept private. -Therefore, we are not going to put it in code that we share. The one exception is the key 9999999999999999999999999999999999999999 that will allow you to access metadata tables (all tables except _traits_ and _yields_). It will also allow you to access all of the simulated data in the https://terraref.ncsa.illinois.edu/bety-test database. +Therefore, we are not going to put it in code that we share. -A common way of handling private API keys is to place it in a text file in your current directory. -Don't put it in a project directory where it might be inadvertently shared. +A common way of handling private API keys is to place it in a text file in your current directory. Don't put it in a project directory where it might be inadvertently shared. Here is how to find and save your API key: @@ -37,7 +36,7 @@ Here is how to find and save your API key: * copy the api key that was sent when you registered into the file * file --> save as '.betykey' -For the public key, you can call this file `.betykey_public`. +An API key is not needed to access public data. This includes metadata tables and simulated data in the https://terraref.ncsa.illinois.edu/bety-test database. ## Accessing data using a URL query @@ -49,7 +48,7 @@ For the public key, you can call this file `.betykey_public`. * path to the api: `/api/v1` * api endpoint: `/search` or `traits` or `sites`. For BETYdb, these are the names of database tables. * Query parameters: `genus=Sorghum` -* Authentication: `key=9999999999999999999999999999999999999999` is the public key for the TERRA REF traits database. +* Authentication: `key=api_key` is your assigned API key. This will only be needed when querying trait data. No key is needed to access the public metadata tables. ### Constructing a URL query @@ -62,17 +61,16 @@ First, lets construct a query by putting together a URL. 3. Add the name of the table you want to query. Lets start with `variables` * terraref.ncsa.illinois.edu/bety/api/v1/variables 4. add query terms by appending a `?` and combining with `&`, for example: - * `key=9999999999999999999999999999999999999999` * `type=trait` where the variable type is 'trait' * `name=~height` where the variable name contains 'height' 5. This is your complete query: - * `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height&key=9999999999999999999999999999999999999999` + * `terraref.ncsa.illinois.edu/bety/api/v1/variables?type=trait&name=~height` * it will query all variables that are type trait and have 'height' in the name * Does it return the expected values? ## Your Turn -> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999 return? +> What will the URL https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum return? > Write a URL that will query the database for sites with "Field Scanner" in the name field. Hint: combine two terms with a `+` as in `Field+Scanner` @@ -84,14 +82,14 @@ Type the following command into a bash shell (the `-o` option names the output f ```sh curl -o sorghum.json \ - "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=9999999999999999999999999999999999999999" + "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum" ``` If you want to write the query without exposing the key in plain text, you can construct it like this: ```sh curl -o sorghum.json \ - "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum&key=`cat .betykey_public`" + "https://terraref.ncsa.illinois.edu/bety/api/v1/species?genus=Sorghum" ``` ## Using the R jsonlite package to access the API with a URL query From 1ff1006f075cd7ec4ad15551ef514de8ec981816 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Thu, 24 Jan 2019 15:12:21 -0800 Subject: [PATCH 67/83] Update traits/05-maricopa-field-scanner.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> --- traits/05-maricopa-field-scanner.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index e5a3be6..7f2b373 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -29,7 +29,7 @@ sites <- betydb_query( city = "Maricopa", sitename = "~Season 2 range", limit = "none") ``` -A more robust (but complicated way) would be to query the experiments and experiments_sites tables. But we will leave that for later. +A more robust (but complicated way) would be to query the experiments and experiments_sites tables. But we will leave that as an exercise for the ambitious user. ### Plot Season 2 plots From 5dcecd5f763da35ea896b7fcbfe5bec74f90c6a3 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Thu, 24 Jan 2019 15:14:11 -0800 Subject: [PATCH 68/83] Update traits/05-maricopa-field-scanner.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> --- traits/05-maricopa-field-scanner.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index 7f2b373..2c4852a 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -78,7 +78,7 @@ leaflet() %>% ```{r} ## Time series of canopy cover, height, NDVI -#First look up variables by name. Let's look for measurements related to canopy size: +First look up variables by name. Let's look for measurements related to canopy size: ``` From 3b7020dcc8cdce28d2d8410363a7e1776d7c2578 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 15:18:12 -0800 Subject: [PATCH 69/83] removed references to public key and made minor spacing edits --- traits/03-access-r-traits.Rmd | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/traits/03-access-r-traits.Rmd b/traits/03-access-r-traits.Rmd index 2b1b428..b19c0f0 100644 --- a/traits/03-access-r-traits.Rmd +++ b/traits/03-access-r-traits.Rmd @@ -28,34 +28,34 @@ theme_set(theme_bw()) library(dplyr) ``` -Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. The public key will provide access to all metadata; you will need a personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions. +Create a file that contains your API key. If you have signed up for access to the TERRA REF database, your API key will have been sent to you in an email. You will need this personal key _and_ permissions to access the trait data. If you receive empty (NULL) datasets, it is likely that you do not have permissions. ```{r writing-key, echo = TRUE} # This should be done once with the key sent to you in your email -# writeLines('abcdefg_rest_of_key_sent_in_email', + +# Example: +#writeLines('abcdefg_rest_of_key_sent_in_email', # con = '.betykey') -# Example with the public key: -writeLines('9999999999999999999999999999999999999999', - con = '.betykey_public') ``` + #### R - using the traits package The R traits package is an API 'client'. It does two important things: 1. It makes it easier to specify the query parameters without having to construct a URL 2. It returns the results as a data frame, which is easier to use within R -Lets start with the query of information about Sorghum from species table from above +Lets start with the query of information about Sorghum from the species table ```{r query-species, results = 'hide', echo = TRUE} sorghum_info <- betydb_query(table = 'species', - genus = "Sorghum", - api_version = 'v1', - limit = 'none', - betyurl = "https://terraref.ncsa.illinois.edu/bety/", - key = readLines('.betykey', warn = FALSE)) + genus = "Sorghum", + api_version = 'v1', + limit = 'none', + betyurl = "https://terraref.ncsa.illinois.edu/bety/", + key = readLines('.betykey', warn = FALSE)) ``` @@ -64,7 +64,6 @@ sorghum_info <- betydb_query(table = 'species', Notice all of the arguments that the `betydb_query` function requires? We can change this by setting the default connection options thus: - ```{r 03-set-up, echo = TRUE} options(betydb_key = readLines('.betykey', warn = FALSE), betydb_url = "https://terraref.ncsa.illinois.edu/bety/", From 658cac26d0e24201443e42621fbe6325d1e47d4c Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Thu, 24 Jan 2019 15:27:38 -0800 Subject: [PATCH 70/83] deleted file --- vignettes/02-get-weather-data-R.Rmd | 1 - 1 file changed, 1 deletion(-) delete mode 100644 vignettes/02-get-weather-data-R.Rmd diff --git a/vignettes/02-get-weather-data-R.Rmd b/vignettes/02-get-weather-data-R.Rmd deleted file mode 100644 index 157521e..0000000 --- a/vignettes/02-get-weather-data-R.Rmd +++ /dev/null @@ -1 +0,0 @@ -# Weather Vignette \ No newline at end of file From 0d16dc8778a3e14db98a0ff318ff2455a67b267b Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Fri, 25 Jan 2019 09:57:41 -0800 Subject: [PATCH 71/83] removed "installing database locally" section --- traits/07-betydb-sql-access.Rmd | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/traits/07-betydb-sql-access.Rmd b/traits/07-betydb-sql-access.Rmd index 73eb97d..d52447a 100644 --- a/traits/07-betydb-sql-access.Rmd +++ b/traits/07-betydb-sql-access.Rmd @@ -29,14 +29,3 @@ User: viewer Password: DelchevskoOro DB: bety ``` - -## Installing the database locally - - -You can run the entire database locally, with daily imports: - -```sh -docker run --name betydb -p 5432:5432 terraref/bety-postgis -``` - -Now it will appear that you have the entire trait database running at localhost on port 5432 just like if it were installed on your system! \ No newline at end of file From 467d2743b15a8ab73b5c422a8f1ef6cade45fc9d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Fri, 25 Jan 2019 10:19:16 -0800 Subject: [PATCH 72/83] changed title and added my draft of traits vignette --- vignettes/01-get-trait-data-R.Rmd | 172 +++++++++++++++++++++++++++++- 1 file changed, 171 insertions(+), 1 deletion(-) diff --git a/vignettes/01-get-trait-data-R.Rmd b/vignettes/01-get-trait-data-R.Rmd index 9bdc468..3e3b249 100644 --- a/vignettes/01-get-trait-data-R.Rmd +++ b/vignettes/01-get-trait-data-R.Rmd @@ -1 +1,171 @@ -# Traits Vignette \ No newline at end of file +# Accessing trait data in R + +```{r chunk-options-setup, echo = FALSE} + +options(width = 100) + +``` + +# Introduction + +The objective of this vignette is to demonstrate to users how to query TERRA REF trait data using the traits package. The traits package allows users to easily pass query parameters into a R function, and returns the data in a tabular format that can be analyzed. + +Through this vignette, users will learn how to query and visualize season 6 canopy height data for May 2018. In addition, users will also be shown how to find more information on a season, such as available traits and dates, when performing their own queries. + +\newline +\newline + +# Getting Started + +First, you will need to install and load the traits package from github. + +```{r traits-setup, message = FALSE, results = FALSE} + +devtools::install_github('terraref/traits', force = TRUE) +library(traits) + + +``` + +\newline +\newline + +# How to query trait data + +## Setting options + +The function that you will be using to perform your queries is `betydb_query`. Options can be set to reduce the number of arguments that need to be passed into the function. + +Note: the `betydb_key` option only needs to be set when accessing non-public data. We will be using public data, so this option does not need to be set. However, when needed, pass in the API key that you were assigned when you first registered for access to the TERRA REF database. The key should be kept private and saved to a file named `.betykey` in your current directory. If you are having trouble locating your API key, you can go to [https://terraref.ncsa.illinois.edu/bety/users](https://terraref.ncsa.illinois.edu/bety/users). + + +```{r options-setup} + +options(betydb_key = readLines('.betykey', warn = FALSE), #need to comment this out later + betydb_url = "https://terraref.ncsa.illinois.edu/bety/", + betydb_api_version = 'v1') + +``` + +## An example: Season 6 canopy height data + +The following is an example of how to query season 6, canopy height data for May 2018. + +```{r canopy_height_query, message = FALSE} + +canopy_height <- betydb_query(table = "search", + trait = "canopy_height", + sitename = "~Season 6", + date = "~2018 May", + limit = "none") + + +``` + +A breakdown of the above query: + +* `table = "search"` + + Specify a table to query with the `table` parameter. Trait data may be queried using the `search` table. + +* `trait = "canopy_height"` + + Specify the trait of interest with the `trait` parameter. + + Trait names must be expressed exactly as they are in the TERRA REF databse. So passing in `Canopy height` instead of `canopy_height` would give NULL results. + + More information on how to determine available traits for a season can be found below under `How to query other seasons, traits, and dates`. + +* `sitename = "~Season 6"` + + Indicate the sites that you would like to query using the `sitename` parameter. + + A tilde `~` is used in this query to get all sitenames that contain `Season 6` + +* `date = "~2018 May"` + + Indicate the date of data collection using the `date` parameter. + + A tilde `~` is used in this query to get all records that have a collection date that contains `2018 May` + +* `limit = "none"` + + Indicate the maximum numnber of records you would like returned with the `limit` parameter. We want all records for this query, so we set limit to `none`. + +## Time series of canopy height + +Here is an example of how to visualize the data that we just queried. + +```{r canopy_height_plot, warning = FALSE, message = FALSE, results = FALSE} + +#load in necessary packages +library(ggplot2) +library(lubridate) + +#plot a time series of canopy height +ggplot(data = canopy_height, + aes(x = lubridate::yday(lubridate::ymd_hms(raw_date)), y = mean)) + + geom_point(size = 0.5, position = position_jitter(width = 0.1)) + + xlab("Day of Year") + ylab("Plant Height") + + guides(color = guide_legend(title = 'Genotype')) + + theme_bw() + +``` + +\newline +\newline + +# May 2018 Season 6 Summary + +The TERRA REF database contains other trait data for May 2018 of season 6. Each trait was measured using a specific method. Here is a summary of available traits and their corresponding methods of measurement. + +```{r season_6_query, message = FALSE, results = FALSE, echo = FALSE} + +#load in dplyr package +library(dplyr) + +#get all season 6 data for May 2018 +season_6 <- betydb_query(table = "search", + sitename = "~Season 6", + date = "~2018 May", + limit = "none") +#get summary +season_6_summary <- season_6 %>% group_by(trait, method_name) %>% summarise(number_of_observations = n()) + +``` + +```{r season_6_summary, echo = FALSE, comment = ""} + +print.data.frame(season_6_summary) + +``` + +\newline +\newline + +# How to query other seasons, traits, and dates + +You can query other seasons, traits, and dates by changing the season number, trait name, and date in the example query. If you are unsure of what traits or dates are available for a season, you can use the following R code to get a subset of a season and figure out what specific dates and traits are available. + +To broaden your queries, remove specific parameters. For example, in order to get all of season 2's data for October 2016, remove the `trait` parameter. + +```{r season_2_query, results = FALSE, message = FALSE} + +#get all of season 2 data for October 2016 +season_2_sub <- betydb_query(table = "search", + sitename = "~Season 2", + date = "~2016 Oct", + limit = "none") + +``` + +```{r season_2_traits, comment = ""} + +#get traits available for the subset of season 2 data +traits <- unique(season_2_sub$trait) + +print(traits) + +``` + +```{r season_2_dates, comment = ""} + +#filter for NDVI trait records +ndvi <- dplyr::filter(season_2_sub, trait == 'NDVI') + +#get unique dates for NDVI records +ndvi_dates <- unique(ndvi$date) + +print(ndvi_dates) +``` From 6d839f3a5ab91f68d8fc61bce7f7201351c00dfc Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Fri, 25 Jan 2019 15:10:45 -0700 Subject: [PATCH 73/83] Added in vignette contents --- vignettes/03-get-images-python.Rmd | 101 ++++++++++++++++++++++++++++- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index bc5fed0..c326c53 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -1 +1,100 @@ -# Images Vignette \ No newline at end of file +--- +title: "Get Source Image Files" +output: html_document +--- + +# Objective: To be able to demonstrate how to locate and retrieve RGB image files + +This vignette shows how to locate and retrieve image files associated with growing Season 6 +from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) +using Python. The files are stored online on the data management system Clowder, +which is accessed using an API. We will be working with the image files generated during the +month of May by limiting the requests to that time period. + +After completing this vignette it should be possible to search for and retrieve other +files through the use of the API. + +As an added bonus we've also included an exmple of how to retrieve the list of available +sensor names through the API. By using the sensor names returned, it's possible to retrieve +other files containing the data the sensors have collected. + +## Locating the images + +To begin looking for files, a sensor name and site name are needed. We will be using +'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this +vignette we show how to retrieve the list of available sensors. + +As mentioned in the overview, the url string will point to the API to use. In this case +we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the +one you received in an email. + +```{python eval=FALSE} +from terrautils.products import get_file_listing + +url = 'https://terraref.ncsa.illinois.edu/clowder/api' +key = 'YOUR_KEY_GOES_HERE' +sensor = 'RGB GeoTIFFs Datasets' +sitename = '' +files = get_file_listing(None, url, key, sensor, sitename, + since='2018-05-01', until='2018-05-31') +``` + +The `files` variable now contains an array of all the file in the datasets that match the +sensor in the plot for the month of May. When performing you own queries it's possible that there +are no matches found and the `files` array would be empty. + +# Retrieving the images + +Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL +that identifies the file to retrieve, making the API call to retrieve the file contents, and writing +the contents to disk. + +To create the correct URL we start with the one defined before and attach the keyword '/files/' +followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving +the file would be: + +``` {sh eval=FALSE} +https://terraref.ncsa.illinois.edu/clowder/api/files/111 +``` + +By looping through each of the returned files from the previous example, and using their ID and +filename, we can retrieve the files from the server and store them locally. + +We are streaming the data returned from our server request (`stream=True` in the code below) due to +the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire +contents would be in the `r` variable which could then be written to the local file. + +```{python eval=FALSE} +# We are using the same `url` and `key` variables declared in the previous example above. +filesurl = url + '/files/' +params={ 'key': key } + +for f in files: + r = requests.get(fileurl + f.id, params=params, stream=True) + with open(f.filename, 'wb') as o: + for chunk in r.iter_content(chunk_size=1024): + if chunk: + o.write(chunk) + +``` + +The images are now stored on the local file system. + +# Retrieving sensor names + +In this section we retrieve the names of different sensor types that are available. This will +allow you to retrieve files other than those containing RBG image data. + +```{python eval=FALSE} +# We are using the same `url` and `key` variables declared in the previous example above. +from terrautils.products import get_sensor_list, unique_sensor_names + +sensors = get_sensor_list(None, url, key) +names = unique_sensor_names(sensors) +``` + +The variable `names` will now contain the list of all available sensors. Using these sensor +names it's possible to use the above search to locate and then retrieve additional data files. +Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is +assigned above. + From 1087eeed29dbe3eb96e7f485d1dac3e0f46591b8 Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Fri, 25 Jan 2019 15:17:08 -0700 Subject: [PATCH 74/83] Reverting my changes --- vignettes/03-get-images-python.Rmd | 101 +---------------------------- 1 file changed, 1 insertion(+), 100 deletions(-) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index c326c53..bc5fed0 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -1,100 +1 @@ ---- -title: "Get Source Image Files" -output: html_document ---- - -# Objective: To be able to demonstrate how to locate and retrieve RGB image files - -This vignette shows how to locate and retrieve image files associated with growing Season 6 -from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) -using Python. The files are stored online on the data management system Clowder, -which is accessed using an API. We will be working with the image files generated during the -month of May by limiting the requests to that time period. - -After completing this vignette it should be possible to search for and retrieve other -files through the use of the API. - -As an added bonus we've also included an exmple of how to retrieve the list of available -sensor names through the API. By using the sensor names returned, it's possible to retrieve -other files containing the data the sensors have collected. - -## Locating the images - -To begin looking for files, a sensor name and site name are needed. We will be using -'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this -vignette we show how to retrieve the list of available sensors. - -As mentioned in the overview, the url string will point to the API to use. In this case -we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the -one you received in an email. - -```{python eval=FALSE} -from terrautils.products import get_file_listing - -url = 'https://terraref.ncsa.illinois.edu/clowder/api' -key = 'YOUR_KEY_GOES_HERE' -sensor = 'RGB GeoTIFFs Datasets' -sitename = '' -files = get_file_listing(None, url, key, sensor, sitename, - since='2018-05-01', until='2018-05-31') -``` - -The `files` variable now contains an array of all the file in the datasets that match the -sensor in the plot for the month of May. When performing you own queries it's possible that there -are no matches found and the `files` array would be empty. - -# Retrieving the images - -Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL -that identifies the file to retrieve, making the API call to retrieve the file contents, and writing -the contents to disk. - -To create the correct URL we start with the one defined before and attach the keyword '/files/' -followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving -the file would be: - -``` {sh eval=FALSE} -https://terraref.ncsa.illinois.edu/clowder/api/files/111 -``` - -By looping through each of the returned files from the previous example, and using their ID and -filename, we can retrieve the files from the server and store them locally. - -We are streaming the data returned from our server request (`stream=True` in the code below) due to -the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire -contents would be in the `r` variable which could then be written to the local file. - -```{python eval=FALSE} -# We are using the same `url` and `key` variables declared in the previous example above. -filesurl = url + '/files/' -params={ 'key': key } - -for f in files: - r = requests.get(fileurl + f.id, params=params, stream=True) - with open(f.filename, 'wb') as o: - for chunk in r.iter_content(chunk_size=1024): - if chunk: - o.write(chunk) - -``` - -The images are now stored on the local file system. - -# Retrieving sensor names - -In this section we retrieve the names of different sensor types that are available. This will -allow you to retrieve files other than those containing RBG image data. - -```{python eval=FALSE} -# We are using the same `url` and `key` variables declared in the previous example above. -from terrautils.products import get_sensor_list, unique_sensor_names - -sensors = get_sensor_list(None, url, key) -names = unique_sensor_names(sensors) -``` - -The variable `names` will now contain the list of all available sensors. Using these sensor -names it's possible to use the above search to locate and then retrieve additional data files. -Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is -assigned above. - +# Images Vignette \ No newline at end of file From 1541181648db6108230671ca422109d0e579dfef Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Fri, 25 Jan 2019 15:19:59 -0700 Subject: [PATCH 75/83] Fleshed out images vignette --- vignettes/03-get-images-python.Rmd | 101 ++++++++++++++++++++++++++++- 1 file changed, 100 insertions(+), 1 deletion(-) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index bc5fed0..c326c53 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -1 +1,100 @@ -# Images Vignette \ No newline at end of file +--- +title: "Get Source Image Files" +output: html_document +--- + +# Objective: To be able to demonstrate how to locate and retrieve RGB image files + +This vignette shows how to locate and retrieve image files associated with growing Season 6 +from the University of Arizona's [Maricopa Agricultural Center](http://cals-mac.arizona.edu/) +using Python. The files are stored online on the data management system Clowder, +which is accessed using an API. We will be working with the image files generated during the +month of May by limiting the requests to that time period. + +After completing this vignette it should be possible to search for and retrieve other +files through the use of the API. + +As an added bonus we've also included an exmple of how to retrieve the list of available +sensor names through the API. By using the sensor names returned, it's possible to retrieve +other files containing the data the sensors have collected. + +## Locating the images + +To begin looking for files, a sensor name and site name are needed. We will be using +'RGB GeoTIFFs Datasets' as the sensor name and '' as the site name. Later in this +vignette we show how to retrieve the list of available sensors. + +As mentioned in the overview, the url string will point to the API to use. In this case +we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the +one you received in an email. + +```{python eval=FALSE} +from terrautils.products import get_file_listing + +url = 'https://terraref.ncsa.illinois.edu/clowder/api' +key = 'YOUR_KEY_GOES_HERE' +sensor = 'RGB GeoTIFFs Datasets' +sitename = '' +files = get_file_listing(None, url, key, sensor, sitename, + since='2018-05-01', until='2018-05-31') +``` + +The `files` variable now contains an array of all the file in the datasets that match the +sensor in the plot for the month of May. When performing you own queries it's possible that there +are no matches found and the `files` array would be empty. + +# Retrieving the images + +Now that we have a list of files we can retrieve them one-by-one. We do this by creating a URL +that identifies the file to retrieve, making the API call to retrieve the file contents, and writing +the contents to disk. + +To create the correct URL we start with the one defined before and attach the keyword '/files/' +followed by the ID of each file. Assuming we have a file ID of '111', the final URL for retrieving +the file would be: + +``` {sh eval=FALSE} +https://terraref.ncsa.illinois.edu/clowder/api/files/111 +``` + +By looping through each of the returned files from the previous example, and using their ID and +filename, we can retrieve the files from the server and store them locally. + +We are streaming the data returned from our server request (`stream=True` in the code below) due to +the high probability of large file sizes. If the `stream=True` parameter was omitted the file's entire +contents would be in the `r` variable which could then be written to the local file. + +```{python eval=FALSE} +# We are using the same `url` and `key` variables declared in the previous example above. +filesurl = url + '/files/' +params={ 'key': key } + +for f in files: + r = requests.get(fileurl + f.id, params=params, stream=True) + with open(f.filename, 'wb') as o: + for chunk in r.iter_content(chunk_size=1024): + if chunk: + o.write(chunk) + +``` + +The images are now stored on the local file system. + +# Retrieving sensor names + +In this section we retrieve the names of different sensor types that are available. This will +allow you to retrieve files other than those containing RBG image data. + +```{python eval=FALSE} +# We are using the same `url` and `key` variables declared in the previous example above. +from terrautils.products import get_sensor_list, unique_sensor_names + +sensors = get_sensor_list(None, url, key) +names = unique_sensor_names(sensors) +``` + +The variable `names` will now contain the list of all available sensors. Using these sensor +names it's possible to use the above search to locate and then retrieve additional data files. +Substitute the new sensor name for 'RGB GeoTIFFs Datasets' where the variable `sensor` is +assigned above. + From 5c58e4a280d582b2f2d3c65ec75e1c8b33b0c60d Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Mon, 28 Jan 2019 12:19:07 -0800 Subject: [PATCH 76/83] updated file names to match those in the vignettes folder --- _bookdown.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_bookdown.yml b/_bookdown.yml index bc18e03..e9b7aa5 100644 --- a/_bookdown.yml +++ b/_bookdown.yml @@ -3,6 +3,6 @@ output_dir: "docs" language: ui: chapter_name: "Chapter " -rmd_files: ["index.Rmd", "vignettes/00-vignettes-introduction.Rmd", "vignettes/01-traits-vignette.Rmd", "vignettes/02-weather-vignette.Rmd", -"vignettes/02-weather-vignette.Rmd", "vignettes/03-images-vignette.Rmd", "vignettes/04-synthesis-vignette.Rmd", "traits/03-access-r-traits.Rmd", +rmd_files: ["index.Rmd", "vignettes/00-introduction.Rmd", "vignettes/01-get-trait-data-R.Rmd", "vignettes/02-get-weather-data-R.Rmd", +"vignettes/03-get-images-python.Rmd", "vignettes/04-synthesis-data.Rmd", "traits/03-access-r-traits.Rmd", "sensors/01-meteorological-data.Rmd", "sensors/06-list-datasets-by-plot.Rmd"] From ad5b12c840fbc1b23b769142a2491e6565d44745 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Mon, 28 Jan 2019 14:31:06 -0700 Subject: [PATCH 77/83] Update vignettes/03-get-images-python.Rmd Co-Authored-By: Chris-Schnaufer --- vignettes/03-get-images-python.Rmd | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index c326c53..2657d19 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -18,6 +18,12 @@ As an added bonus we've also included an exmple of how to retrieve the list of a sensor names through the API. By using the sensor names returned, it's possible to retrieve other files containing the data the sensors have collected. +**requirements** +* Python 3 +* the terrautils library + * this can be installed from pypi by running `pip install terrautils` in the terminal +* an API key to access these data + ## Locating the images To begin looking for files, a sensor name and site name are needed. We will be using From ea0874f07a922bedc7e68939d1c9b35d5bcb1236 Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Mon, 28 Jan 2019 14:50:29 -0700 Subject: [PATCH 78/83] Added information on getting API keys --- vignettes/03-get-images-python.Rmd | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index 2657d19..d3cb1c8 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -24,6 +24,16 @@ other files containing the data the sensors have collected. * this can be installed from pypi by running `pip install terrautils` in the terminal * an API key to access these data +The API key is a string that gets generated upon request through your Clowder account. Existing +API keys will work with this vignette. To get a new API key it is necessary to first register +with Clowder at "https://terraref.ncsa.illinois.edu/clowder/". First click the `Login` button and +wait for the login screen to appear. Then select the `Sign up` button and enter an email +address you have access to. An email is sent to the entered address with instructions for +completing the registration process. Once registration is complete, log +into Clowder and select the `View profile` menu option from the drop-down that is near the search +control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new +key is gnerated that can be used. + ## Locating the images To begin looking for files, a sensor name and site name are needed. We will be using From 589f4c7a224fe9cde5148f8567b03f8f4c555e27 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Mon, 28 Jan 2019 14:54:19 -0700 Subject: [PATCH 79/83] * added ending to comment in index.Rmd * set python engine to python3 --- index.Rmd | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/index.Rmd b/index.Rmd index ea26514..bb8043b 100644 --- a/index.Rmd +++ b/index.Rmd @@ -42,6 +42,7 @@ At a minimum, you should have: We have tried to write these tutorials using open access sample data sets. However, access to much of the data will require you to 1) fill out the TERRA REF Beta user questionaire ([terraref.org/beta](terraref.org/beta)) and 2) request access to specific databases. ## Ways of Acessing Data @@ -72,7 +73,10 @@ The TERRA REF Technical Documentation: [docs.terraref.org](docs.terraref.org) - about the data in the [reference-data repository](https://github.com/terraref/reference-data/issues) ```{r, include = FALSE} -knitr::opts_chunk$set(echo = FALSE) +knitr::opts_chunk$set(echo = FALSE, + engine.path = list( + python = 'python3' + )) options(warn = -1) ``` From 413a538ab2f322c825eb6da7692183d2eb95a37d Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Mon, 28 Jan 2019 14:55:05 -0700 Subject: [PATCH 80/83] Cleaned up an API key related spot --- vignettes/03-get-images-python.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index d3cb1c8..01ea82c 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -42,7 +42,7 @@ vignette we show how to retrieve the list of available sensors. As mentioned in the overview, the url string will point to the API to use. In this case we'll be using "https://terraref.ncsa.illinois.edu/clowder/api" and the key will be the -one you received in an email. +one you created for your Clowder account. ```{python eval=FALSE} from terrautils.products import get_file_listing From f4f25d3e7582025eefb401e1d6a9619383597c46 Mon Sep 17 00:00:00 2001 From: Chris Schnaufer Date: Mon, 28 Jan 2019 14:58:28 -0700 Subject: [PATCH 81/83] Removed some extra words --- vignettes/03-get-images-python.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vignettes/03-get-images-python.Rmd b/vignettes/03-get-images-python.Rmd index 01ea82c..2fac28e 100644 --- a/vignettes/03-get-images-python.Rmd +++ b/vignettes/03-get-images-python.Rmd @@ -32,7 +32,7 @@ address you have access to. An email is sent to the entered address with instruc completing the registration process. Once registration is complete, log into Clowder and select the `View profile` menu option from the drop-down that is near the search control. By clicking the `+ Add` button under "User API Keys" heading in the profile page, a new -key is gnerated that can be used. +key is gnerated. ## Locating the images From b65ce865866fffcbfd7cfd0364877586b1e2c1c3 Mon Sep 17 00:00:00 2001 From: Kimberly Huynh Date: Tue, 29 Jan 2019 15:22:11 -0800 Subject: [PATCH 82/83] added a clarifying comment on how to use ~ for partial string matching in a query --- traits/05-maricopa-field-scanner.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/traits/05-maricopa-field-scanner.Rmd b/traits/05-maricopa-field-scanner.Rmd index 78574fe..37d6da9 100644 --- a/traits/05-maricopa-field-scanner.Rmd +++ b/traits/05-maricopa-field-scanner.Rmd @@ -88,7 +88,7 @@ leaflet() %>% ```{r traits-05-height-cover-ndvi} #variables <- betydb_query( -# table = "variables", name = "~^(NDVI|canopy_height|canopy_cover|)$") +# table = "variables", name = "~^(NDVI|canopy_height|canopy_cover|)$") # a tilde ~ can be used to partially match a string #the tilde is used in this query to get variable names that contain either 'NDVI', 'canopy_height', or 'canopy_cover' #variables %>% # select(id, name, units, n_records = `number of associated traits`) From 4f2e89df6270465b2845b3e0687ec0eb443cc103 Mon Sep 17 00:00:00 2001 From: David LeBauer Date: Tue, 29 Jan 2019 14:21:02 -0800 Subject: [PATCH 83/83] Update vignettes/01-get-trait-data-R.Rmd Co-Authored-By: kimberlyh66 <44081116+kimberlyh66@users.noreply.github.com> --- vignettes/01-get-trait-data-R.Rmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vignettes/01-get-trait-data-R.Rmd b/vignettes/01-get-trait-data-R.Rmd index 3e3b249..aedd75e 100644 --- a/vignettes/01-get-trait-data-R.Rmd +++ b/vignettes/01-get-trait-data-R.Rmd @@ -41,7 +41,7 @@ Note: the `betydb_key` option only needs to be set when accessing non-public dat ```{r options-setup} -options(betydb_key = readLines('.betykey', warn = FALSE), #need to comment this out later +options(# betydb_key = 'Your API Key', # to access non-public data betydb_url = "https://terraref.ncsa.illinois.edu/bety/", betydb_api_version = 'v1')