update data import vignette

addressing additonal items from our ropensci reviews
ropensci · Dec 30, 2020 · 591dd50 · 591dd50
1 parent 5696033
commit 591dd50
Show file tree

Hide file tree

Showing 12 changed files with 775 additions and 75 deletions.
diff --git a/R/utility_functions.R b/R/utility_functions.R
@@ -957,12 +957,15 @@ Please use relabel_viewr_axes() to rename variables as necessary.")
 #' @param ... Additional arguments passed to/from other pathviewR functions
 #'
 #' @details The center point of the tunnel is estimated as the point between the
-#'   two landmarks. The angle between landmark_one, tunnel_center_point, and
-#'   arbitrary point along the length axis (tunnel_center_point - 1 on length)
-#'   is estimated. That angle is then used to rotate the data, again only in the
-#'   length and width dimensions. Height is standardized by average landmark
-#'   height; values greater than 0 are above the landmarks and values less than
-#'   0 are below the landmark level.
+#'   two landmarks. It is therefore recommended that \code{landmark_one} and
+#'   \code{landmark_two} be objects that are placed on opposite ends of the
+#'   tunnel (e.g. in an avian flight tunnel, these landmarks may be perches that
+#'   are placed at the extreme ends). The angle between landmark_one,
+#'   tunnel_center_point, and arbitrary point along the length axis
+#'   (tunnel_center_point - 1 on length) is estimated. That angle is then used
+#'   to rotate the data, again only in the length and width dimensions. Height
+#'   is standardized by average landmark height; values greater than 0 are above
+#'   the landmarks and values less than 0 are below the landmark level.
 #'
 #' @section Warning:
 #' The \code{position_length} values of landmark_one MUST be less than

diff --git a/docs/articles/data-import-cleaning.html b/docs/articles/data-import-cleaning.html
diff --git a/docs/articles/managing-frame-gaps.html b/docs/articles/managing-frame-gaps.html
diff --git a/docs/articles/visual-perception-functions.html b/docs/articles/visual-perception-functions.html
diff --git a/docs/index.html b/docs/index.html
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
@@ -5,7 +5,7 @@ articles:
   data-import-cleaning: data-import-cleaning.html
   managing-frame-gaps: managing-frame-gaps.html
   visual-perception-functions: visual-perception-functions.html
-last_built: 2020-12-29T06:44Z
+last_built: 2020-12-30T07:06Z
 urls:
   reference: https://vbaliga.github.io/pathviewR//reference
   article: https://vbaliga.github.io/pathviewR//articles

diff --git a/docs/reference/read_motive_csv.html b/docs/reference/read_motive_csv.html
diff --git a/docs/reference/standardize_tunnel.html b/docs/reference/standardize_tunnel.html
diff --git a/example_scripts/ropensci_reviews_checklist.html b/example_scripts/ropensci_reviews_checklist.html
diff --git a/example_scripts/ropensci_reviews_checklist.md b/example_scripts/ropensci_reviews_checklist.md
@@ -67,7 +67,7 @@ also takes data in the correct format and already relabels the columns). Maybe
 this could be made a bit clearer in the vignette?
 
 Items for us:   
-- [ ] clarify language of this vignette to indicate that relabeling & gathering
+- [x] clarify language of this vignette to indicate that relabeling & gathering
 are only necessary in certain cases (e.g. using Motive data)
 
 > Maybe that was me not being very familiar with the kind of experiments the
@@ -82,9 +82,9 @@ why it doesn’t need to be rotated, or is that something arising from the Flydr
 software?
 
 Items for us:   
-- [ ] in this vignette: clarify the circumstances under which standardization is
+- [x] in this vignette: clarify the circumstances under which standardization is
 needed and what types of landmarks are appropriate (consider adding a figure?)  
-- [ ] in the Help file for `standardize_tunnel()`: clarify this function's use 
+- [x] in the Help file for `standardize_tunnel()`: clarify this function's use 
 cases and perhaps link to the vignette itself?  
 
 > Also, considering how the select_x_percent() function works (by selecting a
@@ -93,13 +93,13 @@ axis), shouldn’t it be more appropriate to say that the (0,0,0) must be at the
 centre of the region of interest, rather than at the centre of the tunnel?
 
 Items for us:   
-- [ ] revise language of this vignette on what (0,0,0) represents
+- [x] revise language of this vignette on what (0,0,0) represents
 
 > Minor point: the link to the vignette for managing frame gaps is missing in
 the text.
 
 Items for us:   
-- [ ] add link
+- [x] add link
 
 #### Managing frame gaps 
 
@@ -255,8 +255,8 @@ input data can look like. So, you need x,y,z ... but what more. And what defines
 Optitrack and flydra data.
 
 Items for us:  
-- [ ] update the language of the Data import and cleaning vignette OR consider
-cleaving off some of this stuff into its own vignette
+- [x] add a short walkthrough of what movement data look like, both generally
+and specifically in Motive and Flydra
 
 > I did not see any contribution guidelines, so it would be helpful to include
 those.

diff --git a/man/standardize_tunnel.Rd b/man/standardize_tunnel.Rd
diff --git a/vignettes/data-import-cleaning.Rmd b/vignettes/data-import-cleaning.Rmd
@@ -27,20 +27,38 @@ itself. Data may not be organized as “tidy” key-value pairs, the axes and
 overall orientation of the environment may not conform to a standard, and
 individual movement trajectories may be ill-defined. 
 
-`pathviewR` provides functions in R to deal with such problems. This vignette
-will cover the basics of how to import raw data and how to "clean" data to
-prepare it for statistical analyses.
+`pathviewR` provides functions in R to deal with such problems (i.e. "clean"
+them). This vignette will cover the basics of how to import raw data and how to
+clean data to prepare it for visualization and/or statistical analyses.
 
 
+## What do movement data sets look like?
+
+At minimum, movement data provide information on a subject or object's position
+over time. These data are typically supplied in three dimensions (e.g. x, y, z),
+with position in each dimension sampled at a particular rate (e.g. 100 Hz).
+Different recording software may provide additional features, such as the
+ability to track multiple subjects simultaneously, information on subjects'
+rotation, tracking of "rigid body" elements, or even the ability to apply Kalman
+filters.
+
+A central goal of `pathviewR` is to take data from different sources (so far:
+Motive and Flydra), re-organize them into a common format that can be wrangled
+in R, clean them up a bit, and get them ready for visualization and/or
+statistical analyses. We'll first cover what's included in Motive and in Flydra
+data and how `pathviewR` handles these. Should you have data from another
+source, our `as_viewr()` function will allow you to bring it into the
+`pathviewR` framework.
+
 ## Data import via `pathviewR`
 
-Data can be imported via one of three ways:  
+Data can be imported via one of three functions:  
 
-- `read_motive_csv()` imports data from `.csv` files exported by 
-[Optitrack's Motive](https://optitrack.com/software/motive/) software  
+- `read_motive_csv()` imports data from `.csv` files that have been exported
+from [Optitrack's Motive](https://optitrack.com/software/motive/) software
 
-- `read_flydra_mat()` imports data from  `.mat` files exported from 
-[Flydra](https://github.com/strawlab/flydra)  
+- `read_flydra_mat()` imports data from  `.mat` files that have been exported
+from [Flydra](https://github.com/strawlab/flydra)
 
 - `as_viewr()` can be used to handle data from other sources  
 
@@ -82,6 +100,15 @@ motive_data <-
 motive_data
 ``` 
 
+A key thing to note is that these data, as stored in Motive CSVs, are not
+"tidy". Each frame occupies one row, but what that also means is that the
+rotation and position values for the various subjects take up 24 columns! This
+format not only makes plotting data more difficult in base R, `ggplot2`, and
+`rgl`, but also makes other aspects of data wrangling more difficult. In a later
+step, we will 'gather' these data into key-value pairs so that e.g. all
+length-wise position values are in one column, all width-wise are in
+another...etc.
+
 Metadata are stored as attributes. We won't go through all of these, but here
 are a couple important ones.
 
@@ -99,6 +126,11 @@ attr(motive_data, "data_types_simple")
 attr(motive_data, "frame_rate")
 ```
 
+Storing such metatdata in the attributes is a key feature of `pathviewR`. These
+metadata may not be as immediately as important as the time series of position
+or rotation, but they can provide important experimental information such as the
+date & time of capture and the units of the position data (here, meters).
+
 ### Flydra Matlab files
 
 `.mat` files exported from Flydra can be imported via `read_flydra_mat()`.
@@ -120,8 +152,14 @@ flydra_data <-
 ## Similarly, this produces a tibble with important 
 ## metadata as attributes
 flydra_data
+
+attr(flydra_data, "frame_rate")
 ```
 
+Note that unlike the example Motive data, the Flydra data are already organized
+into key-value pairs. Because rotation is not captured by Flydra, such data are
+also not included.
+
 ### Data from other sources
 
 Data from another format can be converted to a `viewr` object via
@@ -155,6 +193,9 @@ test <-
     position_width_col = 6,
     position_height_col = 4
   )
+
+## Some metadata are stored as attributes
+attr(test, "frame_rate")
 ```
 
 We also welcome you to request custom data import functions, especially if
@@ -165,9 +206,11 @@ via our Github Issues page.
 
 
 ## Data cleaning
-Data exported via either Motive or Flydra are not typically "tidy". Functions in
-`pathviewR` ultimately rely on having tidy data sets that are easily 
-interpreted.
+As noted above, raw data often suffer the following:  
+- contain noise or artifacts from the recording session  
+- not organized as “tidy” key-value pairs  
+- axes and overall orientation of the environment may not conform to a standard  
+- individual movement trajectories may be ill-defined
 
 Several functions to clean and wrangle data are available, and we have a
 suggested pipeline for how these steps should be handled. The rest of this
@@ -190,6 +233,9 @@ label it as the y axis instead.
 "tunnel_width", and "tunnel_height". **These axis labels will be expected by
 subsequent functions, so skipping this step is ill-advised.**
 
+Typically, axes from Motive data will need to be relabled, but axes in data
+imported from Flydra will not.
+
 ```{r relabel_axes}
 motive_relabeled <-
   motive_data %>%
@@ -208,12 +254,18 @@ data from a given session and organize it so that all data of a given type are
 within one column, i.e. all position lengths are in `position_length`, as
 opposed to separate length columns for each rigid body. **These column names
 will be expected by subsequent functions, so skipping this step is also
-ill-advised.**
-
-Use `trim_tunnel_outliers()` to remove artifacts and other outlier data. This
-step is entirely optional, and should only be used when the user is confident
-that data outside certain ranges are artifacts or other bugs. Data outside these
-ranges are then filtered out. Best to plot data beforehand and check!!
+ill-advised if you are using data from Motive.** Should you have data from 
+Flydra, this step should be skipable.
+
+Use `trim_tunnel_outliers()` to remove extreme artifacts and other outlier data.
+What this function does is create a (virtual) boundary box according to
+user-specification, and any data outside that boundary are removed. For example,
+if you know your arena measures 10m x 10m x 10m and your data were calibrated to
+range from 0-10m in each dimension, you can be reasonably sure that extreme
+values such as 45m on a given axis are bogus. This step is entirely optional,
+and should only be used when the user is confident that data outside certain
+ranges are artifacts or other bugs. Data outside these ranges are then filtered
+out. Best to plot data beforehand and check!!
 
 ```{r gather_and_trim}
 ## First gather and show the new column names
@@ -246,24 +298,30 @@ in identical ways. Moreover, the user may want to redefine how the coordinate
 system itself is defined (i.e. change the location of `(0, 0, 0)` to another
 place within the tunnel.
 
-Note that having `(0, 0, 0)` set to the center of the tunnel is required for 
-all subsequent `pathviewR` functions to work.
+Note that having `(0, 0, 0)` set to the center of the region of interest
+(covered in the next section of this vignette) is required for all subsequent
+`pathviewR` functions to work.
 
 `pathviewR` offers three main choices for such standardization:  
 
 - `redefine_tunnel_center()`: Sets the location of 0 on any or all axes to a new
 location. See the Help page for this function to see the four different methods
-by which a user can specify this. No rotation of the tunnel is performed.  
+by which a user can specify this. No rotation of the tunnel is performed. This
+function can be used on both Motive and Flydra data.  
 
 - `standardize_tunnel()`: Use specified landmarks (`subjects` within the `viewr`
 object) to rotate and translate the location of a tunnel, setting `(0, 0, 0)` to
-the center of the tunnel (centering).
+the center of the tunnel (centering). For example, in an avian flight tunnel,
+perches may be set up on opposite ends of the tunnel and rigid body markers may
+be set to them. The positions of these perches can be used as landmarks to
+standardize tunnel position. Note that this is typically not possible for Flydra
+data, since Flydra data will be imported with only one `subject`.  
 
-- `rotate_tunnel`:  Rotate and center a tunnel based on user-defined coordinates
+- `rotate_tunnel`: Rotate and center a tunnel based on user-defined coordinates
 (i.e. similar to `standardize_tunnel()` but for cases where specified landmarks
-are not in the data).  
+are not in the data). This function can be used on both Motive and Flydra data.  
 
-Two quick examples will follow, using our motive and Flydra data:
+Two quick examples will follow, using our Motive and Flydra data:
 
 ```{r rotate_example}
 ## Rotate and center the motive data set:
@@ -305,7 +363,7 @@ Differences due to rotation may be extremely subtle, but the redefining of
 axes of the plots.
 
 Flydra data typically do not need to be rotated, so we will instead use
-`redfine_tunnel_center()` to adjust the location of `(0, 0, 0)`:
+`redefine_tunnel_center()` to adjust the location of `(0, 0, 0)`:
 
 ```{r redefine_tunnel_example}
 ## Re-center the Flydra data set:
@@ -383,7 +441,7 @@ Isolating trajectories is handled via the `separate_trajectories()` function in
 Because cameras may occasionally drop frames, we allow the user to permit some
 relaxation of how stringent the "continuous movement" criterion is. This is
 handled via the `max_frame_gap` argument within `separate_trajectories()`. For
-more details, please see VIGNETTE XXX (LINK HERE).  
+more details, please see [the vignette Managing frame gaps with pathviewR](https://vbaliga.github.io/pathviewR/articles/managing-frame-gaps.html).  
 
 In our Motive example, we'll use the automated feature built into the function
 to guesstimate the best `max_frame_gap` allowed. When frame gaps larger than