Skip to content
Convenient Data Exploration with see() and browse()
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
docs
man
.DS_Store
.Rbuildignore
.gitattributes
.gitignore
.travis.yml
DESCRIPTION
LICENSE
NAMESPACE
NEWS.md
README.Rmd
README.md
Xplorer.Rproj

README.md

Xplorer

Xplorer provides a set of functions for convenient exploration of datasets in R, see() and browse().

see()

see() prints the following details of an object to the console:

  • attributes (useful to inspect labels)
  • class, typeof, mode, storage.mode
  • table (frequency and percentage of values)
  • summary statistics

There are several other packages which attempt to do someting similar, but none of them offers the ease of use and thourough inspection of ojects. For example, base R offers the functions summary(), str() and structure(). However, each of those is limited in what they display and leave other aspects untouched. see() combines a number of functions that would have to be invoked seperately and mirrors Stata’s ‘codebook’.

see()is an S3 generic and currently comes with the following data types:

  • character
  • factor
  • numeric
  • data.frame
  • labelled
  • default (all other types)

Example

library("Xplorer")

# Create example data
numeric.labelled <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
attributes(numeric.labelled)$label <- "mylabel"

# see the object
see(numeric.labelled)
#> $label
#> [1] "mylabel"
#> 
#> 
#> 
#>    Class Typeof    Mode Storage.mode
#>  numeric double numeric       double
#> 
#> 
#>   Frequency Percent
#> 1         1      10
#> 2         2      20
#> 3         3      30
#> 4         4      40
#> 
#> 
#>   N Missings Distinct.values Min Max Mean Median Std.Deviation Variance
#>  10        0               4   1   4    3      3      1.054093 1.111111

browse()

The aim of browse() is to create a data.frame that allows you to use the search field in the Rstudio View() function to search for variables based on their names or values, which is currently not possible. In particular if a dataset is very large and no comprehensive codebook is available, quickly searching for variable names makes finding variables of interest much easier.

If you are not working in Rstudio, View() may not work in the same way and may not offer a search option. Hence browse() does not call View() directly. Instead, store the data in a new object first, which then can be opened with View() in Rstudio. browse() can also be called on a data.frame directly to print the output to the console.

Example

library("Xplorer")

# Create an example data.frame and add some variable labels
data(mtcars)
attributes(mtcars$mpg)$label <- "Miles/(US) gallon"
attributes(mtcars$cyl)$label <- "Number of cylinders"
attributes(mtcars$disp)$label <- "Displacement (cu.in.)"
attributes(mtcars$hp)$label <- "Gross horsepower"
attributes(mtcars$drat)$label <- "Rear axle ratio"
attributes(mtcars$wt)$label <- "Weight (1000 lbs)"
attributes(mtcars$qsec)$label <- "1/4 mile time"
attributes(mtcars$vs)$label <- "V/S"

# Call browse() directly...
browse(mtcars)
#>    variable_name        variable_label        range distinct_values
#> 1            mpg     Miles/(US) gallon   10.4, 33.9              25
#> 2            cyl   Number of cylinders         4, 8               3
#> 3           disp Displacement (cu.in.)  71.1, 472.0              27
#> 4             hp      Gross horsepower      52, 335              22
#> 5           drat       Rear axle ratio   2.76, 4.93              22
#> 6             wt     Weight (1000 lbs) 1.513, 5.424              29
#> 7           qsec         1/4 mile time   14.5, 22.9              30
#> 8             vs                   V/S         0, 1               2
#> 9             am                     -         0, 1               2
#> 10          gear                     -         3, 5               3
#> 11          carb                     -         1, 8               6
#>      class typeof  N missings
#> 1  numeric double 32        0
#> 2  numeric double 32        0
#> 3  numeric double 32        0
#> 4  numeric double 32        0
#> 5  numeric double 32        0
#> 6  numeric double 32        0
#> 7  numeric double 32        0
#> 8  numeric double 32        0
#> 9  numeric double 32        0
#> 10 numeric double 32        0
#> 11 numeric double 32        0

# ...or assign to df and open with View()
df <- browse(mtcars)

Then use View(df) and use the search field to search variable names and labels. Or, even simpler, just use View(browse(mtcars)).

Installation

You can install Xplorer from GitHub with:

remotes::install_github("fschaffner/Xplorer")

Please report issues or requests for additional functionality to https://github.com/fschaffner/Xplorer/issues.

You can’t perform that action at this time.