Skip to content

❗ This is a read-only mirror of the CRAN R package repository. labelmachine — Make Labeling of R Data Sets Easy. Homepage: https://a-maldet.github.io/labelmachinehttps://github.com/a-maldet/labelmachine Report bugs for this package: https://github.com/a-maldet/labelmachine/issues

Notifications You must be signed in to change notification settings

cran/labelmachine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

labelmachine

cran release Travis build status GitHub last commit GitHub code size in bytes codecov.io

labelmachine is an R package that helps assigning meaningful labels to data sets. Furthermore, you can manage your labels in so called lama-dictionary files, which are yaml files. This makes it very easy using the same label translations in multiple projects which share similar data structure.

Labeling your data can be easy!

Installation

# Install release version from CRAN
install.packages("labelmachine")

# Install development version from GitHub
devtools::install_github('a-maldet/labelmachine', build_vignettes = TRUE)

Concept

The label assignments are given in so called translations (named character vectors), which are like a recipes, telling which original value will be mapped onto which new label. The translations are collected in so called lama_dictionary objects. This lama_dictionary objects will be used to translate your data frame variables.

Usage

Let df be a data frame with marks and subjects, which should be translated

df <- data.frame(
  pupil_id = c(1, 1, 2, 2, 3),
  subject = c("en", "ma", "ma", "en", "en"),
  result = c(2, 1, 3, 2, NA),
  stringsAsFactors = FALSE
)
df
##   pupil_id subject result
## 1        1      en      2
## 2        1      ma      1
## 3        2      ma      3
## 4        2      en      2
## 5        3      en     NA

Create a lama_dictionary object holding the translations:

library(labelmachine)
dict <- new_lama_dictionary(
  subjects = c(en = "English", ma = "Mathematics", NA_ = "other subjects"),
  results = c("1" = "Excellent", "2" = "Satisfying", "3" = "Failed", NA_ = "Missed")
)
dict
## 
## --- lama_dictionary ---
## Variable 'subjects':
##               en               ma              NA_ 
##        "English"    "Mathematics" "other subjects" 
## 
## Variable 'results':
##            1            2            3          NA_ 
##  "Excellent" "Satisfying"     "Failed"     "Missed"

Translate the data frame variables:

df_new <- lama_translate(
  df,
  dict,
  subject_new = subjects(subject),
  result_new = results(result)
)
str(df_new)
## 'data.frame':    5 obs. of  5 variables:
##  $ pupil_id   : num  1 1 2 2 3
##  $ subject    : chr  "en" "ma" "ma" "en" ...
##  $ result     : num  2 1 3 2 NA
##  $ subject_new: Factor w/ 3 levels "English","Mathematics",..: 1 2 2 1 1
##  $ result_new : Factor w/ 4 levels "Excellent","Satisfying",..: 2 1 3 2 4

Highlights

labelmachine offers the following features:

  • All types of variables can be translated: Logical, Numeric, Character, Factor
  • When translating your variables, you may choose between keeping the current ordering or applying a new factor ordering to your variable.
  • Assigning meaningful labels to missing values (NA) is no problem.
  • Assigning NA to existing values is no problem.
  • Merging two values into a single label is no problem.
  • Transforming a data frame holding label assignment lists into a lama_dictionary is no problem.
  • Manage your translations in yaml files in order to use the same translations in different projects sharing similar data.

Further reading

A short introduction can be found here: Get started

About

❗ This is a read-only mirror of the CRAN R package repository. labelmachine — Make Labeling of R Data Sets Easy. Homepage: https://a-maldet.github.io/labelmachinehttps://github.com/a-maldet/labelmachine Report bugs for this package: https://github.com/a-maldet/labelmachine/issues

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages