Skip to content

Creating R package to code free text gender responses


Unknown, MIT licenses found

Licenses found

Notifications You must be signed in to change notification settings


Repository files navigation


CRAN status Lifecycle: stable Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-CMD-check Codecov test coverage

The goal of gendercoder is to allow simple re-coding of free-text gender responses. This is intended to permit representation of gender diversity, while managing troll-responses and the workload implications of manual coding.


This package is not on CRAN. To use this package please run the following code:


Basic use

The gendercoder package permits the efficient re-coding of free-text gender responses within a tidyverse pipeline. It contains two built-in English output dictionaries, a default manylevels_en dictionary which corrects spelling and standardises terms while maintaining the diversity of responses and a fewlevels_en dictionary which contains fewer gender categories, “man”, “woman”, “boy”, “girl”, and “sex and gender diverse”.

The core function, gender_recode(), takes 3 arguments,

  • gender the vector of free-text gender,

  • dictionary the preferred dictionary, and

  • retain_unmatched a logical indicating whether original values should be carried over if there is no match.

Basic usage is demonstrated below.


tibble(gender = c("male", "MALE", "mle", "I am male", "femail", "female", "enby")) %>% 
  mutate(manylevels_gender  = recode_gender(gender, dictionary = manylevels_en, retain_unmatched = TRUE),
         fewlevels_gender = recode_gender(gender, dictionary = fewlevels_en, retain_unmatched = FALSE)
#> Results not matched from the dictionary have been filled with the user inputted values
#> # A tibble: 7 × 3
#>   gender    manylevels_gender fewlevels_gender      
#>   <chr>     <chr>             <chr>                 
#> 1 male      man               man                   
#> 2 MALE      man               man                   
#> 3 mle       man               man                   
#> 4 I am male I am male         <NA>                  
#> 5 femail    woman             woman                 
#> 6 female    woman             woman                 
#> 7 enby      non-binary        sex and gender diverse

The package does not need to be used as part of a tidyverse pipeline:

df <- tibble(gender = c("male", "MALE", "mle", "I am male", "femail", "female", "enby")) 

df$manylevels_gender <- recode_gender(df$gender, dictionary = manylevels_en)
#> # A tibble: 7 × 2
#>   gender    manylevels_gender
#>   <chr>     <chr>            
#> 1 male      man              
#> 2 MALE      man              
#> 3 mle       man              
#> 4 I am male <NA>             
#> 5 femail    woman            
#> 6 female    woman            
#> 7 enby      non-binary

Contributing to this package

This package is a reflection of cultural context of the package contributors. We acknowledge that understandings of gender are bound by both culture and time and are continually changing. As such, we welcome issues and pull requests to make the package more inclusive, more reflective of current understandings of gender inclusive languages and/or suitable for a broader range of cultural contexts. We particularly welcome addition of non-English dictionaries or of other gender-diverse responses to the manylevels_en and fewlevels_en dictionaries.

The “Adding to the dictionary” vignette includes information about how to make changes to the dictionary either for your own use or when contributiong to the gendercoder package.

Citation Information

Please cite this package as:

Jennifer Beaudry, Emily Kothe, Felix Singleton Thorn, Rhydwyn McGuire, Nicholas Tierney and Mathew Ling (2020). gendercoder: Recodes Sex/Gender Descriptions into a Standard Set. R package version

Acknowledgement of Country

We acknowledge the Wurundjeri people of the Kulin Nation as the custodians of the land on which this package was developed and pay respects to elders past, present and future.


No releases published


No packages published