Skip to content

ropensci/charlatan

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
R
 
 
 
 
 
 
 
 
man
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

charlatan

Project Status: Active – The project has reached a stable, usable state and is being actively developed. R-check cran checks cran version rstudio mirror downloads

charlatan makes fake data, inspired from and borrowing some code from Python's faker (https://github.com/joke2k/faker)

Make fake data for:

  • person names
  • jobs
  • phone numbers
  • colors: names, hex, rgb
  • credit cards
  • DOIs
  • numbers in range and from distributions
  • gene sequences
  • geographic coordinates
  • emails
  • URIs, URLs, and their parts
  • IP addresses
  • more coming ...

Possible use cases for charlatan:

  • Students in a classroom setting learning any task that needs a dataset.
  • People doing simulations/modeling that need some fake data
  • Generate fake dataset of users for a database before actual users exist
  • Complete missing spots in a dataset
  • Generate fake data to replace sensitive real data with before public release
  • Create a random set of colors for visualization
  • Generate random coordinates for a map
  • Get a set of randomly generated DOIs (Digital Object Identifiers) to assign to fake scholarly artifacts
  • Generate fake taxonomic names for a biological dataset
  • Get a set of fake sequences to use to test code/software that uses sequence data

Reasons to use charlatan:

  • Lite weight, few dependencies
  • Relatively comprehensive types of data, and more being added
  • Comprehensive set of languages supported, more being added
  • Useful R features such as creating entire fake data.frame's

Installation

cran version

install.packages("charlatan")

dev version

remotes::install_github("ropensci/charlatan")
library("charlatan")
set.seed(12345)

high level function

... for all fake data operations

x <- fraudster()
x$job()
#> [1] "Corporate investment banker"
x$name()
#> [1] "Dr. Garey Hamill"
x$color_name()
#> [1] "Ivory"

locale support

Adding more locales through time, e.g.,

Locale support for job data

ch_job(locale = "en_US", n = 3)
#> [1] "Therapeutic radiographer" "Teacher, primary school" 
#> [3] "Lobbyist"
ch_job(locale = "fr_FR", n = 3)
#> [1] "Contrôleur de gestion"    "Bactériologiste"         
#> [3] "Attaché d'administration"
ch_job(locale = "hr_HR", n = 3)
#> [1] "Dokumentarist savjetnik" "Maser – kupeljar"       
#> [3] "Voditelj projekta"
ch_job(locale = "uk_UA", n = 3)
#> [1] "Доцент"              "Дипломат"            "Головний меркшейдер"
ch_job(locale = "zh_TW", n = 3)
#> [1] "牙醫師"           "飛安人員"         "機電技師/工程師"

For colors:

ch_color_name(locale = "en_US", n = 3)
#> [1] "LightSeaGreen" "Brown"         "Aqua"
ch_color_name(locale = "uk_UA", n = 3)
#> [1] "Сиваво-зелений"      "Берлінська лазур"    "Сині яйця малинівки"

More coming soon ...

generate a dataset

ch_generate()
#> # A tibble: 10 × 3
#>    name                    job                              phone_number    
#>    <chr>                   <chr>                            <chr>           
#>  1 King Bartoletti         Trading standards officer        972.438.0296    
#>  2 Dr. Ike Gerhold         Surgeon                          (963)938-1790   
#>  3 Dr. Tatyanna Blanda DVM Estate agent                     856.021.4956x893
#>  4 Antione Grant           Fish farm manager                132.576.3127    
#>  5 Michal Gutmann          Scientist, research (maths)      837.134.4726x743
#>  6 Ross Cartwright PhD     Dealer                           773-448-3969    
#>  7 Michal Balistreri       Phytotherapist                   110-184-6140x699
#>  8 Mabelle Crist           Neurosurgeon                     275-104-0595    
#>  9 Infant Dicki            Armed forces operational officer 766-679-9103x791
#> 10 Karri Heaney            Psychiatric nurse                02278877787
ch_generate('job', 'phone_number', n = 30)
#> # A tibble: 30 × 2
#>    job                                 phone_number      
#>    <chr>                               <chr>             
#>  1 Interior and spatial designer       005-426-5468x0971 
#>  2 Geophysical data processor          459-522-7741      
#>  3 Ophthalmologist                     678.654.1098x445  
#>  4 Engineer, agricultural              373.769.5149      
#>  5 Dealer                              121.204.9799x098  
#>  6 Environmental health practitioner   1-222-568-8486    
#>  7 Surveyor, hydrographic              228.958.1370x0609 
#>  8 Lobbyist                            (976)726-0690x1803
#>  9 Cytogeneticist                      008.111.9486      
#> 10 Designer, blown glass/stained glass 387-870-5348      
#> # … with 20 more rows
#> # ℹ Use `print(n = ...)` to see more rows

person name

ch_name()
#> [1] "Susannah Batz-Mraz"
ch_name(10)
#>  [1] "Deondre Jerde"           "Harriett Goodwin"       
#>  [3] "Kaitlynn Dooley"         "Dr. Alannah Botsford"   
#>  [5] "Koby O'Hara-Goldner"     "Carlene Osinski"        
#>  [7] "Miss Alyson Ankunding"   "Dr. Sommer Schroeder MD"
#>  [9] "Sienna Cummerata"        "Ms. Celena Hermiston"

phone number

ch_phone_number()
#> [1] "872-976-6093x382"
ch_phone_number(10)
#>  [1] "+49(9)6373771353"   "1-055-870-8362x208" "+11(9)5635135534"  
#>  [4] "405.525.0245x20351" "(217)908-6461x9385" "(256)144-8907x242" 
#>  [7] "345-963-8208"       "01949102189"        "368.299.7724x532"  
#> [10] "193-445-5487x40228"

job

ch_job()
#> [1] "Warden/ranger"
ch_job(10)
#>  [1] "Engineer, biomedical"          "Librarian, public"            
#>  [3] "Designer, television/film set" "Orthoptist"                   
#>  [5] "Actuary"                       "Television floor manager"     
#>  [7] "Surgeon"                       "Programmer, applications"     
#>  [9] "Social researcher"             "Engineer, electrical"

credit cards

ch_credit_card_provider()
#> [1] "Voyager"
ch_credit_card_provider(n = 4)
#> [1] "American Express" "Mastercard"       "Voyager"          "VISA 16 digit"
ch_credit_card_number()
#> [1] "4149758795998363"
ch_credit_card_number(n = 10)
#>  [1] "3096280733755669659" "3528862994544207088" "55375315925243502"  
#>  [4] "675963691601916"     "4387854850341820"    "6011460885189949222"
#>  [7] "4755578842679336"    "210015419106563146"  "55222480023215177"  
#> [10] "4247284207922"
ch_credit_card_security_code()
#> [1] "301"
ch_credit_card_security_code(10)
#>  [1] "386" "978" "998" "267" "238" "036" "965" "356" "502" "786"

Usage in the wild

Contributors

similar art

Meta

  • Please report any issues or bugs.
  • License: MIT
  • Get citation information for charlatan in R doing citation(package = 'charlatan')
  • Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.