sweary

Sweary is an R package that contains a database of swear words from different languages, cherry picked by native speakers.

Installation

The development version of this package can be installed using devtools:

devtools::install_github("pdrhlik/sweary")

Current swear word lists

Language	Language code	Number of swear words
Czech	cs	57
English	en	39
Polish	pl	41
Total	3 langs	137

Examples

All languages are stored in a swear_words data frame.

library(sweary)
head(swear_words)

## # A tibble: 6 x 2
##   word     language
##   <chr>    <chr>   
## 1 buzerant cs      
## 2 čubka    cs      
## 3 čurák    cs      
## 4 čůrák    cs      
## 5 debil    cs      
## 6 dement   cs

You can only extract one language that you are interested in.

en_swear_words <- get_swearwords("en")
head(en_swear_words)

## # A tibble: 6 x 2
##   word     language
##   <chr>    <chr>   
## 1 arse     en      
## 2 arsehole en      
## 3 ass      en      
## 4 asshole  en      
## 5 bitch    en      
## 6 bollocks en

Add (modify) a language

If you are not comfortable with git and pull requests, you can just follow steps 1-3. After you create the file, send it to me via email with a subject New sweary language: {LANG_CODE}. We will acknowledge you in the README after we approve of the changes.

Choose a new language. Find its two letter ISO 639-1 code.
Create a language file. Place the file in data-raw/swear-word-lists/{LANG_CODE}. Example for English: data-raw/swear-word-lists/en.
Fill in the file with swear-words. Following rules must apply:
- One swear-word per line.
- All words must be lowercase.
- The list must only contain unique words.
- The list must be sorted alphabetically.
Make sure all the tests pass. You can do that using a development function called build_sweary(). It becomes available when you git clone the repository and call devtools::load_all(). Or pressing Ctrl+Shift+L in RStudio. Learn more about calling this function using ?build_sweary.
Update README.Rmd Update the langs data frame in README.Rmd by adding a new row to it. More precise instructions are in the raw file itself.
Create a pull request.

Origin

The idea first appeared after the South Park text analysis lightning talk at the Why R? 2018 conference in Wrocław. All the contributors will be acknowledged as the work progresses.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
R		R
data-raw		data-raw
data		data
man		man
sticker		sticker
tests		tests
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

License

maciejkasinski/sweary

Folders and files

Latest commit

History

Repository files navigation

sweary

Installation

Current swear word lists

Examples

Add (modify) a language

Origin

About

Resources

License

Stars

Watchers

Forks

Languages