clevercloudr

The goal of clevercloudr is to to serve as a solution to create meaningful word clouds. To create meaningful word clouds, data scientists typically takes multiple steps to clean the data, such as removing stopwords, removing punctuation and digits, making the letters lower cases, conducting stemming. This package will help data scientists clean the data easily following the common practices and also allow to create more meaningful word clouds.

Functions

There are 4 functions in this package:

CleverClean A preprocessor to convert all the letters to lower case and remove punctuations.
CleverStemmer A preprocessor to perform stemming on the text
CleverStopwords A comprehensive list of English stopwords that allow adding more customized words.
CleverWordCloud As function to generate a visually appealing word cloud with customized shape and stopwords.

Fitting into the R ecosystem

Packages that have similar functions:

wordcloud2: a word cloud generator for data visualization

What we do differently:

Our aim is to improve on the pre-processing of strings before creating a wordcloud in order to make it more user specific and efficient.
The package wordcloud does not allow customization of stopwords, but with our package gives users the opportunity to add stopwords that cater to their analysis.
We follow the standard practices of cleaning data to make the text suitable for creating meaningful word cloud, focusing on removing duplicate words by setting all letters to lower case, removing punctuation and digits as well as applying a stemmer to further process the text.

Installation

You can install the development version of clevercloudr from GitHub with:

# install.packages("devtools")
devtools::install_github("UBC-MDS/clevercloudr")

Examples

Some quick examples of CleverClean(), CleverStemmer(), CleverStopwords() and CleverWordCloud().

library(clevercloudr)
text <- list("grounds!!!", "feet6", "running123", "feeding", "feed", "feed$", "grounding", "feet", "happiness")

# Clean raw text data 
clean_text <- CleverClean(text)
clean_text
#> [1] "grounds"   "feet"      "running"   "feeding"   "feed"      "feed"     
#> [7] "grounding" "feet"      "happiness"

# Perform stemming on the text 
stem_text <- CleverStemmer(clean_text)
stem_text
#> [1] "ground" "feet"   "run"    "feed"   "feed"   "feed"   "ground" "feet"  
#> [9] "happi"

# Customize the stopwords by adding stopwords specific to each project / task 
new_words <- list("happi")
new_stopwords <- CleverStopwords(new_words)

# Generate the word cloud
CleverWordCloud(stem_text, new_stopwords)

Contributing

Contributors of the project: Amelia Tang, Arushi Ahuja, Victor Francis, Adrianne Leung

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

clevercloudr was created by Amelia Tang, Arushi Ahuja, Victor Francis, Adrianne Leung. It is licensed under the terms of the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
.github		.github
R		R
docs		docs
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
clevercloudr.Rproj		clevercloudr.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

clevercloudr

Functions

Fitting into the R ecosystem

Installation

Examples

Contributing

License

About

Licenses found

Releases 4

Packages

Contributors 4

Languages

License

Licenses found

UBC-MDS/clevercloudr

Folders and files

Latest commit

History

Repository files navigation

clevercloudr

Functions

Fitting into the R ecosystem

Installation

Examples

Contributing

License

About

Topics

Resources

License

Licenses found

Code of conduct

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 4

Languages

Packages