R Package for Analytics and Machine Learning
lares is a library designed to automate, improve, and speed everyday Analysis and Machine Learning tasks. With a wide variety of family functions within Machine Learning, Data Wrangling, EDA, and Scrappers,
lares helps the analyst or data scientist to get quick, reproducible, and robust results, without the need of repetitive coding or extensive programming skills.
You are most welcome to install, use, and/or comment on any of the code and functionalities. If you are colour blind as well, glad to share my colour palettes! Feel free to contact me via Linkedin, and please, do let me know where did you got my contact from.
# install.packages('devtools') devtools::install_github("laresbernardo/lares") # User friendly update lares::updateLares()
CRAN NOTE: I currently don't have planned to submit the library into CRAN, eventhough it passes all its quality tests (and I'm a huge fan). I think
lares is more of an everyday useful package rather than a "specialized for a specific task" library. It has too many useful and various kinds of functions, from NLP to querying APIs to plotting Machine Learning results to market stocks and portfolio reports. I gladly share my code with the community and encourage you to use/comment/share it, but I strongly think that CRAN is not aiming for this kind of libraries in their repertoire.
See the library in action!
DataScience+: Visualizations for Classification Models Results
DataScience+: Visualizations for Regression Models Results
DataScience+: AutoML and DALEX for Dataset Understanding
DataScience+: Portfolio's Performance and Reporting
DataScience+: Plot Timelines with Gantt Charts
AutoML Simplified Map from
Insights While Understanding
To get insights and value out of your dataset, first you need to understand its structure, types of data, empty values, interactions between variables...
freqs() are here to give you just that! They show a wide persepective of your dataset content, correlations, and frequencies. Additionally, with the
missingness() function to detect all missing values and
df_str() to break down you data frame's structure, you will be ready to squeeze valuable insights out of your data.
Kings of Data Mining
My favourite and most used functions are
corr_var(). In this RMarkdown you can see them in action. Basically, they group and count values within variables, show distributions of one variable vs another one (numerical or categorical), and calculate/plot correlations of one variables vs all others, no matter what type of data you insert.
If there is space for one more, I would add
ohse() (One Hot Smart Encoding), which has made my life much easier and my work much valuable. It converts a whole data frame into numerical values by making dummy variables (categoricals turned into new columns with 1s and 0s, ordered by frequencies and grouping less frequent into a single column) and dates into new features (such as month, year, week of the year, minutes if time is present, holidays given a country, currency exchange rates, etc).
What else is there?
You can type
lares:: in RStudio and you will get a pop-up with all the functions that are currently available within the package. You might also want to check the whole documentation by running
help(package = "lares") locally or in the rdrr.io or rdocumentation.org websites. Remember to check the families and similar functions on the See Also sections too.
Getting further help
If you need help with any of the functions, use the
? function (i.e.
?lares::function) and the Help tab will display a short explanation on each function and its parameters.