Standard Evaluation Interfaces for Common Dplyr Verbs
Various examples for different articles
Wrap R Functions for Debugging and Ease of Use
Fluid Use of Big Data in R
‘vtreat’ is an R data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner.
Concise formatting of significances in R (GPL3 license).
Viewable pages from WinVector LLC view at: http://winvector.github.io
Pre-packaged plots in R
R wrappers for tidyr::gather() and tidyr::spread(), available from CRAN: https://CRAN.R-project.org/package=cdata
All material for "Modeling big data with R, sparklyr, and Apache Spark" Strata Hadoop 2017.
Example code for Lesson on Response Campaign planning
Example automatic differentiation code in Scala
Example R scripts and data for "Practical Data Science with R" by Nina Zumel and John Mount (Manning Publications)
Demonstration of parametric bootstrap to find k for kmeans
Quasi observation based survival package for R.
Support materials for Win-Vector blog article
Support materials for WinVector talk
Slides and code for "Validating Models in R" Strata 2016 RDay http://conferences.oreilly.com/strata/hadoop-big-data-ca/public/schedule/detail/48053
Materials for workshop on preparing data for modeling and analysis using R
Some examples of measuring classifier performance in R
Cross-validated PCA/PCR demonstration based on the work: http://www.win-vector.com/blog/2016/05/pcr_part2_yaware/
Iterate through database tables (by JDBC) and TSV(tab separated values)/CSV(comma separated values) and load/dump data.
Java based XML tool to help check Manning Agile Author XML for cross reference problems (Java based, GPL3+ license)
Example library to accumulate data frame rows in R
Example code for articles on sessionizing data.
Code and data for "The Geometry of Classifiers"
Experimental logistic regression code supporting multiple result categories, many levels of categorical modeling variables, good optimization, L2 regularization and more.
Experimental pure Java revised simplex linear program solver (Apache 2.0 license)
Java code to build synthetic data sets that match reported summary totals. Helps explore possible range of variation.
Trivial demonstration of a diverging Newton-Raphson step when solving a logistic regression