Skip to content

Helpful Java Spark stuff

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE_CODE
Notifications You must be signed in to change notification settings

dafrenchyman/spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark

Helpful Java Spark library.

Pipelines:

  • ConcatColumns - When you need to concat a bunch of columns together
  • DropColumns - Pipelines do get messy
  • ExplodeVector - All those models sure like results in vectors
  • FloorCeiling - For putting caps on your variables
  • MultiStringIndexer - For the lazy
  • TopCategories - When you want to limit the categorical variables
  • XGBoostEstimator - Just using the Scala one, but added all the missing "sets" and "gets"

Params:

  • MapParam - When you just need to use a Map

Misc:

  • FindContinuousColumns - Dataset helper
  • FindCategoricalColumns - Dataset helper
  • FindNullColumns - Annoyingly, I have some datasets where columns can always be NULL.
  • SmartUnion - For when your columns don't line up and the datasets don't always have the same ones.
  • JavaToScala helpers
  • SparkUnitTest helpers
  • SparkSettings sysouts
  • ...

About

Helpful Java Spark stuff

Topics

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE_CODE

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages