Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big update prepping for Book and Tarball release. #8

Closed
wants to merge 182 commits into from
Closed

Big update prepping for Book and Tarball release. #8

wants to merge 182 commits into from

Conversation

jpatanooga
Copy link
Owner

Working CLI
Fixed Issues where CLI would spit out error messages, now shows usage mechanics
New prefixes for properties file ("canova.") to allow for mixed workflow property files
Full CLI support for Text, CSV, Image, and Custom InputFormats
Lots of new unit tests
New set of examples in repo for out of the box things to try
New TF-IDF vectorization pipeline, new tokenization, new TF-IDF formula, more unit tests
Normalization for all pipelines
Shuffle mode for output of all pipelines
Added basic set of transforms for post processing vectors after the InputFormat has pulled the data
New TextInputFormat for the general text TF-IDF pipeline to fix some basic issues
Fixed issue in SVMLightRecordWriter where zero values were being written out

Comments

  • there are places this thing could be cleaner, but it works well enough for a alpha-RC for now
  • further refactoring at this point is premature due to the fact that we dont have the timeseries/audio/video pipelines done yet (and the parallelization work is not done yet either). Would be a pointless exercise at this point.

TODO for Next PR:

  1. Finish SVMLightInputFormat / RecordReader (parallelization / hdfs / split) changes
  2. Finish Maven Assembly that generates the release tarball

smarthi and others added 29 commits May 10, 2015 23:35
Added standard components to readme for installation, description and…
…ency

static initialization of ImageIO registry
@jpatanooga jpatanooga closed this Jun 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants