Van-Duyet Le (me@duyetdev.com)
More information comming soon.
Install Prediction.io server:
$ bash -c "$(curl -s https://install.prediction.io/install.sh)"
It takes 6 simple steps to deploy and use an engine:
- Install and Run PredictionIO
- Create an Engine by downloading an Engine Template
- Generate an App ID and Access Key, if you are integrating PredictionIO with a new application
- Collecting Data
- Deploy the Engine as a Service
- Use the Engine
Install Scala Parallel Text Classification:
git clone https://github.com/duyetdev/scala-parallel-textclassification islab-scala-parallel-textclassification
cd islab-scala-parallel-textclassificationStart Prediction Server
pio-start-allpio buildBuild the engine at the current directory.pio trainKick off a training using an engine.pio deployDeploy an engine as an engine server. If no instance ID is specified, it will deploy the latest instance.
Import the data
pio import --appid <app_id> --input data/vnexpress-1000-import-able.jsonBuild, Train and deploy server
pio build
pio train
pio deployList server
pio app list # list all app - Event server: http://localhost:7070/events.json?accessKey=xxxxxx
Look at the following tutorial for a Quick Start guide and implementation details.
Modified PreparedData to use MLLib hashing and tf-idf implementations.
Fixed dot product implementation in the predict methods to work with batch predict method for evaluation.
Included three different data sets: e-mail spam, 20 newsgroups, and the rotten tomatoes semantic analysis set. Includes Multinomial Logistic Regression algorithm for text classification.
Fixed import script bug occuring with Python 2.
Changed data import Python script to pull straight from the 20 newsgroups page.