Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistable SVC Model #27

Closed
LeoHSRodrigues opened this issue Jun 11, 2019 · 6 comments
Closed

Persistable SVC Model #27

LeoHSRodrigues opened this issue Jun 11, 2019 · 6 comments
Assignees
Labels
bug Something isn't working
Projects

Comments

@LeoHSRodrigues
Copy link

Hi,

I'm currently trying to classify some text using the SVC classifier, i'm able to predict using the original object that i trained, but the problem occurs when i try to load the model, i get an error saying:

PHP Fatal error: Uncaught svmexception: No model available to classify with in /var/www/html/rubix/vendor/rubix/ml/src/Classifiers/SVC.php:194
Stack trace:
#0 /var/www/html/rubix/vendor/rubix/ml/src/Classifiers/SVC.php(194): svmmodel->predict(Array)
#1 /var/www/html/rubix/vendor/rubix/ml/src/Pipeline.php(178): Rubix\ML\Classifiers\SVC->predict(Object(Rubix\ML\Datasets\Unlabeled))
#2 /var/www/html/rubix/vendor/rubix/ml/src/PersistentModel.php(129): Rubix\ML\Pipeline->predict(Object(Rubix\ML\Datasets\Unlabeled))
#3 /var/www/html/rubix/teste2.php(26): Rubix\ML\PersistentModel->predict(Object(Rubix\ML\Datasets\Unlabeled))
#4 {main}
thrown in /var/www/html/rubix/vendor/rubix/ml/src/Classifiers/SVC.php on line 194

There's something i didn't catch up?

@andrewdalpino andrewdalpino self-assigned this Jun 12, 2019
@andrewdalpino andrewdalpino added the bug Something isn't working label Jun 12, 2019
@andrewdalpino andrewdalpino added this to Backlog in Roadmap via automation Jun 12, 2019
@andrewdalpino
Copy link
Member

Hi @leo23fla thank you for the great bug report

Rubix uses the svm (libsvm) extension to power the SVC, SVR, and One Class SVM estimators. It's entirely possible that the 'model' (created by the extension) is not being preserved through serialization.

I have a couple questions before I start working on this bug

Are you using the Native Serializer or the Binary Serializer?

Also, are you using the Persistent Model meta estimator as a wrapper around SVC or are you using a Persister object directly?

@LeoHSRodrigues
Copy link
Author

Hi, I'm currently using the Native Serializer and using the Persistent Model.

The train code:

const MODEL_FILE = 'SVC.model';
const PROGRESS_FILE = 'progress.csv';

set_time_limit(0);
ini_set('memory_limit', '-1');

$training = Labeled::build($samples, $labels);

$estimator = new PersistentModel(
    new Pipeline([
        new HTMLStripper(),
        new TextNormalizer(),
        new WordCountVectorizer(1000, 3, new NGram(1, 3)),
        new TfIdfTransformer(),
        new ZScaleStandardizer(),
    ], new SVC(1.0, new Linear(), true, 1e-3, 100.)),
    new Filesystem(MODEL_FILE, true , new Native())
);

$estimator->setLogger(new Screen('sentiment'));

$estimator->train($training);

$estimator->save();

and my load model code:

const MODEL_FILE = 'SVC.model';

set_time_limit(0);
ini_set('memory_limit', '-1');

$estimator = PersistentModel::load(new Filesystem(MODEL_FILE));

$result = $estimator->predict($dataset);

var_dump($result);

@andrewdalpino
Copy link
Member

andrewdalpino commented Jun 13, 2019

Thanks @leo23fla

I was able to reproduce the error

It looks like SVM extension has a separate API for saving and loading the model from disk

I will reach out to the author Ian to see if there is a way we can make the Rubix persistable subsystem jive with the way that the libsvm extension saves/loads the model

There may be some things we can do with magic methods, however I will have to see if they will work with the RedisDB persister (not just disk) as well

The worst case will be that libsvm-based learners won't be able to implement the Persistable interface - instead, we'd have a separate save() and load() method that just takes a file path argument (using the svm extension mechanism under the hood)

I'll keep you posted and feel free to respond with any of your thoughts

@LeoHSRodrigues
Copy link
Author

Ok. Hope everything goes well.

@andrewdalpino
Copy link
Member

andrewdalpino commented Jun 13, 2019

In the interim, I'd recommend trying out a Neural Network based learner such as Multi Layer Perceptron or Softmax Classifier on your problem

I've persisted/loaded those many times and works without fault

@andrewdalpino
Copy link
Member

andrewdalpino commented Jun 18, 2019

@leo23fla

As a preliminary fix to this bug we've gone ahead and dropped the Persistable contract between all SVM-based learners

Instead, we've put a save() and load() method on each class that allows you to save model data to a file and subsequently load the model data in another process

Documentation can be found here https://github.com/RubixML/RubixML#svc

These operations are independent of the Rubix Persistable subsystem as the php-svm extension is not compatible with it (does not implement serialization of the model data itself)

I've reached out to the author to coordinate a fix the issue, however, I have not had a response yet

If warranted, I will use a separate issue to address getting the SVM-based learners back on the Persistable system

One way that this could work is to use __sleep() and __wakeup() magic methods to essentially save the model params to a file first, read it back, and store it in the learner prior to serialization - however, this sounds janky to me and will probably not implement this (but it's an example of what can be done)

I'm closing the issue, as the bug is technically fixed but feel free to continue the conversation by way of your thoughts and/or questions

Thank you again for the great bug report

Roadmap automation moved this from In progress to Completed Jun 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Roadmap
  
Completed
Development

No branches or pull requests

2 participants