No apparent way to pickle a model from vowpalwabbit.sklearn_vw.VWClassifier #1040

dakami · 2016-07-16T03:33:13Z

Pickling a fitted classifier gives you:

RuntimeError: Pickling of "vowpalwabbit.pyvw.vw" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)

Means I basically need to build the classifier at runtime every time to do actual prediction (or, I suppose, use CRIU).

pommedeterresautee · 2016-07-19T10:48:16Z

I am not using the Scikit wrapper but just the Python one.
I made the same mistake than you.
You need to provide the model path as an option (-f) in the constructor like you would do with the C++ client. Check the Python examples to see where to put it.
You perform your learning, and when it is finished, you call the finish() method of you vw instance.
At that moment your model is written on disk.

For sure, pickle and assimilated doesn't work.
Lots of documentation is still missing for the Python part :(

Hope it helps.

gramhagen · 2016-07-25T13:04:32Z

#1052 adds save and load methods for the sklearn interface as an alternative to the f option.

I was thinking it might be helpful to use sphinx to autogenerate api docs for the python build. Is that the kind of documentation that would be helpful?

JohnLangford · 2016-08-09T19:02:40Z

1052 is in, so closing for now. Better documentation is very welcome form anyone who can contribute.

lfleck · 2020-07-01T14:17:32Z

Is there any update on using pickle?
I'm defining multiple contextual bandits and try to train them in multiple python processes like this:

from vowpalwabbit import pyvw
from concurrent.futures import ProcessPoolExecutor

def dummy_train(model):
    pass

vw1 = pyvw.vw(f"--cb_explore 5 --interactions UUA --quiet --cover 4 -f test4.model")
vw2 = pyvw.vw(f"--cb_explore 5 --interactions UUA --quiet --cover 5 -f test5.model")
models = [vw1, vw2]

with ProcessPoolExecutor(max_workers=2) as executor:
    executor.map(dummy_train, models)

However, even with the "-f" option stated above, the originally mentioned 'Pickling of "vowpalwabbit.pyvw.vw" instances is not enabled' error appears. Same holds for cloudpickle which e.g. joblib uses for serialization.

Are there any suggested ways to run VowpalWabbit models in multiple python processes?

jackgerrits · 2020-07-01T15:10:36Z

@gramhagen did you enable pickling in the PR you just submitted? (#2368)

gramhagen · 2020-07-01T18:25:14Z

yes, you can pickle the sklearn model, under the hood this is accomplished using

pyvw.vw.save(file)
pyvw.vw(initial_regressor=file)

but the problem will persist here since this is passing the pyvw.vw object itself (which is not picklable) to each new process. So you need to encapsulate model creation in your dummy_train function and only pass the parameters you want to vary.

example:

from vowpalwabbit.pyvw import vw
from multiprocessing import Pool

def dummy_train(params):
    model = vw(**params)
    ec = model.example("1 | a b c")
    model.learn(ec)
    model.finish()


if __name__ == "__main__":
    pool = Pool(processes=2)
    variants = (
        {"f": "model_0.vw", "quiet": True},
        {"f": "model_1.vw", "quiet": True},
    )
    pool.map(dummy_train, variants)

JohnLangford closed this as completed Aug 9, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No apparent way to pickle a model from vowpalwabbit.sklearn_vw.VWClassifier #1040

No apparent way to pickle a model from vowpalwabbit.sklearn_vw.VWClassifier #1040

dakami commented Jul 16, 2016

pommedeterresautee commented Jul 19, 2016 •

edited

gramhagen commented Jul 25, 2016

JohnLangford commented Aug 9, 2016

lfleck commented Jul 1, 2020

jackgerrits commented Jul 1, 2020

gramhagen commented Jul 1, 2020

No apparent way to pickle a model from vowpalwabbit.sklearn_vw.VWClassifier #1040

No apparent way to pickle a model from vowpalwabbit.sklearn_vw.VWClassifier #1040

Comments

dakami commented Jul 16, 2016

pommedeterresautee commented Jul 19, 2016 • edited

gramhagen commented Jul 25, 2016

JohnLangford commented Aug 9, 2016

lfleck commented Jul 1, 2020

jackgerrits commented Jul 1, 2020

gramhagen commented Jul 1, 2020

pommedeterresautee commented Jul 19, 2016 •

edited