Greykite Forecaster Model is Unpickle-able #73

kurtejung · 2022-05-10T15:58:21Z

Even basic implementation of greykite (see below) does not pickle properly, due to some of the design choices within Greykite (e.g. nested functions and namedtuple definitions within function class calls.

Was this a purposeful design choice? Is there another method to save a trained model state and reuse the model to create inferences downstream? Integrations with deployment tools become much more challenging if we need to retrain the model every time and can't save the model state. Looking for guidance here on best practice - thanks!

Here's code to reproduce the issue:

from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.templates.model_templates import ModelTemplateEnum

import pandas as pd
import numpy as np

date_list = pd.date_range(start='2020-01-01', end='2022-01-01', freq='W-FRI')
df_train = pd.DataFrame(
    {
        'week_end_date': date_list,
        'data': np.random.rand(len(date_list))
    }
)

metadata = MetadataParam(
    time_col="week_end_date",
    value_col=df_train.columns[-1],
    freq='W-FRI'
)

fc = Forecaster()
result = fc.run_forecast_config(
    df=df_train,
    config=ForecastConfig(
        model_template=ModelTemplateEnum.SILVERKITE.name,
        forecast_horizon=52,
        coverage=0.95,         # 95% prediction intervals
        metadata_param=metadata
    )
)

import dill
with open("pickle_out.b", "wb") as fp:
    dill.dump(result.model, fp)
    output_set = dill.load(fp)

The text was updated successfully, but these errors were encountered:

KaixuYang · 2022-05-11T22:25:34Z

Hi @kurtejung some of the functions/classes are not directly pickleable. We have a built-in function to iteratively save or load the model. Once you have run the forecast, you can do

fc.dump_forecast_result(destination_dir="dir")

For loading, you can load an dumped directory with

fc = Forecaster()
fc.load_forecast_result(source_dir="dir")

Change the "dir" to your desired directory.

kurtejung · 2022-05-12T14:17:08Z

Thanks! - not sure how I missed this in the documentation.

I'm trying to implement a deepcopy function for this as well - I can use the save/load functionality but the I/O is time intensive. Is there an in-memory version of dump/load_forecast_result?

If not, would such a function be a welcome addition to the codebase?

KaixuYang · 2022-05-20T17:09:53Z

Hi @kurtejung , yeah you are very welcome to help add the deepcopy version of the save/load functionality! Please feel free to open a PR if you would like to, thanks!

vincetran96 · 2022-07-29T22:39:52Z

I'm using Greykite in Miniconda on Windows. When I tried to dump a Forecaster object, I had the following error: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process. I tried switching both the dump_design_info and overwrite_exist_dir parameters between True and False but the error persisted.

I'm wondering what might be the cause of this problem.

sayanpatra · 2022-07-30T02:21:18Z

Hi @vincetran96, I am not sure that this issue is due to the Greykite package. Maybe these links would help you debug.

Feel free to post your code snippets, it helps us in assisting you.

vincetran96 · 2022-07-30T20:34:33Z

@sayanpatra Thank you for your suggestions. I have re-used the exact code from @kurtejung above except the pickling part at the end:

from greykite.framework.templates.autogen.forecast_config import ForecastConfig
from greykite.framework.templates.autogen.forecast_config import MetadataParam
from greykite.framework.templates.forecaster import Forecaster
from greykite.framework.templates.model_templates import ModelTemplateEnum

import pandas as pd
import numpy as np

date_list = pd.date_range(start='2020-01-01', end='2022-01-01', freq='W-FRI')
df_train = pd.DataFrame(
    {
        'week_end_date': date_list,
        'data': np.random.rand(len(date_list))
    }
)

metadata = MetadataParam(
    time_col="week_end_date",
    value_col=df_train.columns[-1],
    freq='W-FRI'
)

fc = Forecaster()
result = fc.run_forecast_config(
    df=df_train,
    config=ForecastConfig(
        model_template=ModelTemplateEnum.SILVERKITE.name,
        forecast_horizon=52,
        coverage=0.95,         # 95% prediction intervals
        metadata_param=metadata
    )
)

fc.dump_forecast_result("path",dump_design_info=False)

Below is the traceback:

Traceback (most recent call last):
  File "env_path\lib\site-packages\greykite\framework\templates\pickle_utils.py", line 177, in dump_obj
    dill.dump(
  File "env_path\lib\site-packages\dill\_dill.py", line 336, in dump
    Pickler(file, protocol, **_kwds).dump(obj)
  File "env_path\lib\site-packages\dill\_dill.py", line 620, in dump
    StockPickler.dump(self, obj)
  File "env_path\lib\pickle.py", line 487, in dump
    self.save(obj)
  File "env_path\lib\pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "env_path\lib\pickle.py", line 717, in save_reduce
    save(state)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\site-packages\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "env_path\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "env_path\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "env_path\lib\pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "env_path\lib\pickle.py", line 717, in save_reduce
    save(state)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\site-packages\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "env_path\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "env_path\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "env_path\lib\pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "env_path\lib\pickle.py", line 717, in save_reduce
    save(state)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\site-packages\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "env_path\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "env_path\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\pickle.py", line 931, in save_list
    self._batch_appends(obj)
  File "env_path\lib\pickle.py", line 955, in _batch_appends
    save(x)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\pickle.py", line 886, in save_tuple
    save(element)
  File "env_path\lib\pickle.py", line 603, in save
    self.save_reduce(obj=obj, *rv)
  File "env_path\lib\pickle.py", line 717, in save_reduce
    save(state)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\site-packages\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "env_path\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "env_path\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "env_path\lib\pickle.py", line 560, in save
    f(self, obj)  # Call unbound method with explicit self
  File "env_path\lib\site-packages\dill\_dill.py", line 1251, in save_module_dict
    StockPickler.save_dict(pickler, obj)
  File "env_path\lib\pickle.py", line 971, in save_dict
    self._batch_setitems(obj.items())
  File "env_path\lib\pickle.py", line 997, in _batch_setitems
    save(v)
  File "env_path\lib\pickle.py", line 578, in save
    rv = reduce(self.proto)
  File "env_path\lib\site-packages\patsy\util.py", line 723, in no_pickling
    raise NotImplementedError(
NotImplementedError: Sorry, pickling not yet supported. See https://github.com/pydata/patsy/issues/26 if you want to help.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "env_path\lib\site-packages\greykite\framework\templates\forecaster.py", line 442, in dump_forecast_result
    dump_obj(
  File "env_path\lib\site-packages\greykite\framework\templates\pickle_utils.py", line 184, in dump_obj
    os.remove(os.path.join(dir_name, f"{obj_name}.pkl"))
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'path\\object.pkl'

A seemingly noteworthy exception occurred before the PermissionError one was NotImplementedError: Sorry, pickling not yet supported. See https://github.com/pydata/patsy/issues/26 if you want to help..

harithzulfaizal · 2023-01-12T06:57:10Z

@vincetran96 Hi, wondering if you ever come across a solution to this. I'm encountering the same problem. Thanks!

al-bert closed this as completed Jul 21, 2022

samuelefiorini mentioned this issue Jun 29, 2023

MLFLOW support for the SilverKite Algorithm #124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greykite Forecaster Model is Unpickle-able #73

Greykite Forecaster Model is Unpickle-able #73

kurtejung commented May 10, 2022 •

edited

KaixuYang commented May 11, 2022

kurtejung commented May 12, 2022

KaixuYang commented May 20, 2022

vincetran96 commented Jul 29, 2022

sayanpatra commented Jul 30, 2022

vincetran96 commented Jul 30, 2022 •

edited

harithzulfaizal commented Jan 12, 2023

Greykite Forecaster Model is Unpickle-able #73

Greykite Forecaster Model is Unpickle-able #73

Comments

kurtejung commented May 10, 2022 • edited

KaixuYang commented May 11, 2022

kurtejung commented May 12, 2022

KaixuYang commented May 20, 2022

vincetran96 commented Jul 29, 2022

sayanpatra commented Jul 30, 2022

vincetran96 commented Jul 30, 2022 • edited

harithzulfaizal commented Jan 12, 2023

kurtejung commented May 10, 2022 •

edited

vincetran96 commented Jul 30, 2022 •

edited