Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: How to use sample_weight in grid searching a model #127

Open
msat59 opened this issue Aug 10, 2023 · 2 comments
Open

Question: How to use sample_weight in grid searching a model #127

msat59 opened this issue Aug 10, 2023 · 2 comments

Comments

@msat59
Copy link

msat59 commented Aug 10, 2023

Hi there.

I have seen the sample_weight parameter in the doc here, but have no idea how to use it.

The purpose of using sample_weight is I have a series with some missing values. I don't want to use backward/forward-filled data as it may change the model performance. I want to use sample_weight to ignore their effect in the grid search results.

I appreciate it if someone can advise how to use the sample weight in the model.

EDITED:
I found regression_weight_col keyword in the codes, for instance in the SilverkiteEstimator code, but I couldn't find how to define and use it.

@msat59
Copy link
Author

msat59 commented Aug 14, 2023

Has this feature been implemented yet?

According to simple_silverkite_template.py, regression_weight_col should be defined in the ModelComponentsParam, in the custom dictionary:

custom={
                "feature_sets_enabled": self.constants.COMMON_MODELCOMPONENTPARAM_PARAMETERS["FEASET"][components[components.index("FEASET")+1]],
                "fit_algorithm_dict": self.constants.COMMON_MODELCOMPONENTPARAM_PARAMETERS["ALGO"][components[components.index("ALGO")+1]],
                "max_daily_seas_interaction_order": self.constants.COMMON_MODELCOMPONENTPARAM_PARAMETERS["DSI"][freq][components[components.index("DSI")+1]],
                "max_weekly_seas_interaction_order": self.constants.COMMON_MODELCOMPONENTPARAM_PARAMETERS["WSI"][freq][components[components.index("WSI")+1]],
                "extra_pred_cols": [],
                "drop_pred_cols": None,
                "explicit_pred_cols": None,
                "min_admissible_value": None,
                "max_admissible_value": None,
                "regression_weight_col": None,
                "normalize_method": "zero_to_one"
            },

However, it seems that it hasn't been implemented yet as I get this error when I add it there. Note that my dataframe has all columns: ['ts', 'y', 'sample_weight']. I debugged the data and the internally created df had only ts and y columns.

ValueError: 
All the 12 fits failed.
It is very likely that your model is misconfigured.
You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
12 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\pandas\core\indexes\base.py", line 3081, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'sample_weight'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\sklearn\model_selection\_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\sklearn\pipeline.py", line 405, in fit
    self._final_estimator.fit(Xt, y, **fit_params_last_step)
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\greykite\sklearn\estimator\simple_silverkite_estimator.py", line 271, in fit
    self.model_dict = self.silverkite.forecast_simple(
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\greykite\algo\forecast\silverkite\forecast_simple_silverkite.py", line 836, in forecast_simple
    trained_model = super().forecast(**parameters)
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\greykite\algo\forecast\silverkite\forecast_silverkite.py", line 956, in forecast
    trained_model = fit_ml_model_with_evaluation(
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\greykite\algo\common\ml_models.py", line 704, in fit_ml_model_with_evaluation
    trained_model = fit_ml_model(
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\greykite\algo\common\ml_models.py", line 384, in fit_ml_model
    if df[regression_weight_col].min() < 0:
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\pandas\core\frame.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Users\user\miniconda3\envs\py38\lib\site-packages\pandas\core\indexes\base.py", line 3083, in get_loc
    raise KeyError(key) from err
KeyError: 'sample_weight'

@msat59
Copy link
Author

msat59 commented Aug 17, 2023

@al-bert , is this project dead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant