Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prophet Logistic Growth #21

Open
venk2k16 opened this issue Jun 11, 2021 · 9 comments
Open

Prophet Logistic Growth #21

venk2k16 opened this issue Jun 11, 2021 · 9 comments

Comments

@venk2k16
Copy link

Thanks for this library. This really makes my workflow a lot easier !!

I am trying to fit a 'logistic' growth model using the Prophet and I'm passing the dataframe containing 'cap' and 'floor' columns to the metadata param. But while fitting the model, Prophet throws an error saying that for a logistic model, it expectcs 'cap' and 'floor' columns. How do i specify which columns in the dataframe should be used for 'cap' and 'floor' ?

When I looked at the code for ProphetEstimator fit function, only the 'time' column and 'y' column gets passed on to Prophet code, no additional columns are passed on.

def fit(self, X, y=None, time_col=TIME_COL, value_col=VALUE_COL, **fit_params):
       
        super().fit(X, y=y, time_col=time_col, value_col=value_col, **fit_params)

        if self.add_regressor_dict is None:
            fit_columns = [time_col, value_col]
        else:
            reg_cols = list(self.add_regressor_dict.keys())
            fit_columns = [time_col, value_col] + reg_cols

Is this a bug or is there a way to pass the 'cap' and 'floor' columns that I'm missing ?
I couldn't find an example on how to do this in the documentation.

Thanks !!

@al-bert
Copy link
Contributor

al-bert commented Jun 11, 2021

Hi Venkat, thank for bringing this up. You might be able to work around this by pretending 'cap' and 'floor' are regressors in one of your grid search candidates:

  1. Add 'cap' and 'floor' columns to your dataframe, extending them into the future forecast period.
  2. To your grid search candidates, add a hyperparameter set where 'cap' and 'floor' are regressors. You don't care about the result of this model; it's there only to make the columns available.

Let us know if this works.

@venk2k16
Copy link
Author

Thanks for the response @al-bert.

I did try adding the cap and floor as regressor. But in that case, Prophet threw an error saying that the name 'cap' and 'floor' are reserved.

Below is what I tried initially.

df.columns

Index(['ts', 'y', 'cap', 'floor'], dtype='object')

model_components = ModelComponentsParam(
    seasonality = seasonality,
    growth={"growth_term": ["logistic"]},
)

when I run with the above configuration, I'm getting the below error:

ValueError: Capacities must be supplied for logistic growth in column "cap"

And then I added 'cap' and 'floor' as regressors, similar to the example given here. The cap and floor column values extend into the future as well.

regressors=dict(
    add_regressor_dict={ 
        "cap": {
            "prior_scale": 10.0, 
            "mode": 'additive'  
        },
         "floor": {
            "prior_scale": 10.0,  
            "mode": 'additive'  
        }
    }
)


model_components = ModelComponentsParam(
    seasonality = seasonality,
    growth={"growth_term": ["logistic"]},
    regressors=regressors,
)

and now this gives me the error that 'cap' is a reserved name.

ValueError: Name 'cap' is reserved.

Is this same as what you suggested, Albert ?

@al-bert
Copy link
Contributor

al-bert commented Jun 15, 2021

Hi Venkat, thanks for those details. Prophet recognizes 'cap' and 'floor' as special column names, but we do not. Perhaps you could try forking the code and add 'cap' and 'floor' to fit_columns?

@venk2k16
Copy link
Author

Sure Albert, I'll give it a shot !!

@venk2k16
Copy link
Author

I did manage to find a temporary work around for the cap/floor issue. I'm passing the 'cap' and 'floor' columns as regressors, same as how I have defined in the previous comment, but in the ProphetEstimator.fit function I'm filtering out 'cap' and 'floor' columns from where the regressors are added to the model.

def fit(self, X, y=None, time_col=TIME_COL, value_col=VALUE_COL, **fit_params):
        ..........
        ..........
        ..........
        # if extra regressors are given, we add them to temporal features data
        # This implementation assumes that the regressor(s) are provided in time series df, alongside target column.
        if self.add_regressor_dict is not None:
            for reg_col, reg_params in self.add_regressor_dict.items():
                if reg_col != 'cap' and reg_col != 'floor':      # ---> Remove 'cap' and 'floor' from the regressor list.    
                    self.model.add_regressor(name=reg_col, **reg_params)`

This fix seems to work when I tested on my dataset. But this seems more like a temporary fix. I guess this should be better handled, like passing 'cap' and 'floor' column names via the MetadataParam configuration and these parameters should flow from there.

BTW, I don't see any contribution guidelines or public pull requests yet. Are you accepting any pull requests for now ?

@al-bert
Copy link
Contributor

al-bert commented Jun 21, 2021

Great to hear! We do plan to accept contributions and will provide guidelines to explain the process.

@al-bert
Copy link
Contributor

al-bert commented Aug 2, 2021

Contribution guidelines are available here: https://github.com/linkedin/greykite/blob/master/CONTRIBUTING.rst

@venk2k16
Copy link
Author

Thanks Albert. Was bit busy with work, didn't get time to look into this.

When I look into fbprophet's forecaster.py, the reserved names are defined as below:

       reserved_names = [
            'trend', 'additive_terms', 'daily', 'weekly', 'yearly',
            'holidays', 'zeros', 'extra_regressors_additive', 'yhat',
            'extra_regressors_multiplicative', 'multiplicative_terms',
        ]
        rn_l = [n + '_lower' for n in reserved_names]
        rn_u = [n + '_upper' for n in reserved_names]
        reserved_names.extend(rn_l)
        reserved_names.extend(rn_u)
        reserved_names.extend([
            'ds', 'y', 'cap', 'floor', 'y_scaled', 'cap_scaled'])

So apart from the forecast output columns, [ 'ds', 'y', 'cap', 'floor', 'y_scaled', 'cap_scaled'] are the reserved names within fbprophet. Of these, 'ds' and 'y' are already taken care of.

If we have a solution, similar to the one that I had proposed earlier, would that work ?

I would pass the cap and floor as additional regressors in the model config and in the ProphetEstimator.fit function I filter out the columns that matches the reserved columns from being passed on as a regressor. This way the 'cap' and 'floor' columns gets passed on the to the model, but now prophet won't throw an error that the column name is reserved.

Model config

regressors=dict(
    add_regressor_dict={ 
        "cap": {
            "prior_scale": 10.0, 
            "mode": 'additive'  
        },
         "floor": {
            "prior_scale": 10.0,  
            "mode": 'additive'  
        }
    }
)

prophet_estimator.py

prophet_res_columns = ['cap', 'floor', 'y_scaled', 'cap_scaled']
....

def fit(self, X, y=None, time_col=TIME_COL, value_col=VALUE_COL, **fit_params):
        ..........
        ..........
        ..........
        # if extra regressors are given, we add them to temporal features data
        # This implementation assumes that the regressor(s) are provided in time series df, alongside target column.
        if self.add_regressor_dict is not None:
            for reg_col, reg_params in self.add_regressor_dict.items():
                 if reg_col not in prophet_res_columns :      # ---> Remove the reserved column names from the regressor list
                      self.model.add_regressor(name=reg_col, **reg_params)`

....

Let me know if this work-around makes sense ? If yes, then I can make the change and submit a pull request ?

Thanks !!

samuelefiorini added a commit to samuelefiorini/greykite that referenced this issue Jun 14, 2023
@samuelefiorini
Copy link

It's been a while now, but I tried the solution above and it worked for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants