Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot set l1_ratio as a list when using Elastic Net #52

Closed
dromare opened this issue Sep 15, 2021 · 2 comments
Closed

Cannot set l1_ratio as a list when using Elastic Net #52

dromare opened this issue Sep 15, 2021 · 2 comments

Comments

@dromare
Copy link
Contributor

dromare commented Sep 15, 2021

Hello there,

I get an error when running Greykite with the Elastic Net algorithm and the l1_ratio parameter set up as a list of floats [.1, .5, .7, .9, .95, .99, 1] rather than as a single float number:

Capture2

Capture3

The Scikit Learn link https://scikit-learn.org/0.24/modules/generated/sklearn.linear_model.ElasticNetCV.html#sklearn.linear_model.ElasticNetCV says the following:
...This parameter can be a list, in which case the different values are tested by cross-validation and the one giving the best prediction score is used. Note that a good choice of list of values for l1_ratio is often to put more values close to 1 (i.e. Lasso) and less close to 0 (i.e. Ridge), as in [.1, .5, .7, .9, .95, .99, 1].

I would like to know if there is a workaround other than setting up a grid search and CV validation outside the ElasticNetCV() framework, for example:

_cv_max_splits = 5

Grid search is possible

custom = dict(
fit_algorithm_dict=[
dict(
fit_algorithm="elastic_net",
fit_algorithm_params={
"l1_ratio": 0.7
}
),
dict(
fit_algorithm="elastic_net",
fit_algorithm_params={
"l1_ratio": 0.9
}
),
]
)_

Thank you for the good work !

Best regards,
Dario

@sayanpatra
Copy link
Contributor

@dromare It ran fine with the list input of l1_ratio on dummy data. Here is my code:

import pandas as pd
import plotly

from linkedin.greykite.common.data_loader import DataLoader
from linkedin.greykite.framework.templates.autogen.forecast_config import ForecastConfig
from linkedin.greykite.framework.templates.autogen.forecast_config import MetadataParam, EvaluationPeriodParam, EvaluationMetricParam, ModelComponentsParam
from linkedin.greykite.framework.templates.forecaster import Forecaster
from linkedin.greykite.framework.templates.model_templates import ModelTemplateEnum
from linkedin.greykite.framework.utils.result_summary import summarize_grid_search_results
from linkedin.greykite.common.evaluation import EvaluationMetricEnum

# Loads dataset into pandas DataFrame
dl = DataLoader()
df = dl.load_peyton_manning()

# subsetting for faster runtime
df = df.iloc[1:100]
# augmenting the training and testing dataset with 0
df["y"][90:100]=0

# specify dataset information
metadata = MetadataParam(
 time_col="ts",  # name of the time column ("date" in example above)
 value_col="y",  # name of the value column ("sessions" in example above)
 freq="D"  # "H" for hourly, "D" for daily, "W" for weekly, etc.
           # Any format accepted by `pandas.date_range`
)

evaluation_period = EvaluationPeriodParam(
    cv_max_splits=0
)

evaluation_metric = EvaluationMetricParam(
    cv_selection_metric=EvaluationMetricEnum.RootMeanSquaredError.name
)

model_components = ModelComponentsParam(
    custom = dict(
        fit_algorithm_dict=dict(
            fit_algorithm="elastic_net",
            fit_algorithm_params = dict(
                l1_ratio=[.1, .5, .7, .9, .95, .99, 1]
            )
        )
    )
)

config = ForecastConfig(
    model_template=ModelTemplateEnum.SILVERKITE.name,
    forecast_horizon=7,  # forecasts 365 steps ahead
    coverage=0.95,         # 95% prediction intervals
    metadata_param=metadata,
    evaluation_period_param=evaluation_period,
    evaluation_metric_param=evaluation_metric,
    model_components_param=model_components
)

forecaster = Forecaster()
result = forecaster.run_forecast_config(  # result is also stored as `forecaster.forecast_result`.
    df=df,
    config=config
)

If you can post your codes I can take a look.

@dromare
Copy link
Contributor Author

dromare commented Sep 29, 2021

Hi sayanpatra,

If I print out the values of alpha and l1_ratio in lines 378-379 of C:\ProgramData\Anaconda3\envs\greykite-venv\lib\site-packages\greykite\algo\common\model_summary_utils.py in add_model_df_lm(info_dict) I get the following:

alpha = 160.12536709617518
l1_ratio = [0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1]

image

Then in line 384 the calculation 1 - l1_ratio throws the following error:
TypeError: unsupported operand type(s) for -: 'int' and 'list'

because l1_ratio is of type list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants