Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

version 0.4 holidays error: KeyError: '[] not found in axis' #821

Closed
tankle opened this issue Jan 23, 2019 · 14 comments
Closed

version 0.4 holidays error: KeyError: '[] not found in axis' #821

tankle opened this issue Jan 23, 2019 · 14 comments
Milestone

Comments

@tankle
Copy link

tankle commented Jan 23, 2019

the code in the version 0.3 of fbprophet is correct, but in the new version 0.4 throw the next exception.

pandas versio 0.23.0 fbprophet version 0.4.post2 python 3.6

Traceback (most recent call last): File "main.py", line 108, in <module> rlt = predict(vol, dataset) File "main.py", line 84, in predict predict_y = model.predict(forecast_df) File "D:\software\anaconda3\lib\site-packages\fbprophet\forecaster.py", line 1137, in predict seasonal_components = self.predict_seasonal_components(df) File "D:\software\anaconda3\lib\site-packages\fbprophet\forecaster.py", line 1252, in predict_seasonal_components self.make_all_seasonality_features(df) File "D:\software\anaconda3\lib\site-packages\fbprophet\forecaster.py", line 714, in make_all_seasonality_features holidays = self.construct_holiday_dataframe(df['ds']) File "D:\software\anaconda3\lib\site-packages\fbprophet\forecaster.py", line 447, in construct_holiday_dataframe all_holidays = all_holidays.drop(index_to_drop) File "D:\software\anaconda3\lib\site-packages\pandas\core\frame.py", line 3694, in drop errors=errors) File "D:\software\anaconda3\lib\site-packages\pandas\core\generic.py", line 3108, in drop obj = obj._drop_axis(labels, axis, level=level, errors=errors) File "D:\software\anaconda3\lib\site-packages\pandas\core\generic.py", line 3158, in _drop_axis raise KeyError('{} not found in axis'.format(labels)) KeyError: '[] not found in axis'

@bletham
Copy link
Contributor

bletham commented Jan 25, 2019

I haven't run into this myself yet, could you post an example dataframe and the code that produces this error?

@bletham bletham changed the title version 0.4 holidays error version 0.4 holidays error: KeyError: '[] not found in axis' Jan 25, 2019
@lambdu
Copy link

lambdu commented Feb 2, 2019

I ran into the same error (Pandas 0.23.0, fbprophet 0.4.post2, Python 2.7). It seems to occur in predict() when the holidays parameter is set when initializing the model and add_country_holidays method is not called.

xmas = pd.DataFrame({
        'holiday': 'xmas',
        'ds': pd.to_datetime(['2019-12-25', '2018-12-25', '2017-12-25', '2016-12-25', '2015-12-25']),
        'upper_window': 1,
        "lower_window": -1
})

...

nye = pd.DataFrame({
        'holiday': 'nye',
        'ds': pd.to_datetime(['2019-12-31', '2018-12-31', '2017-12-31', '2016-12-31', '2015-12-31']),
        'upper_window': 1,
        'lower_window': 0
})

holiday_dataframe = pd.concat((thanksgiving, laborday, memorialday, xmas, nye))

### No error

m = Prophet(holidays=holiday_dataframe) \
        .add_country_holidays("US")
        .fit(df)
future = m.make_future_dataframe(periods=look_forward, freq='D')
forecast = m.predict(future)

### KeyError: '[] not found in axis'

m = Prophet(holidays=holiday_dataframe) \
        .fit(df)
future = m.make_future_dataframe(periods=look_forward, freq='D')
forecast = m.predict(future)

The root cause of the error is that when these conditions are met, an empty list is passed into pandas.drop() in construct_holiday_dataframe due to the fact self.train_holiday_names and all_holidays will always be equal.

### forecaster.py ln 428 - 447

all_holidays = pd.DataFrame()
if self.holidays is not None:
    all_holidays = self.holidays.copy()
if self.country_holidays is not None: ### Won't be reached since add_country_holidays isn't called

...

if self.train_holiday_names is not None: ### first set when fit() is called
    # Remove holiday names didn't show up in fit
    index_to_drop = all_holidays.index[
        np.logical_not(
            all_holidays.holiday.isin(self.train_holiday_names)
        )
    ]
    all_holidays = all_holidays.drop(index_to_drop) ### error thrown here

I've only tested with Pandas 0.23.0, so I don't know if drop() behaves differently in other versions. Easiest solution would be just to check if index_to_drop is an empty list before passing it to Dataframe.drop().

@atifemreyuksel
Copy link

I took the same error when running code on the environment having Pandas 0.23.0, fbprophet 0.4.post2, Python 3.6.5.
I can run the same code without encountering any error in the environment having Pandas 0.23.4, , fbprophet 0.4.post2, Python 3.7 and also the environment having Pandas 0.18.1, fbprophet 0.2.1, Python 2.7.
The error occurs when predict function is called with the model having holidays parameter like as @lambdu 's error mentioned above.

@bletham
Copy link
Contributor

bletham commented Feb 8, 2019

@lambdu thanks for the great repro. Here is what I have found.

In pandas 0.23.0, this code produces the error:

import pandas as pd
import numpy as np
from fbprophet import Prophet

xmas = pd.DataFrame({
        'holiday': 'xmas',
        'ds': pd.to_datetime(['2019-12-25', '2018-12-25', '2017-12-25', '2016-12-25', '2015-12-25']),
        'upper_window': 1,
        "lower_window": -1
})
nye = pd.DataFrame({
        'holiday': 'nye',
        'ds': pd.to_datetime(['2019-12-31', '2018-12-31', '2017-12-31', '2016-12-31', '2015-12-31']),
        'upper_window': 1,
        'lower_window': 0
})
holiday_dataframe = pd.concat((xmas, nye))

df = pd.DataFrame({
    'ds': pd.date_range(start='2015-01-01', periods=100, freq='M'),
    'y': np.random.rand(100),
})

m = Prophet(holidays=holiday_dataframe).fit(df)
future = m.make_future_dataframe(periods=10, freq='D')
forecast = m.predict(future)

It only produces the error if you include multiple holidays. Both of these work:

m = Prophet(holidays=xmas).fit(df)
future = m.make_future_dataframe(periods=10, freq='D')
forecast = m.predict(future)
m = Prophet(holidays=nye).fit(df)
future = m.make_future_dataframe(periods=10, freq='D')
forecast = m.predict(future)

None of these produce the error in pandas 0.24.1.

I'm not quite sure what is going on yet, but will try to get this figured out tomorrow now that I am able to produce it.

@bletham bletham added the bug label Feb 8, 2019
@bletham
Copy link
Contributor

bletham commented Feb 9, 2019

OK the issue is from this bug in pandas pandas-dev/pandas#21494 which was fixed in 0.23.2.

Basically, it throws the error if you do df.drop([]), and df does not have unique indicies.

For instance, in pd 0.23.0:

import pandas as pd
df1 = pd.DataFrame({'a': [1, 2, 3]})
df2 = pd.DataFrame({'a': [4, 5, 6]})

df1.drop([])  # works

df3 = pd.concat((df1, df2))
df3.drop([])  # errors

This is because pd.concat keeps the indicies from the original dataframes, so we now do not have unique indicies.

We can mitigate this on the fbprophet side, but in the meantime, the workaround is very easy: Just reset the index on the holidays dataframe before passing it in to fbprophet. This code works in pandas 0.23.0:

import pandas as pd
import numpy as np
from fbprophet import Prophet

xmas = pd.DataFrame({
        'holiday': 'xmas',
        'ds': pd.to_datetime(['2019-12-25', '2018-12-25', '2017-12-25', '2016-12-25', '2015-12-25']),
        'upper_window': 1,
        "lower_window": -1
})
nye = pd.DataFrame({
        'holiday': 'nye',
        'ds': pd.to_datetime(['2019-12-31', '2018-12-31', '2017-12-31', '2016-12-31', '2015-12-31']),
        'upper_window': 1,
        'lower_window': 0
})
holiday_dataframe = pd.concat((xmas, nye))

holiday_dataframe = holiday_dataframe.reset_index()  # THE FIX

df = pd.DataFrame({
    'ds': pd.date_range(start='2015-01-01', periods=100, freq='M'),
    'y': np.random.rand(100),
})

m = Prophet(holidays=holiday_dataframe).fit(df)
future = m.make_future_dataframe(periods=10, freq='D')
forecast = m.predict(future)

@tankle
Copy link
Author

tankle commented Feb 11, 2019

Thanks @bletham and @lambdu .

@tankle tankle closed this as completed Feb 11, 2019
@bletham
Copy link
Contributor

bletham commented Feb 11, 2019

I'm going to leave this open until we push the fix (we should get it to work in 0.23.0).

@bletham bletham reopened this Feb 11, 2019
@bletham
Copy link
Contributor

bletham commented May 1, 2019

Requirement was updated to pandas 0.23.4 to avoid another not-backwards-compatible pandas change, so that will resolve this once pushed to pypi.

@bletham bletham added the ready label May 1, 2019
@bletham bletham added this to the v0.5 milestone May 1, 2019
@bletham
Copy link
Contributor

bletham commented May 21, 2019

Pushed to PyPI

@bletham bletham closed this as completed May 21, 2019
@pcko1
Copy link

pcko1 commented May 27, 2019

I am still getting "not found in axis" when using df.drop, even after updating to 0.24.2.

@bletham
Copy link
Contributor

bletham commented May 28, 2019

Could you post the full traceback for when you get the error so I can try and see what is happening? Or even better would be code that produces the error?

If you could also verify that you're using the latest version of fbprophet (0.5) that'd be great.

@elva4012
Copy link

please! how to fix this problem?

@bletham
Copy link
Contributor

bletham commented Jun 14, 2019

@elva4012 can you post code that generates the issue? And check the versions of pandas and fbprophet that you're using:

import pandas
print(pandas.__version__)
import fbprophet
print(fbprophet.__version__)

@Dougz00
Copy link

Dougz00 commented Feb 23, 2020

I borrowed the example code from https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html
and implemented locally (Spyder 3.7/pandas0.24.2 and still got "KeyError: "['B' 'C'] not found in axis":
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(12).reshape(3, 4),
columns=['A', 'B', 'C', 'D'])
print(df)
"""
A B C D
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
Drop columns
"""
print('Drop columns B abd C: \n')
df.drop(['B', 'C'], axis=1,inplace=True)
print(df)
"""
Should be:
A D
0 0 3
1 4 7
2 8 11

NB: newbie been doing pandas a few months.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants