New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Prophet Regressor #2242
Add Prophet Regressor #2242
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2242 +/- ##
=======================================
+ Coverage 99.9% 99.9% +0.1%
=======================================
Files 293 295 +2
Lines 26894 27055 +161
=======================================
+ Hits 26851 27012 +161
Misses 43 43
Continue to review full report at Codecov.
|
A current issue I'm facing regarding the After installation of the backend The path to this folder has to be assigned to the environment variable CMDSTAN. It can either be assigned to it separately as Prophet includes
If the backend is specified as
It appears to be looking for a file, This has been filed with Prophet. |
0b57023
to
d348666
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ParthivNaresh This is looking good! I have some broad comments on the installation process and how we can make that easier for users. Component logic looks good but I'll also take a closer look later!
@@ -66,11 +66,16 @@ jobs: | |||
! pip freeze | grep -E "xgboost|catboost|lightgbm|plotly|ipywidgets|category_encoders" | |||
exit $? | |||
- if: ${{ !matrix.core_dependencies }} | |||
name: Installing Dependencies | |||
name: Installing Dependencies and Prophet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we're installing our dependencies (which installs prophet since it's in our requirements file) and then we have to reinstall prophet with the right backend.
Although that works, I think we can simplify it a bit. I tried this out locally, and it seems to work (in the sense the unit tests pass)
pip install cmdstanpy==0.9.68
<compile cmdstan>
export CMDSTAN=<path-to-cmdstan>
export STAN_BACKEND=CMDSTANPY
pip install -r dev-requirements.txt
I think this also makes it easier to communicate to users how to properly install prophet with the right backend.
This reminds me, we should add a section to the install page in the docs about prophet installation! I'm hoping it can be this:
pip install cmdstanpy==0.9.68
<compile cmdstan>
export CMDSTAN=<path-to-cmdstan>
export STAN_BACKEND=CMDSTANPY
pip install evalml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, hopefully we should be able to remove this once we can get the cmdstan-ext wheel up for an easy pip install
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update on this: we're not including Prophet in requirements because it's being installed as an extra-requirement in setup.py
. Therefore we'll be including the entire installation process for Prophet
for git-test-automl
and git-test-other
.
holidays_prior_scale=10, | ||
seasonality_mode="additive", | ||
random_seed=0, | ||
stan_backend="CMDSTANPY", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't meant to be changed right? I thought the plan was to only support prophet with the CMDSTANPY backend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need this to be specified so we can unit test it. Unfortunately the parameter doesn't get included in the attributes list unless it's passed into the Prophet component
make installdeps-test | ||
pip uninstall pystan -y | ||
pip freeze | ||
- if: ${{ !matrix.core_dependencies && (matrix.command == 'git-test-modelunderstanding' || matrix.command == 'git-test-dask') }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't install prophet for non-core dependencies and if the tests are git-test-modelunderstanding
and git-test-dask
because they don't need it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, Parthiv! I think that addressing @freddyaboulton 's concerns about the tests taking 15s each is very important to not bloating the CI time. If you can reduce those and ping me with the results, I'd love to follow up. Additionally, with respect to the suppression of stdout, I went through the ref and it seemed like some people had issues crop up with "running out of file descriptors" or something of the like. For the sake of getting the Prophet regressor in, I'm willing to accept the risk of some weirdness here, but I'd like an issue filed to try and address this differently post-merge as I think this has the potential of being tech debt with a high interest rate, ala the weirdness going on with the matplotlib pngs.
evalml/pipelines/components/estimators/regressors/prophet_regressor.py
Outdated
Show resolved
Hide resolved
setup.py
Outdated
from setuptools import find_packages, setup | ||
|
||
with open("README.md", "r") as fh: | ||
long_description = fh.read() | ||
|
||
extras_require = { | ||
'update_checker': ['alteryx-open-src-update-checker >= 2.0.0'], | ||
'prophet': ['cmdstan-builder == 0.0.3'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solid work @ParthivNaresh ! I think this looks great. I mainly have some questions about the install process that I want to resolve before merge.
@@ -1297,6 +1312,8 @@ def test_estimators_accept_all_kwargs( | |||
) | |||
if estimator_class.model_family == ModelFamily.ENSEMBLE: | |||
params = estimator.parameters | |||
elif estimator_class.model_family == ModelFamily.PROPHET: | |||
params = estimator.get_params() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would using estimator.parameters
work for prophet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that would work. I'm testing this in particular because Prophet has its own dictionary representation that covers a wide range of parameters being passed into the component.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Thanks for explaining how things work in the writeups/comments/calls! Glad we're able to add this to EvalML!
628fb6e
to
9ec1cf4
Compare
Fixes #1499
Thanks to a conversation with @dsherry , @kmax12 , and @tyler3991, we made the decision to include Prophet alongside the
cmdstanpy
backend. The reasoning behind this is that because we aren't distributing Prophet code (simply requiring the user to install it as a dependency) we are in the same position as Prophet is regarding its use ofpystan 2
and its GPLv3 license. As we aren't distributing the underlying code, we (and Prophet) aren't required to be under the same copyleft license.However, because we have to think downstream to Tempo and whether it will be installed on-premises (which would require packaging up dependencies and distributing them), we can't rely on
pystan 2
as a backend. Because Prophet doesn't supportpystan 3
yet (which uses a permissive license), we will have to rely oncmdstanpy
.For more information read here
This PR only deals with adding the estimator so we can get it into main. I'll be raising another PR to add it to AutoML alongside perf tests.