Skip to content

Enable Thresholding for Time Series Binary#3140

Merged
ParthivNaresh merged 22 commits intomainfrom
EnableThresholdingForBinaryTS
Dec 15, 2021
Merged

Enable Thresholding for Time Series Binary#3140
ParthivNaresh merged 22 commits intomainfrom
EnableThresholdingForBinaryTS

Conversation

@ParthivNaresh
Copy link
Contributor

Fixes #3095

@codecov
Copy link

codecov bot commented Dec 9, 2021

Codecov Report

Merging #3140 (44dcab5) into main (80aa901) will decrease coverage by 0.1%.
The diff coverage is 98.2%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3140     +/-   ##
=======================================
- Coverage   99.7%   99.7%   -0.0%     
=======================================
  Files        318     318             
  Lines      30908   30948     +40     
=======================================
+ Hits       30804   30843     +39     
- Misses       104     105      +1     
Impacted Files Coverage Δ
evalml/pipelines/pipeline_base.py 98.5% <ø> (ø)
evalml/pipelines/time_series_pipeline_base.py 100.0% <ø> (ø)
.../tests/pipeline_tests/test_time_series_pipeline.py 99.8% <ø> (ø)
evalml/tests/conftest.py 96.2% <97.4%> (+0.1%) ⬆️
evalml/automl/engine/engine_base.py 100.0% <100.0%> (ø)
evalml/automl/utils.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_automl_utils.py 100.0% <100.0%> (ø)
evalml/tests/automl_tests/test_engine_base.py 100.0% <100.0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 80aa901...44dcab5. Read the comment docs.

# Conflicts:
#	evalml/pipelines/time_series_pipeline_base.py
#	evalml/tests/automl_tests/test_automl_utils.py
#	evalml/tests/pipeline_tests/test_time_series_pipeline.py
@ParthivNaresh ParthivNaresh marked this pull request as ready for review December 10, 2021 18:46
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ParthivNaresh Looks good to me! Thank you for making this change. The one thing I want to square away before merge is whether we should be tuning on forecast_horizon number of obs or use predict_proba_in_sample.

automl_config.optimize_thresholds
and pipeline.can_tune_threshold_with_objective(threshold_tuning_objective)
):
test_size_ = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should use predict_proba_in_sample rather than predict_proba.

My thoughts are:

  • The target is known during search so we don't have to worry about the forecast horizon
  • forecast horizon is probably less than 20% of the data and usually will be small I think. I wonder if that's enough data to find a good threshold.


def __init__(self, parameters, random_seed=0):
def __init__(
self, parameters, custom_name=None, component_graph=None, random_seed=0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a cosmetic change right?

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ParthivNaresh ParthivNaresh merged commit 93444ec into main Dec 15, 2021
@angela97lin angela97lin mentioned this pull request Dec 22, 2021
@freddyaboulton freddyaboulton deleted the EnableThresholdingForBinaryTS branch May 13, 2022 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable Threshold Optimization for Binary Time Series Pipelines

3 participants

Comments