Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify graph_json output and fix bugs #3049

Merged
merged 11 commits into from
Nov 17, 2021
Merged

Conversation

eccabay
Copy link
Contributor

@eccabay eccabay commented Nov 15, 2021

Closes #3038

Also fixes two bugs that prevented some pipelines from getting JSON serialized:

  • Ensemble components store an entire estimator object as the "final_estimator" parameter. This code saves that estimator as "final_estimator_name" and "final_estimator_parameters" instead.
  • JSON cannot serialize the np.int64 datatype, which is the type returned by Integer ranges for hyperparameter searches. This can be changed within the Integer() call, but that was too many places across the code base for this PR, so I left the fix within graph_json itself. In the future, however, it might be better practice to move over to having Integer generate int types instead of np.int64.

@codecov
Copy link

codecov bot commented Nov 15, 2021

Codecov Report

Merging #3049 (a06865b) into main (a20dcd1) will decrease coverage by 0.8%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #3049     +/-   ##
=======================================
- Coverage   99.8%   99.0%   -0.7%     
=======================================
  Files        312     312             
  Lines      30232   30255     +23     
=======================================
- Hits       30144   29935    -209     
- Misses        88     320    +232     
Impacted Files Coverage Δ
evalml/tests/automl_tests/test_automl.py 99.5% <ø> (-<0.1%) ⬇️
evalml/pipelines/pipeline_base.py 98.5% <100.0%> (+0.1%) ⬆️
evalml/tests/pipeline_tests/test_graphs.py 100.0% <100.0%> (ø)
evalml/automl/pipeline_search_plots.py 17.9% <0.0%> (-82.1%) ⬇️
...l/tests/automl_tests/test_pipeline_search_plots.py 25.6% <0.0%> (-74.4%) ⬇️
...ests/automl_tests/test_automl_search_regression.py 84.6% <0.0%> (-15.4%) ⬇️
.../automl_tests/test_automl_search_classification.py 89.8% <0.0%> (-10.2%) ⬇️
evalml/tests/automl_tests/test_automl_utils.py 90.6% <0.0%> (-9.4%) ⬇️
...lml/tests/automl_tests/test_iterative_algorithm.py 92.4% <0.0%> (-7.6%) ⬇️
evalml/automl/utils.py 98.4% <0.0%> (-1.6%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a20dcd1...a06865b. Read the comment docs.

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great thanks for addressing this!

evalml/pipelines/pipeline_base.py Show resolved Hide resolved
Copy link
Contributor

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eccabay Thanks for adding this!

I still prefer letting users encode the output themselves as opposed to erroring out on some pipelines but I'm ok with the implementation you have now. If we decide to always try to encode to JSON, then it may be good to incorporate @angela97lin 's custom encoder so that we can catch tricky bools or other data types. That can happen in a follow up PR though.

* Changes
* Updated the ``Pipeline.graph_json`` function to return a dictionary of "from" and "to" edges instead of tuples :pr:`3049`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might technically be a breaking change?

@eccabay eccabay merged commit 51c8914 into main Nov 17, 2021
@eccabay eccabay deleted the 3038_graph_json_simplify branch November 17, 2021 18:52
@chukarsten chukarsten mentioned this pull request Nov 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Simplify output of Pipeline.graph_json() function
4 participants