Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sphinx-autoapi to generate api reference #2458

Merged
merged 25 commits into from Jul 20, 2021

Conversation

freddyaboulton
Copy link
Contributor

@freddyaboulton freddyaboulton commented Jun 29, 2021

Pull Request Description

Fixes #601

Fixes #1877

See this doc for the pros and cons of making this change.


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request by adding :pr:123.

@codecov
Copy link

codecov bot commented Jun 29, 2021

Codecov Report

Merging #2458 (c2be105) into main (87df494) will increase coverage by 0.1%.
The diff coverage is 100.0%.

Impacted file tree graph

@@           Coverage Diff           @@
##            main   #2458     +/-   ##
=======================================
+ Coverage   99.7%   99.7%   +0.1%     
=======================================
  Files        283     283             
  Lines      25897   25903      +6     
=======================================
+ Hits       25796   25802      +6     
  Misses       101     101             
Impacted Files Coverage Δ
evalml/automl/automl_search.py 99.4% <ø> (-<0.1%) ⬇️
evalml/data_checks/sparsity_data_check.py 100.0% <ø> (ø)
evalml/model_understanding/force_plots.py 100.0% <ø> (ø)
...alml/objectives/binary_classification_objective.py 100.0% <ø> (ø)
evalml/pipelines/binary_classification_pipeline.py 100.0% <ø> (ø)
...lines/components/ensemble/stacked_ensemble_base.py 100.0% <ø> (ø)
...components/ensemble/stacked_ensemble_classifier.py 100.0% <ø> (ø)
.../components/ensemble/stacked_ensemble_regressor.py 100.0% <ø> (ø)
...ents/estimators/classifiers/baseline_classifier.py 100.0% <ø> (ø)
...ents/estimators/classifiers/catboost_classifier.py 100.0% <ø> (ø)
... and 54 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 87df494...c2be105. Read the comment docs.

@@ -0,0 +1,60 @@
{% if obj.display %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to get the docstrings for hyperparameter_ranges etc to show up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is probably related to this :d

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch with the original comment. I just pushed up a change to fix this! This is how it looks locally now:

image

Made the same change for the linear regressor!

{% if obj.name.split('.')[-1][0] == '_' %}
{{ obj.name }}
{% elif obj.name.split('.')|length <= 2 %}
{{ (' '.join(obj.name.split('.')[-1].split('_'))|title) }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This template is to modify the display names of the modules, e.g. evalml.model_understanding to Model Understanding

@@ -124,9 +124,6 @@ class AutoMLSearch:

_MAX_NAME_LEN = 40

# Necessary for "Plotting" documentation, since Sphinx does not work well with instance attributes.
plot = PipelineSearchPlots
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angela97lin
Copy link
Contributor

angela97lin commented Jul 6, 2021

Just a quick comment but the "Show Value" dropdown is interesting yet quite confusing 😂
For example, for the EN Classifier I see this, which looks a little odd?:

(https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/pipelines/components/index.html#evalml.pipelines.components.ElasticNetClassifier)

EDIT: Ah, I see others look fine so it must have been a formatting issue, but this could be something to keep in mind as we have to manually update parameters in our docs and may run into the risk of these funky looks without warning 😛

@bchen1116
Copy link
Contributor

Some of the formatting in the docs is weird, like here, here, here, and here where the description is pushed far to the side in comparison to the rest of the docstring.

Also I'm not sure if I like the look of the docs, like here. It feels a little overwhelming and kinda messy, even though it does include a lot of info, but if other people are fine with it, then it's fine!

@freddyaboulton
Copy link
Contributor Author

@bchen1116 Great points! Regarding the description being pushed to the side, I can make it so that the function arguments are not displayed in the api reference page but only when users click on them. That matches the behavior we say on latest/stable:

image

Regarding FraudCost looking messy, do you have any suggestions for improvement? I can remove the Bases: evalml.objectives.binary_classification_objective.BinaryClassificationObjective which isn't in latest or stable. Anything else?

@bchen1116
Copy link
Contributor

@freddyaboulton The fixes you suggested sound good!

For me, the main thing that makes it harder to follow is the fact that the attributes are the same size / bigger than the class:
image

For instance, above where the font size of name or greater_is_better is the same as FraudCost. Very nitpicky, but it doesn't feel as clean or easy to read. What do you think?

Copy link
Contributor

@chukarsten chukarsten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with this if we get the stated performance benefit. I think that makes it worth it. I do think it looks kind of weird and the documentation of the default values of the class variables is strange. It will take some getting used to.

~~~~~~~~~

.. autoapisummary::

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I legit don't know how to review this lol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relatable

@@ -57,7 +57,7 @@ def gen_force_plot(shap_values, training_data, expected_value, matplotlib):


def force_plot(pipeline, rows_to_explain, training_data, y):
"""Function to generate the data required to build a force plot.
r"""Function to generate the data required to build a force plot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this 'r' is here? I have no idea how I caught this.

Comment on lines +31 to +63
"""[
ProblemTypes.BINARY,
ProblemTypes.MULTICLASS,
ProblemTypes.TIME_SERIES_BINARY,
ProblemTypes.TIME_SERIES_MULTICLASS,
]"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really weird and I'm not sure I like it. It adds a lot of bloat, it seems. Probably still worth the performance benefit, but will take some getting used to.

@angela97lin
Copy link
Contributor

@freddyaboulton Great work figuring out how to get this package to work, and putting up the doc of pros and cons! It makes this much more clear to review 😁

I think what I say might need to be taken with a grain of salt, because I personally don't like how this change looks 😅 As we talked about in OH, this feels very developer-focus, with focus on packages and modules as the way to organize things. Still, as you pointed out, there's really cool benefits to this. In particular, I really like the decrease in runtime (though at this point, what runs slower? Docs or windows tests? Are we still limited by the runtime of window tests?), that docs are generated automatically (since I know I'm not the only one guilty of not adding the most comprehensive docstrings all the time 😛), and the new component class attributes showing up as "code" (though I like less that it requires clicking the dropdown and "multiline value" is not as informative lol). I also like that the autogenerated docs specify what are properties, abstract methods, etc.

Here are some more specific details:

image

  • I agree that the dropdown looks nicer in your PR but arguably, what we have on the right side of our current docs is comparable :d
    image
  • I honestly find the dump of everything in a class overwhelming LOL as opposed to our current behavior, which simply lists each of the methods. I looked at scikit-learn's docs (ex: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html) and realized they also list everything, but for some reason it's less overwhelming... so my guess is, it's the formatting. Before, we take advantage of lines as visible borders to separate methods. Now, we use whitespace. Given the bold and larger font we use for method/property/attribute names, it feels like a lot is thrown on the page at once. Is there a way to change the formatting more? (I know I'm asking for a lot now lolol). Maybe if there's a way to list all of the methods and then allow clicking to bring you to the method details like scikit-learn does, that could also be helpful as a way to summarize methods?
  • I'm not a fan of how because everything is loaded, clicking on something just brings you to the index. For example, clicking on MulticlassClassificationObjective brings you to the top of the MulticlassClassificationObjective class, but since it's tiny, most of my screen actually shows the ObjectiveBase class. That's a bit confusing 😂 (https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/objectives/index.html#evalml.objectives.MulticlassClassificationObjective)
    image

Other specifics:

Oddly enough, I actually like that our current docs mean that we have to think really specifically about how our code is structured, and what we want to display to users. I realized this while looking at https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/tuners/index.html#evalml.tuners.Tuner: it lists the Exceptions we added to tuner_exception.py, I'm guessing, right smack in the middle of the page of all of the tuner classes because that's how our code is organized (tuner.py --> tuner_exception.py --> random...). This makes sense from a code perspective, sure, but would be a scavenger hunt for a user to find 😂

Again, take what I'm saying with a grain of salt--my dislike for the look might have me nitpicking at a lot of random things 😂

@freddyaboulton
Copy link
Contributor Author

freddyaboulton commented Jul 12, 2021

@angela97lin @bchen1116 @chukarsten Thanks for the detailed feedback! I think I've addressed most of your specific points, detailed below. Before reviewing the docs clear your browser cache because I updated the css file and if your browser cached it, you won't see the changes, namely method/function name code highlighting.

The drop-down menu for class attributes adds bloat/is non-itutitive

  • We're no longer using the drop down menu for class attributes.

Methods now display parameters, which means the description width may vary or not show up at all

  • Methods and functions no longer display parameters, which is what our current docs do!

Method/function order in classes or modules is messy

  • Methods and functions are now displayed in alphabetical order, which is what our current docs do! The contents in a module are also listed in alphabetical order now too.

Displaying all methods in a class looks messy. The font size of the class attributes is so big that things also look messy.

  • We're now listing all methods in a table with a link to the full method documentation. This is what sklearn does. We also display all the attributes in a table so that the font size is smaller. I also added code highlighting to the method and class names to break things up a little bit more. This is also what sklearn does.

image

I'm not a fan of how because everything is loaded, clicking on something just brings you to the index. For example, clicking on MulticlassClassificationObjective brings you to the top of the MulticlassClassificationObjective class, but since it's tiny, most of my screen actually shows the ObjectiveBase class.

  • That was because before we were not documenting inherited methods/attributes. Now we are. Clicking on the multiclassification objective link will show all the methods/attributes for the objective. It doesn't look as confusing as before! See the image I linked above.

Still tweaking some stuff but let me know if this is in the right direction!

@angela97lin
Copy link
Contributor

@freddyaboulton Only took a quick peek but I think it looks much cleaner already 🤩

@bchen1116
Copy link
Contributor

@freddyaboulton the docs look great! Thanks for making the changes, I think this flow is much cleaner and easier to follow! Will look more thoroughly tomorrow

Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't quite sure how to review the rst files, but the docs look great! Thanks for cleaning them up so much and putting in the work to make them much cleaner and easy to follow!

Left a couple comments about things we could delete, but otherwise 🚢

@@ -60,7 +59,13 @@
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
#templates_path = ["_templates"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we delete this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! Thanks for pointing this out!

@@ -213,8 +221,8 @@
}

# autoclass_content = 'both'
autosummary_generate = ["api_reference.rst"]
templates_path = ["_templates"]
#autosummary_generate = ["api_reference.rst"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and these?

Copy link
Contributor

@ParthivNaresh ParthivNaresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this, and after all the improvements based on the others' suggestions, I think this is solid.
I had a couple of questions about the formatting.
For hyperparameter_ranges for the transformers, I think the format is getting a little messed up:
https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/pipelines/components/index.html#evalml.pipelines.components.SimpleImputer
https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/pipelines/components/index.html#evalml.pipelines.components.TargetImputer
https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/pipelines/components/index.html#evalml.pipelines.components.PolynomialDetrender

Additionally, Exceptions have a method called with_traceback that seems to be missing the second half of its docstring with_traceback Exception.with_traceback(tb) -
https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/exceptions/index.html#evalml.exceptions.MissingComponentError.with_traceback
https://feature-labs-inc-evalml--2458.com.readthedocs.build/en/2458/autoapi/evalml/tuners/tuner_exceptions/index.html
I think it's because they're inherited from the BaseException class in builtins and the new line is confusing Sphinx:

"""
     Exception.with_traceback(tb) --
          set self.__traceback__ to tb and return self.
"""

@freddyaboulton freddyaboulton merged commit 501819c into main Jul 20, 2021
@freddyaboulton freddyaboulton deleted the 601-use-sphinx-autoapi branch July 20, 2021 19:51
@freddyaboulton
Copy link
Contributor Author

@ParthivNaresh Thanks for pointing out those issues! I fixed the formatting for the hyperparameter ranges in the components you listed. I also modified the template to not display with_traceback and args for exceptions. Those were being pulled in from the base class and don't add a lot of info imo.

This is what it looks like now! You'll need to clear your browser cache so that you pull in the new css:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rename Objective Functions to Objectives in docs Use sphinx-autoapi to build API modules statically
5 participants