Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add SciPy tutorial dataset to fairlearn.datasets #1086

Merged
merged 12 commits into from
Jul 12, 2022

Conversation

rensoostenbach
Copy link
Contributor

@rensoostenbach rensoostenbach commented May 6, 2022

This PR adds the SciPy tutorial dataset to fairlearn.datasets, as discussed in #1066

Completing this item requires:

  • implement a function fairlearn.datasets.fetch_... (we need to come up with a name)
  • adjust relevant unit tests in test.unit.datasets
  • descriptive API reference (directly in the docstring)
  • a user guide in docs.user_guide.datasets

Description

I have implemented the fetch function as fairlearn.datasets.fetch_diabetes_hospital, but ofcourse it would be easy to rename it to something else if prefered. I wasn't sure how detailed the User Guide should be, and also what parts should be included in the user guide. If there is anything that would be valuable to include that isn't in there yet, I would be happy to write that as well.

Tests

  • existing tests adjusted

Documentation

  • user guide added or updated
  • API docs added or updated

Screenshots

api docs screenshot
user guide

@rensoostenbach
Copy link
Contributor Author

I still need to write the User Guide for this PR, I will start next week with that.
I am not sure why flake8 is giving the following error at line 13:

D412 No blank lines allowed between a section header and its content

That line and the surrounding lines are done in the same way as in the other _fetch_dataset.py files.

fairlearn/datasets/_fetch_diabetes_hospital.py Outdated Show resolved Hide resolved
fairlearn/datasets/_fetch_diabetes_hospital.py Outdated Show resolved Hide resolved
fairlearn/datasets/_fetch_diabetes_hospital.py Outdated Show resolved Hide resolved
Copy link
Contributor

@hildeweerts hildeweerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small comments!

fairlearn/datasets/_fetch_diabetes_hospital.py Outdated Show resolved Hide resolved
fairlearn/datasets/_fetch_diabetes_hospital.py Outdated Show resolved Hide resolved
test/unit/datasets/test_datasets.py Outdated Show resolved Hide resolved
@hildeweerts
Copy link
Contributor

Tagging @LeJit and @MiroDudik who may have thoughts on what (else) should be in the user guide.

@hildeweerts hildeweerts requested a review from MiroDudik May 9, 2022 13:07
Co-authored-by: Hilde Weerts <24417440+hildeweerts@users.noreply.github.com>
@codecov
Copy link

codecov bot commented May 11, 2022

Codecov Report

❗ No coverage uploaded for pull request base (main@d5950c0). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 42d98f0 differs from pull request most recent head 971a349. Consider uploading reports for the commit 971a349 to get more accurate results

@@          Coverage Diff           @@
##             main   #1086   +/-   ##
======================================
  Coverage        ?   0.00%           
======================================
  Files           ?      51           
  Lines           ?    2180           
  Branches        ?       0           
======================================
  Hits            ?       0           
  Misses          ?    2180           
  Partials        ?       0           
Flag Coverage Δ
unittests 0.00% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5950c0...971a349. Read the comment docs.

@LeJit
Copy link
Contributor

LeJit commented May 12, 2022

It may be worthing including a link to the full preprocessing.py script, in case any users are interested in all the changes we made to the original dataset.

@rensoostenbach
Copy link
Contributor Author

It may be worthing including a link to the full preprocessing.py script, in case any users are interested in all the changes we made to the original dataset.

It is actually there already in line 35 of diabetes_hospital_data.rst, or would you (also) like to have it included somewhere else? @LeJit

@LeJit
Copy link
Contributor

LeJit commented May 16, 2022

@rensoostenbach I was thinking more about referencing it in the docstring of fetch_diabetes_hospital in the paragraph where you talk about the SciPy tutorial.

@adrinjalali
Copy link
Member

Merged with the latest main to have your docs rendered in our CI. You should do a git pull to have this on your local copy before you apply more changes.

@adrinjalali
Copy link
Member

🤦🏼 OpenML's SSL certificate is expired, they're working on it.

Copy link
Contributor

@hildeweerts hildeweerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is ready to get merged!

@adrinjalali adrinjalali merged commit b77c72d into fairlearn:main Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants