Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc page and example for shift detection #244

Merged
merged 13 commits into from
May 24, 2023
Merged

Add doc page and example for shift detection #244

merged 13 commits into from
May 24, 2023

Conversation

lballes
Copy link
Contributor

@lballes lballes commented May 23, 2023

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@lballes lballes requested a review from 610v4nn1 May 23, 2023 12:32
@lballes lballes linked an issue May 23, 2023 that may be closed by this pull request
@github-actions
Copy link

Coverage report

The coverage rate went from 85.68% to 85.76% ⬆️

None of the new lines are part of the tested code. Therefore, there is no coverage data about them.

:py:class:`~renate.shift.detector.ShiftDetector`, which defines the main interface. Once a
:code:`detector` object has been initialized, one calls :code:`detector.fit(dataset_ref)` on a
reference dataset (a PyTorch dataset object). This reference dataset characterizes the expected
data distribution. It may, e.g., be the validation set used during the previous fitting of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dataset used for training would be easier to understand for inexperienced readers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant, because it would be dangerous to use the actual training set if the feature extractor has seen that.

doc/getting_started/shift_detection.rst Show resolved Hide resolved
doc/getting_started/shift_detection.rst Show resolved Hide resolved
extractor.

.. literalinclude:: ../../examples/shift_detection/image_shift_detection.py
:caption: Example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be good to explain this example a bit more. At least break it down in a few pieces capturing the main aspects: created dataset, sample reference set (why?), sampler query set, extract features and perform test.
It doesn't need to explain everything but a couple of sentences explaining why certain things are done will greatly help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments in the example script explain those things. I can expand them a bit. Or would you prefer having these explanations separate from the code?

@lballes lballes requested a review from 610v4nn1 May 23, 2023 13:29
@lballes lballes marked this pull request as ready for review May 23, 2023 13:40
@lballes lballes merged commit f3da9e4 into dev May 24, 2023
19 checks passed
@lballes lballes deleted the lballes-doc-shift branch May 24, 2023 08:45
lballes added a commit that referenced this pull request May 24, 2023
* Add NLP Components to Benchmarking (#213)

* Robust Integration Tests (#214)

* Update Renate Config Example (#226)

* Make Wild Time Available in Benchmarking (#187)

* Fix `target_column` bug in `HuggingFaceTextDataModule` (#233)

* Add MMD covariate shift detector (#237)

* Add KS covariate shift detector (#242)

* Update dependabot.yml (#248)

* Update versions of some requirements (#247)

* Add doc page and example for shift detection (#244)

* Bump version (#252)

---------

Co-authored-by: Lukas Balles <lukas.balles@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tools for detecting shifts in data distribution
2 participants