-
Notifications
You must be signed in to change notification settings - Fork 26.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocTests Speech] Add doc tests for all speech models #15031
[DocTests Speech] Add doc tests for all speech models #15031
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Let's take the opportunity to have the doctests passing on this PR before merging! (5 files failing now, but for simple issues I believe)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for tackling those! Could you develop what issue you had with black exactly?
…into start_cleaning_doc_tests
diff = clean_code != code | ||
if diff: | ||
print(f"Overwriting content of {code_file}.") | ||
with open(code_file, "w", encoding="utf-8", newline="\n") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to put them in some temp dir to run the tests and not overwrite the existing one (otherwise we will get modifications we don't want when running the doctest locally).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I made two commands now - one that adds the line - one that reverts it. Think this is more intuitive for local testing instead of copying the files and this way we also don't need to add a temp
path before every file to be tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document somewhere the instruction that has to be run before/after testing locally then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a long intro in the beginning of the file. @sgugger - do you think it could be a good idea to add a README.md to utils
so that community contributors have some docs on how to use the utils scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking more of making a section in the doc page for the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that would be good for me as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great to have those running again!
.github/workflows/doctests.yml
Outdated
- "tests/**" | ||
- ".github/**" | ||
- "templates/**" | ||
types: [assigned, opened, synchronize, reopened] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we decided to remove assigned from those @LysandreJik, can you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but Patrick mentioned that he'll remove that from the PR before merging as it only needs to run on schedule
... "hf-internal-testing/tiny-random-unispeech-sat" | ||
... ) | ||
>>> model = UniSpeechForPreTraining.from_pretrained("microsoft/unispeech-large-1500h-cv") | ||
>>> # TODO: Add full pretraining example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the doctest syntax while waiting for a tested example, but I don't think it's good to remove the example from the doc entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the problem is that it was just copy-pasted and never worked.
>>> # for contrastive loss training model should be put into train mode | ||
>>> model.train() | ||
>>> loss = model(input_values, mask_time_indices=mask_time_indices).loss | ||
>>> # TODO: Add full pretraining example |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the problem is that it was just copy-pasted and never worked.
- name: Clean files after doctests | ||
run: | | ||
python utils/prepare_for_doc_test.py src docs --remove_new_line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't be run if the previous instruction fails (but not sure we care)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but yeah I'm also not sure if it matters really as it's the only test run in this suite
diff = clean_code != code | ||
if diff: | ||
print(f"Overwriting content of {code_file}.") | ||
with open(code_file, "w", encoding="utf-8", newline="\n") as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document somewhere the instruction that has to be run before/after testing locally then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thank you for working on it @patrickvonplaten
.github/workflows/doctests.yml
Outdated
- "tests/**" | ||
- ".github/**" | ||
- "templates/**" | ||
types: [assigned, opened, synchronize, reopened] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah but Patrick mentioned that he'll remove that from the PR before merging as it only needs to run on schedule
@@ -35,8 +35,16 @@ jobs: | |||
run: | | |||
apt -y update && apt install -y libsndfile1-dev | |||
pip install --upgrade pip | |||
pip install .[dev] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will revert this once we have a stable dev
docker image
@@ -19,7 +19,7 @@ env: | |||
|
|||
jobs: | |||
run_doctests: | |||
runs-on: [self-hosted, docker-gpu, single-gpu] | |||
runs-on: [self-hosted, docker-gpu-test, single-gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leaving on -test
for now
What does this PR do?
This PR revives the doc tests and adds doc tests for all speech models now that the new docs are finished.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.