Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use consistent filepath pattern matching across the board #39

Merged
merged 7 commits into from
Mar 30, 2022

Conversation

mchaaler
Copy link
Contributor

Fix issue #36

@mchaaler
Copy link
Contributor Author

mchaaler commented Mar 19, 2022

Sorry I'm not familiar enough with Git/Github processes.
Although I created a new branch to specifically address issue #36, it appears that I did not first 'exclude' the commit addressing issue #37 (97a4516), which was already proposed in PR #38.

Anyway...
The build is now failing on Windows as well as on linux.

@smarie
Copy link
Owner

smarie commented Mar 21, 2022

Although I created a new branch to specifically address issue #36, it appears that I did not first 'exclude' the commit addressing issue #37 (97a4516), which was already proposed in PR #38.

Since I now merged #38, you can simply update this branch (pull from original/main) and the commits should disappear from the displayed diff here

@smarie
Copy link
Owner

smarie commented Mar 21, 2022

The build is now failing on Windows as well as on linux.

Nice ! I'll modify the github workflow to

  • add the test site generation to the tests session for all runners (not only 3.7)
  • enable windows

I'll do this in another PR, but once it is done we should be able to see it fail correctly :)

@smarie
Copy link
Owner

smarie commented Mar 21, 2022

Done. Can you please update this branch, so that we can see if now the windows workers fail too ?

@mchaaler
Copy link
Contributor Author

Success! All tests are now failing! 🤪

@smarie
Copy link
Owner

smarie commented Mar 22, 2022

Cool ! Thanks @mchaaler .
Now the easy part :) fixing the issue. Do you want to give it a try ?

@mchaaler
Copy link
Contributor Author

Do you want to give it a try ?

Sure!

@mchaaler
Copy link
Contributor Author

Since filepaths are sometimes compared to filepaths stored in the mkdocs/mkdocs-gallery config dict, I had to go back to the source and override Dir and File classes from mkdocs.config.config_options so that they return pathlib objects (mchaaler@2ff91ab).

Then fixing the not-failing-although-expected-to-fail bug (#34) was just a matter of adding the src_py_file path to the gallery_conf['failing_examples'] dict...
... and removing the minimum (but necessary) number of as_posix() calls so that the tests and build pass.

Let's check that and see if the CI builds are successful.

Comment on lines 977 to 982
# If expected to fail, let's assume it did when executed previously
if script.src_py_file in script.gallery_conf['expected_failing_examples']:
script.gallery_conf['failing_examples'][script.src_py_file] = (
"Due to MD5 check, script has not been actually executed - "
"Assumed it failed as expected during previous execution."
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be curious to see how this is done in sphinx-gallery. Maybe there is an opportunity to suggest them a PR ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed I did not check sphinx-gallery before implementing the proposed solution.

https://github.com/sphinx-gallery/sphinx-gallery/blob/26fc508bac5459c3d6daf5db64cf5d45cda53b7d/sphinx_gallery/gen_rst.py#L961-L969

It seems they handle this by using the gallery_conf['stale_examples'] list instead of the gallery_conf['failing_examples'] dict. I'm not familiar enough with the global process to evaluate the best approach. It seemed natural to me to use the 'failing_examples' entry, but maybe we should give it a try.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mkdocs-gallery is really a copy of sphinx-gallery, I just modified a few design patterns (pathlib, objects) and made a few steps more readable, but I tried not to change the process at all.

The equivalent lines are here:

if not script.has_changed_wrt_persisted_md5():
# A priori we can...
skip_and_return = True
# ...however for executables (not shared modules) we might need to run anyway because of config
if script.is_executable_example():
if script.gallery_conf['run_stale_examples']:
# Run anyway because config says so.
skip_and_return = False
else:
# Add the example to the "stale examples" before returning
script.gallery_conf['stale_examples'].append(script.dwnld_py_file.as_posix())
if skip_and_return:
# Return with 0 exec time and mem usage, and the existing thumbnail
thumb_source_path = script.get_thumbnail_source(file_conf)
thumb_file = create_thumb_from_image(script, thumb_source_path)
return GalleryScriptResults(script=script, intro=intro, exec_time=0., memory=0., thumb=thumb_file)

So it is a bit strange, why the not modified failing sample is not added to stale samples in our case

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh-oh ! it seems that there is an "as_posix" here ! :)

@mchaaler
Copy link
Contributor Author

src_file is used in many lines in this function body. In particular _set_reset_logging_tee, os.chdir and compile. I can not guarantee that these behave similarly when they receive a path object.

Let's check that:

  • _set_reset_logging_tee - I first added the requested type hints in _check_reset_logging_tee and the _LoggingTee constructor. Then the src_filename attribute is used in _LoggingTee.write(), where the logger.verbose call redirects to Logger.debug. This seems safe.
  • os.chdir - Accepts a path-like object since 3.6 (https://docs.python.org/3.7/library/os.html#os.chdir)
  • compile - The documentation (https://docs.python.org/3.7/library/functions.html#compile) is not very detailed about the use of the 'filename' parameter. Doesn't seem to be used to access the file, but to display the filename in error messages (builtins.compile() docstring says: "The filename will be used for run-time error messages."). To be safe we could use str(src_file) instead of sending the path object, but my understanding is that, whatever the string formatting method used (f-strings, str.format or legacy % operator), path objects conversion to strings is flawless.

@smarie
Copy link
Owner

smarie commented Mar 25, 2022

Nice analysis @mchaaler ! I resolved the comment on src_file.
The only remaining comment is on the "stale_example"

@mchaaler
Copy link
Contributor Author

mchaaler commented Mar 26, 2022

The only remaining comment is on the "stale_example"

I'm currently analyzing how it is done in sphinx-gallery. It seems (still pending deeper understanding...) that the 'expected-to-fail' examples are executed anyway.

Here's why:

  1. The handle_exception() method in gen_rst.py reverts the script_vars['execute_script'] value to 'False' although the script is both executable and executed: link to sphinx-gallery.

  2. This is done just after storing the exception in the gallery_conf['failing_examples'] dict.

  3. Then, back to execute_script(), the part dedicated to md5 checksum file write is simply skipped.

... thus preventing the md5 file to ever be written for the failing or not executed scripts...
... thus bypassing the if md5sum_is_current check in generate_file_rst.

I would be curious to see how this is done in sphinx-gallery. Maybe there is an opportunity to suggest them a PR ?

Indeed. We might at least ask if this approach has been chosen because they did not find a better way to handle expected-to-fail scripts. Since they set up the whole md5-check process to avoid re-executing scripts that did not change, it seems weird not to rely on it for all unmodified files.

At the end, this means that my previous comment

It seems they handle this by using the gallery_conf['stale_examples'] list instead of the gallery_conf['failing_examples'] dict.

is not accurate: the expected-to-fail scripts are not handled the way we thought in sphinx-gallery.

@smarie
Copy link
Owner

smarie commented Mar 27, 2022

Thanks a lot for this analysis @mchaaler ! Well if this is correct, sphinx-gallery's current behaviour is definitely sub-optimal to me.

still pending deeper understanding...

Let me know how do you want to proceed. We can define "the ideal fix" in this PR (probably relying on 'stale_sample' then?) so as to move on and release, and we can share it with them afterwards ; or we can open a discussion with them and find a common solution before implementing. On my side I see no hurry, so I am fine with either way.

@mchaaler
Copy link
Contributor Author

I just pushed another commit (c61bec2) proposing an alternative to the fix.

With 4d7baa9 we had:

mkdocs-gallery successfully executed 0 out of 2 files subselected by:

    gallery_conf["filename_pattern"] = '\\\\plot'
    gallery_conf["ignore_pattern"]   = '__init__\\.py'

after excluding 15 files that had previously been run (based on MD5).

And with c61bec2 we have:

mkdocs-gallery successfully executed 0 out of 0 files subselected by:

    gallery_conf["filename_pattern"] = '\\\\plot'
    gallery_conf["ignore_pattern"]   = '__init__\\.py'

after excluding 15 files that had previously been run (based on MD5).

Note the 0 ou of 0 files instead of 0 out of 2.
This prevents the expected-to-fail scripts to be taken into account in the total number of scripts that should have been run in gen_gallery.summarize_failing_examples.

@smarie Let me know if you would qualify this approach as the "ideal fix"...

@mchaaler
Copy link
Contributor Author

mchaaler commented Mar 28, 2022

Regarding sphinx-gallery, I implemented something similar to optimally handle expected-to-fail examples and I might dare submitting a PR at some time. I'm still struggling with the tests, though.
By the way, I'm quite surprised by the number of assert statements in their test functions. I thought that the adopted practice was "1 test function = 1 assert statement", so that one knows exactly, by the the failing test function's name itself, which expected behavior was broken. But I definitely lack experience in terms of tests implementation.

@smarie May I add a comment in sphinx-gallery/sphinx-gallery#895 you created, to start the discussion about the choice they made in order to handle the expected-to-fail examples?
Or would it preferable to open a dedicated issue?

@smarie
Copy link
Owner

smarie commented Mar 30, 2022

Let me know if you would qualify this approach as the "ideal fix"...

Yes, it looks nice to me ! Let's merge this.

I thought that the adopted practice was "1 test function = 1 assert statement"

No, this is really not the case. It is rather "as many assert statements as needed", the main point is that the test should be focused on a single situation.

would it preferable to open a dedicated issue?

Indeed it is preferable.
Inside the issue you may refer to sphinx-gallery/sphinx-gallery#895 for the general intent of aligning between ourselves, and also to this PR #39 for the discussion.

@smarie
Copy link
Owner

smarie commented Mar 30, 2022

0.7.5 is now released, with the fix. Thanks a lot @mchaaler for all your work to solve these issues !

@mchaaler
Copy link
Contributor Author

0.7.5 is now released, with the fix. Thanks a lot @mchaaler for all your work to solve these issues !

You're welcome! Proud to provide useful contributions and happy for the constructive feedback you gave.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants