Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Check test files for if __name__... snippet #25322

Merged
merged 20 commits into from Jun 2, 2022

Conversation

Yard1
Copy link
Member

@Yard1 Yard1 commented May 31, 2022

Why are these changes needed?

Bazel operates by simply running the python scripts given to it in py_test. If the script doesn't invoke pytest on itself in the if _name__ == "__main__" snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all py_test sources for the presence of if _name__ == "__main__" snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@Yard1
Copy link
Member Author

Yard1 commented May 31, 2022

@ericl let me know if something like that would be desired for core, data, serve, etc. tests. Can make followup PRs.

@simon-mo
Copy link
Contributor

Definitely desired for Serve!

Copy link
Contributor

@amogkam amogkam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks @Yard1!

@@ -335,3 +335,12 @@ py_library(
name = "ml_lib",
srcs = glob(["**/*.py"], exclude=["tests/*.py"]),
)

# Check if all tests have the pytest if __name__... snippet.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there is an easy way to do this for all BUILD files. Otherwise, it's very easy to forget to add this for new BUILD files that are created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am no bazel expert, but I couldn't really find anything that would allow that. All solutions either involved modifying the BUILD files a little (this PR) or completely

@simon-mo
Copy link
Contributor

Another approach can be:

  • Use bazel query to list all py_test targets
  • And check those *.py file all have this line.
  • Add this as a script in LINT step.

@Yard1
Copy link
Member Author

Yard1 commented May 31, 2022

@simon-mo I am a bit worried that may catch things we don't want (eg. the tests in docs don't need this snippet as they are all running in main). This seems to be more configurable, at the cost of having to modify BUILD files. Definetely a good idea, though! Happy to disscuss the approach here.

Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, the problem definitely has bit me in the past.

I would suggest to follow @simon-mo's approach though and add this as a linter step. The reason is that it's easy to miss tests that are either not captured by the glob or to forgert to include this in new BUILD files (especially as we may be moving to a more package-like BUILD structure in the future). Also, this test is arguably a linter check, so it would be great to include it there.

My proposal here would be:

  1. We add a tag no_main for all tests that don't expect the __name__ part
  2. We run a bazel query (something like bazel query 'kind(py_test.*, tests(python/...) except attr(tags, "\bno_main\b", python/...))') we can parse this to json with yq via --output xml | xq .
  3. Extract the files to check and run your test against it.

I don't think we lose any flexibility with the tag but we would make sure that everything declared as a py_test will be caught.

What do you think?

@Yard1
Copy link
Member Author

Yard1 commented Jun 1, 2022

Sure, that sounds good. Let me implement that

@Yard1 Yard1 requested a review from krfricke June 1, 2022 15:20
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Minor question

@@ -624,7 +625,8 @@ def train_with_backend(
_, eval_preds, _ = model.evaluate(dataset=dataset)
assert backend.df_engine.compute(eval_preds) is not None

return model.model.get_weights()
ret = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we continue to use get_weights() here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API has changed and that method is no longer present. As the return value wasn't being used, I just switched to this

@krfricke krfricke merged commit 045c47f into ray-project:master Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants