-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add combinatorial tests to the CI #2991
Conversation
The remaining piece is to add the marker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A handful of minor comments for now. Will review the meat in more detail in a bit.
@@ -188,6 +188,43 @@ jobs: | |||
name: Unit Test Results (Python ${{ matrix.python-version }} ${{ matrix.test-markers }}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Unit Tests
, Integration Tests
, and Regression Tests
above, do we need to extend the not slow or benchmark
clause to include the combinatorial marker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combinatorial tests will run in their own separate runner and the other jobs can stay intact. I opted for this option as to not add to the current time it takes to run these tests and speed up development.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abidwael, my comments are mainly around code structure, variable naming, nits, and questions. Solid work overall.
Higher-level comments:
- The code in
explore_schema
is pretty juicy. It would be great to add unit tests for this code. - In terms of directory structure, I might suggest putting most of the code in a new directory
ludwig/config_sampling
and putting tests intests/ludwig/config_sampling
. - Tests that actually do config sampling and run training for 1 step can live in
tests/robustness
(ortraining_success
as you currently have it, thoughrobustness
seems more clear to me) - Move
explore_<atomic type>()
functions to their own file.explore_schema.py
is really big already. - Throughout your code, you use
key
anditem
, which aren't particularly descriptive. In one of my comments, I suggest usingjsonschema_property
instead ofitem
andparameter_name/path
instead ofkey
, which I would take back ifkey
anditem
are formal concepts in jsonschema. - Prefer to use f"{var}.{var}" strings instead of
var + "." + var
- nit: Consider defining a dataclass for the
dq
instead of passing around tuple objects.
4f1cc0b
to
e518eab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly just a few more nits! Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This PR adds combinatorial Ludwig config testing to the CI to automatically generate valid configs and train models using them to test model training success.
TODO: add a README to explain how it works and how to add more combinatorial tests.