-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 4174: Improvements to field_to_include specifications in create_firefly_object #4175
Conversation
pre-commit.ci autofix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general it's hard to know for sure this is working as expected without tests, could you add some ?
I've added some basic tests (d5be708), but I'm not super familiar with this testing system. Is this what you're looking for? |
Thank you, and nice catch ! I don't have enough time to do a detailed review right now, but here are some general key advice for writing tests:
|
Sorry, I just got back to working on this.
This was a combination of a bad function name and function creep, my bad.
Done. This also helped with the previous point.
I'm just blind apparently. I moved the addition to the already existing test_firefly.py
Done to simplify the @pytest.mark.parametrize stuff. I think I did this correctly, but I'd appreciate a double-check. I've never used some of these capabilities before. Two general questions: am I trying to do too much with one test function? And is there an equivalent to pytest.warns for the mylog.warning function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for keeping poking at this !
There are some functional issues with how you refactored your tests, as can be seen in
https://tests.yt-project.org/job/yt_py38_git/6076/testReport/junit/yt.data_objects.tests/test_firefly/firefly_test_dataset/
(though I'm not sure I actually understand went wrong here TBH; but let's work through it)
I want to add that, unfortunately we're still slowly migrating from nose to pytest, so some of our infrastructure using nose, and doesn't understand pytest. This means that any test file using pytest should be explicitly ignored in nose.
This requires updates to tests/tests.yaml
(ass a --ignore-file
entry), and nose_unit.cfg
(update the ignore-files regex).
Now to answer your questions
am I trying to do too much with one test function?
probably yeah. We have a bunch of very busy tests functions already so I feel like I need to be tolerant about those, but it's best to avoid them if the tests can also be expressed as smaller, more numerous and decoupled functions.
AFAIC, the DRY (Don't Repeat Yourself) principle doesn't apply to tests, and trying to never repeat test code actually hurts readability as well as maintainability IMO.
And is there an equivalent to pytest.warns for the mylog.warning function?
Sort of. There's caplog, but it's pretty hard to use within yt tests because logging is typically disabled in CI:
Line 25 in eda90ff
echo "suppress_stream_logging = true" >> $HOME/.config/yt/yt.toml |
There's a precedent for tests that still use it in yt/tests/test_load_sample.py but it's not as straightforward as I'd like. Don't feel like you have to go through this, we have very little logging tests, but of course feel free to if you'd like !
I think I know what's wrong actually, I was passing the dataset and data region out of the fixture as a tuple, and then unpacking the tuple in the test function:
because I only need
Updated.
Hmm. I think I can split it into testing field names and field tuples, so I'll try that. That should get rid of some of the conditional logic and make it more obvious what's being tested.
I was mainly thinking of using it to test the |
I split up the test function into 3: field strings, field tuples, and mixed strings and tuples. I also noticed that the old firefly code tests ( |
... that's in the context of testing. I don't know what most users do with yt's logger, but by default it's always on and set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again for your effort. I'll do a comprehensive review in the next few days !
Sorry, I just meant in the testing context. I'll definitely leave those messages in! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are important suggestions. I think if we agree on this I might go for one last, smaller review after this is settled
I'm having trouble with the test cases. I can't seem to get e.g. |
Can you produce a minimal example for this ? I do not understand why it would fail only within pytest, so I'm tempted to think you're actually running slightly different code. |
You're absolutely right. I was using different datasets outside of pytest (e.g. the gizmo_mhd_mwdisk) but hadn't actually tried creating the import yt
from yt.utilities.exceptions import YTFieldNotFound
from yt.testing import fake_particle_ds
# create dataset
def test_fake_dataset():
ds_fields = [
("pt1", "particle_position_x"),
("pt1", "particle_position_y"),
("pt1", "particle_position_z"),
("pt2", "particle_position_x"),
("pt2", "particle_position_y"),
("pt2", "particle_position_z"),
("pt2", "field1"),
]
ds_field_units = ["code_length"]*7
ds_negative = [0]*7
ds = fake_particle_ds(
fields=ds_fields,
units=ds_field_units,
negative=ds_negative,
)
return ds
ds = test_fake_dataset()
ad = ds.all_data()
try:
print(ad["field1"])
except(YTFieldNotFound):
print("Expected error, looking for (all, field1) which DNE")
print(ad["pt2","field1"]) # Setting gas1 as most recently requested -> works elsewhere
try:
print(ad["field1"])
except(YTFieldNotFound):
print("ERROR: This should have produced the same result as (pt2, field1)") Is there something I'm supposed to call when I'm creating the |
I see that your expectations don't match the observed behaviour, but that behaviour is intended: calling |
I must be misunderstanding something then, isn't this the error case for ambiguous fields? I wouldn't have thought that would apply. |
yes, but ambiguity is not affected by access history ! Btw, some of the test failures that you can see in you most recent run are completely unrelated to your PR. I'm assuming we can expect them to be resolved upstream in a matter of hours to days. See #4224 for context. |
I think there's still a disconnect between what we're talking about, so I'll try to ask in a different way.
My question is then how should I construct a test dataset to test assumption 2? Of course, either one of those assumptions could be wrong, in which case I think I can figure out the path forward. I just can't seem to create a test dataset that works assuming they're both correct. I've updated the test file; it should be more obvious what is being tested and how. |
Sorry for the delay, had a busy week... It also seems to me that yt's code is a little bit contradictory, which of course doesn't help making things clear. yt/yt/data_objects/static_output.py Lines 728 to 733 in 45dc49b
I am not sure what would be the best approach to fix this, but it seems that it could easily go way out of scope for the present PR. Instead, I propose (contrary to what I previously said ! very sorry about that) that we keep the existing logic for |
To make very sure I'm following correctly 😅 .
|
def create_firefly_object(
...,
*,
match_any_particle_types=None,
):
if match_any_particle_types is None:
# not specified, switching to (temporary) default
issue_deprecation_warning(
"match_any_particle_types wasn't specified. Its current default value (True) will be changed to False in the future. "
"Pass an actual value (True or False) to silence this warning. ",
...
)
match_any_particle_types = True WDYT ? |
This reverts commit 3cfee26.
Co-authored-by: Clément Robert <cr52@protonmail.com>
Co-authored-by: Clément Robert <cr52@protonmail.com>
Moved testing None/empty inputs to separate test. Simplified testing single field string input (not currently working). Moved ambiguous single field to test_field_invalid_specification (and changed name to better match other tests). Switched from testing Masses (common) to Temperature (unique) for single string and vice versa for tuple in test_field_mixed_specification
Switched to generic field type and field names. Also made all field units identical to simplify tests and fixture construction.
data_containers.py/create_firefly_object - added new match_any_particle_types flag with documentation and deprecation warning. If flag=True, will now attempt to disambiguate single string fields with more than one candidate field tuple by simply including all of the candidates. If flag=False, will not allow ambiguous single string fields. test_firefly.py - added explicit specification of match_any_particle_types to all create_firefly_object object calls to avoid deprecation warning resulting in a test failure. Changed test_field_string_specification to test_field_unique_string_specification and test_field_common_string_specification and changed behavior of each to test that previous match-any behavior is retained. Added new invalid specifications: unique single string field->maps to ("all","pt2only_field") and fails; common, ambiguous single string field; and mixed string/tuple. All are invalid if match_any_particle_types=False
Co-authored-by: Clément Robert <cr52@protonmail.com>
Now this has conflicts... I would solve them myself but you seem to be confortable with rebasing so ... can you do it (again) please ? Sorry this is turning so demanding, I will try to find another reviewer so hopefully we can put it at rest soon. |
I'm happy to serve as a reviewer! |
I went ahead and solved the conflicts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, only some minor suggestions and questions.
Co-authored-by: Chris Havlin <chris.havlin@gmail.com>
…ightly Just rearranging to fit in 80 char line limit
Co-authored-by: Chris Havlin <chris.havlin@gmail.com>
@chrishavlin, feel free to merge if you feel it's ready. I'll be off for today |
@matthewturk did you still wanna take a look? |
PR Summary
Added code to allow fields that aren't in all particle types and/or are specified in (field_type,field_name) form.
Previously, fields_to_include could only consist of fields that were universal to all raw particle types. This was
not specified in the documentation and didn't appear to be "natural" behavior.
Lines 892-901 in
yt/data_objects/data_containers.py
incidentally fixes a bug when no fields are specified.This is intended to fix #4174.
PR Checklist