TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI #4785

neutrinoceros · 2024-01-23T14:05:04Z

Bug report

The following test is currently failing in CI, however I couldn't reproduce the error locally yt/visualization/volume_rendering/tests/test_scene.py::test_annotations

see https://github.com/yt-project/yt/actions/runs/7626456177/job/20772882827

The text was updated successfully, but these errors were encountered:

neutrinoceros · 2024-01-24T11:33:43Z

Notes:

the test still passes in minimal env tests
the fact that we haven't been able to reproduce it locally (by running just this test) may be indicative of test pollution, which could be checked by running only this test in CI too.

chrishavlin · 2024-01-24T16:51:09Z

the fact that we haven't been able to reproduce it locally (by running just this test) may be indicative of test pollution, which could be checked by running only this test in CI too.

I just confirmed this over in #4786:

running just pytest yt/visualization/volume_rendering/tests/test_scene.py::test_annotations passed in CI
expanding to run all the visualization tests, pytest yt/visualization/, passed in CI

chrishavlin · 2024-01-24T16:52:23Z

was trying to re-run the failed test with pytest --lf as well to see if it'd fail on the second go, but i couldn't get it working in the github action...

might also be worth doing a run that randomizes test order?

neutrinoceros · 2024-01-24T17:04:33Z

might also be worth doing a run that randomizes test order?

Ideally yes, though I suspect that would be very unstable if we did it now because pollution in our test is possibly high, and it's likely that some of them are only passing because of other tests' side effects (I've put a lid on this for a while ...).

chrishavlin · 2024-01-25T19:36:33Z

ok, well i've narrowed down the subset of tests that you need to run to get the failure: just running the tests in yt/data_objects/tests and the failing test:

pytest --color=yes yt/data_objects/tests yt/visualization/volume_rendering/tests/test_scene.py::test_annotations

no indication of why yet. Also, when re-running the failed test, it does pass on second run.

chrishavlin · 2024-01-25T21:36:15Z

very close to an answer! I used the https://github.com/asottile/detect-test-pollution pytest plugin to find that yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string is the source of the issue. Running just these two tests you can reproduce the failure:

        pytest --color=yes  \
        yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string \
        yt/visualization/volume_rendering/tests/test_scene.py

when reversing the order, tests pass no problem. Looking at test_firefly_JSON_string, there's nothing obvious... but those firefly reader objects seem to do a lot with writing to disk and symlinks and default data directories so maybe there's something there?

neutrinoceros · 2024-01-25T22:03:37Z

Great job ! I may be able to give a second read to those tests as early as tomorrow. Meanwhile: friendly ping @agurvich

chrishavlin · 2024-01-25T22:33:59Z

OK. I think i'm done for the day, but one extra note: I still get the failure when I change test_firefly_JSON_string to not actually call writeToDisk, so it's something in the initial creation of the firefly object in ad.create_firefly_object (could be a yt thing, a firefly thing or a shared library between the two or something else very strange? really still no good idea...)

agurvich · 2024-01-25T22:53:42Z

but those firefly reader objects seem to do a lot with writing to disk and symlinks and default data directories so maybe there's something there?

don't talk about my trash like that! hahaha. I don't have a working yt development environment set-up, but it sounds like I might need to sniff something out?

The situation with the symlinks is that the firefly data needs to be written out to disk in order to be read in by the browser (unless it's passed through a websocket using flask). However, because of browser security features, websites (or webapps) can't read files just anywhere on disk, they have to be subdirectories of the app itself. So when you're hosting this app locally on your computer, the data has to be visible from a path that's a subdirectory of the app files (which int he case of being installed via pip, is hidden in your site-packages directory). My solution at the time was to let users specify where they wanted to write their data to and then symlink from there to Firefly's data directory. So what yt should be doing here is creating a Firefly object, passing some array data to it, using the Firefly object's built-in writeToDisk to write the data to a named temporary file, and then make a symlink from there to Firefly's data directory. I don't totally remember how I set up the unit test and I'm not sure what is actually failing here (is there a failing test or is there an incomprehensible error?). Where can I see the CI output (without building and running locally myself)?

Or will I have to build yt?

chrishavlin · 2024-01-25T23:22:54Z

don't talk about my trash like that! hahaha.

haha, many apologies :D I don't actually think that the symlinks are the culprit, I was just grasping at straws cause I'm rather baffled by this failure.

Here's a quick summary of the issue:

a test unrelated to firefly started failing recently (yt/visualization/volume_rendering/tests/test_scene.py::test_annotations). neither myself or @neutrinoceros have been able to reproduce this issue locally, it only fails in our CI pipeline on github with python 3.12.
we figured out it a test pollution issue: re-running the failed test by itself passes just fine
I bisected down to the point where I can re-create the failure with just two tests: a yt test that creates a firefly object (yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string) and our test_annotations test. It only fails when the firefly related test is run before the test_annotations test, reversing the order passes. So some global state somewhere is somehow being changed? maybe??

here's an example of the failing test that reads back in an image that was written to disk.

agurvich · 2024-01-25T23:23:00Z

E       DeprecationWarning: 
E       Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
E       (to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
E       but was not found to be installed on your system.
E       If this would cause problems for you,
E       please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466

../../../miniconda3/envs/oss/lib/python3.12/site-packages/pandas/__init__.py:221: DeprecationWarning
==================================== short test summary info ====================================
FAILED yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string - DeprecationWarning:

So here is what I get when I build locally, which seems consistent with ageller/Firefly#177 .

The solution there was to remove the dependency on pandas. This was the objectively correct thing to do, using pandas to write a JSON string was silly, but happened because we were already using pandas elsewhere and knew it would work. It looks like what's going on is that pandas is being imported in a different file (in this case, firefly/data_reader/reader.py (where the "Firefly Object" aka Reader class instance is defined). There, we use it to read a csv. We could swap to numpy and use genfromtxt or something, but I like how pandas is able to load tabular data in a pretty generic/bulletproof way (at least, it doesn't die it just returns nonsense which depending on your philosophy is better/worse).

My hypothesis is that depending on the order/whether the test has been run is that Pyarrow is being loaded into the environment after this test is run and perhaps it's not happening locally for either of you because you have Pyarrow installed(?). I'll be honest I don't totally understand the concept of this deprecation warning. The current version of pandas doesn't require Pyarrow. If I ever want to upgrade to the next version, whatever software distribution system I'm using (conda or pip) will install Pyarrow for me.

agurvich · 2024-01-25T23:23:54Z

@chrishavlin I was so close to finishing my message before you posted, haha! What do you think of my hypothesis? What is the best way to proceed?

agurvich · 2024-01-25T23:27:39Z

Ah... looking at the error you posted above, I would be very surprised if the Pyarrow thing were the actual culprit,

E       assert (614, 614, 4) == (512, 512, 4)
E         At index 0 diff: 614 != 512
E         Full diff:
E         - (512, 512, 4)
E         + (614, 614, 4)

/home/runner/work/yt/yt/yt/visualization/volume_rendering/tests/test_scene.py:104: AssertionError
=========================== short test summary info ============================
FAILED yt/visualization/volume_rendering/tests/test_scene.py::test_annotations - assert ([61](https://github.com/yt-project/yt/actions/runs/7661070299/job/20879720772?pr=4786#step:7:62)4, 614, 4) == (512, 512, 4)
  At index 0 diff: 614 != 512
  Full diff:
  - (512, 512, 4)
  + (614, 614, 4)
========================= 1 failed, 2 passed in 6.[63](https://github.com/yt-project/yt/actions/runs/7661070299/job/20879720772?pr=4786#step:7:64)s ==========================
Error: Process completed with exit code 1.

How could something like this fail because of whether or not something is imported into the environment...? Maybe the Pyarrow thing is just a red herring and a totally separate failure in my local environment.

chrishavlin · 2024-01-25T23:28:10Z

and it doesn't look like Pyarrow is being installed in the test run either (but I liked the idea :) )

agurvich · 2024-01-25T23:29:44Z

Not sure if it's progress, but I'm getting the same error locally now that I've installed Pyarrow....

however running a second time does not make the test pass

chrishavlin · 2024-01-25T23:30:30Z

ooOOoo, I'd say it is progress. maybe pyarrow is being installed and I didn't search the logs properly

chrishavlin · 2024-01-25T23:32:50Z

and I just reproduced it as well after installing PyArrow!

agurvich · 2024-01-25T23:33:02Z

3rd time was not the charm, still getting the above assertion error in test_annotations when I do

pytest --color=yes\

yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string\

yt/visualization/volume_rendering/tests/test_scene.py

seemingly no matter how many times I do it (which seems unlike the situation in the CI environment?).

agurvich · 2024-01-25T23:33:12Z

Hey!!! I'll call that progress!!!!

chrishavlin · 2024-01-25T23:34:09Z

Well, it passes if you change the order so that test_scene runs first:

pytest --color=yes  \
        yt/visualization/volume_rendering/tests/test_scene.py \
        yt/data_objects/tests/test_firefly.py::test_firefly_JSON_string

chrishavlin · 2024-01-25T23:34:23Z

I gotta run now, thank you for your help so far!!

agurvich · 2024-01-25T23:34:40Z

Ahhh I see, yes, can confirm, that passed for me.

agurvich · 2024-01-25T23:35:13Z

NP. At this point I'm gonna leave it to you and @neutrinoceros now that you can get your hands dirty locally, but let me know if you want me in the trenches as well.

neutrinoceros · 2024-01-26T08:44:24Z

Thank you both for working on this. There seem to be some confusion around pyarrow, so let me try to clarify the situation:

In yt's CI, we don't install pyarrow yet, but we promote warnings to errors
pandas 2.2.0 started emitting a DeprecationWarning to warn that pyarrow was to become a dependency to pandas 3.0
there are three ways to get around this warning:
1. tell pytest to ignore it (this is what I did in TST: ignore deprecation warning from pandas (Pyarrow will become a required dependency) #4784)
2. install pyarrow (pandas assumes that you're fine with having to install pyarrow in the future if you already have it)
3. downgrade pandas
this actually has nothing to do with BUG: fix incompatibility with pandas>=2.2 ageller/Firefly#177: what happened there was that Firefly was using private pandas API which was moved in pandas 2.2 . I fixed it by using public API instead. It so happens that said public API lives in the standard lib, but it wouldn't have been a problem to keep using pandas if it provided such public API.

So, I don't think pyarrow is necessary to reproduce locally, it just helps dodging pandas' deprecation warning, but I think downgrading pandas would have the same effect. To help reduce confusion I'd suggest merging #4784 😇

chrishavlin · 2024-01-26T17:18:27Z

Ok, we have an answer! and not pyarrow related after all.

So turns out when you import firefly, it modifies the default matplotlib figure.dpi value:

import matplotlib as mpl
rcparams0 = mpl.rcParams.copy()
import firefly.data_reader
rcparams1 = mpl.rcParams.copy()

print(rcparams0['figure.dpi']) 
print(rcparams1['figure.dpi'])

prints

100
120

In yt, Scene._show_mpl() has a dpi=100 argument that is used to set the dimensions of the figure but is not explicitly passed to the matplotlib Figure on instantiation. So after firefly gets imported, the default dpi changes and the yt figure changes size when saved. The fix on the yt side is to just pass along the dpi argument to Figure as well (which maybe we should have been doing all along). I'm not sure why this was only exposed now...

chrishavlin · 2024-01-26T17:26:50Z

and @agurvich thanks for you help yesterday!

neutrinoceros · 2024-01-26T17:40:55Z

I think I found where this change happens:
https://github.com/agurvich/abg_python/blob/408f0daffb2f84130b4df096b72b77a5affdd733/src/abg_python/plot_utils.py#L26

abg_python is a dependency to Firefly, and it leaks Matplotlib configuration on import.

The reason we're finding out now is that the culprit commit is from Nov 2022, but there was no release of abc_python between April 2022 and 3 days ago when I came knocking on Alex's door agurvich/abg_python#4

so, at the risk of being called names: ping @agurvich 🙈

neutrinoceros · 2024-01-27T09:11:36Z

moving the discussion upstream as there's nothing left to be done on our side: agurvich/abg_python#5

neutrinoceros added bug tests: running tests Issues with the test setup labels Jan 23, 2024

neutrinoceros mentioned this issue Jan 23, 2024

TST: ignore deprecation warning from pandas (Pyarrow will become a required dependency) #4784

Merged

chrishavlin mentioned this issue Jan 23, 2024

[WIP] investigating failing annotation test #4786

Closed

chrishavlin mentioned this issue Jan 25, 2024

Add Parthenon frontend #4323

Open

2 tasks

neutrinoceros mentioned this issue Jan 26, 2024

Backport PR #4784 on branch yt-4.3.x (TST: ignore deprecation warning from pandas (Pyarrow will become a required dependency)) #4791

Merged

chrishavlin mentioned this issue Jan 26, 2024

BUG: enforce Figure dpi in _show_mpl #4792

Merged

neutrinoceros closed this as completed in #4792 Jan 26, 2024

neutrinoceros mentioned this issue Jan 27, 2024

BUG: abg_python leaks Matplotlib configuration (side effects at import time) agurvich/abg_python#5

Closed

neutrinoceros added this to the 4.3.1 milestone Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI #4785

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI #4785

neutrinoceros commented Jan 23, 2024

neutrinoceros commented Jan 24, 2024

chrishavlin commented Jan 24, 2024

chrishavlin commented Jan 24, 2024 •

edited

neutrinoceros commented Jan 24, 2024

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

neutrinoceros commented Jan 25, 2024

chrishavlin commented Jan 25, 2024 •

edited

agurvich commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024 •

edited

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024 •

edited

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

neutrinoceros commented Jan 26, 2024 •

edited

chrishavlin commented Jan 26, 2024

chrishavlin commented Jan 26, 2024

neutrinoceros commented Jan 26, 2024

neutrinoceros commented Jan 27, 2024

TST: yt/visualization/volume_rendering/tests/test_scene.py::test_annotations is failing in CI #4785

TST: yt/visualization/volume_rendering/tests/test_scene.py::test_annotations is failing in CI #4785

Comments

neutrinoceros commented Jan 23, 2024

Bug report

neutrinoceros commented Jan 24, 2024

chrishavlin commented Jan 24, 2024

chrishavlin commented Jan 24, 2024 • edited

neutrinoceros commented Jan 24, 2024

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

neutrinoceros commented Jan 25, 2024

chrishavlin commented Jan 25, 2024 • edited

agurvich commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024 • edited

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024 • edited

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

chrishavlin commented Jan 25, 2024

agurvich commented Jan 25, 2024

agurvich commented Jan 25, 2024

neutrinoceros commented Jan 26, 2024 • edited

chrishavlin commented Jan 26, 2024

chrishavlin commented Jan 26, 2024

neutrinoceros commented Jan 26, 2024

neutrinoceros commented Jan 27, 2024

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI #4785

TST: `yt/visualization/volume_rendering/tests/test_scene.py::test_annotations` is failing in CI #4785

chrishavlin commented Jan 24, 2024 •

edited

chrishavlin commented Jan 25, 2024 •

edited

agurvich commented Jan 25, 2024 •

edited

agurvich commented Jan 25, 2024 •

edited

neutrinoceros commented Jan 26, 2024 •

edited