Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: tests/integration/test_diags.py is failing on local main branch #758

Closed
tomvothecoder opened this issue Dec 1, 2023 · 7 comments
Closed
Labels
bug Bug fix (will increment patch version)

Comments

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Dec 1, 2023

What happened?

6/18 tests in the image diff checker (test_diags.py) fail on a local copy of the main branch. It doesn't happen on GitHub Actions though (example). @forsyth2 mentioned 4 of those are also failing for him his comment.

I've pasted the sections of the logs and image diffs below.

What did you expect to happen? Are there are possible answers you came across?

Tests should pass. Maybe the plots need to be updated on Chrysalis?

Minimal Complete Verifiable Example (MVCE)

No response

Relevant log output

============================== short test summary info ==============================
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_regional_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_polar_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_qbo_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_streamflow_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_zonal_mean_2d_plot_diffs - AssertionError: assert 1 == 0

Anything else we need to know?

No response

Environment

Latest main branch of e3sm_diags and dev.yml environment

Image Diffs

Difference in Metrics

First image is main, second image is LCRC, third image is the diff.

Lat Lon Regional

ERA-Interim-T-850-ANN-CONUS_RRM_actual
ERA-Interim-T-850-ANN-CONUS_RRM_expected
ERA-Interim-T-850-ANN-CONUS_RRM_diff

Lat Lon Global

ERA-Interim-T-850-ANN-global_actual
ERA-Interim-T-850-ANN-global_expected
ERA-Interim-T-850-ANN-global_diff

Polar

ERA-Interim-T-850-ANN-polar_S_actual
ERA-Interim-T-850-ANN-polar_S_expected
ERA-Interim-T-850-ANN-polar_S_diff

Zonal Mean 2D

ERA-Interim-T-ANN-global_actual
ERA-Interim-T-ANN-global_expected
ERA-Interim-T-ANN-global_diff

Difference in Test Name

qbo

qbo_diags_actual
qbo_diags_expected
qbo_diags_diff

streamflow

seasonality_map_actual
seasonality_map_expected
seasonality_map_diff

@tomvothecoder tomvothecoder added the bug Bug fix (will increment patch version) label Dec 1, 2023
@tomvothecoder
Copy link
Collaborator Author

Just an FYI @forsyth2 and @chengzhuzhang. I think the images on Chrysalis need to be updated. Any thoughts?

@chengzhuzhang
Copy link
Contributor

Hi @tomvothecoder Yes, I think you are right. I haven't been diligently to update the expected image on LCRC. As you pointed out the difference is concerning. There are not only plot layout changes, also metrics number difference. For the plots pairs, which is the baseline (expected results), which is from current main branch (upper or bottom plots)? Thanks.

@tomvothecoder
Copy link
Collaborator Author

@chengzhuzhang The first image is main (actual) and second image is LCRC (expected). We can talk about it at our CDAT migration sprint meeting on Monday.

@forsyth2
Copy link
Collaborator

forsyth2 commented Dec 1, 2023

Maybe the plots need to be updated on Chrysalis?

Almost certainly the case. That said, the ultimate solution is really #756.

Before the image diff test was introduced, what we would do is just constantly check the plots randomly to see if they looked alright. That wasn't anywhere near thorough enough, so automating an image diff check seemed like a clever idea.

As we have seen though, the image diff test is extremely sensitive to noise -- thus requiring constant updates that aren't really necessary. Manual checking isn't sensitive enough and the automated check is too sensitive.

It seems like the metrics check will be both faster and exactly the level of sensitivity we want.

(Although as @chengzhuzhang pointed out, it does look like some metrics changed. That could very well be an expected change because of a pull request, one that didn't have the expected files updated after merging.)

@chengzhuzhang
Copy link
Contributor

@chengzhuzhang The first image is main (actual) and second image is LCRC (expected). We can talk about it at our CDAT migration sprint meeting on Monday.

This is actually good news. The first images look more reasonable. Yes, let's walk it through on Monday. Thanks!

@tomvothecoder
Copy link
Collaborator Author

tomvothecoder commented Dec 4, 2023

  • Look at commit on Chrysalis for expected result
$ cat /lcrc/group/e3sm/public_html/e3sm_diags_test_data/integration/expected/README.md

E3SM Diags version: v2_8_0
Date these files were generated: 20230523
Hash of commit (first 7 characters): ab041b4
  • Run on main to see it if fails -- yes it fails, 6/18 fail
========== short test summary info ===========
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_regional_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_polar_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_qbo_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_streamflow_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_zonal_mean_2d_plot_diffs - AssertionError: assert 1 == 0
  • Run on v2.9.0 to see if to if it fails -- yes, 4/18 fail
========== short test summary info ===========
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_lat_lon_regional_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_polar_plot_diffs - AssertionError: assert 1 == 0
FAILED tests/integration/test_diags.py::TestAllSets::test_zonal_mean_2d_plot_diffs - AssertionError: assert 1 == 0
  • Run on commit v2.8.0 to see if it fails -- can't run, dependency issues with old version of cdms2 and numpy types

@tomvothecoder
Copy link
Collaborator Author

Based on the diff, there were no noticeable code changes that seem to have affected metrics. Since the commit that produced the images on LCRC. We think there might have been an underlying dependency change that affected a few of the plots.

We decided to update the images on LCRC to reflect the latest results produced on main. If there any issues that crop up, we have the backed up images from May.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fix (will increment patch version)
Projects
None yet
Development

No branches or pull requests

3 participants