Skip to content

Extend preprocessor anomalies#2871

Merged
valeriupredoi merged 19 commits intomainfrom
extend_anomalies_preprocessor
Nov 7, 2025
Merged

Extend preprocessor anomalies#2871
valeriupredoi merged 19 commits intomainfrom
extend_anomalies_preprocessor

Conversation

@axel-lauer
Copy link
Contributor

@axel-lauer axel-lauer commented Oct 31, 2025

Description

This PR extends the existing preprocessor anomalies to allow for optionally calculating relative anomalies (in percent).

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

@axel-lauer axel-lauer added enhancement New feature or request preprocessor Related to the preprocessor labels Oct 31, 2025
@codecov
Copy link

codecov bot commented Oct 31, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.49%. Comparing base (9998183) to head (1074f8e).
⚠️ Report is 65 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2871      +/-   ##
==========================================
+ Coverage   95.43%   95.49%   +0.05%     
==========================================
  Files         261      261              
  Lines       15543    15557      +14     
==========================================
+ Hits        14834    14856      +22     
+ Misses        709      701       -8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@axel-lauer axel-lauer marked this pull request as draft October 31, 2025 10:06
@axel-lauer
Copy link
Contributor Author

axel-lauer commented Nov 3, 2025

I am afraid I need some assistance from some more tech-savvy people to get the tests fixed for this seemingly simple extension of the preprocessor anomalies. If possible, could anyone from @ESMValGroup/technical-lead-development-team help? Please?

@valeriupredoi
Copy link
Contributor

I'll have a look at this later on today @axel-lauer 🍺

@valeriupredoi
Copy link
Contributor

@axel-lauer ruff is failing with rule COM812 - there's gotta be a missing trailing comma somewhere in the new code; alas, you can try committing once, see it fail, then either fix the issue, or commit again to let the formatter fix that for you (not encouraged, but practical 😁 )

@axel-lauer
Copy link
Contributor Author

Thanks @valeriupredoi ! I think I solved the ruff issue (I was looking at the wrong file). It looks like the only problem left is the codecov issue telling me that not all changes are covered. I have no idea how to solve this as I used the unit tests for "standardized anomalies" as a template for the newly implemented "relative anomalies". Would you have an idea what to do?

@bouweandela
Copy link
Member

It looks like the only problem left is the codecov issue telling me that not all changes are covered. I have no idea how to solve this as I used the unit tests for "standardized anomalies" as a template for the newly implemented "relative anomalies". Would you have an idea what to do?

There are two different code paths in anomalies preprocessor function, either period="full" for which you've added a test case, or period is something else, for which there are currently no tests. So you need to add a test for the latter.

@axel-lauer
Copy link
Contributor Author

axel-lauer commented Nov 5, 2025

Thanks @valeriupredoi and @bouweandela for your help. Following your suggestions, I extended the unit test for the newly implemented relative anomalies. Codecov forces to also provide tests for code that was previously not covered by the tests as I moved old code into a new function _apply_scaling to satisfy the "maximum complexity" rule. I therefore also rewrote the unit test for the standardized anomalies.

Unfortunately, Codecov is still complaining about an uncovered error message that is triggered in case of inconsistencies in the input data in _apply_scaling. Notably, this is also the case for similar error messages in extract_season and seasonal_statistics, except that these uncovered lines of code are ignored by Codecov.

I am now out of ideas and I would highly appreciate help from our experts.

@valeriupredoi
Copy link
Contributor

@axel-lauer codecov complains about two things:

  • the removed code that was covered by tests - that should be fine
  • actual new code that needs to be covered by tests:
    if reps % 1 != 0: 
        msg = ( 
            "Cannot safely apply preprocessor to this dataset, " 
            "since the full time period of this dataset is not " 
            f"a multiple of the period '{period}'" 
        ) 
        raise ValueError( 
            msg, 
        ) 

if you are unsure of what codecov spits out, you can always run the tests locally with eg pytest -n 2 --cov (remember the cov flag, so you can record the coverage in test-reports/coverage_html) 🍺

@axel-lauer
Copy link
Contributor Author

Thanks @valeriupredoi! I tried to cover this check with providing an incomplete reference dataset:

@pytest.mark.parametrize("period", ["month"])
def test_relative_anomalies_valerr(period):
    """Test ValueError in relative ``anomalies``."""
    cube = make_map_data(number_years=2)
    reference = {
        "start_year": 1950,
        "start_month": 1,
        "start_day": 1,
        "end_year": 1950,
        "end_month": 11,
        "end_day": 30,
    }

    with assert_raises(ValueError):
        anomalies(cube, period, reference, relative=True)

Unfortunately, no success so far. The preprocessor code in _time.py crashes already earlier, in function _compute_anomalies that I have not touched and that is called before my new function _apply_scaling. I am out of ideas and I am tempted to simply remove the code Codecov complains about that you listed above. I start doubting that it is possible to ever raise this ValueError as the code will crash before anyways.

@bouweandela
Copy link
Member

What if you pass in incomplete input data, like so?

@pytest.mark.parametrize("period", ["month"])
def test_relative_anomalies_valerr(period):
    """Test ValueError in relative ``anomalies``."""
    cube = make_map_data(number_years=2)[:15]
    reference = {
        "start_year": 1950,
        "start_month": 1,
        "start_day": 1,
        "end_year": 1950,
        "end_month": 12,
        "end_day": 31,
    }

    with assert_raises(ValueError):
        anomalies(cube, period, reference, relative=True)

?

@axel-lauer
Copy link
Contributor Author

Thank you @bouweandela! Your idea seems to work. Yay!

@axel-lauer axel-lauer marked this pull request as ready for review November 6, 2025 09:45
@valeriupredoi valeriupredoi reopened this Nov 7, 2025
@valeriupredoi valeriupredoi merged commit 2561fc0 into main Nov 7, 2025
4 checks passed
@valeriupredoi valeriupredoi deleted the extend_anomalies_preprocessor branch November 7, 2025 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request preprocessor Related to the preprocessor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants