Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe testing and comparison for release 2.7.0 #2881

Closed
valeriupredoi opened this issue Oct 25, 2022 · 84 comments
Closed

Recipe testing and comparison for release 2.7.0 #2881

valeriupredoi opened this issue Oct 25, 2022 · 84 comments
Assignees
Milestone

Comments

@valeriupredoi
Copy link
Contributor

valeriupredoi commented Oct 25, 2022

Sister and logical evolution of #2852 - I am commencing testing and comparison of recipes and recipes results in order to release 2.7.0 at the end of this week (hopefully). System parameters below, work done on DKRZ/Levante: submit files in /home/b/b382109/submit, output in /scratch/b/b382109/esmvaltool_output

System and settings

conda/mamba

(base) mamba --version
mamba 0.27.0
conda 22.9.0

Git branch and state

Date: 25 October 2022 14:22 BST

(base) git status
On branch release_270stable
Your branch is up to date with 'origin/release_270stable'.

nothing to commit, working tree clean

Environment

On Levante:

mamba env create -n tool270Test -f environment.yml
conda activate tool270Test

Environment file

ToolEnv270Test.yml

Extraneous file movements

I moved the autoassess-specific files to /home/b/b382109/autoassess_files - run was succesful for AA recipes then 👍

Ad-hoc hacks (code changes)

Mods to config user file

Added DKRZ downloaded data pool as:

  CMIP6:
    - /work/bd0854/DATA/ESMValTool2/CMIP6_DKRZ
    - /work/bd0854/DATA/ESMValTool2/download/CMIP6
  CMIP5:
    - /work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
    - /work/bd0854/b309141/additional_CMIP5
    - /work/bd0854/DATA/ESMValTool2/download/cmip5/output1
    - /work/bd0854/DATA/ESMValTool2/download/cmip5

as @schlunma and @remi-kazeroni have suggested 🍺

Recipe runs

Recipe runs results (as of final on 27 October 2022) are listed in #2881 (comment) (with very many thanks to @remi-kazeroni for running the impossible to run ones!) and are as follows:

  • 122(121)*/127 successfully run recipes
  • 0(1)*/127 failed with Diagnostic error, but fixed and rerun, but not yet PR-ed with the fix
  • 2/127 that are missing data (for reals)
  • 3/127 that have various issues (not missing data and not DiagnosticError)

(*) means not counting/counting the one that had a DiagnosticError but was fixed but not PR-ed

Running the comparison

Login and access to the DKRZ esmvaltool VM

Results from recipe runs are stored on the VM; login with:

ssh youraccount@esmvaltool.dkrz.de

Get and install miniconda on VM

E.g. scp Miniconda3-py39_4.12.0-Linux-x86_64.sh b382109@esmvaltool.dkrz.de:~ from a file already on Levante.

Setting up the input files

If you wrote recipe runs output to Levante /scratch partition be aware that
the data will be removed after two weeks, so you will have to move the output data
to the /work partition, via e.g. a nohup job:

nohup cp -r /scratch/b/b382109/esmvaltool_output/* /work/bd0854/b382109/v270

/work is visible by the VM so you can run the compare tool straight on the VM.

NOTE do not store final release results on the VM including /preproc/ dirs, the total
size for all the recipes output, including /preproc/ dirs is in the 4.5TB ballpark,
much too high for the VM storage capacity

Running compare tool at VM

  • run date: 28 October 2022 (1st run)
  • conda env: tool270Compare
  • ESMValTool branch: release270stable
  • prerquisite: pip install imagehash

Input/output/run

  • current: /work/bd0854/b382109/v270 (contains preproc/ dirs too, 122 recipes)
  • reference: /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4 (does not contain preproc/ dirs)
  • cmd: nohup python ESMValTool/esmvaltool/utils/testing/regression/compare.py /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4 /work/bd0854/b382109/v270 > compare270output.txt

Sanity check, as outputted by compare.py

Comparing recipe run(s) in:
/work/bd0854/b382109/v270
to reference in /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4

First pass result

Running the compare.py results in a few recipes not-OK (NOK) wrt plots differing from previous release v2.6.0, summary in #2881 (comment)

Detailed plots inspection

Plots that differ for the 34 recipes that have them different is happening in #2881 (comment)

@valeriupredoi
Copy link
Contributor Author

@sloosvel I am in dire pain after realizing blithering DKRZ's SLURM emails me for every recipe 😵‍💫

@valeriupredoi
Copy link
Contributor Author

@sloosvel what's these jobs up to?

(tool270Test) squeue -u b382109
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
           2378977   compute recipe_z  b382109 PD       0:00      1 (AssocMaxJobsLimit)
           2378976   compute recipe_w  b382109 PD       0:00      1 (AssocMaxJobsLimit)
           2378975   compute recipe_w  b382109 PD       0:00      1 (AssocMaxJobsLimit)
           2378974   compute recipe_w  b382109 PD       0:00      1 (AssocMaxJobsLimit)

@sloosvel
Copy link
Contributor

@sloosvel I am in dire pain after realizing blithering DKRZ's SLURM emails me for every recipe face_with_spiral_eyes

You can comment that if it's not useful to you, to me it was!

@sloosvel what's these jobs up to?

I think there is a limit in number of jobs an account can run simultaneously in levante. They will be pending until other jobs finish I guess

@remi-kazeroni
Copy link
Contributor

@sloosvel what's these jobs up to?

On Levante, a user can't have more than 20 Slurm jobs running at a time. As soon as a job is finished, the next one should start

@valeriupredoi
Copy link
Contributor Author

They will be pending until other jobs finish I guess

Cheers! More emails then 🤦‍♂️ 🤣

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 26, 2022

OK guys - first (and only) sbatch session over on Levante (I have one stray recipe still running, it's a zombie though) and this is how it looks:

Recipe running session 2022-10-26 13:13:41.568698

Succesfully run recipes

122 out of 127 final

  • recipe_anav13jclim.yml by @remi-kazeroni
  • recipe_albedolandcover.yml
  • recipe_arctic_ocean.yml
  • recipe_autoassess_landsurface_permafrost.yml
  • recipe_autoassess_landsurface_soilmoisture.yml
  • recipe_autoassess_landsurface_surfrad.yml
  • recipe_autoassess_radiation_rms_Amon_all.yml
  • recipe_autoassess_radiation_rms_Amon_obs.yml
  • recipe_autoassess_stratosphere.yml
  • recipe_bock20jgr_fig_1-4.yml by @remi-kazeroni
  • recipe_bock20jgr_fig_6-7.yml
  • recipe_bock20jgr_fig_8-10.yml
  • recipe_capacity_factor.yml
  • recipe_carvalhais14nat.yml
  • recipe_climwip_brunner2019_med.yml by @remi-kazeroni
  • recipe_climwip_brunner20esd.yml
  • recipe_climwip_test_basic.yml
  • recipe_climwip_test_performance_sigma.yml
  • recipe_clouds_bias.yml
  • recipe_clouds_ipcc.yml
  • recipe_cmug_h2o.yml
  • recipe_collins13ipcc.yml by @remi-kazeroni
  • recipe_combined_indices.yml
  • recipe_concatenate_exps.yml
  • recipe_consecdrydays.yml
  • recipe_correlation.yml
  • recipe_cox18nature.yml
  • recipe_cvdp.yml
  • recipe_daily_era5.yml
  • recipe_deangelis15nat.yml
  • recipe_deangelis15nat_fig1_fast.yml
  • recipe_decadal.yml
  • recipe_diurnal_temperature_index.yml
  • recipe_eady_growth_rate.yml
  • recipe_ecs.yml
  • recipe_ecs_constraints.yml
  • recipe_ecs_scatter.yml
  • recipe_ensclus.yml
  • recipe_era5-land.yml
  • recipe_esacci_lst.yml
  • recipe_esacci_oc.yml
  • recipe_extract_shape.yml
  • recipe_extreme_events.yml
  • recipe_extreme_index.yml
  • recipe_eyring06jgr.yml
  • recipe_eyring13jgr_12.yml
  • recipe_gier2020bg.yml
  • recipe_globwat.yml
  • recipe_heatwaves_coldwaves.yml
  • recipe_hydro_forcing.yml
  • recipe_hyint.yml
  • recipe_hyint_extreme_events.yml
  • recipe_hype.yml
  • recipe_impact.yml by @remi-kazeroni
  • recipe_ipccwg1ar6ch3_atmosphere.yml
  • recipe_julia.yml
  • recipe_kcs.yml
  • recipe_landcover.yml
  • recipe_lauer13jclim.yml
  • recipe_li17natcc.yml
  • recipe_lisflood.yml
  • recipe_marrmot.yml
  • recipe_martin18grl.yml
  • recipe_meehl20sciadv.yml
  • recipe_miles_block.yml
  • recipe_miles_eof.yml
  • recipe_miles_regimes.yml
  • recipe_modes_of_variability.yml
  • recipe_monitor.yml
  • recipe_monitor_with_refs.yml
  • recipe_mpqb_xch4.yml
  • recipe_multimodel_products.yml
  • recipe_my_personal_diagnostic.yml
  • recipe_ncl.yml
  • recipe_ocean_Landschuetzer2016.yml
  • recipe_ocean_amoc.yml
  • recipe_ocean_bgc.yml
  • recipe_ocean_example.yml
  • recipe_ocean_ice_extent.yml
  • recipe_ocean_multimap.yml
  • recipe_ocean_quadmap.yml
  • recipe_ocean_scalar_fields.yml
  • recipe_pcrglobwb.yml
  • recipe_preprocessor_derive_test.yml
  • recipe_preprocessor_test.yml
  • recipe_psyplot.yml
  • recipe_pv_capacity_factor.yml
  • recipe_python.yml
  • recipe_quantilebias.yml
  • recipe_perfmetrics_CMIP5.yml by @remi-kazeroni
  • recipe_perfmetrics_CMIP5_4cds.yml by @remi-kazeroni
  • recipe_r.yml
  • recipe_radiation_budget.yml
  • recipe_rainfarm.yml
  • recipe_runoff_et.yml
  • recipe_russell18jgr.yml
  • recipe_schlund20esd.yml
  • recipe_schlund20jgr_gpp_abs_rcp85.yml
  • recipe_schlund20jgr_gpp_change_1pct.yml
  • recipe_schlund20jgr_gpp_change_rcp85.yml
  • recipe_sea_surface_salinity.yml
  • recipe_seaice.yml by @remi-kazeroni
  • recipe_seaice_drift.yml
  • recipe_seaice_feedback.yml
  • recipe_shapeselect.yml
  • recipe_smpi.yml
  • recipe_smpi_4cds.yml
  • recipe_snowalbedo.yml
  • recipe_spei.yml
  • recipe_tcr.yml
  • recipe_tebaldi21esd.yml
  • recipe_thermodyn_diagtool.yml
  • recipe_toymodel.yml
  • recipe_validation.yml
  • recipe_validation_CMIP6.yml
  • recipe_variable_groups.yml
  • recipe_wenzel14jgr.yml
  • recipe_wenzel16jclim.yml
  • recipe_wenzel16nat.yml
  • recipe_wflow.yml
  • recipe_williams09climdyn_CREM.yml
  • recipe_zmnam.yml

Recipes that failed with DiagnosticError

0 out of 127 (1 fixed, not PR-ed yet)

Recipes that failed of Missing Data

2 out of 127 final

Recipes that failed of other reasons

3 out of 127 final

Obsolete/resolved issues comment:

The Julia ones are totally my bad - forgot to install Julia after installing esmvaltool, the autoassess ones are either of the old bug that @alistairsellar is fixing now, or they need aux data that is only on JASMIN, the ones of Missing Data are bothering me badly - since I have turned on auto downloads but they are still missing data, what do you guys recommend doing about those? @sloosvel @remi-kazeroni @bouweandela ? I will post detailed postmortems for the ones that have failed for odd reasons below 👍

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 26, 2022

Postmortem of failed recipes OTHER THAN Missing Data

Recipes that failed with DiagnosticError

0 out of 127 (1 fixed, not yet PR-ed)

Recipes that failed of other reasons or are still running

1 out of 127

@remi-kazeroni
Copy link
Contributor

Hi @valeriupredoi, great job with the testing! I forgot to mention but we have a central pool of downloaded data on Levante at /work/bd0854/DATA/ESMValTool2/download/CMIP6, /work/bd0854/DATA/ESMValTool2/download/cmip5/output1, and /work/bd0854/DATA/ESMValTool2/download/cmip5/output1. Maybe you could add those to your path on top of your download directory? This should help solving the time limit issues (lots of fx files searched on ESGF and/or downloaded I guess).

@remi-kazeroni
Copy link
Contributor

recipe_smpi.yml - too slow Elapsed time : 04:00:19 (Timelimit=04:00:00)

For this one, I would recommend using:

#SBATCH --partition=compute
#SBATCH --time=08:00:00
#SBATCH --constraint=512G

@valeriupredoi
Copy link
Contributor Author

Indeed, cheers @remi-kazeroni - smpi is a memory gobbler - I restarted it on SLURM and promptly got kicked out coz mem limit (this time around I think all data has been downloaded, hence it went to intensive processing). I'll resubmit with mem reqs. What do you recommend about those that really-really are missing data?

@valeriupredoi
Copy link
Contributor Author

recipe_smpi.yml - too slow Elapsed time : 04:00:19 (Timelimit=04:00:00)

For this one, I would recommend using:

#SBATCH --partition=compute
#SBATCH --time=08:00:00
#SBATCH --constraint=512G

even with 512G still fails out of MEM 😮

@valeriupredoi
Copy link
Contributor Author

oh crap, forgot to change the partition 😶‍🌫️

@remi-kazeroni
Copy link
Contributor

recipe_smpi.yml - too slow Elapsed time : 04:00:19 (Timelimit=04:00:00)

For this one, I would recommend using:

#SBATCH --partition=compute
#SBATCH --time=08:00:00
#SBATCH --constraint=512G

even with 512G still fails out of MEM 😮

You can try with 1024G then! But that's the highest available

@valeriupredoi
Copy link
Contributor Author

recipe_smpi.yml - too slow Elapsed time : 04:00:19 (Timelimit=04:00:00)

For this one, I would recommend using:

#SBATCH --partition=compute
#SBATCH --time=08:00:00
#SBATCH --constraint=512G

even with 512G still fails out of MEM open_mouth

You can try with 1024G then! But that's the highest available

totally user-side - forgot to change the partition to compute - cheers, dude! 🍺

@sloosvel
Copy link
Contributor

I never managed to run the smpi recipes, @remi-kazeroni did it for me in the last release. Maybe the batch script settings for this recipe can be changed in #2883

@valeriupredoi
Copy link
Contributor Author

with correct SLURM settings as recommended by @remi-kazeroni (:beer:) those smpi monsters are happily plodding along now - yes, we should change the settings for sure. @sloosvel how did you fix the runs for those recipes that really-really dont have data, like I found in #2881 (comment)

@remi-kazeroni
Copy link
Contributor

I don't have a definitive answer for the really-really missing data cases. As said in this comment, you could try to rerun the recipes adding these paths to you config file. But that data pool is 2 releases old. One could argue that we should delete it and re-download everything as /work/bd0854/DATA/ESMValTool2/download/ may contain data retracted from ESGF...

Taking a closer look at some of these (currently) 13 cases:

@sloosvel
Copy link
Contributor

I think for recipe_climate_change_hotspot.ym, I ended up running it on jasmin

@valeriupredoi
Copy link
Contributor Author

Hi @remi-kazeroni @sloosvel awesome, thanks a lot! Here's the thing(s):

  • recipe_anav13jclim.yml - this is not optimal if "special" cmip5 data is needed, that is not available on ESGF - I would add this recipe to the list of those we have to see what to do about it wrt obsolete data
  • recipe_climate_change_hotspot.yml - same as above, unless there is a serious reason why it's not working, having to have preferred sites where recipes run is against our core principle of reproducibility of results

I'll have a closer look at the meeh and schnlund ones, and will ping @schlunma asap

@katjaweigel
Copy link
Contributor

katjaweigel commented Oct 26, 2022

Yes, the version of recipe_flato13ipcc.yml currently in #2156 is running. The cost is to remove/comment out data sets, which do not work on Levante (and to fix a wrong time period for one model). There was already some discussion on how to deal with such cases, and if I remember right @axel-lauer , who is maintainer of the original recipe_flato13ipcc.yml did not agree on removing data sets? It should also be noted, that the option --skip_nonexistent does not work for all diagnostics in recipe_flato13ipcc.yml, because in several data sets from e.g. two different experiments are needed and it does not work, if only one is there. Therefore I was going to ask, which version of recipe_flato13ipcc.yml should be in the end in #2156 in this issue. (Unfortunately I'm also not completely ready with some issues in recipe_flato13ipcc_figures_938_941.yml I hope to finish them soon).

@schlunma
Copy link
Contributor

V, can adapt the permission to /scratch/b/b382109/esmvaltool_output so I can have a look at the logs?

@valeriupredoi
Copy link
Contributor Author

/scratch/b/b382109/esmvaltool_output

@schlunma Manu, they are here /home/b/b382109/manu_logs

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 26, 2022

Yes, the version of recipe_flato13ipcc.yml currently in #2156 is running. The cost is to remove/comment out data sets, which do not work on Levante (and to fix a wrong time period for one model). There was already some discussion on how to deal with such cases, and if I remember right @axel-lauer , who is maintainer of the original recipe_flato13ipcc.yml did not agree on removing data sets? It should also be noted, that the option --skip_nonexistent does not work for all diagnostics in recipe_flato13ipcc.yml, because in several data sets from e.g. two different experiments are needed and it does not work, if only one is there. Therefore I was going to ask, which version of recipe_flato13ipcc.yml should be in the end in #2156 in this issue. (Unfortunately I'm also not completely ready with some issues in recipe_flato13ipcc_figures_938_941.yml I hope to finish them soon).

@katjaweigel many thanks for your clarification! I will consider this recipe at-risk for now, and will not faff about it until you guys fix it - not the first and not the last time we include not really fully working recipes in a release 😁

@schlunma
Copy link
Contributor

cd: permission denied: /home/b/b382109/manu_logs 😢

@sloosvel
Copy link
Contributor

Hi @valeriupredoi please let me know if want to schedule a call, I have to say that I am quite confused by all your issues. I did not ran into any of that.

@valeriupredoi
Copy link
Contributor Author

Hi @sloosvel - many thanks, am back on track now, no need for a call just yet, maybe if you could keep an eye on this issue if I ask for some help, that'd be awesome 🍺 👍

@valeriupredoi
Copy link
Contributor Author

OK comparison tool is now plodding along nicely - I have also added the instructions in the issue description - we can use that description to hatch us a nice doc entry - the next RM should not go through the Gates of VM Purgatory like I did yesterday 👍

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 28, 2022

Comparison results

Run command and output stored

  • location: DKRZ VM
  • current: /work/bd0854/b382109/v270 (contains preproc/ dirs too, 122 recipes)
  • reference: /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4 (does not contain preproc/ dirs)
  • cmd: nohup python ESMValTool/esmvaltool/utils/testing/regression/compare.py /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4 /work/bd0854/b382109/v270 > compare270output.txt
  • examination from /home/b/b382109/compare270output.txt

Per recipe result

Legend:

  • OK: plots identical (even if some work nc are different)
  • NOK: plots differ

122 out of 127 final

  • recipe_anav13jclim.yml by @remi-kazeroni NOK
  • recipe_albedolandcover.yml NOK
  • recipe_arctic_ocean.yml OK
  • recipe_autoassess_landsurface_permafrost.yml OK
  • recipe_autoassess_landsurface_soilmoisture.yml NOK
  • recipe_autoassess_landsurface_surfrad.yml OK
  • recipe_autoassess_radiation_rms_Amon_all.yml OK
  • recipe_autoassess_radiation_rms_Amon_obs.yml OK
  • recipe_autoassess_stratosphere.yml OK
  • recipe_bock20jgr_fig_1-4.yml by @remi-kazeroni NOK
  • recipe_bock20jgr_fig_6-7.yml NOK
  • recipe_bock20jgr_fig_8-10.yml OK
  • recipe_capacity_factor.yml NOK
  • recipe_carvalhais14nat.yml all plot files are now pdf (from png) - need look NOK
  • recipe_climwip_brunner2019_med.yml by @remi-kazeroni OK
  • recipe_climwip_brunner20esd.yml OK
  • recipe_climwip_test_basic.yml OK
  • recipe_climwip_test_performance_sigma.yml OK
  • recipe_clouds_bias.yml OK
  • recipe_clouds_ipcc.yml OK
  • recipe_cmug_h2o.yml OK
  • recipe_collins13ipcc.yml by @remi-kazeroni NOK
  • recipe_combined_indices.yml OK
  • recipe_concatenate_exps.yml OK
  • recipe_consecdrydays.yml OK
  • recipe_correlation.yml OK
  • recipe_cox18nature.yml OK
  • recipe_cvdp.yml OK
  • recipe_daily_era5.yml OK
  • recipe_deangelis15nat.yml OK
  • recipe_deangelis15nat_fig1_fast.yml OK
  • recipe_decadal.yml OK
  • recipe_diurnal_temperature_index.yml OK
  • recipe_eady_growth_rate.yml OK
  • recipe_ecs.yml OK
  • recipe_ecs_constraints.yml NOK
  • recipe_ecs_scatter.yml OK
  • recipe_ensclus.yml OK
  • recipe_era5-land.yml OK
  • recipe_esacci_lst.yml OK
  • recipe_esacci_oc.yml OK
  • recipe_extract_shape.yml OK
  • recipe_extreme_events.yml missing 2 plots, plots differ too NOK
  • recipe_extreme_index.yml missing 1 plots, plots differ too NOK
  • recipe_eyring06jgr.yml OK
  • recipe_eyring13jgr_12.yml OK
  • recipe_gier2020bg.yml 1 plots differ NOK
  • recipe_globwat.yml OK
  • recipe_heatwaves_coldwaves.yml OK
  • recipe_hydro_forcing.yml OK
  • recipe_hyint.yml NOK
  • recipe_hyint_extreme_events.yml NOK
  • recipe_hype.yml OK
  • recipe_impact.yml by @remi-kazeroni no reference found, unable to check
  • recipe_ipccwg1ar6ch3_atmosphere.yml no reference run found, unable to check
  • recipe_julia.yml OK
  • recipe_kcs.yml NOK
  • recipe_landcover.yml OK
  • recipe_lauer13jclim.yml OK
  • recipe_li17natcc.yml NOK
  • recipe_lisflood.yml OK
  • recipe_marrmot.yml OK
  • recipe_martin18grl.yml NOK
  • recipe_meehl20sciadv.yml all plots went from png to pdf NOK
  • recipe_miles_block.yml NOK
  • recipe_miles_eof.yml NOK
  • recipe_miles_regimes.yml OK
  • recipe_modes_of_variability.yml NOK
  • recipe_monitor.yml NOK
  • recipe_monitor_with_refs.yml NOK
  • recipe_mpqb_xch4.yml OK
  • recipe_multimodel_products.yml NOK
  • recipe_my_personal_diagnostic.yml OK
  • recipe_ncl.yml OK
  • recipe_ocean_Landschuetzer2016.yml OK
  • recipe_ocean_amoc.yml OK
  • recipe_ocean_bgc.yml NOK
  • recipe_ocean_example.yml NOK
  • recipe_ocean_ice_extent.yml NOK
  • recipe_ocean_multimap.yml OK
  • recipe_ocean_quadmap.yml OK
  • recipe_ocean_scalar_fields.yml OK
  • recipe_pcrglobwb.yml OK
  • recipe_preprocessor_derive_test.yml OK
  • recipe_preprocessor_test.yml OK
  • recipe_psyplot.yml OK
  • recipe_pv_capacity_factor.yml OK
  • recipe_python.yml OK
  • recipe_quantilebias.yml OK
  • recipe_perfmetrics_CMIP5.yml by @remi-kazeroni OK
  • recipe_perfmetrics_CMIP5_4cds.yml by @remi-kazeroni NOK
  • recipe_r.yml OK
  • recipe_radiation_budget.yml OK
  • recipe_rainfarm.yml OK
  • recipe_runoff_et.yml OK
  • recipe_russell18jgr.yml OK
  • recipe_schlund20esd.yml all plots changed from png to pdf NOK
  • recipe_schlund20jgr_gpp_abs_rcp85.yml NOK
  • recipe_schlund20jgr_gpp_change_1pct.yml OK
  • recipe_schlund20jgr_gpp_change_rcp85.yml NOK
  • recipe_sea_surface_salinity.yml OK
  • recipe_seaice.yml by @remi-kazeroni OK
  • recipe_seaice_drift.yml OK
  • recipe_seaice_feedback.yml OK
  • recipe_shapeselect.yml OK
  • recipe_smpi.yml OK
  • recipe_smpi_4cds.yml OK
  • recipe_snowalbedo.yml OK
  • recipe_spei.yml NOK
  • recipe_tcr.yml OK
  • recipe_tebaldi21esd.yml no reference run found, unable to check
  • recipe_thermodyn_diagtool.yml OK
  • recipe_toymodel.yml NOK
  • recipe_validation.yml OK
  • recipe_validation_CMIP6.yml OK
  • recipe_variable_groups.yml OK
  • recipe_wenzel14jgr.yml OK
  • recipe_wenzel16jclim.yml OK
  • recipe_wenzel16nat.yml NOK
  • recipe_wflow.yml OK
  • recipe_williams09climdyn_CREM.yml OK
  • recipe_zmnam.yml OK

Result

We need to look at plots for 34 recipes; we're good to go for 85 recipes; 3 have no reference in 2.6.0

@bouweandela
Copy link
Member

bouweandela commented Oct 28, 2022

And as mentioned in my previous comment, anav13 is a special case since data from output2 cannot be read with the default DRS (our fault, not CMIPs!). You could also try:

CMIP5: /work/bd0854/DATA/ESMValTool2/download/cmip5/output2

@schlunma Why are the project and product facets hardcoded in the path? This should work fine if the DRS starts with {project.lower}/{product}/.. (i.e. the one called ESGF in config-developer.yml) and the rootpath is set to /work/bd0854/DATA/ESMValTool2/download.

@bouweandela
Copy link
Member

Oh wait, is that because you're trying to combine data downloaded with the ESGF DRS with the DKRZ DRS and we don't support per path DRS settings? ESMValGroup/ESMValCore#129

@schlunma
Copy link
Contributor

Oh wait, is that because you're trying to combine data downloaded with the ESGF DRS with the DKRZ DRS and we don't support per path DRS settings? ESMValGroup/ESMValCore#129

Exactly. We found that using the full paths with project and output for the downloaded data is currently the cleanest way to include the DKRZ and ESGF rootpaths.

@bouweandela
Copy link
Member

Actually, the cleanest way to make it work is just to set download_dir: /work/bd0854/DATA/ESMValTool2/download/ (provided you have write access there) and run with offline: false and only use the 'official' rootpaths for the CMIP projects. This works because the tool will always check if it already has a file before downloading it. In this case you're lucky that the ESGF DRS is similar to the DKRZ one, so your approach works too. Anyway, I'll see if I can do something about ESMValGroup/ESMValCore#129 for the next release.

@schlunma
Copy link
Contributor

We tried that too, but if I remember correctly there was a problem with this. Could be that it was an issue because at the beginning downloading was very slow (so there was no chance that slow recipes would run), which should be fixed now. I will give it another try sometime in the future.

It would be really great if you could do something about ESMValGroup/ESMValCore#129 🚀

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 28, 2022

alright folks, as we have seen in #2881 (comment) we need to look at some recipes that have not had the same plots as in v2.6.0, these are 34 party poopers:

  • recipe_anav13jclim
  • recipe_albedolandcover
  • recipe_autoassess_landsurface_soilmoisture (this time it was proper run on Levante!)
  • recipe_bock20jgr_fig_1-4 - plot legends are nicer in 2.7.0 🍺
  • recipe_bock20jgr_fig_6-7 png to pdf format, plots look the same (my eyes hurt now)
  • recipe_capacity_factor - stared at it like an idiot, can't see any difference, img filesize varies by <1.5%
  • recipe_carvalhais14nat - all fine, plots PDF now, not png, see Recipe recipe_carvalhais14nat.yml fails at plotting in diagnostic #2886
  • recipe_collins13ipcc
  • recipe_ecs_constraints (@schlunma: only auxiliary plots change - no idea what though, they look fine - important results are unchanged)
  • recipe_extreme_events - plots OK (identical); missing these though:
    Missing files:
    • plots/extreme_events/main/.ipynb_checkpoints/Gleckler_CMIP_24-models_11-idx_4-obs_1981-2000-checkpoint.png
    • plots/extreme_events/main/.ipynb_checkpoints/cddETCCDI_yr_4-obs_ensmean_timeseriesplot-checkpoint.png
      Not sure what that .ipnb - python notebook is doing there - the actual plots with those names (bar checkpoint) are there and fine, am callling it OK
  • recipe_extreme_index - same as for the other extreme, ipython guff snuck in 2.6.0 w/o need
  • recipe_gier2020bg
  • recipe_hyint
  • recipe_hyint_extreme_events
  • recipe_kcs - blue lines differ a swoosh 2.7 vs 2.6 - diags use sampling, and a rand num generator, so I'd expect variations
  • recipe_li17natcc - they actually fixed buggy plot axes (nice!) see 2.7 vs 2.6
  • recipe_martin18grl - same here, cleaned up buggy plot axes, see eg 2.7 vs 2.6
  • recipe_meehl20sciadv (@schlunma: v2.7.0 uses PDF, v2.6.0 used PNG, plots look fine)
  • recipe_miles_block
  • recipe_miles_eof
  • recipe_modes_of_variability - plots and table differ significantly; files affected are:
    Differing files:
  • recipe_monitor (@schlunma: plots look slightly different but still reasonable, no idea what changed)
  • recipe_monitor_with_refs (@schlunma: plot now shows basic stats: Added option to show basic statistics in plots of monitor/multi_datasets.py #2790)
  • recipe_multimodel_products - bunch of tiny x's change from one version to another 2.7 vs 2.6 - not anything broken, but would be worth looking into it
  • recipe_ocean_bgc
  • recipe_ocean_example - see Recipe testing and comparison for release 2.7.0 #2881 (comment)
  • recipe_ocean_ice_extent - coastline lin contours much more pronounced in 2.7, see Recipe testing and comparison for release 2.7.0 #2881 (comment)
  • recipe_perfmetrics_CMIP5_4cds - figure legend files, but they are fine and the same
  • recipe_schlund20esd (@schlunma: v2.7.0 uses PDF, v2.6.0 used PNG, plots look fine)
  • recipe_schlund20jgr_gpp_abs_rcp85 (@schlunma: uses a probabilistic algorithm, plots are expected to change but look reasonable; will see what I can do for the next release -> Recipes schlund20jgr_*.yml give non-deterministic results  #2889)
  • recipe_schlund20jgr_gpp_change_rcp85 (@schlunma: uses a probabilistic algorithm, plots are expected to change but look reasonable; will see what I can do for the next release -> Recipes schlund20jgr_*.yml give non-deterministic results  #2889)
  • recipe_spei - changed hist colours from a baby pink to ugly brown see 2.7 vs 2.6 - @katjaweigel please yell if you think there are other differences, they look idem to me
  • recipe_toymodel plots differ 2.7 vs 2.6 but this is again synthetic rand data generation so would expect it to differ; also R diagnostic, so prob related to see above
  • recipe_wenzel16nat (@schlunma: looks fine)

To quickly identify differing plots please have a look at this log https://esmvaltool.dkrz.de/shared/esmvaltool/compare270output_trimmed.txt

We can have a look at them in the run list for v2.7.0 https://esmvaltool.dkrz.de/shared/esmvaltool/v2.7.0/debug.html vs the v2.6.0 one https://esmvaltool.dkrz.de/shared/esmvaltool/v2.6.0/debug.html - I will start having me a look but by all means, @ESMValGroup/esmvaltool-developmentteam I could really use a hand here, especially since you (as recipe maintainer/developer) you know these things well, they're all beetles and bugs on coloured paper to me 😁

@bettina-gier
Copy link
Contributor

Is there a log file where we can see the differences? My recipe has a lot of plots and if just one of them differs as the list says it'd be easier to just look at that one =D

@valeriupredoi
Copy link
Contributor Author

Is there a log file where we can see the differences? My recipe has a lot of plots and if just one of them differs as the list says it'd be easier to just look at that one =D

logfile coming right away - I'll post it in the comment above 🍺

@valeriupredoi
Copy link
Contributor Author

before I post the log (currently curating it) to not lose you, Tina, here's the only bitty plot that differs for your recipe:
recipe_gier2020bg.yml: results differ from reference run
Reference run: /mnt/esmvaltool_disk2/shared/esmvaltool/v2.6.0rc4/recipe_gier2020bg_20220712_100159
Current run: /work/bd0854/b382109/v270/recipe_gier2020bg_20221025_142445
Differing files:

  • plots/cmip6_ensemble_analysis/main_ensemble/xco2_esm-hist_global_2003-2014_barplot_SA_obs.png

@bettina-gier
Copy link
Contributor

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

@valeriupredoi
Copy link
Contributor Author

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

@bettina-gier - legend, many thanks! 🍺

@sloosvel
Copy link
Contributor

@valeriupredoi do you mind if I run recipe_climate_change_hotspot in jasmin, so that I can at least upload the results for this version?

@valeriupredoi
Copy link
Contributor Author

@valeriupredoi do you mind if I run recipe_climate_change_hotspot in jasmin, so that I can at least upload the results for this version?

@sloosvel go for it! Cheers 🍺 - but am planning on releasing tonight, it looks promising. If you run it then upload results to the v2.7.0 that'd be awesome, and the release is not affecting that 👍 It'd be great if you was around to approve the last PR thereby changing the version number, in an hour or so, no probs if you not will ask @bouweandela

@TomasTorsvik
Copy link
Contributor

@valeriupredoi for recipe_ocean_example, the only obvious difference seems to be the transect plots, Diag_Transect_1, Diag_Transect_2 and Diag_Transect_3. These plots are empty in v2.6.0 and non-empty in v2.7.0 (see bugfix #2858).

Diag_Transect_1 picks up the mask data 1.e20, but this is probably a separate issue.

@valeriupredoi
Copy link
Contributor Author

valeriupredoi commented Oct 28, 2022

@TomasTorsvik brilliant, many thanks for looking! And a positive difference too, thanks to your PR 🍺

Diag_Transect_1 picks up the mask data 1.e20, but this is probably a separate issue.

Would you be OK to open an issue about this, please? And tag @ledm so we can fix that in 2.8. Many thanks! 🍺

@TomasTorsvik
Copy link
Contributor

@valeriupredoi the same applies for recipe_ocean_bgc, the v2.7.0 have plots for Diag_Transect_No_Data and Diag_Transect_vs_Woa that are empty in v2.6.0. The other plots look OK to me.

@valeriupredoi
Copy link
Contributor Author

Fantastic, cheers @TomasTorsvik 🍺

@valeriupredoi
Copy link
Contributor Author

OK this concludes the release testing marathon! Good news is there are not many bad apples among the recipes, bad news is there are a couple - see #2881 (comment) - we found a couple MAGICs project R recipes that look dubious and opened at least one issue about #2890 - but since these recipes are unmaintained, developers who wrote them have in the meantime left the institutes they're listed under etc I am not going to hold the release for some Da Vinci Code-style tracking down; we need to think what we do with such recipes.

Oh and the ocean recipes by @ledm need some TLC but he's told me this for a while now, we should get together one time and fix them, no major bugs, but old crap that needs updating.

I declare this Tool ready for release! Many thanks to all who helped during this testing process @sloosvel @remi-kazeroni @schlunma @bettina-gier @TomasTorsvik and @bouweandela of course 😁 🍻

@valeriupredoi
Copy link
Contributor Author

it's out and about! 🍺 https://pypi.org/project/ESMValTool/2.7.0/

@bouweandela
Copy link
Member

bouweandela commented Nov 1, 2022

You can pass that recipe, that's just a different sorting for the diff ensemble members in the histogram and looks diff cause they're not labeled. Cheers for the special extract for me ;)

@bettina-gier Would it be possible to sort the ensemble members in a way that is stable between runs? With the upcoming more regular recipe testing that @ehogan et al are working on, as described in #2723, issues like this will keep popping up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants