Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realization marked as failed, but all fm steps completed (2024.04) #7715

Closed
larsevj opened this issue Apr 19, 2024 · 4 comments · Fixed by #7927
Closed

Realization marked as failed, but all fm steps completed (2024.04) #7715

larsevj opened this issue Apr 19, 2024 · 4 comments · Fixed by #7927
Labels

Comments

@larsevj
Copy link
Contributor

larsevj commented Apr 19, 2024

Ran a drogon case, and on iteration 3; four (4) realizations were marked as failed in the GUI, but all forward model steps were marked as success and OK file was written. The following error message was found in the logs:

status from done callback: Error reading GEN_DATA: R_A2_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/RFT_R_A2_1']
Error reading GEN_DATA: R_A3_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/RFT_R_A3_1']
Error reading GEN_DATA: R_A4_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/RFT_R_A4_1']
Error reading GEN_DATA: R_A5_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/RFT_R_A5_1']
Error reading GEN_DATA: R_A6_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/RFT_R_A6_1']
Error reading GEN_DATA: TRACER_SIM, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/tracer/drogon_tracer_sim_1.txt']
Error reading GEN_DATA: AMP_2020_2018_TOP, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/share/results/points/topvolantis_amplitude_mean_20200701_20180101_1.txt']
Error reading GEN_DATA: AMP_2020_2018_BASE, errors: ['Missing output file: /scratch/fmu/levje/01_drogon_ahm_test_license/realization-19/iter-3/share/results/points/basevolantis_amplitude_mean_20200701_20180101_1.txt']
ERROR    Realization: 60 failed after reaching max submit (1):

To reproduce
Steps to reproduce the behaviour:

  1. pip install ert
  2. ert gui my_config.ert
  3. Run experiment (IES/Smoother/ESMDA/Test)

Expected behaviour
A clear and concise description of what you expected to happen.

Environment

  • OS: [ RHEL7]
  • ERT/Komodo release: [2024.04]
  • Python version
  • Remote/HPC execution involved: [yes]
@larsevj larsevj added the bug label Apr 19, 2024
@larsevj larsevj changed the title Realization marked as failed, an errors in the logs, but all jobs completed (2024.04) Realization marked as failed, an errors in the logs, but all fm steps completed (2024.04) Apr 19, 2024
@larsevj larsevj changed the title Realization marked as failed, an errors in the logs, but all fm steps completed (2024.04) Realization marked as failed, but all fm steps completed (2024.04) Apr 19, 2024
@larsevj
Copy link
Contributor Author

larsevj commented Apr 20, 2024

Error seen in ert-internal examples as well on building the 2024.04.04 release:
https://github.com/equinor/komodo-releases/actions/runs/8755056479/job/24055171964

@sondreso
Copy link
Collaborator

We need to check if the file is on disk, and if it is we need to reconsider if there should be a slight wait in the callback to allow disk synchronisation.

@larsevj
Copy link
Contributor Author

larsevj commented Apr 22, 2024

In the case of ert-internal-examples the file does seem to be on disk:

cat RFT_RWI_3_1
271.8949890136719
268.4920349121094
275.8153991699219

@eivindjahren
Copy link
Contributor

Would be solved by #7788

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants