Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SatDiagn problems in gcclassic CO2 run (v14.1) #1805

Closed
lfeng89089 opened this issue May 24, 2023 · 10 comments · Fixed by #1808
Closed

SatDiagn problems in gcclassic CO2 run (v14.1) #1805

lfeng89089 opened this issue May 24, 2023 · 10 comments · Fixed by #1808
Assignees
Labels
category: Bug Something isn't working topic: Diagnostics Related to output diagnostic data

Comments

@lfeng89089
Copy link

lfeng89089 commented May 24, 2023

In my tests for v14.1 CO2 simulation, all runs go smoothly when SatDiagn is off. However when I turn on SatDiagn, various problems appears. Here is one example setting for HISTORY.rc :

  SatDiagn.template:          '%y4%m2%d2_%h2%n2z.nc4',
  SatDiagn.format:            'CFIO',
  SatDiagn.frequency:        00000001 000000
  SatDiagn.duration:          00000001 000000
  SatDiagn.hrrange:           11.98 15.02
  SatDiagn.mode:              'time-averaged'
  SatDiagn.fields:            'SatDiagnConc_CO2      '

The simulation will crash at end of first day with segmentation error:

if I also included Pressure Edge to to the collection

SatDiagnPEdge

, I got an error message to complain ' no OH field included in the simulation '

I am not sure whether other people have met similar problem..

@yantosca yantosca self-assigned this May 24, 2023
@yantosca yantosca added category: Bug Something isn't working topic: Diagnostics Related to output diagnostic data labels May 24, 2023
@yantosca
Copy link
Contributor

Thanks for writing @lfeng89089. I think the problem is that you cannot mix level-centered and level-edged diagnostics in the same diagnostic collection, as the netCDF COARDS convention requires that there be only one vertical dimension per file.

Ww would recommend that you create a new collection just for the SatDiagnPEdge field:

.....
SatDiagnEdge.template: '%y4%m2%d2_%h2%n2z.nc4',
SatDiagnEdge.format: 'CFIO',
SatDiagnEdge.frequency: 00000001 000000
SatDiagnEdge.duration: 00000001 000000
SatDiagnEdge.hrrange: 11.98 15.02
SatDiagnEdge.mode: 'time-averaged'
SatDiagnEdge.fields: 'SatDiagnPEdge ',

and this should fix the issue.

@lfeng89089
Copy link
Author

lfeng89089 commented May 25, 2023 via email

@yantosca
Copy link
Contributor

yantosca commented May 25, 2023

Thanks for the error message. The error is happening at line 858 of History/history_netcdf_mod.F90

          !------------------------------------------------------------------
          ! 3-D data
          !------------------------------------------------------------------
          CASE( 3 )

             ! Get dimensions of data
             Dim1 = SIZE( Item%Data_3d, 1 )
             Dim2 = SIZE( Item%Data_3d, 2 )
             Dim3 = SIZE( Item%Data_3d, 3 )

             ! Get average for satellite diagnostic:
             IF ( Container%name == 'SatDiagn' ) THEN
                Item%Data_3d = Item%Data_3d / State_Diag%SatDiagnCount.   ! <=== line where error happens
                Item%nUpdates = 1.0
             ENDIF

Did you add SatDiagnCount to the SatDiagn collection in HISTORY.rc? That might be the root of the problem.

Probably what is happening is that State_Diag%SatDiagnCount is 0 and that is causing a div-by-zero condition. The quick fix is to add SatDiagnCount to your HISTORY.rc:

  SatDiagn.template:          '%y4%m2%d2_%h2%n2z.nc4',
  SatDiagn.format:            'CFIO',
  SatDiagn.frequency:        00000001 000000
  SatDiagn.duration:          00000001 000000
  SatDiagn.hrrange:           11.98 15.02
  SatDiagn.mode:              'time-averaged'
  SatDiagn.fields:            'SatDiagnConc_CO2      '
                              'SatDiagnCount         '

We can put in a bug fix that would halt the run if you have any of the SatDiagn diagnostics selected in HISTORY.rc, but not SatDiagnCount. That would stop the run right away instead of letting a long simulation go on and then dying when you get to the time to write out diagnostics. This can be added to either 14.2.1 or 14.3.0, which are the next versions in line.

@yantosca
Copy link
Contributor

Also @lfeng89089, I realize the documentation can also be improved as well. We can work on that too.

@yantosca
Copy link
Contributor

yantosca commented May 25, 2023

@lfeng89089, I think I've found the root cause. In Headers/state_diag_mod.F90, we have these statements:

    !------------------------------------------------------------------------
    ! Set a single logical for SatDiagn output
    !------------------------------------------------------------------------
    State_Diag%Archive_SatDiagn = (                                          &
         State_Diag%Archive_SatDiagnColEmis                             .or. &
         State_Diag%Archive_SatDiagnSurfFlux                            .or. &
         State_Diag%Archive_SatDiagnOH                                  .or. &
         State_Diag%Archive_SatDiagnRH                                  .or. &
         State_Diag%Archive_SatDiagnAirDen                              .or. &
         State_Diag%Archive_SatDiagnBoxHeight                           .or. &
         State_Diag%Archive_SatDiagnPEdge                               .or. &
         State_Diag%Archive_SatDiagnTROPP                               .or. &
         State_Diag%Archive_SatDiagnPBLHeight                           .or. &
         State_Diag%Archive_SatDiagnPBLTop                              .or. &
         State_Diag%Archive_SatDiagnTAir                                .or. &
         State_Diag%Archive_SatDiagnGWETROOT                            .or. &
         State_Diag%Archive_SatDiagnGWETTOP                             .or. &
         State_Diag%Archive_SatDiagnPARDR                               .or. &
         State_Diag%Archive_SatDiagnPARDF                               .or. &
         State_Diag%Archive_SatDiagnPRECTOT                             .or. &
         State_Diag%Archive_SatDiagnSLP                                 .or. &
         State_Diag%Archive_SatDiagnSPHU                                .or. &
         State_Diag%Archive_SatDiagnTS                                  .or. &
         State_Diag%Archive_SatDiagnPBLTOPL                             .or. &
         State_Diag%Archive_SatDiagnMODISLAI                            .or. &
         State_Diag%Archive_SatDiagnWetLossLS                           .or. &
         State_Diag%Archive_SatDiagnWetLossConv                         .or. &
         State_Diag%Archive_SatDiagnJval                                .or. &
         State_Diag%Archive_SatDiagnJvalO3O1D                           .or. &
         State_Diag%Archive_SatDiagnJvalO3O3P                           .or. &
         State_Diag%Archive_SatDiagnDryDep                              .or. &
         State_Diag%Archive_SatDiagnDryDepVel                           .or. &
         State_Diag%Archive_SatDiagnOHreactivity                            )

    !------------------------------------------------------------------------
    ! Satellite diagnostic: Counter
    !------------------------------------------------------------------------
    IF ( State_Diag%Archive_SatDiagn ) THEN 

       ! Array to contain the satellite diagnostic weights
       ALLOCATE( State_Diag%SatDiagnCount( State_Grid%NX,                    &
                                           State_Grid%NY,                    &
                                           State_Grid%NZ ), STAT=RC         )
       CALL GC_CheckVar( 'State_Diag%DiagnCount', 0, RC )
       IF ( RC /= GC_SUCCESS ) RETURN
       State_Diag%SatDiagnCount = 0.0_f4
       State_Diag%Archive_SatDiagnCount = .TRUE.
    ENDIF

This should make sure that the State_Diag%SatDiagnCount is allocated if any of the other SatDiagn diagnostics are turned on. However if you look close you will notice that State_Diag%Archive_SatDiagnConc is missing from the first IF statement. This means that because you tried to archive SatDiagnConc_CO2, the SatDiagnConc field was never allocated, thus causing the seg fault that you encountered.

Try adding

    State_Diag%Archive_SatDiagn = (                                          &
         State_Diag%ArchiveSatDiagnConc                                 .or. &    <=== add this line here
         State_Diag%Archive_SatDiagnColEmis                             .or. &
         ... etc ...

and then see if you can run without getting the error.

We will try to fix this (and maybe add an error trap for this condition) in an upcoming version.

@yantosca
Copy link
Contributor

We can now close out this issue as PR #1808 has been merged into the GEOS-Chem 14.2.1 devlopment stream.

@Kexin828
Copy link

Hi Bob,
Many thanks!
In my tests for v14.1 fullchemistry simulation, I meet same problems as follow:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
gcclassic 0000000000D0C47A Unknown Unknown Unknown
libpthread-2.17.s 00002B68FACD3630 Unknown Unknown Unknown
gcclassic 000000000075F576 history_netcdf_mo 858 history_netcdf_mod.F90
gcclassic 0000000000743339 history_mod_mp_hi 3064 history_mod.F90
gcclassic 0000000000408621 MAIN__ 2006 main.F90
gcclassic 0000000000407262 Unknown Unknown Unknown
libc-2.17.so 00002B68FAF02555 __libc_start_main Unknown Unknown
gcclassic 0000000000407169 Unknown Unknown Unknown

But after modifying the two files according to your instructions, it still can't run. Below is the new error message:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
gcclassic 0000000000D0C47A Unknown Unknown Unknown
libpthread-2.17.s 00002B68FACD3630 Unknown Unknown Unknown
gcclassic 000000000075F576 history_netcdf_mo 858 history_netcdf_mod.F90
gcclassic 0000000000743339 history_mod_mp_hi 3064 history_mod.F90
gcclassic 0000000000408621 MAIN__ 2006 main.F90
gcclassic 0000000000407262 Unknown Unknown Unknown
libc-2.17.so 00002B68FAF02555 __libc_start_main Unknown Unknown
gcclassic 0000000000407169 Unknown Unknown Unknown

Looking forward to your reply. Many thanks!

Copy link
Contributor

Thanks for writing @Kexin828. If you could please open a new issue, this will help us to track the problem more efficiently.

@Kexin828
Copy link

Thank you for your reply.
I opened a new issue #2017 .

@PowderL
Copy link

PowderL commented Nov 10, 2023

@lfeng89089, I think I've found the root cause. In Headers/state_diag_mod.F90, we have these statements:

    !------------------------------------------------------------------------
    ! Set a single logical for SatDiagn output
    !------------------------------------------------------------------------
    State_Diag%Archive_SatDiagn = (                                          &
         State_Diag%Archive_SatDiagnColEmis                             .or. &
         State_Diag%Archive_SatDiagnSurfFlux                            .or. &
         State_Diag%Archive_SatDiagnOH                                  .or. &
         State_Diag%Archive_SatDiagnRH                                  .or. &
         State_Diag%Archive_SatDiagnAirDen                              .or. &
         State_Diag%Archive_SatDiagnBoxHeight                           .or. &
         State_Diag%Archive_SatDiagnPEdge                               .or. &
         State_Diag%Archive_SatDiagnTROPP                               .or. &
         State_Diag%Archive_SatDiagnPBLHeight                           .or. &
         State_Diag%Archive_SatDiagnPBLTop                              .or. &
         State_Diag%Archive_SatDiagnTAir                                .or. &
         State_Diag%Archive_SatDiagnGWETROOT                            .or. &
         State_Diag%Archive_SatDiagnGWETTOP                             .or. &
         State_Diag%Archive_SatDiagnPARDR                               .or. &
         State_Diag%Archive_SatDiagnPARDF                               .or. &
         State_Diag%Archive_SatDiagnPRECTOT                             .or. &
         State_Diag%Archive_SatDiagnSLP                                 .or. &
         State_Diag%Archive_SatDiagnSPHU                                .or. &
         State_Diag%Archive_SatDiagnTS                                  .or. &
         State_Diag%Archive_SatDiagnPBLTOPL                             .or. &
         State_Diag%Archive_SatDiagnMODISLAI                            .or. &
         State_Diag%Archive_SatDiagnWetLossLS                           .or. &
         State_Diag%Archive_SatDiagnWetLossConv                         .or. &
         State_Diag%Archive_SatDiagnJval                                .or. &
         State_Diag%Archive_SatDiagnJvalO3O1D                           .or. &
         State_Diag%Archive_SatDiagnJvalO3O3P                           .or. &
         State_Diag%Archive_SatDiagnDryDep                              .or. &
         State_Diag%Archive_SatDiagnDryDepVel                           .or. &
         State_Diag%Archive_SatDiagnOHreactivity                            )

    !------------------------------------------------------------------------
    ! Satellite diagnostic: Counter
    !------------------------------------------------------------------------
    IF ( State_Diag%Archive_SatDiagn ) THEN 

       ! Array to contain the satellite diagnostic weights
       ALLOCATE( State_Diag%SatDiagnCount( State_Grid%NX,                    &
                                           State_Grid%NY,                    &
                                           State_Grid%NZ ), STAT=RC         )
       CALL GC_CheckVar( 'State_Diag%DiagnCount', 0, RC )
       IF ( RC /= GC_SUCCESS ) RETURN
       State_Diag%SatDiagnCount = 0.0_f4
       State_Diag%Archive_SatDiagnCount = .TRUE.
    ENDIF

This should make sure that the State_Diag%SatDiagnCount is allocated if any of the other SatDiagn diagnostics are turned on. However if you look close you will notice that State_Diag%Archive_SatDiagnConc is missing from the first IF statement. This means that because you tried to archive SatDiagnConc_CO2, the SatDiagnConc field was never allocated, thus causing the seg fault that you encountered.

Try adding

    State_Diag%Archive_SatDiagn = (                                          &
         State_Diag%ArchiveSatDiagnConc                                 .or. &    <=== add this line here
         State_Diag%Archive_SatDiagnColEmis                             .or. &
         ... etc ...

and then see if you can run without getting the error.

We will try to fix this (and maybe add an error trap for this condition) in an upcoming version.

Though this issue was closed by the contributor, there is another point we should be careful about. I just followed the steps above to solve the same problem I also encountered. After I added a new line State_Diag%ArchiveSatDiagnConc .or. & to the Headers/state_diag_mod. F90, a new issue was raised in compiling the code with the command make -j. But the issue was fixed if I revised the new-added line to State_Diag%Archive_SatDiagnConc .or. & . I don't know the reason, and I guess the new-added line should keep the same formula with other existing lines such Archive_SatDiagnColEmis. They all have a _ between Archive and SatDiagn.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Bug Something isn't working topic: Diagnostics Related to output diagnostic data
Projects
None yet
4 participants