[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

msl3v · 2022-11-18T21:44:02Z

We realized today that when putting Prod/Loss terms for families into the mechanism, KPP uses the errors of these terms in the error norm calculation. In one example, the CH4 loss term, LCH4, has a high relative error, suggesting that it imparts a significant impact on KPP's behavior (step acceptance/rejection, internal step size). The prod/loss terms should be entirely passive. They should not be considered in the convergence evaluation. So I would like to implement a fix that allows ros_ErrorNorm() to ignore them in computing Err.

Can y'all think of a best way to do this?

yantosca · 2022-11-18T23:24:09Z

Hi @msl3v @RolfSander @jimmielin: Am wondering if the best thing is to pass a vector of logicals where the prod/loss species would be true and everything else false. We could initialize that outside of the loop over grid boxes.

!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  KPP_REAL FUNCTION ros_ErrorNorm ( Y, Ynew, Yerr, IsDummySpc, &
                               AbsTol, RelTol, VectorTol )
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!~~~> Computes the "scaled norm" of the error vector Yerr
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   IMPLICIT NONE

! Input arguments
   KPP_REAL, INTENT(IN) :: Y(N), Ynew(N), &
          Yerr(N), IsDummySpc(N), AbsTol(N), RelTol(N)
   LOGICAL, INTENT(IN) ::  VectorTol
! Local variables
   KPP_REAL :: Err, Scale, Ymax
   INTEGER  :: i
   KPP_REAL, PARAMETER :: ZERO = 0.0_dp

   Err = ZERO
   DO i=1,N
     IF ( isDummySpc(N) ) CYCLE
     Ymax = MAX(ABS(Y(i)),ABS(Ynew(i)))
     IF (VectorTol) THEN
       Scale = AbsTol(i)+RelTol(i)*Ymax
     ELSE
       Scale = AbsTol(1)+RelTol(1)*Ymax
     END IF
     Err = Err+(Yerr(i)/Scale)**2
   END DO
   Err  = SQRT(Err/N)

   ros_ErrorNorm = MAX(Err,1.0d-10)

  END FUNCTION ros_ErrorNorm

msl3v · 2022-11-18T23:27:26Z

I think we'd need to preprocess an index of actual species. Note at the end of the ros_ErrorNorm() function, that Err is divided by N

yantosca · 2022-11-18T23:42:26Z

Thanks @msl3v. I think the number of prod/loss species are given by NFam, right? Then it would be divided by N-Nfam

msl3v · 2022-11-19T02:51:08Z

@yantosca I think you're correct! Alright. This should be an easy fix. Where should it go? Dev? Or would it be better to create a new branch and create a PR?

RolfSander · 2022-11-19T09:26:15Z

I think a new branch would be best. Then you have time to test it
independently while other things may happen in the dev branch.

Maybe you can use a Fortran90 mask and a WHERE block for this. I haven't
tested it, but maybe it could look like this:

LOGICAL :: notfam(N)
KPP_REAL :: Err(N)

WHERE (notfam)
  Err = (Yerr/Scale)**2
END WHERE
Err = SUM(Err, notfam) / (N-Nfam)

Here, the mask notfam is the vector of logicals where the prod/loss
species would be false.

msl3v · 2022-11-19T17:29:38Z

So, whether it's a logical vector or integer vector, it makes sense to pre-process it into the code rather than have KPP build it each call. My inclination is to pre-process a separate vector of species index values and have the function loop over that. It will avoid unnecessary branching. There would be no need for an IF() statement. While I'm not really sure that this would be more efficient, ErrorNorm gets called every internal time step.
e.g. declare it it INDX(N-NFam)

DO i=1,N-NFam
     Ymax = MAX(ABS(Y(INDX(i))),ABS(Ynew(INDX(i))))
     IF (VectorTol) THEN
       Scale = AbsTol(INDX(i))+RelTol(INDX(i))*Ymax
     ELSE
       Scale = AbsTol(1)+RelTol(1)*Ymax
     END IF
     Err = Err+(Yerr(INDX(i))/Scale)**2
END DO

Or is this unnecessary complexity for marginal improvement?
Thoughts appreciated.

msl3v · 2022-11-22T22:53:22Z

Created a new branch, FamilyIndexTesting. As usual, bad with naming things.
Made a commit, 7d9c537
What's completed is the generation of the vector of non-prod/loss species index values, its declaration and inclusion in *_Parameters.F90

Please have a look @yantosca @RolfSander @jimmielin

is the code OK? Anything glaring, in bad form, or whatnot
is it reasonable to place this in the Parameters file? I thought to use Util, but Util isn't linked in the integrator files.

I tested this by manually changing the ros_ErrorNorm() function in a boxmodel I use, and the number of internal steps decreased from 19 to 17 for the given GEOS-Chem scenario I tested. So, an improvement.
Some of the species (mainly the super-stiff iodine radicals) changed some, but that's not surprising. Nothing glaring. Still converged. Results still made sense.

msl3v · 2022-11-23T15:50:59Z

There's an alternative solution to this issue here: that family terms are weighing down the error norm.

We were looking at the error in some detail, and realized that the reason the relative error for the family terms is large is because they are initialized to zero.

A kludgy fix would/could simply be to initialize all P/L terms to a large value, and just use finite difference to get their true value after the integration is complete.

The benefit would be that we minimize additional complexity within KPP. <-- I feel this is not unimportant. As KPP is a community project now, additional complexity could hinder development and/or increase the likelihood of bugs.

It also sets a precedent that says "keep piling things on" when the benefit may be marginal and even localized to only a few users' cases.

I also don't like making changes that touch all of the integrator template files. Though in the case here, this could be resolved if, say, the ErrorNorm() functions were put in a pre-processed module file like Util. or LinearAlgebra.

This is more of a philosophical comment. But I hope for some feedback.

RolfSander · 2022-11-28T12:10:23Z

I'm not very familiar^ with the family code, so I restrict myself to a
philosophical comment.

I'm a big fan of piling lots of new features onto KPP, as long as they
are not switched on by default.

Often, it is fine to add a few IF (newfeature) THEN blocks to the
code. If there are too many IF blocks that slow down the code, it may
be necessary to deactivate the new features via CPP preprocessor
directives. Then the new code can be excluded from the compiled
executable.

^ no pun intended :-)

msl3v · 2022-12-01T15:43:41Z

Using CPP to toggle the error norm calculation makes sense.

OK. If piling-it-on is the path, here's three approaches to this issue:

Do as done already. Create a new vector and modify the error norm calcs in the various integrator files. Possibly bounded by a CPP tag. The vector could be logical or integer.
Create a "#PAD" toggle, that pads any prod/loss or dummy species by a sufficiently large value to minimize the relative error for that species.
Simply set a high ABSTOL for the prod/loss or dummy species that minimizes the relative error for that species, and use vector tolerances. Similar to (2), uses a toggle in the *.kpp file.

yantosca · 2022-12-05T14:48:08Z

Hi all. I think it may be cleaner to use the toggle in the *.kpp file, as then you wouldn't have to burden users with having to compile in a C-preprocessor switch setting.

msl3v · 2023-02-27T17:23:11Z

Wrote a simple program to test IF() statement overhead vs. the cost of FLOPs in computation of the error norm (see below). It executes two do-loops: in the first, a conditional is applied. If .true., then perform the error norm calc. Otherwise, nothing. The second loop simply computes the full vector regardless.

The IF() loop tests a random vector between 0 and 1.

IF (arr(j) .gt. X) then ... compute.

With gfortran (no optimization, i.e. no "-O" flag), it suggests that above X = 0.6, there is a speed advantage in using an IF() statement. Otherwise, it is considerably faster to just compute all terms.

With the "-O" flag, there is no difference at any of the range of tested X values (0.05 thru 0.95). So compiler optimization is really pretty effective.

Anyway - carry on.

  implicit none

  integer :: itr, i, j, nitr
  integer, parameter :: arrsize = 500
  real    :: tstart, tend, iftime, str8time 
  real    :: arr(arrsize), atol(arrsize), rtol(arrsize), c(arrsize), cnew(arrsize)

  call RANDOM_NUMBER(arr)
  call RANDOM_NUMBER(atol)
  call RANDOM_NUMBER(rtol)
  call RANDOM_NUMBER(c)

  ! Number of iterations to perform. Larger value increases the runtime                                                                                                                                                                                                         
  nitr = 100000

  ! Go ahead and run a loop just to prime the memory                                                                                                                                                                                                                            
  do i=1,nitr
     do j=1,size(arr)
        if (arr(j) .gt. 0.25) then
           cnew(j) = atol(j)+rtol(j)*c(j)
        endif
     enddo
  enddo

  ! Run the loop with the IF statement                                                                                                                                                                                                                                          
  call cpu_time(tstart)
  do i=1,nitr
     do j=1,size(arr)
        if (arr(j) .gt. 0.5) then
           cnew(j) = atol(j)+rtol(j)*c(j)
        endif
     enddo
  enddo
  call cpu_time(tend)

  iftime = tend-tstart ! Time spent in the IF loop                                                                                                                                                                                                                              

  write(*,*) 'IF time = ', iftime

  ! Run the loop just str8 throught the vector ops                                                                                                                                                                                                                              
  call cpu_time(tstart)
  do i=1,nitr
     do j=1,size(arr)
        cnew(j) = atol(j)+rtol(j)*c(j)
     enddo
  enddo
  call cpu_time(tend)

  str8time = tend-tstart ! Time spent just plowing straight through the vectors                                                                                                                                                                                                 

  write(*,*) 'STR8 time = ', str8time

  write(*,*) 'STR8/IF = ', str8time/iftime

end program main

RolfSander · 2023-02-28T08:59:50Z

Hello Mike,

Thanks for your code development activities! I'm sorry that I'm
currently busy with other projects and haven't been able to contribute
much here. However, if you have any questions specifically for me, let
me know!

It's nice to see that IF statements don't slow down the optimized code.
That's quite reassuring, especially since we have added a couple of
IF statements deep inside the loops for Do_Update*

   IF ( Do_Update_SUN    ) CALL Update_SUN()
   IF ( Do_Update_RCONST ) CALL Update_RCONST()

yantosca · 2023-02-28T14:57:31Z

@RolfSander @msl3v IF statements are pretty cheap, if they don't have an ELSE. For this reason I've started refactoring IF/ELSE blocks in my other projects to

A = .FALSE.
IF ( B ) A =.TRUE.

as the assignment an IF are likely cheaper than an IF/ELSE. The issue is that the code has to "guess" which branch is being taken (the IF or the ELSE) at the chip level and if it guesses wrong it has to dump and reload cache instructions and data.

RolfSander · 2023-02-28T15:04:51Z

Interesting, thanks for the explanation, @yantosca!

Every day is a school day...

yantosca · 2023-04-14T14:29:46Z

@msl3v @RolfSander @obin1: Are there any updates on this feature? Would this be ready to include in the dev branch?

msl3v · 2023-04-14T14:33:48Z

I only developed it as far as a successful test of concept. Not sure the best way to proceed. Using the auto-reduce boxmodel wouldn't work since the C() and RCONST() vals are hard-coded by reaction index. Is this something you/GCST/GCSC want to see done?

…

On 4/14/23 10:29, Bob Yantosca wrote: @msl3v <https://github.com/msl3v> @RolfSander <https://github.com/RolfSander> @obin1 <https://github.com/obin1>: Are there any updates on this feature? Would this be ready to include in the dev branch? — Reply to this email directly, view it on GitHub <#66 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABVE237TTEPSXPCZNHJG7ELXBFNOLANCNFSM6AAAAAASE5AO3Y>. You are receiving this because you were mentioned.Message ID: ***@***.***>

yantosca · 2023-04-14T14:44:14Z

@msl3v: only as much as we don't want family species messing around with the integration. Not really a priority though.

obin1 · 2023-04-14T15:12:52Z

Here's a visualization from a project @msl3v and I've been working on: giving prod/loss species arbitrarily large absolute tolerances of 1e25 (effectively removing them from the error norm) reduces the number of integration steps. This can also slightly alter the concentrations of species whose tolerances we don't modify, by changing the internal timesteps.

yantosca · 2023-04-14T16:56:38Z

Interesting find @obin1! That might be the best workaround. When I have time later I might open a feature request on the GEOS-Chem repo so that we initialize P/L species to 1e25.

jimmielin · 2023-09-08T20:12:18Z

I just noticed that geoschem/geos-chem#1334 is put on pause because of the P/L species being considered in the error norm. I'm linking to it here so we can see these issues are related on GitHub.

yantosca · 2024-06-14T14:37:07Z

@obin1 @RolfSander @jimmielin @msl3v This has been identified as a high-priority item to fix at the IGC11 meeting.

obin1 · 2024-06-14T15:21:39Z

Not the most elegant fix for KPP in general but the quickest fix for GEOS-Chem regarding this issue could be in the scope of GeosCore/fullchem_mod.F90, e.g. this commit from December 2023 in an older dev version of GEOS-CF. We were thinking about just making the ATOL of this small set of P/L species arbitrarily large, e.g.

LOGICAL :: relax_rtol, relax_atol_PL ! relax KPP tols
...

    ! Absolute tolerance
    ATOL      = 1e-2_dp
    relax_atol_PL = .TRUE.
    IF (relax_atol_PL) THEN
       IF (Input_Opt%amIRoot) write(*,*) "Setting ATOL of P/L dummy species to 1e25"
       ATOL(ind_LBRO2H)    = 1e25_dp
       ATOL(ind_LBRO2N)    = 1e25_dp
       ATOL(ind_LCH4)      = 1e25_dp
       ATOL(ind_LCO)       = 1e25_dp
       ATOL(ind_LISOPNO3)  = 1e25_dp
       ATOL(ind_LISOPOH)   = 1e25_dp
       ATOL(ind_LNRO2H)    = 1e25_dp
       ATOL(ind_LNRO2N)    = 1e25_dp
       ATOL(ind_LOx)       = 1e25_dp
       ATOL(ind_LTRO2H)    = 1e25_dp
       ATOL(ind_LTRO2N)    = 1e25_dp
       ATOL(ind_LXRO2H)    = 1e25_dp
       ATOL(ind_LXRO2N)    = 1e25_dp
       ATOL(ind_PCO)       = 1e25_dp
       ATOL(ind_PH2O2)     = 1e25_dp
       ATOL(ind_POx)       = 1e25_dp
       ATOL(ind_PSO4)      = 1e25_dp
    END IF

msl3v · 2024-06-14T15:30:53Z

This is trivial to resolve. - Set the tols arbitrarily high. - use a logical vectorI like option (1) for simplicity. Option (2) adds a new data structure. Tho maybe all the overhead of (2) could be done at preprocessing. Add a tag to a species that says keep it out of the error norm. Store the info in the stack. We should discuss the best option. I’m on vacation thru June 23 but will monitor email as I can.

yantosca · 2024-06-14T16:08:55Z

Thanks @obin1 @msl3v. Another suggestion is to add entries for ATOL to the GEOS-Chem species_database.yml file, which could be read into the State_Chm%SpcData object (and then have a default ATOL if one is not specified). That would also be easy to implement. I will work on this when I get back. I am also on vacation until June 24th.

I agree, it would be nice to have a data structure internal to KPP that would flag if it could be added to the error norm or not. That will take some more thought but might be worth it in the long run. Open to your thoughts.

yantosca · 2024-06-14T16:20:23Z

Also see related GEOS-Chem issue: geoschem/geos-chem#1753

yantosca · 2024-07-09T15:48:22Z

Update: This issue has been fixed in GEOS-Chem 14.5.0 by externally setting the ATOL values. But we can leave this issue open as it would be nice to eventually have an internal fix for this in KPP. I think we'd need to have some kind of mask array set to 1's or 0's that we can multiply the rate constants by in order to exclude the dummy species (I think @msl3v suggested that before).

msl3v added the bug Something isn't working label Nov 18, 2022

msl3v self-assigned this Nov 18, 2022

msl3v added question Further information is requested integrators Related to numerical integrators labels Nov 18, 2022

msl3v mentioned this issue Nov 23, 2022

[BUG/ISSUE] (minor issue) Initialize CO2 to a relevant value in fullchem_mod.F90 geoschem/geos-chem#1529

Closed

yantosca mentioned this issue Dec 19, 2022

Sulfur in-cloud chemistry disconnected from TOMAS in V14 [BUG/ISSUE] geoschem/geos-chem#1511

Closed

yantosca assigned yantosca and unassigned msl3v Jun 14, 2024

yantosca added this to the 3.2.0 milestone Jun 14, 2024

yantosca removed the question Further information is requested label Jun 14, 2024

yantosca mentioned this issue Jul 2, 2024

Allow per-species definition of KPP absolute and relative solver tolerances; Set absolute tolerances of dummy species to large value geoschem/geos-chem#2359

Merged

yantosca mentioned this issue Jul 9, 2024

Error with CMIP6 file geoschem/GCHP#414

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

msl3v commented Nov 18, 2022

yantosca commented Nov 18, 2022 •

edited

Loading

msl3v commented Nov 18, 2022

yantosca commented Nov 18, 2022

msl3v commented Nov 19, 2022

RolfSander commented Nov 19, 2022

msl3v commented Nov 19, 2022

msl3v commented Nov 22, 2022

msl3v commented Nov 23, 2022

RolfSander commented Nov 28, 2022

msl3v commented Dec 1, 2022

yantosca commented Dec 5, 2022

msl3v commented Feb 27, 2023 •

edited by yantosca

Loading

RolfSander commented Feb 28, 2023

yantosca commented Feb 28, 2023

RolfSander commented Feb 28, 2023

yantosca commented Apr 14, 2023

msl3v commented Apr 14, 2023 via email

yantosca commented Apr 14, 2023

obin1 commented Apr 14, 2023

yantosca commented Apr 14, 2023

jimmielin commented Sep 8, 2023

yantosca commented Jun 14, 2024

obin1 commented Jun 14, 2024

msl3v commented Jun 14, 2024 via email •

edited by yantosca

Loading

yantosca commented Jun 14, 2024

yantosca commented Jun 14, 2024

yantosca commented Jul 9, 2024

[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

Comments

msl3v commented Nov 18, 2022

yantosca commented Nov 18, 2022 • edited Loading

msl3v commented Nov 18, 2022

yantosca commented Nov 18, 2022

msl3v commented Nov 19, 2022

RolfSander commented Nov 19, 2022

msl3v commented Nov 19, 2022

msl3v commented Nov 22, 2022

msl3v commented Nov 23, 2022

RolfSander commented Nov 28, 2022

msl3v commented Dec 1, 2022

yantosca commented Dec 5, 2022

msl3v commented Feb 27, 2023 • edited by yantosca Loading

RolfSander commented Feb 28, 2023

yantosca commented Feb 28, 2023

RolfSander commented Feb 28, 2023

yantosca commented Apr 14, 2023

msl3v commented Apr 14, 2023 via email

yantosca commented Apr 14, 2023

obin1 commented Apr 14, 2023

yantosca commented Apr 14, 2023

jimmielin commented Sep 8, 2023

yantosca commented Jun 14, 2024

obin1 commented Jun 14, 2024

msl3v commented Jun 14, 2024 via email • edited by yantosca Loading

yantosca commented Jun 14, 2024

yantosca commented Jun 14, 2024

yantosca commented Jul 9, 2024

yantosca commented Nov 18, 2022 •

edited

Loading

msl3v commented Feb 27, 2023 •

edited by yantosca

Loading

msl3v commented Jun 14, 2024 via email •

edited by yantosca

Loading