Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66

Open
msl3v opened this issue Nov 18, 2022 · 27 comments
Assignees
Labels
bug Something isn't working integrators Related to numerical integrators
Milestone

Comments

@msl3v
Copy link
Contributor

msl3v commented Nov 18, 2022

We realized today that when putting Prod/Loss terms for families into the mechanism, KPP uses the errors of these terms in the error norm calculation. In one example, the CH4 loss term, LCH4, has a high relative error, suggesting that it imparts a significant impact on KPP's behavior (step acceptance/rejection, internal step size). The prod/loss terms should be entirely passive. They should not be considered in the convergence evaluation. So I would like to implement a fix that allows ros_ErrorNorm() to ignore them in computing Err.

Can y'all think of a best way to do this?

@msl3v msl3v added the bug Something isn't working label Nov 18, 2022
@msl3v msl3v self-assigned this Nov 18, 2022
@msl3v msl3v added question Further information is requested integrators Related to numerical integrators labels Nov 18, 2022
@yantosca
Copy link
Contributor

yantosca commented Nov 18, 2022

Hi @msl3v @RolfSander @jimmielin: Am wondering if the best thing is to pass a vector of logicals where the prod/loss species would be true and everything else false. We could initialize that outside of the loop over grid boxes.

!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  KPP_REAL FUNCTION ros_ErrorNorm ( Y, Ynew, Yerr, IsDummySpc, &
                               AbsTol, RelTol, VectorTol )
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!~~~> Computes the "scaled norm" of the error vector Yerr
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   IMPLICIT NONE

! Input arguments
   KPP_REAL, INTENT(IN) :: Y(N), Ynew(N), &
          Yerr(N), IsDummySpc(N), AbsTol(N), RelTol(N)
   LOGICAL, INTENT(IN) ::  VectorTol
! Local variables
   KPP_REAL :: Err, Scale, Ymax
   INTEGER  :: i
   KPP_REAL, PARAMETER :: ZERO = 0.0_dp

   Err = ZERO
   DO i=1,N
     IF ( isDummySpc(N) ) CYCLE
     Ymax = MAX(ABS(Y(i)),ABS(Ynew(i)))
     IF (VectorTol) THEN
       Scale = AbsTol(i)+RelTol(i)*Ymax
     ELSE
       Scale = AbsTol(1)+RelTol(1)*Ymax
     END IF
     Err = Err+(Yerr(i)/Scale)**2
   END DO
   Err  = SQRT(Err/N)

   ros_ErrorNorm = MAX(Err,1.0d-10)

  END FUNCTION ros_ErrorNorm

@msl3v
Copy link
Contributor Author

msl3v commented Nov 18, 2022

I think we'd need to preprocess an index of actual species. Note at the end of the ros_ErrorNorm() function, that Err is divided by N

@yantosca
Copy link
Contributor

Thanks @msl3v. I think the number of prod/loss species are given by NFam, right? Then it would be divided by N-Nfam

@msl3v
Copy link
Contributor Author

msl3v commented Nov 19, 2022

@yantosca I think you're correct! Alright. This should be an easy fix. Where should it go? Dev? Or would it be better to create a new branch and create a PR?

@RolfSander
Copy link
Contributor

I think a new branch would be best. Then you have time to test it
independently while other things may happen in the dev branch.

Maybe you can use a Fortran90 mask and a WHERE block for this. I haven't
tested it, but maybe it could look like this:

LOGICAL :: notfam(N)
KPP_REAL :: Err(N)

WHERE (notfam)
  Err = (Yerr/Scale)**2
END WHERE
Err = SUM(Err, notfam) / (N-Nfam)

Here, the mask notfam is the vector of logicals where the prod/loss
species would be false.

@msl3v
Copy link
Contributor Author

msl3v commented Nov 19, 2022

So, whether it's a logical vector or integer vector, it makes sense to pre-process it into the code rather than have KPP build it each call. My inclination is to pre-process a separate vector of species index values and have the function loop over that. It will avoid unnecessary branching. There would be no need for an IF() statement. While I'm not really sure that this would be more efficient, ErrorNorm gets called every internal time step.
e.g. declare it it INDX(N-NFam)

DO i=1,N-NFam
     Ymax = MAX(ABS(Y(INDX(i))),ABS(Ynew(INDX(i))))
     IF (VectorTol) THEN
       Scale = AbsTol(INDX(i))+RelTol(INDX(i))*Ymax
     ELSE
       Scale = AbsTol(1)+RelTol(1)*Ymax
     END IF
     Err = Err+(Yerr(INDX(i))/Scale)**2
END DO

Or is this unnecessary complexity for marginal improvement?
Thoughts appreciated.

@msl3v
Copy link
Contributor Author

msl3v commented Nov 22, 2022

Created a new branch, FamilyIndexTesting. As usual, bad with naming things.
Made a commit, 7d9c537
What's completed is the generation of the vector of non-prod/loss species index values, its declaration and inclusion in *_Parameters.F90

Please have a look @yantosca @RolfSander @jimmielin

  • is the code OK? Anything glaring, in bad form, or whatnot
  • is it reasonable to place this in the Parameters file? I thought to use Util, but Util isn't linked in the integrator files.

I tested this by manually changing the ros_ErrorNorm() function in a boxmodel I use, and the number of internal steps decreased from 19 to 17 for the given GEOS-Chem scenario I tested. So, an improvement.
Some of the species (mainly the super-stiff iodine radicals) changed some, but that's not surprising. Nothing glaring. Still converged. Results still made sense.

@msl3v
Copy link
Contributor Author

msl3v commented Nov 23, 2022

There's an alternative solution to this issue here: that family terms are weighing down the error norm.

We were looking at the error in some detail, and realized that the reason the relative error for the family terms is large is because they are initialized to zero.

A kludgy fix would/could simply be to initialize all P/L terms to a large value, and just use finite difference to get their true value after the integration is complete.

The benefit would be that we minimize additional complexity within KPP. <-- I feel this is not unimportant. As KPP is a community project now, additional complexity could hinder development and/or increase the likelihood of bugs.

It also sets a precedent that says "keep piling things on" when the benefit may be marginal and even localized to only a few users' cases.

I also don't like making changes that touch all of the integrator template files. Though in the case here, this could be resolved if, say, the ErrorNorm() functions were put in a pre-processed module file like Util. or LinearAlgebra.

This is more of a philosophical comment. But I hope for some feedback.

@RolfSander
Copy link
Contributor

I'm not very familiar^ with the family code, so I restrict myself to a
philosophical comment.

I'm a big fan of piling lots of new features onto KPP, as long as they
are not switched on by default.

Often, it is fine to add a few IF (newfeature) THEN blocks to the
code. If there are too many IF blocks that slow down the code, it may
be necessary to deactivate the new features via CPP preprocessor
directives. Then the new code can be excluded from the compiled
executable.

^ no pun intended :-)

@msl3v
Copy link
Contributor Author

msl3v commented Dec 1, 2022

Using CPP to toggle the error norm calculation makes sense.

OK. If piling-it-on is the path, here's three approaches to this issue:

  1. Do as done already. Create a new vector and modify the error norm calcs in the various integrator files. Possibly bounded by a CPP tag. The vector could be logical or integer.
  2. Create a "#PAD" toggle, that pads any prod/loss or dummy species by a sufficiently large value to minimize the relative error for that species.
  3. Simply set a high ABSTOL for the prod/loss or dummy species that minimizes the relative error for that species, and use vector tolerances. Similar to (2), uses a toggle in the *.kpp file.

@yantosca
Copy link
Contributor

yantosca commented Dec 5, 2022

Hi all. I think it may be cleaner to use the toggle in the *.kpp file, as then you wouldn't have to burden users with having to compile in a C-preprocessor switch setting.

@msl3v
Copy link
Contributor Author

msl3v commented Feb 27, 2023

Wrote a simple program to test IF() statement overhead vs. the cost of FLOPs in computation of the error norm (see below). It executes two do-loops: in the first, a conditional is applied. If .true., then perform the error norm calc. Otherwise, nothing. The second loop simply computes the full vector regardless.

The IF() loop tests a random vector between 0 and 1.

IF (arr(j) .gt. X) then ... compute.

With gfortran (no optimization, i.e. no "-O" flag), it suggests that above X = 0.6, there is a speed advantage in using an IF() statement. Otherwise, it is considerably faster to just compute all terms.

With the "-O" flag, there is no difference at any of the range of tested X values (0.05 thru 0.95). So compiler optimization is really pretty effective.

Anyway - carry on.

  implicit none

  integer :: itr, i, j, nitr
  integer, parameter :: arrsize = 500
  real    :: tstart, tend, iftime, str8time 
  real    :: arr(arrsize), atol(arrsize), rtol(arrsize), c(arrsize), cnew(arrsize)

  call RANDOM_NUMBER(arr)
  call RANDOM_NUMBER(atol)
  call RANDOM_NUMBER(rtol)
  call RANDOM_NUMBER(c)

  ! Number of iterations to perform. Larger value increases the runtime                                                                                                                                                                                                         
  nitr = 100000

  ! Go ahead and run a loop just to prime the memory                                                                                                                                                                                                                            
  do i=1,nitr
     do j=1,size(arr)
        if (arr(j) .gt. 0.25) then
           cnew(j) = atol(j)+rtol(j)*c(j)
        endif
     enddo
  enddo

  ! Run the loop with the IF statement                                                                                                                                                                                                                                          
  call cpu_time(tstart)
  do i=1,nitr
     do j=1,size(arr)
        if (arr(j) .gt. 0.5) then
           cnew(j) = atol(j)+rtol(j)*c(j)
        endif
     enddo
  enddo
  call cpu_time(tend)

  iftime = tend-tstart ! Time spent in the IF loop                                                                                                                                                                                                                              

  write(*,*) 'IF time = ', iftime

  ! Run the loop just str8 throught the vector ops                                                                                                                                                                                                                              
  call cpu_time(tstart)
  do i=1,nitr
     do j=1,size(arr)
        cnew(j) = atol(j)+rtol(j)*c(j)
     enddo
  enddo
  call cpu_time(tend)

  str8time = tend-tstart ! Time spent just plowing straight through the vectors                                                                                                                                                                                                 

  write(*,*) 'STR8 time = ', str8time

  write(*,*) 'STR8/IF = ', str8time/iftime

end program main

@RolfSander
Copy link
Contributor

Hello Mike,

Thanks for your code development activities! I'm sorry that I'm
currently busy with other projects and haven't been able to contribute
much here. However, if you have any questions specifically for me, let
me know!

It's nice to see that IF statements don't slow down the optimized code.
That's quite reassuring, especially since we have added a couple of
IF statements deep inside the loops for Do_Update*

   IF ( Do_Update_SUN    ) CALL Update_SUN()
   IF ( Do_Update_RCONST ) CALL Update_RCONST()

Copy link
Contributor

@RolfSander @msl3v IF statements are pretty cheap, if they don't have an ELSE. For this reason I've started refactoring IF/ELSE blocks in my other projects to

A = .FALSE.
IF ( B ) A =.TRUE.

as the assignment an IF are likely cheaper than an IF/ELSE. The issue is that the code has to "guess" which branch is being taken (the IF or the ELSE) at the chip level and if it guesses wrong it has to dump and reload cache instructions and data.

@RolfSander
Copy link
Contributor

Interesting, thanks for the explanation, @yantosca!

Every day is a school day...

@yantosca
Copy link
Contributor

@msl3v @RolfSander @obin1: Are there any updates on this feature? Would this be ready to include in the dev branch?

@msl3v
Copy link
Contributor Author

msl3v commented Apr 14, 2023 via email

@yantosca
Copy link
Contributor

@msl3v: only as much as we don't want family species messing around with the integration. Not really a priority though.

@obin1
Copy link
Member

obin1 commented Apr 14, 2023

Here's a visualization from a project @msl3v and I've been working on: giving prod/loss species arbitrarily large absolute tolerances of 1e25 (effectively removing them from the error norm) reduces the number of integration steps. This can also slightly alter the concentrations of species whose tolerances we don't modify, by changing the internal timesteps.
visualizePL

@yantosca
Copy link
Contributor

Interesting find @obin1! That might be the best workaround. When I have time later I might open a feature request on the GEOS-Chem repo so that we initialize P/L species to 1e25.

@jimmielin
Copy link
Member

I just noticed that geoschem/geos-chem#1334 is put on pause because of the P/L species being considered in the error norm. I'm linking to it here so we can see these issues are related on GitHub.

@yantosca
Copy link
Contributor

@obin1 @RolfSander @jimmielin @msl3v This has been identified as a high-priority item to fix at the IGC11 meeting.

@yantosca yantosca assigned yantosca and unassigned msl3v Jun 14, 2024
@yantosca yantosca added this to the 3.2.0 milestone Jun 14, 2024
@yantosca yantosca removed the question Further information is requested label Jun 14, 2024
@obin1
Copy link
Member

obin1 commented Jun 14, 2024

Not the most elegant fix for KPP in general but the quickest fix for GEOS-Chem regarding this issue could be in the scope of GeosCore/fullchem_mod.F90, e.g. this commit from December 2023 in an older dev version of GEOS-CF. We were thinking about just making the ATOL of this small set of P/L species arbitrarily large, e.g.

LOGICAL :: relax_rtol, relax_atol_PL ! relax KPP tols
...

    ! Absolute tolerance
    ATOL      = 1e-2_dp
    relax_atol_PL = .TRUE.
    IF (relax_atol_PL) THEN
       IF (Input_Opt%amIRoot) write(*,*) "Setting ATOL of P/L dummy species to 1e25"
       ATOL(ind_LBRO2H)    = 1e25_dp
       ATOL(ind_LBRO2N)    = 1e25_dp
       ATOL(ind_LCH4)      = 1e25_dp
       ATOL(ind_LCO)       = 1e25_dp
       ATOL(ind_LISOPNO3)  = 1e25_dp
       ATOL(ind_LISOPOH)   = 1e25_dp
       ATOL(ind_LNRO2H)    = 1e25_dp
       ATOL(ind_LNRO2N)    = 1e25_dp
       ATOL(ind_LOx)       = 1e25_dp
       ATOL(ind_LTRO2H)    = 1e25_dp
       ATOL(ind_LTRO2N)    = 1e25_dp
       ATOL(ind_LXRO2H)    = 1e25_dp
       ATOL(ind_LXRO2N)    = 1e25_dp
       ATOL(ind_PCO)       = 1e25_dp
       ATOL(ind_PH2O2)     = 1e25_dp
       ATOL(ind_POx)       = 1e25_dp
       ATOL(ind_PSO4)      = 1e25_dp
    END IF

@msl3v
Copy link
Contributor Author

msl3v commented Jun 14, 2024 via email

@yantosca
Copy link
Contributor

Thanks @obin1 @msl3v. Another suggestion is to add entries for ATOL to the GEOS-Chem species_database.yml file, which could be read into the State_Chm%SpcData object (and then have a default ATOL if one is not specified). That would also be easy to implement. I will work on this when I get back. I am also on vacation until June 24th.

I agree, it would be nice to have a data structure internal to KPP that would flag if it could be added to the error norm or not. That will take some more thought but might be worth it in the long run. Open to your thoughts.

@yantosca
Copy link
Contributor

Also see related GEOS-Chem issue: geoschem/geos-chem#1753

@yantosca
Copy link
Contributor

yantosca commented Jul 9, 2024

Update: This issue has been fixed in GEOS-Chem 14.5.0 by externally setting the ATOL values. But we can leave this issue open as it would be nice to eventually have an internal fix for this in KPP. I think we'd need to have some kind of mask array set to 1's or 0's that we can multiply the rate constants by in order to exclude the dummy species (I think @msl3v suggested that before).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integrators Related to numerical integrators
Projects
None yet
Development

No branches or pull requests

5 participants