-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG/ISSUE] Prod/loss species are contributing to the Error Norm when they should be ignored #66
Comments
Hi @msl3v @RolfSander @jimmielin: Am wondering if the best thing is to pass a vector of logicals where the prod/loss species would be true and everything else false. We could initialize that outside of the loop over grid boxes. !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KPP_REAL FUNCTION ros_ErrorNorm ( Y, Ynew, Yerr, IsDummySpc, &
AbsTol, RelTol, VectorTol )
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
!~~~> Computes the "scaled norm" of the error vector Yerr
!~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IMPLICIT NONE
! Input arguments
KPP_REAL, INTENT(IN) :: Y(N), Ynew(N), &
Yerr(N), IsDummySpc(N), AbsTol(N), RelTol(N)
LOGICAL, INTENT(IN) :: VectorTol
! Local variables
KPP_REAL :: Err, Scale, Ymax
INTEGER :: i
KPP_REAL, PARAMETER :: ZERO = 0.0_dp
Err = ZERO
DO i=1,N
IF ( isDummySpc(N) ) CYCLE
Ymax = MAX(ABS(Y(i)),ABS(Ynew(i)))
IF (VectorTol) THEN
Scale = AbsTol(i)+RelTol(i)*Ymax
ELSE
Scale = AbsTol(1)+RelTol(1)*Ymax
END IF
Err = Err+(Yerr(i)/Scale)**2
END DO
Err = SQRT(Err/N)
ros_ErrorNorm = MAX(Err,1.0d-10)
END FUNCTION ros_ErrorNorm |
I think we'd need to preprocess an index of actual species. Note at the end of the ros_ErrorNorm() function, that Err is divided by N |
Thanks @msl3v. I think the number of prod/loss species are given by NFam, right? Then it would be divided by N-Nfam |
@yantosca I think you're correct! Alright. This should be an easy fix. Where should it go? Dev? Or would it be better to create a new branch and create a PR? |
I think a new branch would be best. Then you have time to test it Maybe you can use a Fortran90 mask and a WHERE block for this. I haven't
Here, the mask |
So, whether it's a logical vector or integer vector, it makes sense to pre-process it into the code rather than have KPP build it each call. My inclination is to pre-process a separate vector of species index values and have the function loop over that. It will avoid unnecessary branching. There would be no need for an IF() statement. While I'm not really sure that this would be more efficient, ErrorNorm gets called every internal time step.
Or is this unnecessary complexity for marginal improvement? |
Created a new branch, FamilyIndexTesting. As usual, bad with naming things. Please have a look @yantosca @RolfSander @jimmielin
I tested this by manually changing the ros_ErrorNorm() function in a boxmodel I use, and the number of internal steps decreased from 19 to 17 for the given GEOS-Chem scenario I tested. So, an improvement. |
There's an alternative solution to this issue here: that family terms are weighing down the error norm. We were looking at the error in some detail, and realized that the reason the relative error for the family terms is large is because they are initialized to zero. A kludgy fix would/could simply be to initialize all P/L terms to a large value, and just use finite difference to get their true value after the integration is complete. The benefit would be that we minimize additional complexity within KPP. <-- I feel this is not unimportant. As KPP is a community project now, additional complexity could hinder development and/or increase the likelihood of bugs. It also sets a precedent that says "keep piling things on" when the benefit may be marginal and even localized to only a few users' cases. I also don't like making changes that touch all of the integrator template files. Though in the case here, this could be resolved if, say, the ErrorNorm() functions were put in a pre-processed module file like Util. or LinearAlgebra. This is more of a philosophical comment. But I hope for some feedback. |
I'm not very familiar^ with the family code, so I restrict myself to a I'm a big fan of piling lots of new features onto KPP, as long as they Often, it is fine to add a few ^ no pun intended :-) |
Using CPP to toggle the error norm calculation makes sense. OK. If piling-it-on is the path, here's three approaches to this issue:
|
Hi all. I think it may be cleaner to use the toggle in the *.kpp file, as then you wouldn't have to burden users with having to compile in a C-preprocessor switch setting. |
Wrote a simple program to test IF() statement overhead vs. the cost of FLOPs in computation of the error norm (see below). It executes two do-loops: in the first, a conditional is applied. If .true., then perform the error norm calc. Otherwise, nothing. The second loop simply computes the full vector regardless. The IF() loop tests a random vector between 0 and 1. IF (arr(j) .gt. X) then ... compute. With gfortran (no optimization, i.e. no "-O" flag), it suggests that above X = 0.6, there is a speed advantage in using an IF() statement. Otherwise, it is considerably faster to just compute all terms. With the "-O" flag, there is no difference at any of the range of tested X values (0.05 thru 0.95). So compiler optimization is really pretty effective. Anyway - carry on. implicit none
integer :: itr, i, j, nitr
integer, parameter :: arrsize = 500
real :: tstart, tend, iftime, str8time
real :: arr(arrsize), atol(arrsize), rtol(arrsize), c(arrsize), cnew(arrsize)
call RANDOM_NUMBER(arr)
call RANDOM_NUMBER(atol)
call RANDOM_NUMBER(rtol)
call RANDOM_NUMBER(c)
! Number of iterations to perform. Larger value increases the runtime
nitr = 100000
! Go ahead and run a loop just to prime the memory
do i=1,nitr
do j=1,size(arr)
if (arr(j) .gt. 0.25) then
cnew(j) = atol(j)+rtol(j)*c(j)
endif
enddo
enddo
! Run the loop with the IF statement
call cpu_time(tstart)
do i=1,nitr
do j=1,size(arr)
if (arr(j) .gt. 0.5) then
cnew(j) = atol(j)+rtol(j)*c(j)
endif
enddo
enddo
call cpu_time(tend)
iftime = tend-tstart ! Time spent in the IF loop
write(*,*) 'IF time = ', iftime
! Run the loop just str8 throught the vector ops
call cpu_time(tstart)
do i=1,nitr
do j=1,size(arr)
cnew(j) = atol(j)+rtol(j)*c(j)
enddo
enddo
call cpu_time(tend)
str8time = tend-tstart ! Time spent just plowing straight through the vectors
write(*,*) 'STR8 time = ', str8time
write(*,*) 'STR8/IF = ', str8time/iftime
end program main |
Hello Mike, Thanks for your code development activities! I'm sorry that I'm It's nice to see that
|
@RolfSander @msl3v IF statements are pretty cheap, if they don't have an ELSE. For this reason I've started refactoring IF/ELSE blocks in my other projects to A = .FALSE.
IF ( B ) A =.TRUE. as the assignment an IF are likely cheaper than an IF/ELSE. The issue is that the code has to "guess" which branch is being taken (the IF or the ELSE) at the chip level and if it guesses wrong it has to dump and reload cache instructions and data. |
Interesting, thanks for the explanation, @yantosca! Every day is a school day... |
@msl3v @RolfSander @obin1: Are there any updates on this feature? Would this be ready to include in the dev branch? |
I only developed it as far as a successful test of concept. Not sure the
best way to proceed.
Using the auto-reduce boxmodel wouldn't work since the C() and RCONST()
vals are hard-coded by reaction index.
Is this something you/GCST/GCSC want to see done?
…On 4/14/23 10:29, Bob Yantosca wrote:
@msl3v <https://github.com/msl3v> @RolfSander
<https://github.com/RolfSander> @obin1 <https://github.com/obin1>: Are
there any updates on this feature? Would this be ready to include in
the dev branch?
—
Reply to this email directly, view it on GitHub
<#66 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABVE237TTEPSXPCZNHJG7ELXBFNOLANCNFSM6AAAAAASE5AO3Y>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@msl3v: only as much as we don't want family species messing around with the integration. Not really a priority though. |
Here's a visualization from a project @msl3v and I've been working on: giving prod/loss species arbitrarily large absolute tolerances of 1e25 (effectively removing them from the error norm) reduces the number of integration steps. This can also slightly alter the concentrations of species whose tolerances we don't modify, by changing the internal timesteps. |
Interesting find @obin1! That might be the best workaround. When I have time later I might open a feature request on the GEOS-Chem repo so that we initialize P/L species to 1e25. |
I just noticed that geoschem/geos-chem#1334 is put on pause because of the P/L species being considered in the error norm. I'm linking to it here so we can see these issues are related on GitHub. |
@obin1 @RolfSander @jimmielin @msl3v This has been identified as a high-priority item to fix at the IGC11 meeting. |
Not the most elegant fix for KPP in general but the quickest fix for GEOS-Chem regarding this issue could be in the scope of
|
This is trivial to resolve. - Set the tols arbitrarily high. - use a logical vectorI like option (1) for simplicity. Option (2) adds a new data structure. Tho maybe all the overhead of (2) could be done at preprocessing. Add a tag to a species that says keep it out of the error norm. Store the info in the stack. We should discuss the best option. I’m on vacation thru June 23 but will monitor email as I can.
|
Thanks @obin1 @msl3v. Another suggestion is to add entries for I agree, it would be nice to have a data structure internal to KPP that would flag if it could be added to the error norm or not. That will take some more thought but might be worth it in the long run. Open to your thoughts. |
Also see related GEOS-Chem issue: geoschem/geos-chem#1753 |
Update: This issue has been fixed in GEOS-Chem 14.5.0 by externally setting the ATOL values. But we can leave this issue open as it would be nice to eventually have an internal fix for this in KPP. I think we'd need to have some kind of mask array set to 1's or 0's that we can multiply the rate constants by in order to exclude the dummy species (I think @msl3v suggested that before). |
We realized today that when putting Prod/Loss terms for families into the mechanism, KPP uses the errors of these terms in the error norm calculation. In one example, the CH4 loss term, LCH4, has a high relative error, suggesting that it imparts a significant impact on KPP's behavior (step acceptance/rejection, internal step size). The prod/loss terms should be entirely passive. They should not be considered in the convergence evaluation. So I would like to implement a fix that allows ros_ErrorNorm() to ignore them in computing Err.
Can y'all think of a best way to do this?
The text was updated successfully, but these errors were encountered: