Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cleaning tamc.h : collect previous comments #608

Closed
jm-c opened this issue Feb 22, 2022 · 6 comments · Fixed by #650
Closed

cleaning tamc.h : collect previous comments #608

jm-c opened this issue Feb 22, 2022 · 6 comments · Fixed by #650

Comments

@jm-c
Copy link
Member

jm-c commented Feb 22, 2022

In PR #579 discussion, removing some un-used parameters within "tamc.h" was mentionned but postponed until "nthreads_chkpt" could also be removed (after removing it from tape initialisations).
Raise this issue here to give this task more visibility (+ also as a reminder).

The relevant part in the discussion were:
#579 (comment)
and
#579 (comment)

@mjlosch
Copy link
Member

mjlosch commented Sep 8, 2022

@jm-c could you please explain, for reference, why nthreads_chkpt should never be different from 1? Is It because the number of threads (i.e. nTx*nTy) always needs to be < nSx*nSy, and since nSx and nSy are already considered in the size of the tapes?

@mjlosch
Copy link
Member

mjlosch commented Sep 8, 2022

I suggest a (draft) PR where we remove nthreads_chkpt from the_main_loop.F (the only place where it is used) and agree on the form of the reference tamc.h and then adjust all 19 verification-tamc.h (and the verification_other ones). I can start this PR if desired.

@jm-c
Copy link
Member Author

jm-c commented Sep 13, 2022

Regarding nthreads_chkpt , you are right: each time it appears in the code (in: the_main_loop.F and in pkg/smooth/smooth_diff2/3d.F) it's multiplied by nSx*nSy. But whether we run single thread or mutlple threads, the size of multi-tile arrays don't change, always nSx*nSy tiles per proc. So there is no need to reserve more space than the array size.
And in the case where we would need to store a common block array such as the ones in pkg/aim_v23/ (e.g., in AIM_GRID.h, com_forcing.h or com_physvar.h) with the last dimension function of the maximum number of threads "MAX_NO_THREADS" instead of the number of tiles "nSx*nSy", we would not use the product of both but instead either one or the other (MAX_NO_THREADS or nSx*nSy).

@jm-c
Copy link
Member Author

jm-c commented Sep 13, 2022

@mjlosch I agree with your plan for a draft PR.

@mjlosch mjlosch mentioned this issue Sep 15, 2022
@jm-c
Copy link
Member Author

jm-c commented Nov 8, 2022

An other simplification we could do when computing "key" to store a single tile array:
The key setting involves the tile indices bi,bj (as it should), but in a complicated way, for instance, see

ikey = (act1 + 1) + act2*max1

grouping the tiles by thread.
This ordering is not necessary (if we were running multi-threads, they would not run sequentially anyway) and we could just use simply the tile indices bi,bj:
ikey = bi + (bj - 1)*nSx + (ikey_dynamics - 1)*nSx*nSy

@mjlosch
Copy link
Member

mjlosch commented Nov 9, 2022

Thanks for this suggestion! The key computation was always hard for my to understand, but I never dared to touch this in fear of breaking something that was well tested.
There are a few places where I introduced new key computations, and I already did them in the way similar to your suggestions, e.g. here:

ipasskey = ipass + (ticekey-1) * maxcube

So I am all for it!

@jm-c jm-c closed this as completed in #650 Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants