Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not drop indexes when computing rmax #40

Merged
merged 8 commits into from
Jun 27, 2021
Merged

Conversation

malmans2
Copy link
Member

@malmans2 malmans2 commented Jun 21, 2021

@malmans2 malmans2 added the enhancement New feature or request label Jun 21, 2021
@malmans2 malmans2 requested a review from jdha June 21, 2021 09:17
@malmans2 malmans2 added this to In progress in domzgr.F90 -> pyDOMCFG via automation Jun 21, 2021
@codecov
Copy link

codecov bot commented Jun 21, 2021

Codecov Report

Merging #40 (6e8ed5e) into main (f68612d) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main      #40   +/-   ##
=======================================
  Coverage   94.85%   94.85%           
=======================================
  Files           5        5           
  Lines         214      214           
=======================================
  Hits          203      203           
  Misses         11       11           
Flag Coverage Δ
unittests 94.85% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pydomcfg/domzgr/zgr.py 100.00% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f68612d...6e8ed5e. Read the comment docs.

@malmans2
Copy link
Member Author

Ready for review!

@oceandie oceandie mentioned this pull request Jun 21, 2021
6 tasks
@malmans2
Copy link
Member Author

malmans2 commented Jun 21, 2021

@oceandie @jdha moving the discussion here.
What is you preference for land points (0/0):

  1. Return inf
  2. Return NaN
  3. Return 0

Also what about the rolling average. I.e., what should be the output of coastal points. Say val0=0.1, val1=0/0, the rolling average should return:

  1. 0.1
  2. 0.1 / 2
  3. Others? (0, or NaN, or inf, ....)?

@jdha
Copy link
Contributor

jdha commented Jun 21, 2021

Sorry, I'm probably a little slow on the uptake but re land points: a DS is passed (and therefore a mask) so the land isn't used in the calc, is that correct?

Re: coastal point. If we're adopting @malmans2 solution, then the output at coastal points is just the maximum rmax of the surrounding u/v points, no? (u/v being zero adjacent to land)

@oceandie
Copy link
Contributor

Sorry, I'm probably a little slow on the uptake but re land points: a DS is passed (and therefore a mask) so the land isn't used in the calc, is that correct?

Re: coastal point. If we're adopting @malmans2 solution, then the output at coastal points is just the maximum rmax of the surrounding u/v points, no? (u/v being zero adjacent to land)

Right now the input is a DataArray (DA) and we don't specify anywhere that has to be masked. Also, creating the mask inside the function, doesn't really work, since the input DA not necesserely correspond to the original model bathymetry (i.e. original land-sea mask) : for example NEMO set first land point adjacent to a wet cell to min_dep before the smoothing as this needs to be included in smoothing (see here)

I think that even implementing #39 we still have division by zero on the land, which will give nans and then will create a problem since np.maximum propagates nans, isn't it?

@malmans2
Copy link
Member Author

I think @jdha is right, but currently we are not expecting users to pass a mask along with Bathymetry (I wasn't sure if we should expect mask from users, so the Bathymetry adds a mask but it is never used).

So I think we just have to add depth = depth.where(depth > 0, 0) at the beginning, which infers sea points and replaces all land points with zeros.

These are the steps for each dimensions of the current implementation:

  1. Move to U/V points: rmax_vel = abs(H[1] - H[0]) / (H[0] + H[1]) -> 0/0 gives NaN
  2. Move back to T points: rmax = mean(rmax_vel[0], rmax_vel[1]) -> By default, skipna=True, so mean([1, NaN]) == 1
  3. replace NaN with 0 (NaN are either land points or first/last row/column)

Finally, rmax = max(rmax_x, rmax_y)

My question was mainly about how to handle 2 in this comment, when we move back to T points. I think there are cases where it would make a difference if we replace NaN with 0. So mean(1, NaN) != mean(1, 0), which could affect the final result near land. It makes more sense to me to use NaN, but I'm not 100% sure. Does it make sense?

@malmans2
Copy link
Member Author

malmans2 commented Jun 21, 2021

These are the steps for each dimensions of the current implementation:

With current implementation I mean this PR... I think I see now the difference in this implementation that fixes issues with land:

rmax = rmax.shift({dim: -1}).fillna(0)

We are now filling all nan with zeros (i.e., land and first/last row/column), before we were just padding the boundaries with zeros.

@jdha
Copy link
Contributor

jdha commented Jun 21, 2021

I think @jdha is right, but currently we are not expecting users to pass a mask along with Bathymetry (I wasn't sure if we should expect mask from users, so the Bathymetry adds a mask but it is never used).

So I think we just have to add depth = depth.where(depth > 0, 0) at the beginning, which infers sea points and replaces all land points with zeros.

These are the steps for each dimensions of the current implementation:

  1. Move to U/V points: rmax_vel = abs(H[1] - H[0]) / (H[0] + H[1]) -> 0/0 gives NaN
  2. Move back to T points: rmax = mean(rmax_vel[0], rmax_vel[1]) -> By default, skipna=True, so mean([1, NaN]) == 1
  3. replace NaN with 0 (NaN are either land points or first/last row/column)

Finally, rmax = max(rmax_x, rmax_y)

My question was mainly about how to handle 2 in this comment, when we move back to T points. I think there are cases where it would make a difference if we replace NaN with 0. So mean(1, NaN) != mean(1, 0), which could affect the final result near land. It makes more sense to me to use NaN, but I'm not 100% sure. Does it make sense?

@malmans2 I would definitely use NaN, as that value/point is never used to calculate pressure gradients in NEMO, so is of no interest.

@oceandie
Copy link
Contributor

Since we are not dropping nans anymore an given the default behaviour of mean (which solves the problem with maximum I was mentioning before), yes I would agree to leave nan to avoind weird results with zeros is the way to go.

@oceandie
Copy link
Contributor

However, I still think the input should be a DA, not a DS., what do you think?

@malmans2
Copy link
Member Author

malmans2 commented Jun 21, 2021

However, I still think the input should be a DA, not a DS., what do you think?

Yes, the bathymetry is the only DataArray needed to compute rmax. The mask is inferred from positive values, and anything that is land (not positive) is replaced with zeros.

I also added skipna=True to make explicit what's going on.

@malmans2
Copy link
Member Author

pre-commit.ci autofix

@@ -155,6 +155,9 @@ def _calc_rmax(depth):
Slope steepness value (units: None)
"""

# Replace land with zeros
depth = depth.where(depth > 0, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this here? I think when bathy or envelope are passed to this function they should have already zeros for land

Copy link
Member Author

@malmans2 malmans2 Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if the user bathymetry has negative values or NaN on land?
Should we remove it and clarify in the documentation that Bathymetry must be >= 0?

It's not a bad idea to get rid of this because it is potentially a very expensive but also useless operation...

Copy link
Contributor

@oceandie oceandie Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say yes, since I think we should write a function that put all land values to zero when interpolating the input bathymetry onto the model grid. So the model bathymetry will be positive defined by definition, with zeros for land (also I think NEMO will fail if this conditions are not met, but not sure about this last point)

Copy link
Contributor

@jdha jdha Jun 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I've been dipping in and out of this today (so no continuity in my thought process). @oceandie re NEMO will fail, NEMO as far as I understand doesn't use bathymetry field, just scale factors and wet cells [i.e. if top_level is equal to zero it's land]. So in that sense we can define land values in bathymetry as we see fit. I'll look over the python again tomorrow (and you may have already included some of this), but the key user defined vars are rn_sbot_min (for ln_sco) and rn_hmin (for ln_zco, ln_zps; which take different meanings depending on whether they are -ve or +ve: min number of levels or min depth). So when calculating rmax this user choices will affect the outcome (although, not in the case of tests).

from the Fortran (where the bathy is updated):

IF ( .not. ln_sco ) THEN                                !==  set a minimum depth  ==!
         IF( rn_hmin < 0._wp ) THEN    ;   ik = - INT( rn_hmin )                                      ! from a nb of level
         ELSE                          ;   ik = MINLOC( gdepw_1d, mask = gdepw_1d > rn_hmin, dim = 1 )  ! from a depth
         ENDIF
         zhmin = gdepw_1d(ik+1)                                                        ! minimum depth = ik+1 w-levels 
         WHERE( bathy(:,:) <= 0._wp )   ;   bathy(:,:) = 0._wp                         ! min=0     over the lands
         ELSE WHERE ( risfdep == 0._wp );   bathy(:,:) = MAX(  zhmin , bathy(:,:)  )   ! min=zhmin over the oceans
         END WHERE
         IF(lwp) write(numout,*) 'Minimum ocean depth: ', zhmin, ' minimum number of ocean levels : ', ik
ENDIF

and

bathy(:,:) = MIN( rn_sbot_max, bathy(:,:) )

         DO jj = 1, jpj
            DO ji = 1, jpi
              IF( bathy(ji,jj) > 0._wp )   bathy(ji,jj) = MAX( rn_sbot_min, bathy(ji,jj) )
            END DO
         END DO

note there is also a rn_sbot_max

Apologies if I'm going over old ground...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @jdha, yes sorry, I was still thinking about NEMO3.6 or DOMCFG, that we are actually trying to replicate :)
Regarding all the params and bits of code you mentioned, yes I started to consider them - soon I will push where I am now in sco_dev , then any feedback is more than welcome ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is helpful! (maybe open an issue about this as this comment will get kind of lost when we merge this PR?)

This PR should be just the starting point, and _calc_rmax will become the general function to compute rmax. Once we merge this PR, I think @oceandie will merge main into #33 and will move the whole _calc_rmax function into utils. I guess vcoord-specific parameters such as rn_sbot_min will be handled internally by our domzgr classes (but all classes will make use of the same _calc_rmax function).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @malmans2 and @jdha , how can I see the original code for calc_rmax by @jdha (the one using numpy) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR started from the xarray version. You'd have to look in James' PR (already merged). Should be this one: #17

I'd go there and look at the appropriate commit (e.g., click on commits and select the first commit).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Member Author

@malmans2 malmans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to go now... I changed stuff back and forward quite a bit, so it's probably better if you compare the changes from all commits (button at the top of this page if you didn't use it before).

pydomcfg/tests/bathymetry.py Show resolved Hide resolved
pydomcfg/tests/bathymetry.py Show resolved Hide resolved
@malmans2 malmans2 mentioned this pull request Jun 26, 2021
@malmans2
Copy link
Member Author

malmans2 commented Jun 26, 2021

This is now using the same implementation in pyroms

domzgr.F90 -> pyDOMCFG automation moved this from In progress to Reviewer approved Jun 27, 2021
@oceandie oceandie merged commit 9002bb0 into pyNEMO:main Jun 27, 2021
domzgr.F90 -> pyDOMCFG automation moved this from Reviewer approved to Done Jun 27, 2021
@malmans2 malmans2 deleted the rmax_indexes branch June 30, 2021 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

testing calc_rmax calc_rmax: Should we compute the rolling average using absolute values?
3 participants