-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove for loops #16
remove for loops #16
Conversation
Codecov Report
@@ Coverage Diff @@
## zco_dev #16 +/- ##
==========================================
Coverage ? 92.30%
==========================================
Files ? 2
Lines ? 39
Branches ? 0
==========================================
Hits ? 36
Misses ? 3
Partials ? 0
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
range(1, self._jpk + 1), coords={"z": range(self._jpk)}, dims="z" | ||
).astype(float) | ||
kindx = DataArray(range(self._jpk), coords={"z": range(self._jpk)}, dims="z") | ||
kindx = kindx + 1.0 # to deal with python convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be consistent with ln 60:
kindx += 1.0 # Fortran indexing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do it because int += float
is not allowed with DataArrays.
I don't like it either, so many options that I can't decide what would be the best to clarify what's going on:
kindx = DataArray(np.arange(1., self._jpk+1), coords={"z": range(self._jpk)}, dims="z")
# or
kindx = DataArray(range(1, self._jpk+1), coords={"z": range(self._jpk)}, dims="z").astype("float")
# or
kindx = DataArray(range(self._jpk), coords={"z": range(self._jpk)}, dims="z").astype("float")
kindx += 1.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is for: kindx = DataArray(np.arange(1., self._jpk+1), coords={"z": range(self._jpk)}, dims="z")
Hi @pyNEMO/pydomcfg guys, That's very interesting, thanks a lot. Getting rid of loops and writing a more efficient pythonic code is surely our ultimate goal! While in z-coordinates z3 is effectively only 1D, i.e. z3(k), in s- and MEs-coordinates z3 is completely 3D z3(x,y,z). In addition, in s- and MEs-coordinates the coordinate transformation will have some parameters that will vary with the 2D bathymetry (e.g., the transitioning from sigma to stretched s-coordinates in sh94 or sf12). I personally think that using loop structures will help me writing faster and a more cleaner code that I could debug and check more easily. Of course the price would be the less efficiency , but I think we could worry about this at the end of this first phase, when we will a more comprehensive picture of all the cases we will have to deal with. So my suggestion would be to leave all these developments in a branch, continuing with the looping code and once we will have a base version of the code that we will be sure is able to correctly replicate the Fortran90 version then we could massively invest our energies in optimise the code and make it more efficient. What do you think? |
Hi @oceandie, sorry I was busy with other things yesterday - I'll try and catch up with this thread a little later. I would agree with you that to get the base code down is a priority. @malmans2 is introducing me to quite a few new aspects to python - which will take a while to permeate the 'old grey matter'. I'm not too sure it's necessary to keep these devs in a branch. The task at hand is to lay down the base code. In my mind once that is done, tested and approved it can be merged. We can open an 'efficiency' branch later. I was also thinking that Friday coffee/tea meet-ups are great way to review and discuss pythonic approaches... I think that way some of your/@malmans2 ideas/practices will slowly creep into the way I code too ;) |
I defenitely agree with you on everything @jdha. If you agree I will setup a recursive 30 min python-tea/coffee-chat for Fridays: what time do you generally prefer guys? |
This PR is not really meant to get merged but just to showcase how the final version could look like (lemme move it to draft). Would have not been easy to do it with inline comments. We can introduce these kind of more aggressive changes once you have a test that makes sure we are getting consistent results, but I think it's better sooner rather than later. So 1. conversion from Fortran -> 2. test -> 3. refactor The point I wanted to make is that your code is pretty much already written in a way that with a little change it would work with both for loops or multidimensional variables. The only thing you have to get rid of is basically the few times you use |
Fridays work for me! My afternoons are usually clear... |
me too - preferably between 14h00 and 15h00. I usually have to be out the door at 15h15 for the school run. |
.... oh, just realised that this is a second PR - sorry I had it in my head we were talking about this one. Happy to keep it as a PR as I try and digest the message ;) |
I think I like what you are saying (not sure if I am getting correctly :) ): basically you are saying that all methods returning float could be converted to return Dataset and then doesn't matter if you call them to return just one scalar (so behaving effectively as a float(var)) or to return an N-dim array. In this way we could keep the loops for the moment but having a code which is almost ready for a more massive optimisation in the future? ... Am I getting right? ... because if it is like this, if you agree I think I will implement your suggestions for modifying sigma, compute_z3 etc from float to Dataarray ... |
I think it might be even easier than that... Most of the time you wrap variables in
As you can't do |
@oceandie
There's a couple of things a think xarray can simplify:
xarray
will automatically broadcast whenever it's needed.for
loop you had in__call__