-
Notifications
You must be signed in to change notification settings - Fork 53
Add moc script (postprocessing) and plot_vertical_section #139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@xylar, @pwolfram: this is not ready to merge, but it is a big PR (for me, at least), so it would be great if you could provide some feedback as soon as it is possible for you. Things that I tried to address and things that I still would like to address:
|
Yep, happy to take a look as soon as I can. I agree, it's a big PR. It looks like a lot of hard work and a very nice job, based on what I've seen so far.
I would have to look more closely at the
Yep, breaking things into smaller functions can often be a good move. Happy to advise where I can.
Let me know if you need help or advise here, or if it's really just a question of time. |
ok, I can see to add a check |
|
@xylar: one other thing that perhaps this could include, since plotting.py is being updated, is to add the setup_colorbar function in plotting, rather than in ocean_modelvsobs. What do you think? Also, quick dumb question: why some functions start with an underscore? (sorry..) |
Sure, that sounds good. You could make a function to parse the colorbar info from the config given a section name, right? Then, different plot routines would call this same function to set up the colorbar?
No worries, glad you asked. The convention is that functions starting with underscores are "non-public" meaning other python files aren't meant to use them. In practice, you can use whatever you want, but functions with underscores get less documentation and they're allowed to change without warning and don't have to be backwards compatible. It won't get mentioned in the list of functions available at the top of the module. I typically try to use underscores at the beginning of function names to indicate that it's kind of a helper function. You don't have to do this if you don't want. |
yes, that's exactly what I was thinking.
ah! great. Thanks. |
|
@milenaveneziani, could you rebase onto |
config.default
Outdated
| # Mask file for post-processing regional MOC computation | ||
| regionMaskFiles = /path/to/MOCregional/mapping/file | ||
|
|
||
| # colormap for model/observations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there are no observations, maybe change the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
@milenaveneziani, could you post instructions for creating the regional mask files for a given grid? That way, I could create them for some other grids we want to support. |
22d1258 to
2bf7570
Compare
|
@milenaveneziani, I can only run the MOC analysis on wolf right now because of not having read permission for the mask on Edison (see above) and not having a mask file for the G-UQ240 run (the only one I can test on my laptop). I ran successfully with the default settings for the which would be settings I would typically use to plot the whole time series, I get an error: I will make some comments around the code relating to this error. |
|
good catch. I imagine why it gives an error. About the masks: @mark-petersen made them, you can ask him for instructions on how to do it. Or we can try to address the problem with extracting the southern boundary in the MPAS-Tool that you created a while back (right after this), so that we can create the region masks and transect masks with this newer workflow. And I will change permissions on the edison mask files, sorry about that. |
pwolfram
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big picture comments:
- The code does a nice job of providing two options for plotting of the MOC (pre- and post-processed options) within the current MPAS-Analysis framework.
- Documentation / overall PEP8 compliance is great.
- The script is long-- most functions should be 50-100 lines long at most, optimally. This one is around 400. It would be good to break this up into subfunctions if possible, especially related to regions of logic, e.g., A) plot precomputed MOC, B) plot post-process computed MOC
- I think it is ok to violate PEP8 for the greater interest of clarity in reading the code, as noted below.
| # referenceRunName is the name of a reference run to compare against (or None | ||
| # to turn off comparison with a reference, e.g. if no reference case is | ||
| # available) | ||
| referenceRunName = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need the above lines because referenceRunName = None is specified in the defaults. If it is needed here I recommend a more specific comment than the generic one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| oceanNamelistFileName = mpas-o_in | ||
| oceanStreamsFileName = streams.ocean | ||
| seaIceNamelistFileName = mpas-cice_in | ||
| seaIceStreamsFileName = streams.cice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, these appear to be default values so I don't think we need these lines above in this script. They obviously don't hurt, but they do provide additional clutter that a user has to navigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| baseDirectory = /lustre/atlas/proj-shared/cli115/observations | ||
| sstSubdirectory = SST | ||
| sssSubdirectory = SSS | ||
| mldSubdirectory = MLD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, these subdirectories are not unique relative to the defaults and including them here appears redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your other similar comments, but this one, because we have slightly different directory structures on different machines, so it may be clearer to specify subdirectories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense. Thank you for clarifying.
| ## compared | ||
|
|
||
| # directory where sea ice reference simulation results are stored | ||
| baseDirectory = /lustre/atlas/proj-shared/cli115/milena/ACMEv0_lowres/B1850C5_ne30_v0.4/ice/postprocessing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This makes the input file much cleaner. 👍
| raise RuntimeError('*** MPAS-Analysis relies on the existence of ' | ||
| 'monthly\n*** averaged files: make sure to ' | ||
| 'enable the\n*** timeSeriesStatsMonthly ' | ||
| 'analysis member!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer this be written as
raise RuntimeError('*** MPAS-Analysis relies on the existence of monthly \n'
'*** averaged files: make sure to enable the \n'
'*** timeSeriesStatsMonthly analysis member!')to that the error message is easier to read in the code, even if this means we violate PEP8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message above is hard to read in its current form, e.g., due to PEP8 compliance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with violating PEP8 in this case.
| colorbarLevels = config.getExpression( | ||
| sectionNameClimo, | ||
| '{}ColorbarLevels'.format(regionName)) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is where the colormapName code, etc, as highlighted above, should go.
| time[:] = ds.Time[ncount-12:ncount].values | ||
| fldAtlantic26[:] = mocRegional26[ncount-12:ncount] | ||
| ncFile.close() | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Large if/else loops are difficult to understand. This can be clarified by using more functions as possible to keep code in if/else sections to some small number of lines, e.g., <<50.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, move the simpler section up front to the if/else to make this clearer.
| movingAveragePoints, title, | ||
| xLabel, yLabel, figureName, | ||
| lineStyles=['k-'], lineWidths=[1.5], | ||
| calendar=calendar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to do all the plotting at the end in designated sections for each particular plot? As it is now plotting occurs throughout the script, which is somewhat confusing.
| """ | ||
| monthlyClimatology = compute_monthly_climatology(ds, calendar) | ||
| annualClimatology = monthlyClimatology.mean('month') | ||
| return annualClimatology |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really beautiful and is demonstrating the power of having modular, easy to understand code!
run_analysis.py
Outdated
| # print "Plotting Meridional Overturning Circulation (MOC)..." | ||
| # from mpas_analysis.ocean.meridional_overturning_circulation import moc_postprocess | ||
| # moc_postprocess(config, streamMap=oceanStreamMap, | ||
| # variableMap=oceanVariableMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this commented code above? My preference is to remove it from the final merged code.
xylar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@milenaveneziani, I think this is some good progress. Like @pwolfram, I see some places for improvement in terms of organizing into smaller helper functions wherever possible. I think it would be helpful for the future to give some more thought to making code a bit more generic and reusable, like not having variables specific to the Atlantic but rather for whichever region you might be analyzing, and looping over regions (even though there's only one for now).
run_analysis.py
Outdated
| variableMap=oceanVariableMap) | ||
|
|
||
| if checkGenerate(config, analysisName='streamfunctionMOC', mpasCore='ocean', | ||
| analysisCategory='regriddedHorizontal'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest leaving analysisCategory=None for this or inventing a new category (I thought of 'transect' or 'verticalSection', but that's not really correct for the streamfunction...), because I don't think this analysis task really fits in regriddedHorizontal.
As a separate PR later on, I will probably change the name of regriddedHorziontal and the associated tasks to something like climatologyMap. I think the MOC streamfunction still doesn't fit in that category.
| print "" | ||
| print "Plotting streamfunction of Meridional Overturning Circulation (MOC)..." | ||
| from mpas_analysis.ocean.meridional_overturning_circulation import moc_streamfunction | ||
| moc_streamfunction(config, streamMap=oceanStreamMap, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several of these lines are too long to pass PEP8. If you use spyder to edit your code, make sure to go to Preferences > Editor > Code Inspection/Analysis and turn on 'Code analysis'. That way, you will get warnings in orange on the side anytime there's a PEP8 problem. If you're using vim or something else to edit the code, I'm afraid you'll have to use flake8 afterward to check for issues.
| import matplotlib.colors as cols | ||
|
|
||
| import xarray as xr | ||
| import datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
datetime is not used and can be removed
| import open_multifile_dataset | ||
|
|
||
| from ..shared.timekeeping.utility import get_simulation_start_time, \ | ||
| days_to_datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
days_to_datetime is not used and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reintroduced it because we do use it now.
| refTopDepth[1:nVertLevels+1] = refBottomDepth[0:nVertLevels] | ||
| refLayerThickness = np.zeros(nVertLevels) | ||
| refLayerThickness[0] = refBottomDepth[0] | ||
| refLayerThickness[1:nVertLevels] = refBottomDepth[1:nVertLevels] - \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Various PEP8 issues starting here. See my other comment about setting up spyder to detect these as you write code. Or use flake8 to find them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was one specific case when flake8 didn't like what I did, no matter what. which is why I decided to choose the best indent for readability.
I think this was the only option that flake8 liked, from what I remember:
refLayerThickness[1:nVertLevels] = refBottomDepth[1:nVertLevels] - \
refBottomDepth[0:nVertLevels-1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's complaining that there are 2 spaces after the equal sign. The formatting is fine. Here's how I would handle this one, though:
refLayerThickness[1:nVertLevels] = (refBottomDepth[1:nVertLevels] -
refBottomDepth[0:nVertLevels-1])
| print '\n Compute and/or plot post-processed MOC climatological '\ | ||
| 'streamfunction...' | ||
| if not os.path.exists(outputFileClimo): | ||
| print ' Load data...' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part inside the "if" section could also make sense as a separate helper function.
| ncFile.close() | ||
| else: | ||
| # Read from file | ||
| print ' Read previously computed MOC streamfunction from file...' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is small enough that it doesn't necessarily need to be a helper function, but it could be if you wanted to keep things kind of symmetric with the helper function you would add above to compute and write the MOC streamfunction.
| ## circulation (MOC) | ||
|
|
||
| # Mask file for ocean basin regional computation | ||
| regionMaskFiles = /global/project/projectdirs/acme/mapping/grids/EC60to30v1_SingleRegionAtlanticWTransportTransects_masks.nc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@milenaveneziani, could you please change the permissions on this file? The easiest would be changing the group to acme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| latBinGlobal.description = \ | ||
| 'latitude bins for MOC Global streamfunction' | ||
| latBinGlobal.units = 'degrees (-90 to 90)' | ||
| latBinAtlantic = ncFile.createVariable( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@milenaveneziani, for now, I think this is fine but I wonder if it would make sense to rewrite everything explicitly referencing the Atlantic so it doesn't have to know which region it's referring to in anticipation of having a list of regions. If you want, we could discuss this by phone.
| outputFileTseries = '{}/mocTimeSeries_year{:04d}.nc'.format( | ||
| outputDirectory, year) | ||
| if not os.path.exists(outputFileTseries): | ||
| for nm in range(12): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this outer looping over nyears and inner looping over months is problematic because the time series may not start and end t the beginning and end of a year, as I found when I set endYear=9999 on wolf.
I also don't see the value of saving a netcdf file for each year as opposed to processing all years into a single file. If the idea would be to add data to your existing file as the run continues, that would be possible with an unlimited dimension for time. The existing approach is problematic if a given year is incomplete because the file might exist from the previous run of analysis but it might be incomplete (because the output for the full final year didn't yet exist) so breaking the output into yearly files doesn't really change the need for an unlimited time dimension and checking to see if you need to append to an existing file or create a new one.
Here is an example of a script where I use an unlimited dimension for time and either create a new file or append to and existing one, depending on whether the file exists already:
https://github.com/MPAS-Dev/MPAS/blob/ocean/develop/testing_and_setup/compass/ocean/isomip_plus/viz/computeOverturningStreamfunction.py#L132
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should discuss this by phone.
|
Thanks so much for the prompt review!
I will address or reply to your comments one by one now. |
Okay, well I guess I can go along with this. I've been trying to be very strict about the PEP8 standard. My preference would be to try very hard to maintain the standard but to have long lines in rare instances (e.g. very long strings, particularly when it puts \n commands in the logical place). The 79 character limit is frankly quite inconvenient and there are lots of cases where I think readability would be better without it but then one might as well throw out the standard...
Okay, I'm fine with generalizing later. If I were writing this, I'd want to save myself work down the road by writing the generalized version now. But since it's already written this way, I think it's fine to leave it for now. |
I know, I feel the same way. I guess what I could do is revisit the MPAS-Tool you created a few months ago, and see what I get with the script that I have now. We could try to reiterate a bit there to see if we can solve any issue relatively quickly, and then go from there. What do you think? |
|
|
||
| ax = plt.gca() | ||
|
|
||
| # Add a x=0 line if y ranges between positive and negative values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean y = 0, not x = 0, right?
This all seems great! I'm not bothered by the redundancy because it allows flexibility we might want in the future and makes things very clear. The longer
Looking at the code, I think you mean |
|
Looking at the code, there are still a number of PEP8 formatting issues that should preferably be addressed before this gets merged. I can point out individual ones if you need. spyder should be able to help with indentation issues. |
|
Tests on my laptop seem to suggest that For now, I am just flagging this as a potential issue to possibly be addressed in a follow-up PR. |
| print '\n Compute and/or plot post-processed MOC climatological '\ | ||
| 'streamfunction...' | ||
| simulationStartTime = get_simulation_start_time(streams) | ||
| outputDirectory = buildConfigFullPath(config, 'output', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This output directory is appropriate for the streamfunction but maybe it's kind of confusing to write files for the time series here? Should we make a separate directory for writing out time series data? We could also cache extractions of time series from other analysis tasks.
|
@milenaveneziani, I went ahead an made the changes I suggested above because I needed them to test the code. I also did a little bit of clean up. If you want, you can just do a hard reset (or a cherry-pick) to 7370db9. |
|
@milenaveneziani, if you are okay with my last fix, I think we're in good shape. I have tested with these changes using the I also tested the beta1 run ( |
|
Note: I'm not able to run on an Maybe that's okay but we should make a decision if we're not supporting backward compatibility to older runs. |
|
@milenaveneziani, I'm looking forward to having these available for a doing actual analysis. The One suggestion might be changing the colormap and/or removing the tick at 0 Sv in the Global MOC plot because the transition just look like noise in all the plots and it's rather distracting. |
|
ah, that's because the am was not called timeseriesstatsmonthly back then.. That means it has to go through variable map? Not too worried about alpha8, but maybe we can make it more robust to future changes. |
Sounds good. Let me know when you're happy. I'll do one last test on Edison (on the theory that the longer tests across multiple runs will catch more problems) and then I'm happy to merge. I'm assuming @mark-petersen isn't going to do a review? |
|
@milenaveneziani, just remove the "don't merge" tag and I'll take that as the sign it's ready to merge. There are a lot of commits in this branch. I think that's likely appropriate, but if you decide you want to rebase and squash some of the commits, that is also fine. |
3ba094d to
f27166d
Compare
|
@xylar: I addressed all your remaining comments, except for the one related to the failing for alpha8. |
|
Sounds good. I don't think alpha8 is a problem but if we find we should be supporting it, we can do a fix later on. I like the change to the color levels. It looks much less noisy. |
|
@milenaveneziani, I tried running a full test of the I was able to successfully run |
|
Thanks @xylar. Sorry about the permissions for the mapping file. So glad this is integrated now. |
|
Very nice work! |
|
one last thing about this one, about the slow performance of |
|
@milenaveneziani, please see #164 for the removal of the loop if you are interested. |







This PR is intended to accomplish the following:
add a general meridional_overturning_circulation script, which will eventually plot both the results of the MOC analysis member (if enabled) and/or the postprocessed MOC, for the global ocean and for a selected number of oceanic basins. Presently, it supports the computation and plotting of the postprocessed MOC only.
add a plot_vertical_section routine in the plotting module, meant to plot an MPAS vertical slice (as lat vs z, lon vs z, or spherical distance vs z) using the native MPAS grid (no vertical or horizontal interpolation).
modify plotting to make it PEP8 compliant.