-
Notifications
You must be signed in to change notification settings - Fork 53
Add autoclose argument to xarray.open_mfdataset call #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,18 +1,18 @@ | ||
| ## This file contains the default values of all possible configuration options | ||
| ## used to run analysis tasks. Do not modify options in this file direct. | ||
| ## This file contains the default values of all possible configuration options | ||
| ## used to run analysis tasks. Do not modify options in this file direct. | ||
| ## Instead, follow this procedure: | ||
| ## 1. Create and empty config file (say config.myrun) or copy one of the | ||
| ## example files in the configs directory. | ||
| ## 2. Copy and modify any config options you want to change from this file into | ||
| ## into your new config file. Make sure they have the right section name | ||
| ## (e.g. [run] or [output]). If nothing esle, you will need to set | ||
| ## into your new config file. Make sure they have the right section name | ||
| ## (e.g. [run] or [output]). If nothing esle, you will need to set | ||
| ## baseDirectory under [output] to the folder where output should be stored. | ||
| ## 3. run: ./run_analysis.py config.myrun. This will read the configuraiton | ||
| ## first from this file and then replace that configuraiton with any | ||
| ## changes from from config.myrun | ||
| ## 4. If you want to run a subset of the analysis, you can either set the | ||
| ## generate option under [output] in your config file or use the | ||
| ## --generate flag on the command line. See the comments for 'generate' | ||
| ## 4. If you want to run a subset of the analysis, you can either set the | ||
| ## generate option under [output] in your config file or use the | ||
| ## --generate flag on the command line. See the comments for 'generate' | ||
| ## in the '[output]' section below for more details on this option. | ||
|
|
||
|
|
||
|
|
@@ -47,6 +47,15 @@ seaIceStreamsFileName = streams.cice | |
| # names of ocean and sea ice meshes (e.g. EC60to30, QU240, RRS30to10, etc.) | ||
| mpasMeshName = mesh | ||
|
|
||
| # The system has a limit to how many files can be open at one time. By | ||
| # default, xarray attempts to open all files in a data set simultaneously. | ||
| # A new option allows files to be automatically closed as a data set is being | ||
| # read to prevent hitting this limit. Here, you can set what fraction of the | ||
| # system limit of open files an analysis task is allowed to use. Note: In the | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there a typo somewhere on the line above? I do not understand the sentence starting with
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks fine to me, but I'm happy to clarify. autocloseFileLimitFraciton is multiplied by the system limit on the number of files that can be opened. The result is the maximum number of files that we allow xarray to open before we use autoclose. Would the following be clearer? "Here, you can set the fraction of the system limit that xarray is allowed to use to open data sets at any given time. " If not, please suggest an alternative wording.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for clarifying. Yes the new wording is clearer. |
||
| # future when multiple tasks can run simultaneously, the system file limit will | ||
| # first be divided among the tasks before applying this fraction. | ||
| autocloseFileLimitFraction = 0.5 | ||
|
|
||
| [output] | ||
| ## options related to writing out plots, intermediate cached data sets, logs, | ||
| ## etc. | ||
|
|
@@ -140,7 +149,7 @@ endYear = 9999 | |
| # like start_year = 1 and end_year = 9999 will be clipped to the valid range | ||
| # of years, and is a good way of insuring that all values are used. | ||
| # For valid statistics, index times should include at least 30 years | ||
| startYear = 1 | ||
| startYear = 1 | ||
| endYear = 9999 | ||
|
|
||
| [oceanObservations] | ||
|
|
@@ -293,7 +302,7 @@ movingAveragePoints = 12 | |
| ## options related to plotting time series of the El Nino 3.4 index | ||
|
|
||
| # Specified region for the Nino Index, 5 = Nino34, 3 = Nino3, 4 = Nino4 | ||
| # The indexNino34 routine only accepts one value at a time, | ||
| # The indexNino34 routine only accepts one value at a time, | ||
| # regionIndicesToPlot should be an integer | ||
| regionIndicesToPlot = 5 | ||
|
|
||
|
|
@@ -312,7 +321,7 @@ regionIndicesToPlot = [6] | |
| movingAveragePoints = 12 | ||
|
|
||
| [streamfunctionMOC] | ||
| ## options related to plotting the streamfunction of the meridional overturning | ||
| ## options related to plotting the streamfunction of the meridional overturning | ||
| ## circulation (MOC) | ||
|
|
||
| # Region names for basin MOC calculation. | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,8 +22,7 @@ | |
| from ..shared.generalized_reader.generalized_reader \ | ||
| import open_multifile_dataset | ||
|
|
||
| from ..shared.timekeeping.utility import get_simulation_start_time, \ | ||
| date_to_days, days_to_datetime | ||
| from ..shared.timekeeping.utility import get_simulation_start_time | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unused functions |
||
|
|
||
|
|
||
| def nino34_index(config, streamMap=None, variableMap=None): | ||
|
|
@@ -80,14 +79,14 @@ def nino34_index(config, streamMap=None, variableMap=None): | |
| mainRunName = config.get('runs', 'mainRunName') | ||
| plotsDirectory = buildConfigFullPath(config, 'output', 'plotsSubdirectory') | ||
|
|
||
| plotTitles = config.getExpression('regions', 'plotTitles') | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unused variable |
||
| # regionIndex should correspond to NINO34 in surface weighted Average AM | ||
| regionIndex = config.getint('indexNino34', 'regionIndicesToPlot') | ||
|
|
||
| # Load data: | ||
| varList = ['avgSurfaceTemperature'] | ||
| ds = open_multifile_dataset(fileNames=fileNames, | ||
| calendar=calendar, | ||
| config=config, | ||
| simulationStartTime=simulationStartTime, | ||
| timeVariableName='Time', | ||
| variableList=varList, | ||
|
|
@@ -100,10 +99,8 @@ def nino34_index(config, streamMap=None, variableMap=None): | |
| nino34 = compute_nino34_index(SSTregions[:, regionIndex], config) | ||
|
|
||
| print ' Computing NINO3.4 power spectra...' | ||
| f, spectra, conf99, conf95, redNoise = compute_nino34_spectra(nino34, config) | ||
|
|
||
| start = days_to_datetime(np.amin(ds.Time.min()), calendar=calendar) | ||
| end = days_to_datetime(np.amax(ds.Time.max()), calendar=calendar) | ||
| f, spectra, conf99, conf95, redNoise = compute_nino34_spectra(nino34, | ||
| config) | ||
|
|
||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. unused variables and long line |
||
| # Convert frequencies to period in years | ||
| f = 1.0 / (constants.eps + f*constants.sec_per_year) | ||
|
|
@@ -122,11 +119,12 @@ def nino34_index(config, streamMap=None, variableMap=None): | |
| def compute_nino34_index(regionSST, config): | ||
| # {{{ | ||
| """ | ||
| Computes nino34 index time series. It follow the standard nino34 algorithm, i.e., | ||
| Computes nino34 index time series. It follow the standard nino34 | ||
| algorithm, i.e., | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. long line |
||
|
|
||
| 1) Compute monthly average SST in the region | ||
| 2) Computes anomalous SST | ||
| 3) Performs a 5 month running mean over the anomalies | ||
| 1. Compute monthly average SST in the region | ||
| 2. Computes anomalous SST | ||
| 3. Performs a 5 month running mean over the anomalies | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. better output formatting |
||
|
|
||
| This routine requires regionSST to be the SSTs in the nino3.4 region ONLY. | ||
| It is defined as lat > -5S and lat < 5N and lon > 190E and lon < 240E. | ||
|
|
@@ -136,7 +134,8 @@ def compute_nino34_index(regionSST, config): | |
| regionSST : xarray.dataArray object | ||
| values of SST in the nino region | ||
|
|
||
| config : instance of the MPAS configParser | ||
| config : MpasConfigParser object | ||
| the config options | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. better comment |
||
|
|
||
| Returns | ||
| ------- | ||
|
|
@@ -159,7 +158,8 @@ def compute_nino34_index(regionSST, config): | |
| calendar = namelist.get('config_calendar_type') | ||
|
|
||
| # Compute monthly average and anomaly of climatology of SST | ||
| monthlyClimatology = climatology.compute_monthly_climatology(regionSST, calendar) | ||
| monthlyClimatology = climatology.compute_monthly_climatology(regionSST, | ||
| calendar) | ||
| anomalySST = regionSST.groupby('month') - monthlyClimatology | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. long line |
||
|
|
||
| return _running_mean(anomalySST.to_pandas()) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and elsewhere in this file, it is just automatically deleting trailing whitespace. No other changes were made.