Revamp main script functionality and associated examples #155

spencerahill · 2017-03-16T23:59:40Z

Closes #151 and #152.

@spencerkclark I would really appreciate a review on this within the next few days if at all possible. (Sorry for the time crunch and that I accidentally did this on the same branch as my last PR #153, so the diffs from that are also showing up below) . Thanks in advance!

This is a major overhaul of the functionality relating to the main script, and also includes one minor change to the Calc API.

Calc.region had been a dict of {region.name: region} items. But there is no need for this; we were never using the key, and we can always just grab the name from the object itself. Doing so made the main script overhaul easier
Enable multiple projs to be iterated over in the same multi-calc-generation call.
Rename main.py to aospy_main.py and the main() function to submit_mult_calcs()
Make the names used by submit_mult_calcs and thus the main script more descriptive
Delete find_obj.py and most of logic from main.py; move what remained of main.py to new module automate.py.

I am not completely satisfied with my new approach in automate.py, which involves separating the specifications for Calc into "core" and "aux" categories. This whole business would be trivial (just call itertools.product on everything), were it not for our support of 'default' and 'all' for some of the specs.

Basically I think this is well suited for a more OOP approach: a CalcSpec abstract base class, with one implementation of it for each spec that gets passed to Calc. Each implementation would specify how it is to be handled, e.g. where to find its 'default' or 'all' values or what data type to be expected (e.g. dtype_out_time is a tuple that gets permuted over within a Calc, not across Calcs). This would make it easier to, for example, support 'default' and 'all' values for every spec.

BUT we are a bit time crunched in terms of the pending blog post, and I've already ended up down the rabbit hole as it is. So, barring fundamental objections to the current approach, I'm looking for critiques there, and we can come back to this bigger picture stuff later. Sound good?

This bigger picture stuff notwithstanding, still some outstanding items:

Tests
Better docstrings in automate.py
what's new

Once this is locked in I will address #150, which I've already started on.

spencerkclark

Calc.region had been a dict of {region.name: region} items. But there is no need for this; we were never using the key, and we can always just grab the name from the object itself. Doing so made the main script overhaul easier

Delete find_obj.py

+1 for both of these

we can come back to [...] bigger picture stuff later. Sound good?

Agreed; we could easily get caught in a lengthy discussion here, but for now let's hold off.

Still working through the PR; I need to head out now, but will return to this later today and tomorrow.

spencerkclark · 2017-03-17T18:32:05Z

aospy/automate.py

+    elif name == 'default':
+        return getattr(parent, defaults_attr_name).values()
+    else:
+        return getattr(parent, attr_name)[name]


Not necessarily opposed, just want to confirm: does this mean that we need to switch from using lists to dictionaries when we specify the models attribute of a Proj object or the runs attribute of a Model object?

E.g. go from this:

example_model = Model( name='example_model', grid_file_paths=('/path/to/files'), runs=[runs.control, runs.conv_off], default_runs=[] )

to this:

example_model = Model( name='example_model', grid_file_paths=('/path/to/files'), runs=dict(control=runs.control, conv_off=runs.conv_off), default_runs=dict() )

I'm assuming this was a decision made to eliminate the need for the find_obj.py module? (I'm all for that decision, by the way)

This is actually already taken care of via the utils.io.dict_name_keys function. So the list syntax still works.

On that point, the main script and example object library from these commits should work; try running the main script (from inside the example directory so that you don't have to install the object library).

Ah I see this gets addressed automatically, should have noticed that sorry -- should we add some checking to utils.io.dict_name_keys to make sure that there are no duplicate names in the list of provided core objects (at least warn, maybe raise an exception)?

For example, right now if you had two Runs with the same name attribute in the list provided, the first one would get overwritten by the second without the user knowing.

On that point, the main script and example object library from these commits should work; try running the main script (from inside the example directory so that you don't have to install the object library).

Indeed, I can confirm this works!

spencerahill · 2017-03-17T20:48:19Z

Still working through the PR; I need to head out now, but will return to this later today and tomorrow.

Thanks a lot. Sorry again for the large scope and putting us into a bit of a time crunch.

spencerkclark

@spencerahill I've finished making a first pass here -- overall I think it looks pretty good! Thanks for taking this on.

A few high-level questions came up (related to #3), but we might want to put those off until later.

spencerkclark · 2017-03-18T14:02:36Z

aospy/automate.py

+    elif name == 'default':
+        return getattr(parent, defaults_attr_name).values()
+    else:
+        return getattr(parent, attr_name)[name]


Ah I see this gets addressed automatically, should have noticed that sorry -- should we add some checking to utils.io.dict_name_keys to make sure that there are no duplicate names in the list of provided core objects (at least warn, maybe raise an exception)?

For example, right now if you had two Runs with the same name attribute in the list provided, the first one would get overwritten by the second without the user knowing.

On that point, the main script and example object library from these commits should work; try running the main script (from inside the example directory so that you don't have to install the object library).

Indeed, I can confirm this works!

spencerkclark · 2017-03-18T14:30:06Z

examples/example_obj_lib.py

+
+
+projects = {example_proj.name: example_proj}
+variables = {var.name: var for var in [precip_largescale, precip_convective,


This is somewhat big picture, but I'll just note here that this requirement (creating a dictionary mapping Var name attributes to Var objects) breaks my current object library, because I have some Var objects that have the same name attribute, but are different Var objects (to accommodate cases where a variable is native in one model, but needs to be computed from other variables for output from another model, see #3 (comment)).

E.g. these two variables for OLR:

olr = Var( name='olr', alt_names=('rlut',), units=units.W_m2, domain='atmos', description='All-sky outgoing longwave radiation at TOA.', def_time=True, def_vert=False, def_lat=True, def_lon=True ) olr_imr = Var( name='olr', domain='atmos', description=('Outgoing longwave radiation'), variables=(swdn_sfc, vert_int_tdtsw_rad_imr, netrad_toa_imr), def_time=True, def_vert=False, def_lat=True, def_lon=True, func=calcs.idealized_moist_rad.energy.olr, units=units.W_m2 )

This will take some effort/thinking to address and I don't think it's necessarily a blocker here. There's nothing that precludes me (or anyone else) from rolling their own solutions to a main script or object library (so until we have a better solution for #3 I may need to just continue using find_obj.py and my old main script to get things done).

On a more general note, this practice of requiring the user to create module level dictionaries to set up their object libraries feels a little non-ideal; again though, without more thought I don't have a good alternative solution. For small object libraries it feels OK, but for larger ones (that get spread out across multiple submodules, like what we use in practice), these would need to be placed (somewhat hidden) in an aospy_user/__init__.py file.

For variables in particular, if one has a lot defined in a module (e.g. in aospy_user/variables/__init__.py) is there a straightforward pythonic way to create this dictionary without much typing?

this requirement (creating a dictionary mapping Var name attributes to Var objects) breaks my current object library, because I have some Var objects that have the same name attribute, but are different Var objects

this practice of requiring the user to create module level dictionaries to set up their object libraries feels a little non-ideal

These are both good points and I definitely want to avoid breaking your workflow. Will have a separate comment and/or Issue on these...I have some ideas but need to clarify them.

spencerkclark · 2017-03-18T15:01:31Z

aospy/automate.py

+        specs = self._specs_in.copy()
+        [specs.pop(core) for core in self._CORE_SPEC_NAMES]
+        # TODO: don't just pull from first project.
+        regions = _get_mult_objs_by_names(specs['regions'], self._projects[0],


Would this prevent a user from using a region that was not explicitly listed as a Region in a particular Proj object?

Yes that's a good point...I need to think about this

A related issue (although I don't see an immediate use-case) is if the same name was used by Region objects in different projects but with different region definitions. In that case this logic would incorrectly use the region definition for the first project in all cases.

same name was used by Region objects in different projects but with different region definitions

This would remain a problem even if e.g. we formed the union of regions defined across all projects. In that case the problem reduces to the same issue you raised above re: conflicting names of Var objects.

spencerkclark · 2017-03-18T15:20:21Z

aospy/automate.py

+    def create_calcs(self):
+        return _create_calcs(self._combine_core_aux_specs())
+
+    def _print_specs(self):


I think you'll need to update this function to use the self._specs_in dictionary, rather than the (now removed) projects, models, runs etc. attributes in order for it to run (though I'm guessing you're aware of this, since you've commented it out below).

Yes definitely

spencerkclark · 2017-03-18T15:21:35Z

aospy/automate.py

+                      prompt_verify=False, verbose=True):
+    """Generate and execute all specified computations."""
+    calc_suite = CalcSuite(calc_suite_specs, obj_lib)
+    # calc_suite._print_specs()


Uncomment this once _print_specs() is updated

spencerkclark · 2017-03-18T15:29:33Z

aospy/automate.py

+        return _permute_aux_specs(specs)
+
+    def _combine_core_aux_specs(self):
+        return _combine_core_aux_specs(self._permute_core_specs(),


It's kind of confusing to have two functions of the same name that are defined differently. Is there a good reason to define the module-level version of _combine_core_aux_specs? Could you simply just implement what it does within this class-level function?

Maybe at the module level create a function called _merge_dicts and then here you can write a one-liner using itertools.product? Would that work?

E.g. at the module level:

def _merge_dicts(a, b): merged = a.copy() merged.update(b) return merged

and at the instance-level (here):

def _combine_core_aux_specs(self): return [_merge_dicts(a, b) for a, b in itertools.product( self._permute_core_specs(), self._permute_aux_specs())]

I agree, I got carried away here with the module-level functions. The original motivation was for ease of testing and potential re-use of functionality elsewhere, but this logic seems pretty specific to CalcSuite.

spencerkclark · 2017-03-18T15:52:07Z

aospy/automate.py

+                                    defaults_attr_name) for name in names]
+
+
+def _permute_aux_specs(specs):


I think you could write this function pretty compactly as a list-comprehension:

def _permute_aux_specs(specs): return [dict(zip(specs.keys(), perm)) for perm in itertools.product(*specs.values())]

Indeed, I can confirm this works!

Great!

For example, right now if you had two Runs with the same name attribute in the list provided, the first one would get overwritten by the second without the user knowing.

Good catch; will address.

I think you could write this function pretty compactly as a list-comprehension

Yes, will do and the other list comprehensions you suggest

spencerkclark · 2017-03-18T15:58:10Z

aospy/automate.py

+    return all_specs
+
+
+def _create_calcs(specs):


I think this could also be written more compactly as a list-comprehension

spencerkclark · 2017-03-18T16:05:26Z

aospy/automate.py

+        return pool.map(lambda calc: calc.compute(), calcs)
+    out = []
+    for calc in calcs:
+        try:


Could there be a way to encapsulate this try-except logic such that it could also be used above when we parallelize the calculations? That way it would ensure the same warning behavior through each pathway.

Good idea, fixing this makes that logic much cleaner

spencerahill · 2017-03-19T23:55:34Z

@spencerkclark thanks a lot for the review. Your concerns largely coincide with the parts I was most uncomfortable about, so at least we're on the same page about what needs improving.

In terms of long-term solutions, I opened #156 to discuss the aspects re: how to register objects with minimal input by the users.

The other big issue you note that I still need to think about is what the appropriate "parent" is of Vars and Regions. Will follow up with my thinking on that.

spencerahill · 2017-03-22T00:02:38Z

@spencerkclark I think I have managed to relax the prior onerous requirement of defining {name: obj} dicts for the projects and variables. CalcSuite now simply searches the object.

However, now I'm realizing this still doesn't work with either of our existing setups, because we are importing a variables module into the object library's top-level namespace. In what I've just pushed, the Var objects are searched for at that top level.

Perhaps a good solution is a try/except case, wherein if 'variables' is defined it is used, and if not then this new method of finding all the Vars is used. Does that make sense?

I added some initial tests; will add more. C.f. our convo offline, you're welcome to add more, but more important for me is your feedback on the overall approach.

spencerahill · 2017-03-22T00:08:05Z

Perhaps a good solution is a try/except case, wherein if 'variables' is defined it is used, and if not then this new method of finding all the Vars is used

Edit: typo in the commit message; should be 'variables'

Just pushed this, let me know if it works for you.

spencerkclark · 2017-03-22T01:06:25Z

Perhaps a good solution is a try/except case, wherein if 'variables' is defined it is used, and if not then this new method of finding all the Vars is used. Does that make sense?

Thanks for the updates; yes, I think this is a very good compromise. I will have a closer look at your changes tomorrow afternoon.

spencerahill · 2017-03-22T01:12:43Z

Great, thanks! I'll try to do some more testing in the meantime

spencerahill · 2017-03-22T17:14:36Z

Just added some more tests. This is my first time using pytest. I immediately like it a lot, although I'm not certain I'm using it correctly. I feel like maybe I'm overdoing it with all of the @pytest.fixtures, and also I haven't figured out the syntax of the one parameterized case.

@spencerkclark since I know you've been using pytest recently, any comments on these aspects would be appreciated.

spencerahill · 2017-03-22T17:31:38Z

I have encountered another problem. When I execute aospy_main the first time, it works fine. But the second time, I'm getting an error related to the tar output:

~/Dropbox/py/aospy/examples$ ./aospy_main.py
INFO:root:Initializing Calc instance: Calc object: precip_total, example_proj, example_model, example_run (Wed Mar 22 10:11:03 2017)
INFO:root:Initializing Calc instance: Calc object: precip_convective, example_proj, example_model, example_run (Wed Mar 22 10:11:03 2017)
INFO:root:Initializing Calc instance: Calc object: precip_largescale, example_proj, example_model, example_run (Wed Mar 22 10:11:03 2017)
INFO:root:Initializing Calc instance: Calc object: precip_conv_frac, example_proj, example_model, example_run (Wed Mar 22 10:11:03 2017)
Calc object: precip_total, example_proj, example_model, example_run
Calc object: precip_convective, example_proj, example_model, example_run
Calc object: precip_largescale, example_proj, example_model, example_run
Calc object: precip_conv_frac, example_proj, example_model, example_run
Perform these computations? [y/n] y
INFO:root:Getting input data: Var instance "precip_largescale" (Wed Mar 22 10:11:06 2017)
/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/conventions.py:389: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  result = decode_cf_datetime(example_value, units, calendar)
/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/conventions.py:408: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  calendar=self.calendar)
INFO:root:Getting input data: Var instance "precip_convective" (Wed Mar 22 10:11:06 2017)
/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/conventions.py:389: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  result = decode_cf_datetime(example_value, units, calendar)
/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/conventions.py:408: RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range
  calendar=self.calendar)
INFO:root:Computing timeseries for 0004-01-01 00:00:00 -- 0006-12-31 00:00:00.
INFO:root:Applying desired time-reduction methods. (Wed Mar 22 10:11:06 2017)
INFO:root:Writing desired gridded outputs to disk.
tar: Option --delete is not supported
Usage:
  List:    tar -tf <archive-filename>
  Extract: tar -xf <archive-filename>
  Create:  tar -cf <archive-filename> [filenames...]
  Help:    tar --help
INFO:root:	example-output/example_proj/example_model/example_run/precip_total/precip_total.ann.av.from_monthly_ts.example_model.example_run.0004-0006.nc
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:
  #000: H5F.c line 522 in H5Fcreate(): unable to create file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 1024 in H5F_open(): unable to truncate a file which is already open
    major: File accessibilty
    minor: Unable to open file
Traceback (most recent call last):
  File "./aospy_main.py", line 110, in <module>
    prompt_verify=prompt_verify)
  File "/Users/shill/Dropbox/py/aospy/aospy/automate.py", line 247, in submit_mult_calcs
    return exec_calcs(calcs, parallelize=parallelize)
  File "/Users/shill/Dropbox/py/aospy/aospy/automate.py", line 230, in exec_calcs
    o = calc.compute()
  File "/Users/shill/Dropbox/py/aospy/aospy/calc.py", line 642, in compute
    save_files=save_files, save_tar_files=save_tar_files)
  File "/Users/shill/Dropbox/py/aospy/aospy/calc.py", line 707, in save
    self._save_files(data, dtype_out_time)
  File "/Users/shill/Dropbox/py/aospy/aospy/calc.py", line 663, in _save_files
    data_out.to_netcdf(path)
  File "/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/core/dataset.py", line 953, in to_netcdf
    unlimited_dims=unlimited_dims)
  File "/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/backends/api.py", line 563, in to_netcdf
    store = store_cls(path, mode, format, group, writer)
  File "/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/backends/netCDF4_.py", line 205, in __init__
    self.ds = opener()
  File "/Users/shill/Dropbox/miniconda3/lib/python3.5/site-packages/xarray/backends/netCDF4_.py", line 181, in _open_netcdf4_group
    ds = nc4.Dataset(filename, mode=mode, **kwargs)
  File "netCDF4/_netCDF4.pyx", line 1848, in netCDF4._netCDF4.Dataset.__init__ (netCDF4/_netCDF4.c:13992)
OSError: Permission denied

If I delete the tar output and re-submit, it works fine. @spencerkclark does this happen for you? It's possibly a MacOS thing, given the OSError and that we've not had this problem in the past with the tar output (I'm running from my Macbook Pro).

spencerkclark · 2017-03-22T21:14:33Z

Just added some more tests. This is my first time using pytest. I immediately like it a lot, although I'm not certain I'm using it correctly. I feel like maybe I'm overdoing it with all of the @pytest.fixtures, and also I haven't figured out the syntax of the one parameterized case.

Thanks for adding some more tests! I'm still getting the hang of pytest as well, but I also think it's pretty awesome. I think your use of fixtures is actually what's causing your trouble with parametrize. As it turns out the results of fixture functions cannot be used within parametrize decorators; they can only be passed as arguments to test functions themselves. So basically you'll just need to explicitly write out what obj_lib, all_vars, and all_projects should be there:

@pytest.mark.parametrize(('type_', 'obj_lib', 'expected'),
                         [(Var, examples, {condensation_rain, convection_rain, precip, ps, sphum}),
                          (Proj, examples, {example_proj})])
def test_get_all_objs_of_type(type_, obj_lib, expected):
    actual = _get_all_objs_of_type(type_, obj_lib)
    assert expected == actual

and I think things should work.

I'm taking a look at the rest of your changes now, and will see if I can fill in some gaps in the test coverage after.

spencerkclark · 2017-03-22T21:51:49Z

aospy/automate.py

+        # Drop the "core" specifications, which are handled separately.
+        specs = self._specs_in.copy()
+        [specs.pop(core) for core in self._CORE_SPEC_NAMES]
+        specs['regions'] = self._get_regions()


I think this will always use every region defined by the user in their object library.

Also does this work if there is no region specified? (In other words if someone doesn't want to do a spatial reduction on result?)

Wow, both very good points, and same with _get_variables below.

spencerkclark · 2017-03-22T21:51:51Z

aospy/automate.py

+        specs = self._specs_in.copy()
+        [specs.pop(core) for core in self._CORE_SPEC_NAMES]
+        specs['regions'] = self._get_regions()
+        specs['variables'] = self._get_variables()


(Similar to above) I think this will always use every variable defined by the user in their object library. Is that correct?

spencerkclark · 2017-03-22T21:57:15Z

aospy/automate.py

+
+    def _get_variables(self):
+        if hasattr(self._obj_lib, 'variables'):
+            return self._obj_lib.variables


I've gone ahead and tried this in my object library; this line does get called, but it causes a problem because this function needs to return an iterable to be used in line 53.

The obvious change (which I tried) would be to return the result of _get_all_objs_of_type(Var, self._obj_lib.variables), but this results in automate attempting to create calculations involving all the variables defined in my object library (ignoring the one I specified in the specifications), prompting my comments below. Is there a simple way to address this?

spencerkclark · 2017-03-22T22:18:46Z

I have encountered another problem. When I execute aospy_main the first time, it works fine. But the second time, I'm getting an error related to the tar output:

I've actually had something similar happen (prior to this PR), and not for tar output but for standard netCDF output. It only seems to occur for regional calculations. Here's a minimal working example:

In [1]: import xarray as xr

In [2]: ds = xr.Dataset()

In [3]: ds['a'] = 1

In [4]: ds['b'] = 2

In [5]: ds.to_netcdf('test.nc')

In [6]: ds.to_netcdf('test.nc')

For some reason I can call to_netcdf twice in the same session, but if I shut down ipython and restart the kernel, and reopen the file to try again, things fail:

In [1]: import xarray as xr

In [2]: ds = xr.open_dataset('test.nc')

In [3]: ds.to_netcdf('test.nc')
HDF5-DIAG: Error detected in HDF5 (1.8.17) thread 0:
  #000: H5F.c line 522 in H5Fcreate(): unable to create file
    major: File accessibilty
    minor: Unable to open file
  #001: H5Fint.c line 1024 in H5F_open(): unable to truncate a file which is already open
    major: File accessibilty
    minor: Unable to open file
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-3-d7687fe1d488> in <module>()
----> 1 ds.to_netcdf('test.nc')

/home/skc/miniconda/envs/research/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims)
    951         return to_netcdf(self, path, mode, format=format, group=group,
    952                          engine=engine, encoding=encoding,
--> 953                          unlimited_dims=unlimited_dims)
    954
    955     def __unicode__(self):

/home/skc/miniconda/envs/research/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path, mode, format, group, engine, writer, encoding, unlimited_dims)
    561     sync = writer is None
    562
--> 563     store = store_cls(path, mode, format, group, writer)
    564
    565     if unlimited_dims is None:

/home/skc/miniconda/envs/research/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in __init__(self, filename, mode, format, group, writer, clobber, diskless, persist)
    203                                    diskless=diskless, persist=persist,
    204                                    format=format)
--> 205         self.ds = opener()
    206         self.format = format
    207         self.is_remote = is_remote_uri(filename)

/home/skc/miniconda/envs/research/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in _open_netcdf4_group(filename, mode, group, **kwargs)
    179     import netCDF4 as nc4
    180
--> 181     ds = nc4.Dataset(filename, mode=mode, **kwargs)
    182
    183     with close_on_error(ds):

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.__init__ (netCDF4/_netCDF4.c:13992)()

OSError: Permission denied

(This is on a GFDL workstation by the way, so not Mac OS specific)

spencerkclark · 2017-03-22T22:20:41Z

I'm not sure if this issue in xarray is related: pydata/xarray#1215

spencerkclark · 2017-03-23T01:41:20Z

aospy/automate.py

+    def _get_regions(self):
+        return [_get_all_objs_of_type(Region, self._obj_lib)]
+
+    def _get_variables(self):


Would rewriting this function to look like this be acceptable? This would basically revert the behavior back to using the python variable name for each Var (not its name attribute).

def _get_variables(self): objs = getattr(self._obj_lib, 'variables', self._obj_lib).__dict__ variables = { name: var for name, var in objs.items() if isinstance(var, Var)} names = self._specs_in['variables'] return set([var for name, var in variables.items() if name in names])

This is pretty confusing though, since the name attribute is used for every other core object type (though that's also the way things are handled in the main script prior to this PR).

To be honest I think for my particular situation / use-case, writing a similar, simpler, kind of main script to handle permutation that asked the user to specify lists of actual aospy core objects (rather than strings) would be the way to go. It's up to you what you want to do here; I'm not suggesting we switch to that in the current PR. I'm more considering in the medium-term writing that type of main script on my own, and potentially thinking about adding it to the aospy core later.

You're right, we should stop dinking around with this string silliness and use the actual objects. I have started in on this.

@spencerahill cool! Please let me know if you need any help.

spencerahill · 2017-03-23T04:49:10Z

@spencerkclark big thanks for this batch of comments. Sorry wasn't able to get to them tonight, will tomorrow for sure. Once again you've caught some important things.

spencerahill · 2017-03-23T23:07:05Z

(Github not letting respond to this in-line for some reason)

results of fixture functions cannot be used within parametrize decorators; they can only be passed as arguments to test functions themselves. So basically you'll just need to explicitly write out what obj_lib, all_vars, and all_projects should be there

Darn. I kind of assumed that was the case but was hoping there was some trickery to avoid it. Oh well. Thanks for the commit fixing it.

Among other things, need to (re-) implement converting from 'default' date_ranges value to each Run's actual default range

spencerahill · 2017-03-24T02:09:47Z

@spencerkclark I've taken a stab at switching from names to objects and have managed to get back to roughly the same spot I was before otherwise.

I also added a "library" kwarg to the calc_suite_specs (essentially moving it there, rather than it being a separate kwarg of submit_mult_calcs).

Note that the date_ranges issue I note in the commit message goes away if you just make it ['default']. But with that fixed I'm noticing that I'm generating too many Calcs; it's probably iterating over the individual regions. Should be easy enough to fix these.

FYI I'll be away for the next hour or so but this is my sole task otherwise tonight. Will try my best to get all outstanding problems addressed before tomorrow.

spencerahill · 2017-03-24T16:24:56Z

Feel free to make the changes I suggested in the meantime (just be sure to push updates frequently, so I know what you've done), otherwise I'll try to get to them when I get back

Yes, will do and will ping you

spencerahill · 2017-03-24T17:29:37Z

@spencerkclark I'm still working through these, I'll ping you when I'm done, don't worry about it in the meantime (was going to push that same fix shortly)

- pretty print summary of requested calcs - getattr rather than try/except - user doesn't have to add extra list around 'default' for date_ranges or around output_time_regional_reductions or regions NOTE: I'm getting weird failures on test_permute_aux_specs; sometimes it fails, sometimes it doesn't, depending on the context (i.e. whether I call pytest on the whole package or just that test module). When it fails, basically the order of the two elements has been swapped somehow.

…/spencerahill/aospy into docs-update-install-instructions

spencerahill · 2017-03-24T18:45:15Z

@spencerkclark ok I think I've addressed everything at this point, less some further tests of automate to bring the coverage up.

However see my note in the dd25efb commit re: the weird failing test. On Travis it's only failing for 2.7, but it's not simply a Python version issue; it happens for me on 3.5 sometimes, but not always.

spencerkclark · 2017-03-24T19:04:40Z

I want to say in this test we are implicitly assuming that the order is consistent across the sets that go into producing the dictionaries (which is not guaranteed). Using sets is nice to automatically trim duplicates, but I wonder if we should just switch to lists for test simplicity, which always have the same order?

spencerkclark · 2017-03-24T19:12:12Z

aospy/utils/io.py

-            raise AttributeError(e)
-    return objs
-
-
 def to_dup_list(x, n, single_to_list=True):


I just realized we no longer use this (untested) function, so I think we can remove it

Good call, will delete

FYI I recently learned of the vulture package that finds unused code in a package, could be useful to use this periodically on aospy

spencerahill · 2017-03-24T20:51:13Z

I wonder if we should just switch to lists for test simplicity, which always have the same order?

Yes this is probably the right move. I was motivated by a vague desire to think more carefully about what data structures we are using in each place, but I guess in this case a list's order preservation is more useful than set's prevention of duplicates (there is no built-in ordered set)

Will do this shortly

We actually do need sets/unordered containers after all, because the _get_all_objs_of_type list comprehension pulls items from __dict__, which depending on the Python version may or may not be ordered. So since we can't rely on an order there, we can't rely on an order in anything downstream of it or tests thereof.

spencerkclark · 2017-03-24T23:25:15Z

aospy/test/test_automate.py

-    return True
+        assert len(actual) == len(expected)
+        for act in actual:
+            assert act in expected


But of course! Much cleaner solution here. Sorry for leading you down the wrong path.

spencerahill · 2017-03-24T23:29:24Z

Wow this ended up being a headache!

@spencerkclark automate is only at 78% coverage, but if it's alright with you I'd like to let it slide. Partly because I see a clear path forward for an improved long-term solution (#158 , with more details to come). Partly because I'm burnt out on this module! Self-inflicted 😄

…/spencerahill/aospy into docs-update-install-instructions

spencerahill · 2017-03-24T23:48:48Z

@spencerkclark good catch on the docstring. Just updated the what's new.

Since all CI passed on the previous non-doc commits, we're green in my view. Any final concerns?

spencerkclark · 2017-03-24T23:51:56Z

@spencerahill thanks again! Bear with me, let me see if I can write a simple test for submit_mult_calcs and exec_calcs (it would just be to submit a single Calc, under both parallel and serial methods).

…/spencerahill/aospy into docs-update-install-instructions

spencerkclark · 2017-03-25T00:32:47Z

@spencerahill let me know if you're good with the test I added; if that checks out, and the AppVeyor build succeeds, I'm good to merge!

spencerahill · 2017-03-25T01:09:04Z

@spencerkclark done! Once again, huge thanks for catching my myriad mistakes on this and all your other effort on it.

Ironic that we went for a "quick fix", since it ended up being such an ordeal! Just how it goes sometimes I guess 😄

Spencer Hill added 4 commits March 13, 2017 17:10

DOCS conda preferred method except on python3.4

58ec87a

API Calc.region now a sequence of Region objects (formerly dict)

45b6604

Major overhaul of main script-related functionality

a75e428

Forgot __init__.py in last commit

cfb512a

spencerkclark reviewed Mar 17, 2017

View reviewed changes

spencerkclark reviewed Mar 18, 2017

View reviewed changes

This was referenced Mar 19, 2017

Retain data loaded from disk for multiple Calcs in a CalcSuite, rather than reloading each time #4

Open

What is optimal solution for automatically registering objects with their parents? #156

Open

Re-make main script compatible with existing workflows

71abeff

Use object library's 'variable' attribute if it exists

0c72d6c

Spencer Hill added 2 commits March 22, 2017 10:08

More tests for automate.py

d604d08

Forgot to update a _get_regions call signature

d4a465e

Fix pytest.mark.parametrize decorator

ebac19e

spencerkclark reviewed Mar 22, 2017

View reviewed changes

spencerkclark reviewed Mar 23, 2017

View reviewed changes

WIP use objects, not names, in main script

e14613f

Among other things, need to (re-) implement converting from 'default' date_ranges value to each Run's actual default range

Spencer Hill and others added 2 commits March 24, 2017 09:41

default of 'default_X' should be empty, not 'all'

93c7045

TST Update test_get_attr_by_tag

84215b3

Spencer Hill added 2 commits March 24, 2017 11:32

Merge branch 'docs-update-install-instructions' of https://github.com…

e77a25a

…/spencerahill/aospy into docs-update-install-instructions

spencerkclark reviewed Mar 24, 2017

View reviewed changes

Spencer Hill added 4 commits March 24, 2017 14:33

Use lists not sets in automate; remove obsolete 'to_dup_list' func

8dd9add

q

ea5e55b

Ignore ordering of dict values & list elements in automate.py tests

9f0ef99

spencerkclark reviewed Mar 24, 2017

View reviewed changes

spencerkclark and others added 3 commits March 24, 2017 19:42

DOC Update aospy_main.py primary docstring

ec7d436

Add what's new

0a7dde6

Merge branch 'docs-update-install-instructions' of https://github.com…

c691ce1

…/spencerahill/aospy into docs-update-install-instructions

spencerkclark added 2 commits March 24, 2017 20:21

TST Add test of submitting calculations through the main script

da56594

Merge branch 'docs-update-install-instructions' of https://github.com…

1e18f17

…/spencerahill/aospy into docs-update-install-instructions

spencerahill merged commit c5864c2 into develop Mar 25, 2017

spencerahill deleted the docs-update-install-instructions branch March 25, 2017 01:07

spencerahill changed the title ~~WIP: revamp main script functionality and associated examples~~ Revamp main script functionality and associated examples Mar 25, 2017

This was referenced Mar 25, 2017

Move all except required user input out of main into aospy-level module #152

Closed

Add example_obj_lib.py file using objects from tutorial.ipynb and use in main script #151

Closed



		projects = {example_proj.name: example_proj}
		variables = {var.name: var for var in [precip_largescale, precip_convective,

		defaults_attr_name) for name in names]


		def _permute_aux_specs(specs):

Revamp main script functionality and associated examples #155

Revamp main script functionality and associated examples #155

Conversation

spencerahill commented Mar 16, 2017 • edited Loading

spencerkclark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerahill commented Mar 17, 2017

spencerkclark left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerahill Mar 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerkclark Mar 18, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerahill commented Mar 19, 2017

spencerahill commented Mar 22, 2017

spencerahill commented Mar 22, 2017 • edited Loading

spencerkclark commented Mar 22, 2017

spencerahill commented Mar 22, 2017

spencerahill commented Mar 22, 2017

spencerahill commented Mar 22, 2017 • edited Loading

spencerkclark commented Mar 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerkclark commented Mar 22, 2017 • edited Loading

spencerkclark commented Mar 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerahill commented Mar 23, 2017

spencerahill commented Mar 23, 2017

spencerahill commented Mar 24, 2017

spencerahill commented Mar 24, 2017

spencerahill commented Mar 24, 2017

spencerahill commented Mar 24, 2017

spencerkclark commented Mar 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spencerahill commented Mar 24, 2017

Choose a reason for hiding this comment

spencerahill commented Mar 24, 2017

spencerahill commented Mar 24, 2017

spencerkclark commented Mar 24, 2017

spencerkclark commented Mar 25, 2017

spencerahill commented Mar 25, 2017

spencerahill commented Mar 16, 2017 •

edited

Loading

spencerahill Mar 19, 2017 •

edited

Loading

spencerkclark Mar 18, 2017 •

edited

Loading

spencerahill commented Mar 22, 2017 •

edited

Loading

spencerahill commented Mar 22, 2017 •

edited

Loading

spencerkclark commented Mar 22, 2017 •

edited

Loading

spencerkclark commented Mar 24, 2017 •

edited

Loading