Add config filter parameter to simulate_factors() #14

bgmerrell · 2017-05-01T19:37:11Z

A new optional config_filter parameter for the simulate_factors() method is
added. The config filter is a function which (when not None), is passed the list
of configs from a multi-factor simulation. Only the simulations for which the
config filter function returns true are run.

This helps with the orchestrating the execution of multiple, per-factor,
simulations on separate machines (e.g., in the cloud) with a single command
(with multiple factors). For example, snowbird could pass the following
config_filter argument to simulate_factors():

def only_factor_filter_fn(cfg):
        return cfg['meta.sim.index'] == args.only_factor

In the case that args.only_factor = 1, only the simulation at factor index 1
would be executed, and the user would see the following artifacts (assuming
a workspace of ~/locker/test):

~locker/test/1/config.yaml
~locker/test/1/result.yaml
~locker/test/1/sim.log

Add a new optional `only_factor` parameter for the simulate_factors() method. The only_factor parameter is specified, only a single simulation will be run. The simulation that runs is for the factor at the index specified by the value of the only_factor argument. For example, if the snowbird command-line arguments look like this: meta.cmd_line: sb -f host.queue_depth 4,16 -f fm.fim.xfer_rate_MBps 400,800 -f fm.ase.t_wait_rd 4,5,6 -s duration 10 ms --only-factor 4 Then snowbird will pass in only_factor=4 as the argument to desmod and the meta.sim.index and meta.sim.workspace will be set to 4 and workspace/4 respectively. Additionally, meta.sim.special will look like this: meta.sim.special: - [host.queue_depth, 4] - [fm.fim.xfer_rate_MBps, 800] - [fm.ase.t_wait_rd, 5] That is, only a single simulation, for the factor indicated by meta.sim.special, was run. One use for this new feature is orchestrating the execution of multiple, per-factor, simulations on separate machines (e.g., in the cloud) with a single command (with multiple factors).

Some new config options have been added: - sim.workspace.s3_sync: When true, syncs any workspaces to s3 - sim.workspace.s3_bucket: The s3 bucket to sync to - sim.workspace.s3_sync_prefix: A string to act as the root path in the s3 bucket. By default, this is set the user's username. Here's an example of running snowbird with these new options: $ sb -s sim.workspace ~/locker/test -s sim.workspace.s3_sync_prefix bgmerrell -n test-rd-rnd -s sim.duration 10ms -s sim.workspace.s3_sync True -f host.queue_depth 4,16 This would result in the following being synced to the default bucket in s3: /bgmerrell/test/0/config.yaml /bgmerrell/test/0/result.yaml /bgmerrell/test/0/sim.log /bgmerrell/test/1/config.yaml /bgmerrell/test/1/result.yaml /bgmerrell/test/1/sim.log The method to do the sync work is called when the _Workspace context manager is exiting. The uploading to s3 is done using the concurrent.features package. This package isn't available in the Python 2 library, so the requirements.txt file has been updated to install the package for Python 2 only.

coveralls · 2017-05-01T21:09:24Z

Coverage decreased (-1.1%) to 94.747% when pulling 006567d on bgmerrell:master into 84edba4 on SanDisk-Open-Source:master.

coveralls · 2017-05-01T21:09:24Z

Coverage decreased (-1.1%) to 94.747% when pulling 006567d on bgmerrell:master into 84edba4 on SanDisk-Open-Source:master.

coveralls · 2017-05-01T21:09:24Z

Coverage decreased (-1.1%) to 94.747% when pulling 006567d on bgmerrell:master into 84edba4 on SanDisk-Open-Source:master.

coveralls · 2017-05-01T21:37:20Z

Coverage decreased (-1.1%) to 94.747% when pulling 006567d on bgmerrell:master into 84edba4 on SanDisk-Open-Source:master.

jpgrayson

I like this capability, but there are several wrinkles to smooth out.

jpgrayson · 2017-05-01T19:52:59Z

desmod/simulation.py

+            os.chdir(self.orig_dir)
+
+    def sync(self):
+        if self.config['sim.workspace.s3_sync']:


Within desmod, we want to avoid fatal effects if the user provides an under-specified config. And when the user under-specifies and desmod uses a default, we want the config dict to be updated to reflect the default that desmod used--i.e. desmod's implicit defaults become explicit.

To that end, we should use self.config.setdefault() to interrogate the config dict here.

W.r.t. the config key name, I also advocate that it be sim.workspace.s3_sync.enable. The ".enable" suffix has a lot of precedence in other config keys in the "sim" namespace. We could probably make a good case for some other naming strategies:
self.config.setdefault('sim.workspace.s3_sync.enable', False)
self.config.setdefault('sim.s3_sync.enable', False)
self.config.setdefault('sim.sync.s3.enable', False)

If we are going to have other workspace synchronization methods, then the last option creats a clean namespace.

N.B. that the two other new config items should fit into this scheme. E.g.:

sim.sync.s3.bucket
sim.sync.s3.prefix

self.config.setdefault('sim.sync.s3.enable', False)

Cool. My original approach was to use sim.workspace.sync.s3.enable, but I changed my mind because I thought it might match existing patterns better to be able to simply specify -s s3_sync True on the snowbird command-line. I'm happy to be wrong on this one.

One question: do you I should including "workspace" in sim.workspace.sync.s3.enable now that the syncing takes place in _Workspace, or is that an unnecessary coupling?

My bias is to use sim.sync.s3. as the base namespace. In addition to being decoupled from sim.workspace, it also ends up being less typing and a shallower hierarchy--both positives.

jpgrayson · 2017-05-01T20:09:39Z

desmod/workspacesync/s3.py

@@ -0,0 +1,59 @@
+from concurrent.futures import ThreadPoolExecutor
+
+import boto3


We should defer importing boto3 until we know that S3 synchronization is enabled. That allows boto3 to be an optional dependency as opposed to a required dependency. The same holds for concurrent.futures.

My bias is that most people most of the time will not be using the S3 sync feature and thus keeping the requirements list shorter outweighs the additional burden this would put on users who do want to use S3 sync.

Note that we do deferred import (non import-time import) of the progressbar library as well.

jpgrayson · 2017-05-01T20:20:51Z

desmod/simulation.py

+            artifacts = []
+            for root, _, files in os.walk(os.curdir, topdown=False):
+                for filename in files:
+                    artifacts.append(os.path.join(root, filename))


It is a little curious that _Workspace has responsibility for walking the artifacts versus the S3Sync taking that on itself. Is the idea that if we have multiple synchronization backends that we could reuse the same artifacts list?

I originally had this logic in workspacesync/s3.py and moved it into _Workspace for that exact reason you asked about. Having the sync backends using the same artifacts input is nice for consistency, code reuse, and computational cycles. That said, I'm solving a problem that doesn't exist yet, so I'm happy to move it back to s3.py for now if you think that is better.

<shrug> I think it's fine as-is.

jpgrayson · 2017-05-01T20:44:12Z

desmod/workspacesync/s3.py

+                username = getpass.getuser()
+            except:
+                username = "unknown"
+            config['sim.workspace.s3_sync_prefix'] = username


I'm not sure this is the right policy. While it seems like certain desmod users might have a policy that requires organizing synchronized artifacts by username, I don't think it is universal enough to drive a default in desmod. Instead, I think that the default prefix should simply be "/" and users should have to compose their own prefix that incorporates usernames, timestamps, or other such metadata.

jpgrayson · 2017-05-01T20:47:45Z

desmod/workspacesync/s3.py

+import os
+
+
+class S3Sync(object):


It seems like this could have been implemented as a regular function instead of as a class with a single method. I.e.

sync_to_s3(...)

versus:

S3Sync(...).sync()

I don't see it as imperative that we change this. It is not a user-facing API and we could thus revisit later if desired.

The main reason I went with this approach to perhaps have common set of methods for multiple backend synchronizers, which could register themselves and perhaps one day be called in a loop like so:

for cls in sync_registry: cls(config, artifacts).sync()

That said, that's a lot of hand waving, and we only have one sync backend now, so I probably shouldn't be solving problems that don't exist yet.

Even without using classes, each synchronization backend module could register its sync() function. Lots of ways to skin this. As long as these syncronization backends are all driven by configuration and not by user-facing API, then we retain lots of license to revisit this detail.

I do think there is at least one compelling case for having a class: extract and check the configuration before simulation.

jpgrayson · 2017-05-01T21:02:14Z

desmod/workspacesync/s3.py

+                os.path.split(self.workspace)[1],
+                (artifact[2:]))
+        self.client.upload_file(
+                artifact, self.config['sim.workspace.s3_sync_bucket'], dest)


We should extract the required configuration prior to submitting to the thread pool. The config is user input and thus may have all manner of garbage in it.

A good case for retaining the class-oriented strategy here would be so that we can extract configuration in __init__() before simulation and then have confidence that the user provided valid configuration well before end of simulation.

class _Workspace(...): def __init__(self, ...): ... if config.setdefault('sim.sync.s3.enable', False): self.s3_sync = S3Sync(...) else: self.s3_sync = None ... def __exit__(self, ...): ... if self.s3_sync: self.s3_sync.sync() class S3Sync(...): def __init__(self, config, ...): self.bucket = config['sim.sync.s3.bucket'] # user must provide bucket self.prefix = config.setdefault('sim.sync.s3.prefix', '/') ...

jpgrayson · 2017-05-01T21:04:46Z

desmod/simulation.py

+        if len(configs) == 0:
+            raise ValueError('No factor at index {}'.format(only_factor))
+        assert len(configs) == 1, 'exactly one config expected when ' \
+            '--only_factor is used'


The string associated with this assert presumes a --only_factor command line option, but that's not a valid assumption.

jpgrayson · 2017-05-01T21:10:14Z

desmod/config.py

+    :param int only_factor:
+        When specified, only a single config will be yielded; the config will
+        contain only the single factor at the specified index of the list of
+        the cartesian product of the unrolled factors.


I remain against plumbing this only_factor kwarg into factorial_config(). We can isolate this capability to simulate_factors(). I know that imposes a somewhat larger runtime cost, but I see that cost as marginal and the conceptual complexity of spreading this feature across two modules as the more important concern.

jpgrayson · 2017-05-01T21:25:49Z

desmod/simulation.py

@@ -205,7 +220,7 @@ def simulate(config, top_type, env_type=SimEnvironment, reraise=True,


 def simulate_factors(base_config, factors, top_type,
-                     env_type=SimEnvironment, jobs=None):
+                     env_type=SimEnvironment, jobs=None, only_factor=None):


What if instead of having this new only_factor argument, we instead had a filter function? I.e. the user would pass in a function that evaluated whether each config was either included or not. E.g.:

simulate_factors(..., config_filter=lambda config: config['meta.sim.index'] == 42)

This would make this feature more general. I could, for example, envision this being useful for at least two other applications:

Selecting more than one index: config_filter=lambda c: c['meta.sim.index'] in [42, 142].

Sampling configs from a very large configuration space: config_filter: lambda c: random.random() < 0.1. I.e. pick a random 10% of the factorial configs.

I think it would be more reasonable to push this more general capability into factorial_config() than the specific only_factor capability. Unfortunately, I don't think it would afford the same performance optimization that plumbing only_factor affords.

N.B. if we pushed this filter function capability down to factorial_config(), that function would have to take responsibility for assigning meta.sim.index. We currently do that assignment in simulate_factors().

* only_factor -> config_filter * use setdefault() for some workspace sync config keys * default s3 sync prefix is now "" * don't importa boto3 or concurrent.futures until needed. * sim.workspace.s3_sync -> sim.sync.s3.enable (and similar for other s3 sync-related keys)

jpgrayson · 2017-05-04T17:02:42Z

desmod/simulation.py

@@ -5,6 +5,12 @@
 from multiprocessing import cpu_count, Process, Queue
 from pprint import pprint
 from threading import Thread
+try:
+    from future_builtins import filter


We use six for Python 2/3 compatibility. To get filter, we can unconditionally:

from six.moves import filter

jpgrayson · 2017-05-04T17:03:29Z

desmod/config.py

@@ -271,7 +271,7 @@ def factorial_config(base_config, factors, special_key=None):
    for keys, values_list in factors:
        unrolled_factors.append([(keys, values) for values in values_list])

-    for keys_values_lists in product(*unrolled_factors):
+    for idx, keys_values_lists in enumerate(product(*unrolled_factors)):


Remove enumerate() to completely unwind the changes in this function.

jpgrayson · 2017-05-04T17:14:50Z

desmod/simulation.py

+            os.chdir(self.config['sim.workspace'])
+            self.sync()
+        finally:
+            os.chdir(self.orig_dir)


When __exit__() is called, we have a strong expectation that the current directory will be meta.sim.workspace. In multi-simulation cases, that would be something like ws/42. The new code here changes to the sim.workspace directory prior to calling sync(); i.e. just ws, not ws/42 where this simulation's artifacts reside.

I question whether it is right for sync() to operate outside of the individual simulation's workspace?

It seems like this paradigm of changing to the multi-sim session level directory (sim.workspace) can only work correctly if the new config_filter is used to limit the multi-simulation session to exactly one simulation. Otherwise, each of many simulations would call sync() such that they would all try to sync ws/0, ws/1, ..., ws/N.

To put a fine point on it, what I'm suggesting is that the control-flow here should be:

def __exit__(...): self.sync() os.chdir(self.prev_dir)

Sorry I didn't pick up on this in my first review.

jpgrayson · 2017-05-04T17:29:06Z

desmod/simulation.py

@@ -124,11 +131,12 @@ def schedule(self, delay=0):
 class _Workspace(object):
    """Context manager for workspace directory management."""
    def __init__(self, config):
+        self.config = config


I maintain that it is important to fully extract all configuration from config here in _Workspace.__init__(). Doing so would obviate the need to capture config with the self.config instance attribute. It would also demand that we instead instantiate an S3Sync instance. I.e.:

def __init__(self, config): self.s3_sync = S3Sync(config) ... def sync(self): if self.s3_sync.is_enabled: ... self.s3_sync.sync(artifacts)

We also want S3Sync.__init__() to follow suit and extract all needed config as instance attributes such that the config dict is not retained as an instance attribute and is thus not referenced in S3Sync.sync().

jpgrayson · 2017-05-04T18:01:01Z

desmod/workspacesync/s3.py

+            artifact, self.config['sim.sync.s3.bucket'], dest)
+
+    def sync(self):
+        """Concurrently upload the artifacts to s3."""


A potential problem with sync() is that it does not sync--it only uploads new files.

I assume client.upload_file() will happily overwrite files without error.

What is missing is any concept of removing files that exist in S3, but are not in the list of artifacts to synchronize. We would ostensibly need to interrogate S3 with list_objects() to figure out which extra files (objects) need to be deleted.

However, doing that alone would leave extra per-simulation directories unaccounted for. E.g. using the same workspace, if I first run a multi-factor set of 10 simulations and then run again with only 5 simulations, even if we synchronized per-simulation directories 0..4 correclty, S3 would still contain bogus/stale directories 5..9.

jpgrayson · 2017-05-04T18:03:28Z

tests/test_tracer.py

@@ -37,6 +37,7 @@ def config():
        'sim.vcd.start_time': '',
        'sim.vcd.stop_time': '',
        'sim.workspace': 'workspace',
+        'sim.sync.s3.enable': False,


Seems like since sim.sync.s3.enable defaults to False, we should not have had to add this here. Is it really needed?

The sync code will not live in desmod. We've decided to take a different approach.

coveralls · 2017-05-05T07:08:41Z

Coverage increased (+0.03%) to 95.914% when pulling aa35322 on bgmerrell:master into 84edba4 on SanDisk-Open-Source:master.

bgmerrell added 2 commits May 1, 2017 13:34

bgmerrell closed this May 1, 2017

Update unit tests

190ff12

bgmerrell reopened this May 1, 2017

bgmerrell added 2 commits May 1, 2017 14:50

flake8 fixes

3383cf5

Add blank desmod/workspacesync/__init__.py

006567d

bgmerrell closed this May 1, 2017

bgmerrell reopened this May 1, 2017

jpgrayson requested changes May 1, 2017

View reviewed changes

jpgrayson requested changes May 4, 2017

View reviewed changes

bgmerrell added 2 commits May 5, 2017 00:45

Remove code for s3 sync

a1a33b8

The sync code will not live in desmod. We've decided to take a different approach.

Fix flake8 concern with lambda

aa35322

jpgrayson merged commit 3bd2ea2 into westerndigitalcorporation:master May 5, 2017

bgmerrell changed the title ~~Optionally sync workspaces to s3~~ Add config filter parameter to simulate_factors() May 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add config filter parameter to simulate_factors() #14

Add config filter parameter to simulate_factors() #14

bgmerrell commented May 1, 2017 •

edited

coveralls commented May 1, 2017 •

edited

coveralls commented May 1, 2017

coveralls commented May 1, 2017

coveralls commented May 1, 2017 •

edited

jpgrayson left a comment

jpgrayson May 1, 2017

bgmerrell May 3, 2017

jpgrayson May 3, 2017

jpgrayson May 1, 2017

jpgrayson May 1, 2017

bgmerrell May 3, 2017

jpgrayson May 3, 2017

jpgrayson May 1, 2017

jpgrayson May 1, 2017

bgmerrell May 3, 2017

jpgrayson May 3, 2017

jpgrayson May 1, 2017

jpgrayson May 1, 2017

jpgrayson May 1, 2017

jpgrayson May 1, 2017

jpgrayson May 4, 2017

jpgrayson May 4, 2017

jpgrayson May 4, 2017

jpgrayson May 4, 2017

jpgrayson May 4, 2017

jpgrayson May 4, 2017

coveralls commented May 5, 2017 •

edited

		@@ -0,0 +1,59 @@
		from concurrent.futures import ThreadPoolExecutor

		import boto3

Add config filter parameter to simulate_factors() #14

Add config filter parameter to simulate_factors() #14

Conversation

bgmerrell commented May 1, 2017 • edited

coveralls commented May 1, 2017 • edited

coveralls commented May 1, 2017

coveralls commented May 1, 2017

coveralls commented May 1, 2017 • edited

jpgrayson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented May 5, 2017 • edited

bgmerrell commented May 1, 2017 •

edited

coveralls commented May 1, 2017 •

edited

coveralls commented May 1, 2017 •

edited

coveralls commented May 5, 2017 •

edited