support new bucketing logic #186

Kaitaan · 2018-06-22T21:00:39Z

add support for:

single variant experiments (one control, one variant)
multi variant experiments (multiple controls/variants)
feature rollout (single variant)

add support for: - single variant experiments (one control, one variant) - multi variant experiments (multiple controls/variants) - feature rollout (single variant)

mattknox · 2018-06-22T21:49:40Z

probably should include @pacejackson, given his previous work and r2 knowledge.

mattknox

Bunch of questions, possible documentation needed in some cases.

mattknox · 2018-06-25T18:11:50Z

baseplate/experiments/providers/__init__.py


    enabled = config.get("enabled", True)
-    if not enabled:
+    if now < start_ts or now > stop_ts:


either or both of start_ts and stop_ts could be None, I think, so this might raise.

if start_ts or stop_ts is None, there's an error that would be thrown a couple lines above (line 83).

Though, now that you mention it, if they're explicitly passed as None, that'll cause an issue. Fix inbound.

mattknox · 2018-06-25T18:42:28Z

baseplate/experiments/providers/__init__.py


    enabled = config.get("enabled", True)
-    if not enabled:
+    if now < start_ts or now > stop_ts:


I'd check for enabled first-a disabled experiment might lack one or both of start/stop.

I think start_ts and stop_ts should be required fields. Even if an experiment is disabled, it should still have configured start and end times.

Thoughts?

I believe that the initial intention was that it was always required.

mattknox · 2018-06-25T18:43:46Z

baseplate/experiments/providers/__init__.py

@@ -119,6 +127,35 @@ def parse_experiment(config):
            version=version,
            config=experiment_config,
        )
+    elif experiment_type == "multi_variant":
+        return MultiVariantExperiment.from_dict(


I'd probably assign the class to something like experiment_type and then return experiment_type.from_dict(...) to DRY this up

mattknox · 2018-06-25T18:44:29Z

baseplate/experiments/providers/feature_flag.py

@@ -7,7 +7,9 @@


 class FeatureFlag(R2Experiment):
-    """An experiment with a single variant "active".
+    """An experiment with a single variant "active". This type of experiment


might be useful to link to the design doc rationale for why. (For example, I don't know)

This repo is OS, so can't really link internal docs.

Reason for deprecation? Just because it's redundant. I'd like to stop supporting the legacy bucketing altogether for simplicity if possible.

I'm not opposed to deprecating this to try to simplify the config format/keep things more consistent. The "legacy" bucketing doesn't really affect anything here since there's only one variant, it can be any size.

My thinking is that it's basically a feature rollout, with 100%, but with specific targeting. That's going to be replaced with the feature_rollout option once targeting is in place (work in progress)

mattknox · 2018-06-25T18:46:28Z

baseplate/experiments/providers/multi_variant.py

+    """Basic experiment, handling more than two variants (typically two
+    controls, and multiple treatments).
+    This type of experiment does not guarantee that changing the variant size
+    will not rebucketing existing users.


will not rebucket.

mattknox · 2018-06-25T18:50:59Z

baseplate/experiments/providers/simple_experiment.py

+    def make_seed(self, seed_data):
+        id = seed_data.get('id')
+        name = seed_data.get('name')
+        version = seed_data.get('version')


version should NOT be used in the randomization. Shuffle_version should be, although it's fine to exclude it for now if you want.

oh, hmm. Since this is legacy, maybe that's OK? Still think it should not be, for consistency.

note that the "version" used in this case is whatever version is desired to be used for seed generation. I actually pass the "shuffle_version" value in here, since that's the specific type of version we want to use for making the seed. So it's actually doing the right thing, but I agree that it's actually a bit convoluted. Will clean this up a bit.

mattknox · 2018-06-25T18:52:38Z

baseplate/experiments/providers/single_variant.py

+
+        super(SingleVariantExperiment, self)._validate_variants(variants)
+
+        if len(variants) != 2:


probably want to verify that the sizes work out too, right?

what do you mean by "sizes work out"?

See comment below about super call, if you're referring to sum(sizes) < 1.0

mattknox · 2018-06-25T18:52:57Z

baseplate/experiments/providers/multi_variant.py

+
+        super(MultiVariantExperiment, self)._validate_variants(variants)
+
+        if len(variants) < 3:


sum(sizes) < 1.0 . ?

done by the super call

mattknox · 2018-06-25T18:54:51Z

baseplate/experiments/providers/simple_experiment.py

+        return True
+
+    def _is_enabled(self, **kwargs):
+        return self.start_ts


also the enabled field ,right?

couple things missing here. Fixed.

mattknox · 2018-06-25T18:55:36Z

baseplate/experiments/providers/single_variant.py

+                          any of the variants
+        """
+
+        if bucket < (self.variants[0]["size"] * self.num_buckets):


again, why * num_buckets?

pacejackson · 2018-06-25T22:15:45Z

It looks like your .vagrant stuff snuck into this PR, maybe we should add that to .gitignore?

Kaitaan · 2018-06-25T22:25:45Z

ah damn. Will remove, and add to my .gitignore.

pacejackson · 2018-06-25T23:35:36Z

baseplate/experiments/providers/__init__.py

-            version=version,
+            start_ts=start_ts,
+            stop_ts=stop_ts,
+            enabled=enabled,


Why pass this stuff to the experiment? This stuff should be handled by ForcedVariantExperiment-s rather than the other providers so you don't have to worry about it.

Setting aside how it was done in the old types, I disagree for a few reasons:

an experiment being disabled (or not started, or ended) doesn't make it a fundamentally different experiment. It just makes it disabled so that it doesn't bucket users.

if we don't pass this information in, then we can't use it in the experiment. I can see cases where we want to enforce different restrictions based on the type of experiment. For instance, multivariant may be required to run for 2 weeks, whereas a multi-armed bandit may be required to run longer.

If we don't pass this in, it makes it more complex to allow us to start an experiment in the future (or end one). Assume the following flow:
a) I create an experiment, set to start in an hour
b) a bucketing call is made, but the experiment hasn't started yet, so it gets created as a ForcedVariantExperiment. This experiment is then added to the experiment_cache
c) No further config changes happen for a day

Now it's going to be treated as a forced variant until the config has been re-parsed for some reason.

I still think that the current way this is handled is the way to go, not implementing it in the experiment provider:

While true, now you have to implement this logic in all of your experiment providers, the reason for putting this in the parsing logic and using ForcedVariantExperiments is so you don't have to re-implement the same "disable" logic every time.

But we don't need to use this right now, if we have some new type of experiment where we do need to use it in the future, we can pass it in/implement it there.

I'm not sure where an experiment would be cached for an hour, the only cache is local to the request, it's not shared between requests.

but that's fundamentally a part of the experiment. I'd be willing to agree that it could be in the base experiment class (I put it in the base class for a simple experiment for that reason), but the status, start, and end times of an experiment are features of an experiment, not features of an experiment parser.

You're right, in that the hypothetical situations are just that: hypothetical. I still think that the properties of an experiment (whether it's enabled, when it's meant to start and end) should be available to the experiment, not used to infer the type of experiment (that's what the experiment_type is for)

I may be showing my ignorance on some things here, but is there anything inherent in the baseplate.Experiments class that restricts its scope to request level? When you request bucketing on an experiment that hasn't been seen yet, its config is parsed, and the experiment object is added to this dict: https://github.com/reddit/baseplate/blob/master/baseplate/experiments/__init__.py#L67
That's the cache I'm referring to.

Generally speaking though, I think using an entirely new type of object for the sake of abstracting away the enabled/disabled/started/ended has more downsides than upsides.

I'm not sure which chain to respond to here but I wanted to hop in. I'm seeing a lot of discussion of "override in subclasses" and "implement in base classes". In general, I'd really strongly like to avoid 1) implementation inheritance, and 2) deep hierarchies of objects. This ends up being quite byzantine to follow (jumping back and forth between class definitions to follow a code path) and very brittle. Rather, it'd be good to encapsulate specific pieces of logic in functions or discrete objects we bring together with composition. Duplication by putting into multiple classes is definitely a bummer, but so is putting it into a base class instead. Would it make sense to put this kind of shared logic into an object that only handles expiration logic and then it can be included in any experiment type that cares?

ah! I see @pacejackson had a much better suggested structure for this here: https://github.com/reddit/baseplate/pull/186/files#r198285273 :)

As far as (3) goes, the Experiments objects are created by the ExperimentsContextFactory when you get a new request/context so you get a new cache on each request.

pacejackson · 2018-06-25T23:37:46Z

baseplate/experiments/providers/simple_experiment.py

+        self._validate_variants(self.variants)
+
+    def _get_seed(self):
+        return self.seed


This seems like a weird pattern, the thing you are returning is "public" but this is "private", I think the way to do this would be something like:

@property def seed(self): return self._seed

pacejackson · 2018-06-25T23:39:06Z

baseplate/experiments/providers/simple_experiment.py

+        self.variants = config.get("variants", [])
+
+        seed_data = {"id": id, "name": name, "shuffle_version": self.shuffle_version}
+        self.seed = config.get("bucket_seed", self.make_seed(seed_data))


I think the general preference is to do this parsing in the factory methods (i.e. from_dict) and pass everything into the constructor explicitly rather than having it done in the constructor.

Is that a python thing, a Reddit thing, or a "this is how it was done in here previously, so please stay consistent" thing? I realize that currently my from_dict method is pretty useless, but I'm just trying to determine the reasoning for needing that method at all is.

It's more of 3 thing, although that's not always consistent. You can see other examples of this in other parts of baseplate such as https://github.com/reddit/baseplate/blob/69eba73ad7493e8c4822677b391346b5a8ee7c38/baseplate/queue_consumer.py#L113

IMO, my reason for preferring this is I like the constructor to make it explicit what the dependencies of that object are. When you can just pass in a couple of things and build off of that, you have to dive into the constructor to try to figure that out.

pacejackson · 2018-06-25T23:40:19Z

baseplate/experiments/providers/simple_experiment.py

+        """Check if any of the kwargs override the variant. Functionality
+        to be built in the future. For now, overrides are not supported.
+        """
+        return None


IMO, we should leave code like this out if it's not being implemented, we can implement it when we need it rather than having code that doesn't actually do anything lying around until we do.

Normally, I'd agree. In this case, however, I think there's value in including it for 2 reasons:

to indicate that this code is coming shortly (it's currently in-progress)

by including the non-functionality here, anyone using this can easily override targeting and overrides in subclasses.

"coming shortly" is weird in fixed-release libraries though.

2 to 1. Removed.

pacejackson · 2018-06-25T23:40:28Z

baseplate/experiments/providers/simple_experiment.py

+        Advanced targeting functionality to be supported in the future.
+        For now, return True.
+        """
+        return True


Same comment as about _check_overrides.

See reply above

pacejackson · 2018-06-25T23:41:01Z

baseplate/experiments/providers/simple_experiment.py

+    def _is_enabled(self, **kwargs):
+        current_ts = time.time()
+
+        return (self.enabled and current_ts >= self.start_ts


I don't think the experiments should be checking this, this would be handled by the parser who would return a ForcedVariantExperiment if any of this is true.

See comment above

pacejackson · 2018-06-25T23:41:59Z

baseplate/experiments/providers/single_variant.py

+            stop_ts=stop_ts,
+            enabled=enabled,
+            config=config,
+        )


This seems like it's the same across the subclasses of SimpleExperiment, maybe we should just implement this there?

Yeah, could do. Fixed

pacejackson · 2018-06-25T23:48:01Z

baseplate/experiments/providers/single_variant.py

+        with the same bucket and varaints will result in the same answer
+
+        :param bucket -- an integer bucket representation
+        :param variants -- a dictionary of


This isn't a parameter.

pacejackson · 2018-06-25T23:48:27Z

baseplate/experiments/providers/single_variant.py

+            return self.variants[0]["name"]
+        elif bucket >= int(self.num_buckets
+                - (self.variants[1]["size"] * self.num_buckets)):
+            return self.variants[1]["name"]


I don't think we really need to implement this in every one of these subclasses, it seems like it's all the same logic, the only difference is how many variants are allowed.

The logic is actually a bit different, at least for the single_variant. Single variant buckets one variant at the high end of the bucketing range, and one at the low end. This is to allow those variants to change sizes without rebucketing.

Something to consider, most of what these different experiment types are doing is the same, the main differences are 1) the number of allowed variants and 2) exactly how variants are assigned to buckets. It might be cleaner to separate out that concern into it's own object VariantCollection or something like that. You can have one that allows multiple variants and assigns buckets the way that MultiVariantExperiment does and another that assigns buckets the way that SingleVariantExperiment does. This lets you decouple that from the rest of the Experiment and lets you build things more by composition rather than by subclassing.

pacejackson · 2018-06-25T23:49:25Z

baseplate/experiments/providers/__init__.py

+    'feature_flag': FeatureFlag,
+    'single_variant': SingleVariantExperiment,
+    'multi_variant': MultiVariantExperiment,
+    'feature_rollout': FeatureRollout,


maybe this type should be feature_flag_v2? Establish a versioning pattern rather than finding new names for the same thing?

The implication with a feature flag is "on/off (for a targeted group)" which isn't the case here. Even if targeted, there's no guarantee that a user will be bucketed in one or the other (unless the rollout percentage is 0 or 100)

That's fair, although that's not how the current FeatureFlags work in practice. I'm not opposed to changing the name if it fits better though.

Kaitaan · 2018-06-27T17:46:28Z

💇

Kaitaan · 2018-06-27T17:59:09Z

baseplate/experiments/providers/simple_experiment.py

+        """
+
+        if variants is None:
+            raise ValueError('Sum of all variants is greater than 100%')


This error message is wrong. Will fix.

spladug

thanks for taking this on and sorry for delay in reviewing! i think the biggest thing for me here is the heavy use of implementation inheritance. many of the other discussion points in here might sort themselves out if that's different.

spladug · 2018-06-28T14:40:06Z

baseplate/experiments/providers/__init__.py

-        )
+    start_ts = config.get("start_ts")
+    stop_ts = config.get("stop_ts")
+    if (start_ts is None or stop_ts is None):


nit: no parens around the if condition in python please!

fixed. This used to be multi-line.

this doesn't appear fixed in the current diff?

spladug · 2018-06-28T14:41:15Z

baseplate/experiments/providers/__init__.py

+    experiment_type = config.get("type")
+    if experiment_type:
+        experiment_type = experiment_type.lower()
+    experiment_id = config.get("id")


why change from ["id"] to .get("id")? with the assert on the next line all we've done is change the error from a fairly understandable KeyError: id to AssertionError in the case where the ID wasn't defined

Fair. Altered it to still raise a KeyError (let me know if you think there's a better exception type), but made the error message a bit more explicit.

spladug · 2018-06-28T14:44:07Z

baseplate/experiments/providers/__init__.py

-        return R2Experiment.from_dict(
+    experiment_class = type_class_map.get(experiment_type, ForcedVariantExperiment)
+
+    if experiment_type in ['r2', 'feature_flag']:


it feels really weird to have the type map and then still branch on types. it'd be cleaner to move this branching into factory functions so this part of the code doesn't continue to need to be updated any time new variants are added.

I've changed this a bit, so let me know if you still have objections here. I'm not sure I'm entirely following what you're suggesting though; this, for all intents and purposes is the factory function.

Keep in mind that the legacy types (r2, feature_flag, forced_variant) are slated for deprecation soon.

spladug · 2018-06-28T14:48:11Z

baseplate/experiments/providers/feature_flag.py

@@ -7,7 +7,9 @@


 class FeatureFlag(R2Experiment):
-    """An experiment with a single variant "active".
+    """An experiment with a single variant "active". This type of experiment


rather than overembiggening the summary line of the docstring, you can add a sphinx deprecation notice to the docs instead.

http://www.sphinx-doc.org/en/stable/markup/para.html#directive-deprecated

spladug · 2018-06-28T14:52:37Z

baseplate/experiments/providers/__init__.py

-            version=version,
+            start_ts=start_ts,
+            stop_ts=stop_ts,
+            enabled=enabled,


I'm not sure which chain to respond to here but I wanted to hop in. I'm seeing a lot of discussion of "override in subclasses" and "implement in base classes". In general, I'd really strongly like to avoid 1) implementation inheritance, and 2) deep hierarchies of objects. This ends up being quite byzantine to follow (jumping back and forth between class definitions to follow a code path) and very brittle. Rather, it'd be good to encapsulate specific pieces of logic in functions or discrete objects we bring together with composition. Duplication by putting into multiple classes is definitely a bummer, but so is putting it into a base class instead. Would it make sense to put this kind of shared logic into an object that only handles expiration logic and then it can be included in any experiment type that cares?

spladug · 2018-06-28T14:56:57Z

baseplate/experiments/providers/r2.py

@@ -14,7 +14,9 @@


 class R2Experiment(Experiment):
-    """A "legacy", r2-style experiment.
+    """A "legacy", r2-style experiment. This experiment type is deprecated,


same note about sphinx deprecation thing

spladug · 2018-06-28T14:57:28Z

baseplate/experiments/providers/simple_experiment.py

+
+
+class SimpleExperiment(Experiment):
+    """Base class for simple experiments. Note that this class shares a lot of


the stuff about sharing code would be better for a comment since it's irrelvant to someone using baseplate and just reading the docs but is important to someone working inside baseplate.

spladug · 2018-06-28T14:58:08Z

baseplate/experiments/providers/simple_experiment.py

+                self.bucket_val,
+                self.name,
+            )
+            return None


nit: can you put a blank between these ifs please? when jammed together it reads from bird's eye view as an if/elif

spladug · 2018-06-28T14:59:15Z

baseplate/experiments/providers/simple_experiment.py

+        """Check if any of the kwargs override the variant. Functionality
+        to be built in the future. For now, overrides are not supported.
+        """
+        return None


"coming shortly" is weird in fixed-release libraries though.

spladug · 2018-06-28T15:00:36Z

baseplate/experiments/providers/__init__.py

-            version=version,
+            start_ts=start_ts,
+            stop_ts=stop_ts,
+            enabled=enabled,


ah! I see @pacejackson had a much better suggested structure for this here: https://github.com/reddit/baseplate/pull/186/files#r198285273 :)

Kaitaan · 2018-06-29T21:23:23Z

CI probably failed on the tests that don't run locally. Will fix them up.

Kaitaan · 2018-06-29T22:47:45Z

that should do it.

💇‍♂️

Kaitaan · 2018-07-02T23:37:28Z

💇‍♂️
ping

pacejackson · 2018-07-02T23:19:57Z

baseplate/experiments/providers/variant_sets/base.py

@@ -0,0 +1,16 @@
+
+
+class VariantSet:


You'll need to add (object) to this:

class VariantSet(object)

You can leave it off in Python3 but in 2, not including it means you are an "old-style" class.

pacejackson · 2018-07-03T00:13:18Z

baseplate/experiments/providers/simple_experiment.py

+from .base import Experiment
+from ..._compat import long, iteritems
+
+from .variant_sets.single_variant_set import SingleVariantSet


🔕 I don't feel strongly about this, so I would like to see what others think as well. I feel like we can put variant_sets under experiments/ rather than experiments/providers/ and just leave experiments/providers/ for the provider classes.

I also don't have any strong opinions here. I just put it here because a variantset is a core piece of a provider.

seems reasonable to pop it up a level if it itself is not a provider

pacejackson · 2018-07-03T00:29:46Z

baseplate/experiments/providers/__init__.py

@@ -103,21 +123,25 @@ def parse_experiment(config):
        override = config.get("global_override")
        return ForcedVariantExperiment(override)

-    if experiment_type == "r2":
-        return R2Experiment.from_dict(
+    if experiment_type in legacy_type_class_map.keys():


You don't need the .keys() for this, in on a dict checks if the value is a key.

pacejackson · 2018-07-03T00:30:18Z

baseplate/experiments/providers/__init__.py

-    if experiment_type == "r2":
-        return R2Experiment.from_dict(
+    if experiment_type in legacy_type_class_map.keys():
+        experiment_class = legacy_type_class_map.get(experiment_type)


I would just use the [] syntax to access this rather than get since you already know that it's in the dict and it's a little bit more concise.

pacejackson · 2018-07-03T01:00:36Z

baseplate/experiments/providers/simple_experiment.py

+
+    def should_log_bucketing(self):
+        """ Default to true. Override if logging of eligibility events not required. """
+        return True


Should this change depending on variant_type?

good catch. Given that it's not inherent to a variant though, I'm going to punt the responsibility to the config to determine whether it wants to log them or not, defaulting to true. Added a param on instantiation

pacejackson · 2018-07-03T01:03:43Z

baseplate/experiments/providers/variant_sets/base.py

+        raise NotImplementedError
+
+    def validate_variants(self):
+        raise NotImplementedError


I'm not sure this needs to be part of the interface, it doesn't look like you're technically implementing it.

pacejackson · 2018-07-03T15:23:28Z

baseplate/experiments/providers/variant_sets/multi_variant_set.py

+        self.variants = variants
+        self.num_buckets = num_buckets
+
+    def __contains__(self, item):


🔕 If you check this often, it might be a good idea to just store a set of all of the variant names and return item in self._variant_names.

pacejackson · 2018-07-03T15:24:04Z

baseplate/experiments/providers/variant_sets/multi_variant_set.py

+            raise ValueError('No variants provided')
+
+        if len(variants) < 3:
+            raise ValueError("MultiVariant experiments expect two controls "


Should this also check that at least 2 of the variants are named control_X?

I don't think the system should enforce variant naming.

Makes sense, should the error message not reference "control" then? I feel like that implies that it is looking for variants called "control".

Makes sense. Changed.

Kaitaan · 2018-07-05T20:53:28Z

baseplate/experiments/providers/variant_sets/multi_variant_set.py

+from .base import VariantSet
+
+
+class MultiVariantSet(VariantSet):


pacejackson · 2018-07-03T15:27:29Z

tests/unit/experiments/providers/variant_sets/multi_variant_set_tests.py

+    def test_contains(self):
+        variant_set = create_multi_variant_set()
+
+        self.assertTrue("variant_2" in variant_set)


🔕 self.assertIn/self.assertNotIn work as well for this.

Kaitaan · 2018-07-05T21:50:20Z

💇‍♂️

pacejackson · 2018-07-05T21:09:58Z

baseplate/experiments/providers/__init__.py

+}
+
+
+simple_type_class_list = [


Can you make this a frozenset? I don't think we care about the ordering and it should make checking the type quicker.

pacejackson · 2018-07-05T21:24:47Z

baseplate/experiments/providers/simple_experiment.py

+
+class SimpleExperiment(Experiment):
+
+    def make_seed(self, seed_data):


This should probably be a @classmethod or just a function in this module.

also, maybe we should just pass in the different values as arguments to the function rather than as a dict?

Removed method altogether.

pacejackson · 2018-07-05T21:31:05Z

baseplate/experiments/providers/simple_experiment.py

+                 bucket_seed, bucket_val, enabled=True, log_bucketing=True,
+                 num_buckets=1000, **kwargs):
+        """
+        :param int id -- the experiment id. This should be unique.


I think these should be formatted as :param type name: rather than :param type name --

Also, we want to capitalize the first letter in each description.

pacejackson · 2018-07-05T21:33:51Z

baseplate/experiments/providers/simple_experiment.py

+          as a floating point value between 0 and 1.
+        :param bucket_seed -- if provided, this provides the seed for determining
+            which bucket a variant request lands in. Providing a consistent
+            bucket_seed will ensure a user is bucketed consistently.


Can you flesh this out a little more? I feel like the way this is currently written, it implies that the user will not be bucketed consistently which is not the case, it's still consistent, it's just based on a generated seed rather than a static one that you provide in the config.

pacejackson · 2018-07-05T21:34:11Z

baseplate/experiments/providers/simple_experiment.py

+        :param int num_buckets -- how many available buckets there are for
+            bucketing requests. This should match the num_buckets in the
+            provided VariantSet. The default value is 1000, which provides
+            a potential variant granularity of 0.1%


Some of these don't have a period at the end.

pacejackson · 2018-07-05T22:25:12Z

tests/unit/experiments/providers/variant_sets/multi_variant_set_tests.py

+logger = logging.getLogger(__name__)
+
+
+NUM_BUCKETS_DEF = 1000


If DEF supposed to be DEFAULT? Can you call it that so it's more clear?

pacejackson · 2018-07-05T22:34:05Z

tests/unit/experiments/providers/variant_sets/multi_variant_set_tests.py

+
+def create_multi_variant_set():
+    cfg = generate_variant_config()
+    return MultiVariantSet(cfg)


IIRC you can pass in the number of buckets, you should probably pass it in here, maybe making it an argument to create_multi_variant_set with a default value of NUM_BUCKETS_DEF so you don't have to worry about keeping those in sync with the default values.

pacejackson · 2018-07-05T22:36:13Z

tests/unit/experiments/providers/variant_sets/multi_variant_set_tests.py

+        self.assertEqual(len(variant_counts), 4)
+
+        for variant_count in variant_counts.values():
+            self.assertEqual(variant_count, 250)


Can you also add a test for the other extreme end of this, where you only have a single bucket i.e.:

[ {'name': 'active', 'size': 0.001}, {'name': 'control_1', 'size': 0}, {'name': 'control_2', 'size': 0}, ]

pacejackson · 2018-07-05T22:38:18Z

tests/unit/experiments/providers/variant_sets/rollout_variant_set_tests.py

+logger = logging.getLogger(__name__)
+
+
+NUM_BUCKETS_DEF = 1000


The comments in multi_variant_set_tests.py also apply here.

pacejackson · 2018-07-05T22:38:25Z

tests/unit/experiments/providers/variant_sets/single_variant_set_tests.py

+logger = logging.getLogger(__name__)
+
+
+NUM_BUCKETS_DEF = 1000


The comments in multi_variant_set_tests.py also apply here.

Kaitaan · 2018-07-06T00:14:32Z

💇‍♂️

pacejackson

Mostly just small nits over formatting, looks good overall.

pacejackson · 2018-07-06T01:56:17Z

baseplate/experiments/providers/__init__.py

+    if now < start_ts or now > stop_ts:
+        enabled = False
+
+    if not enabled and experiment_type in ["r2", "feature_flag"]:


You can write this as if not enabled and experiment_type in legacy_type_class_map:.

pacejackson · 2018-07-06T02:02:10Z

baseplate/experiments/providers/feature_flag.py

@@ -7,7 +7,10 @@


 class FeatureFlag(R2Experiment):
-    """An experiment with a single variant "active".
+    """.. deprecated:: 0.27
+    Use FeatureRollout instead.


Maybe these comments should say Use SimpleExperiment with X VariantStore?

pacejackson · 2018-07-06T02:09:53Z

baseplate/experiments/providers/simple_experiment.py

+    def _calculate_bucket(self, bucket_val):
+        """Sort something into one of self.num_buckets buckets.
+
+        :param bucket_val -- a string used for shifting the deterministic bucketing


This is missing the type and still has the -- rather than :.

pacejackson · 2018-07-06T02:10:06Z

baseplate/experiments/providers/simple_experiment.py

+        :return int -- a bucket, 0 <= bucket < self.num_buckets
+        """
+        # Mix the experiment seed with the bucket_val so the same users don't
+        # get bucketed into the same bucket for each experiment.


placed into the same bucket?

pacejackson · 2018-07-06T02:10:30Z

baseplate/experiments/providers/variant_sets/base.py

+
+    A VariantSet contains a set of experimental variants, as well as
+    their distributions. It is used by experiments to track which
+    variant a check is meant to return.


to track which bucket a variant is assigned to?

pacejackson · 2018-07-06T02:18:37Z

baseplate/experiments/providers/variant_sets/rollout_variant_set.py

+    """
+
+    def __init__(self, variants, num_buckets=1000):
+        """ :param variants -- array of dicts, each containing the keys 'name'


Formatting and param type.

pacejackson · 2018-07-06T02:18:59Z

baseplate/experiments/providers/variant_sets/rollout_variant_set.py

+        """Deterministically choose a percentage-based variant. Every call
+        with the same bucket and varaints will result in the same answer.
+
+        :param bucket -- an integer bucket representation


Formatting and param type.

pacejackson · 2018-07-06T02:19:09Z

baseplate/experiments/providers/variant_sets/single_variant_set.py

+    """
+
+    def __init__(self, variants, num_buckets=1000):
+        """ :param variants -- array of dicts, each containing the keys 'name'


Formatting and param type.

pacejackson · 2018-07-06T02:19:29Z

baseplate/experiments/providers/variant_sets/single_variant_set.py

+        """Deterministically choose a variant. Every call with the same bucket
+        on one instance will result in the same answer
+
+        :param bucket -- an integer bucket representation


Formatting and param type.

tests/unit/experiments/providers/simple_experiment_tests.py

+                bucketing_changed = True
+                break
+
+        self.assertTrue(bucketing_changed)


pacejackson

Just a few small comments/spelling fixes. LGTM.

pacejackson · 2018-07-07T00:03:29Z

baseplate/experiments/providers/__init__.py

+        experiment_type = experiment_type.lower()
+    experiment_id = config.get("id")
+    if not isinstance(experiment_id, int):
+        raise KeyError("Integer id must be provided for experiment.")


It seems weird to raise a KeyError if experiment_id happens to be a string or something like that, KeyError only really makes sense if "id" isn't in the config so experiment_id would be None.

TypeError would be the more appropriate built-in exception

pacejackson · 2018-07-07T00:22:29Z

baseplate/experiments/providers/variant_sets/multi_variant_set.py

+            raise ValueError("MultiVariant experiments expect three or "
+                "more variants.")
+
+        total_size = 0.0


I think you want to make this an total_size = 0 so it doesn't get turned into a float.

pacejackson · 2018-07-07T00:33:07Z

baseplate/experiments/providers/variant_sets/rollout_variant_set.py

+
+    def choose_variant(self, bucket):
+        """Deterministically choose a percentage-based variant. Every call
+        with the same bucket and varaints will result in the same answer.


spladug

awesome, this looks great. thanks for splitting up the variant stuff from the providers, that's super clean now.

just a few remaining nits and andrew's comments.

spladug · 2018-07-09T17:08:43Z

baseplate/experiments/providers/__init__.py

-        )
+    start_ts = config.get("start_ts")
+    stop_ts = config.get("stop_ts")
+    if (start_ts is None or stop_ts is None):


this doesn't appear fixed in the current diff?

spladug · 2018-07-09T17:09:22Z

baseplate/experiments/providers/__init__.py

+    if (start_ts is None or stop_ts is None):
+        if "expires" in config:
+            warn_deprecated(
+                "The 'expires' field is in experiment %s deprecated, you should "


I think that "is" needs to move after the %s

spladug · 2018-07-09T17:11:17Z

baseplate/experiments/providers/feature_flag.py

@@ -7,7 +7,10 @@


 class FeatureFlag(R2Experiment):
-    """An experiment with a single variant "active".
+    """.. deprecated:: 0.27


the deprecated thing should go near the bottom of the docstring, not replace the summary line. you can run make docs in the vm and navigate to http://baseplate.local/html to see what the output of sphinx looks like to make sure it all makes sense.

Would it make sense to have it at the top of the docstring, but not replace the summary line? Seems to me the first thing you'd want to know when reading a block of docs is that the class is deprecated?

spladug · 2018-07-09T17:12:57Z

baseplate/experiments/providers/simple_experiment.py

+from .base import Experiment
+from ..._compat import long, iteritems
+
+from .variant_sets.single_variant_set import SingleVariantSet


seems reasonable to pop it up a level if it itself is not a provider

Kaitaan · 2018-07-10T20:53:13Z

💇‍♂️

Kaitaan added 2 commits June 22, 2018 13:17

support new bucketing logic

d43d8d7

add support for: - single variant experiments (one control, one variant) - multi variant experiments (multiple controls/variants) - feature rollout (single variant)

FIXUP: formatting cleanup

dad3d46

Kaitaan requested review from mattknox and spladug June 22, 2018 21:00

Kaitaan requested a review from pacejackson June 22, 2018 21:56

mattknox requested changes Jun 25, 2018

View reviewed changes

FIXUP

6ad30e3

Kaitaan added 2 commits June 25, 2018 15:26

FIXUP: remove vagrant stuff

d09c3b9

add .vagrant to gitignore

cacdb07

pacejackson requested changes Jun 25, 2018

View reviewed changes

Kaitaan added 3 commits June 26, 2018 11:40

FIXUP: Address some PR feedback

1ec70b5

FIXUP: test fix

d65475f

FIXUP: PR feedback

20c3bdd

Kaitaan commented Jun 27, 2018

View reviewed changes

spladug suggested changes Jun 28, 2018

View reviewed changes

FIXUP: restructure for composition variants

04571ec

FIXUP: fix failing test

7179c0d

FIXUP: typo in test class name

dc57193

FIXUP: remove autogen integration tests

e194adb

pacejackson requested changes Jul 3, 2018

View reviewed changes

Kaitaan added 2 commits July 5, 2018 13:45

FIXUP: PR feedback

2cfeb67

FIXUP: change error message

45e78f6

pacejackson requested changes Jul 5, 2018

View reviewed changes

FIXUP: Address PR feedback

9abc921

pacejackson requested changes Jul 6, 2018

View reviewed changes

FIXUP: PR feedback

031e8e8

pacejackson approved these changes Jul 7, 2018

View reviewed changes

spladug suggested changes Jul 9, 2018

View reviewed changes

FIXUP: PR feedback

e64735f

fix typo

bb1da7a

pacejackson approved these changes Jul 10, 2018

View reviewed changes

spladug approved these changes Jul 11, 2018

View reviewed changes

Kaitaan merged commit ebdef7d into master Jul 16, 2018

Kaitaan deleted the new_experiment_types branch July 18, 2018 21:54


		super(SingleVariantExperiment, self)._validate_variants(variants)

		if len(variants) != 2:


		super(MultiVariantExperiment, self)._validate_variants(variants)

		if len(variants) < 3:



		class SimpleExperiment(Experiment):
		"""Base class for simple experiments. Note that this class shares a lot of

		from .base import VariantSet


		class MultiVariantSet(VariantSet):


		class SimpleExperiment(Experiment):

		def make_seed(self, seed_data):

support new bucketing logic #186

support new bucketing logic #186

Conversation

Kaitaan commented Jun 22, 2018

mattknox commented Jun 22, 2018

mattknox left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pacejackson commented Jun 25, 2018

Kaitaan commented Jun 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Kaitaan commented Jun 27, 2018

Choose a reason for hiding this comment

spladug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment