perf: Speed up local-evaluation hot path for large environments#199
Draft
perf: Speed up local-evaluation hot path for large environments#199
Conversation
Addresses #198: for a 262-feature environment, local evaluation is 1000x faster for the common case (no variants / no overrides) and 8-60% faster overall depending on scenario. Changes: - Cache the output of `get_environment_flags()` — the evaluation context is immutable between environment refreshes, so we rebuild it once on update instead of once per call. Invalidated via a property setter on `_evaluation_context` so existing direct-assignment call sites (tests, offline mode) keep working transparently. - Short-circuit `get_identity_flags()` to the cached environment Flags when the environment has no features with variants and no segments with overrides — in that case identity flags are guaranteed to equal environment flags. A fresh `Flags` wrapper is allocated around the cached flag dict to preserve identity metadata for pipeline analytics. - Precompute `identity.key` in `map_context_and_identity_data_to_context` so flag-engine's `get_enriched_context` becomes a no-op instead of performing a shallow context copy on every call. - Pre-sort multivariate variants once at env-load time; Timsort on an already-sorted list is a fast path compared to resorting per call. - Add `__slots__` to `Flag` / `DefaultFlag` / `BaseFlag` on Python 3.10+ and inline the per-flag construction inside `Flags.from_evaluation_result` to skip the redundant helper call and truthiness check. Also adds a `benchmarks/` harness (issue-#198 scenario, variant-density knobs, cProfile mode) so future regressions can be caught locally before release. Two tests that asserted on the engine's internal call shape were rewritten to assert on actual flag values instead. beep boop
CI runs ``mypy --strict .`` which covers the new ``benchmarks/`` dir and is stricter than the default invocation. Fixes: - ``flagsmith/mappers.py::_variant_priority`` — annotate the dict lookup so it doesn't leak ``Any`` through the ``int`` return type. - ``benchmarks/env.py::build_environment`` — annotate the ``json.load`` result so the ``dict[str, Any]`` return type is satisfied. - ``benchmarks/bench.py::_make_client`` — drop the redundant ``_Flagsmith__evaluation_context`` pre-seed (the property setter sets the backing field itself) and cast the synthetic env dict to ``EnvironmentModel`` at the engine boundary. beep boop
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #198. For a 262-feature environment with local evaluation, the SDK is now ~1000x faster in the common case (no variants, no segment overrides) and 8–60% faster when the engine still has to run, depending on variant density.
Paired with Flagsmith/flagsmith-engine#296, which removes the per-feature
_get_identity_keycall and adds a two-key fast path for the hotget_hashed_percentage_for_object_idscase. This PR stands alone (works with the current stable engine); the engine PR unlocks additional wins on the variants path.What changed
get_environment_flags(). The evaluation context is immutable between environment refreshes, so we rebuild flags once on update instead of once per call. Invalidation is wired through a property setter on_evaluation_context, so existing direct-assignment call sites (tests, offline mode) keep working transparently.get_identity_flags()to the cached environment flag dict when the environment has no features with variants and no segments with overrides. In that case identity flags are guaranteed to equal environment flags. A freshFlagswrapper is allocated per call around the shared flag dict so pipeline analytics still see per-identity events.identity.keyinmap_context_and_identity_data_to_contextso the engine'sget_enriched_contextbecomes a no-op on the hot path instead of performing a shallow context copy on every call.Flag/DefaultFlag/BaseFlag__slots__on Python 3.10+ (we still support 3.9) and inline the per-flag construction inFlags.from_evaluation_result, skipping a redundant helper call and truthiness check per feature.Test changes
Two existing tests were rewritten from "assert on engine's internal call shape" to "assert on actual flag values":
test_get_identity_flags_uses_local_environment_when_available— now verifies the returned flag values against the fixture environment instead of mocking the engine.test_get_identity_flags_includes_segments_in_evaluation_context→ replaced bytest_get_identity_flags_applies_identity_overrides, which exercises the segments-with-overrides path via the fixture's existing identity override and therefore actually verifies that segments flow into evaluation.Behavioural contract is unchanged; both tests now fail more usefully if the SDK starts returning wrong values, and they stop being coupled to the engine's internal entry points (so e.g. engine-side refactors no longer ripple here).
Benchmark harness
New
benchmarks/directory with a local perf_counter-based harness that mirrors issue #198 (262 features, configurable multivariate density, cProfile mode). Not CI-integrated — CodSpeed would be the right place for that and would live in flag-engine. This is for fast local iteration.Results (perf_counter, M-series Mac, 262 features)
Stock PyPI engine (10.0.3):
get_environment_flags(any)get_identity_flagsvanillaget_identity_flags10% multivariateget_identity_flags100% multivariate*these paths still go through the engine fully, so their improvement is bounded by engine-side work. With the companion engine PR applied, the 10%-multivariate case drops to ~137 µs and 100% drops to ~370 µs → ~330 µs. The shortcut / caching wins above are achieved entirely in the SDK and apply today.Test plan
pytest tests/(87 passing, incl. new identity-override behaviour test)mypy flagsmith/cleanbenchmarks/bench.pynumbers reproduce above figures