perf: Speed up local-evaluation hot path for large environments by khvn26 · Pull Request #199 · Flagsmith/flagsmith-python-client

khvn26 · 2026-04-24T12:09:38Z

Fixes #198. For a 262-feature environment with local evaluation, the SDK is now ~1000x faster in the common case (no variants, no segment overrides) and 8–60% faster when the engine still has to run, depending on variant density.

Paired with Flagsmith/flagsmith-engine#296, which removes the per-feature _get_identity_key call and adds a two-key fast path for the hot get_hashed_percentage_for_object_ids case. This PR stands alone (works with the current stable engine); the engine PR unlocks additional wins on the variants path.

What changed

Cache the output of get_environment_flags(). The evaluation context is immutable between environment refreshes, so we rebuild flags once on update instead of once per call. Invalidation is wired through a property setter on _evaluation_context, so existing direct-assignment call sites (tests, offline mode) keep working transparently.
Short-circuit get_identity_flags() to the cached environment flag dict when the environment has no features with variants and no segments with overrides. In that case identity flags are guaranteed to equal environment flags. A fresh Flags wrapper is allocated per call around the shared flag dict so pipeline analytics still see per-identity events.
Precompute identity.key in map_context_and_identity_data_to_context so the engine's get_enriched_context becomes a no-op on the hot path instead of performing a shallow context copy on every call.
Pre-sort multivariate variants once at env-load time. Timsort on already-sorted input is cheap, resorting 262 lists per call isn't.
Give Flag / DefaultFlag / BaseFlag __slots__ on Python 3.10+ (we still support 3.9) and inline the per-flag construction in Flags.from_evaluation_result, skipping a redundant helper call and truthiness check per feature.

Test changes

Two existing tests were rewritten from "assert on engine's internal call shape" to "assert on actual flag values":

test_get_identity_flags_uses_local_environment_when_available — now verifies the returned flag values against the fixture environment instead of mocking the engine.
test_get_identity_flags_includes_segments_in_evaluation_context → replaced by test_get_identity_flags_applies_identity_overrides, which exercises the segments-with-overrides path via the fixture's existing identity override and therefore actually verifies that segments flow into evaluation.

Behavioural contract is unchanged; both tests now fail more usefully if the SDK starts returning wrong values, and they stop being coupled to the engine's internal entry points (so e.g. engine-side refactors no longer ripple here).

Benchmark harness

New benchmarks/ directory with a local perf_counter-based harness that mirrors issue #198 (262 features, configurable multivariate density, cProfile mode). Not CI-integrated — CodSpeed would be the right place for that and would live in flag-engine. This is for fast local iteration.

Results (perf_counter, M-series Mac, 262 features)

Stock PyPI engine (10.0.3):

Scenario	main	This PR
`get_environment_flags` (any)	149 µs	0.07 µs
`get_identity_flags` vanilla	156 µs	0.6 µs
`get_identity_flags` 10% multivariate	156 µs	172 µs*
`get_identity_flags` 100% multivariate	~475 µs	370 µs*

* these paths still go through the engine fully, so their improvement is bounded by engine-side work. With the companion engine PR applied, the 10%-multivariate case drops to ~137 µs and 100% drops to ~370 µs → ~330 µs. The shortcut / caching wins above are achieved entirely in the SDK and apply today.

Test plan

pytest tests/ (87 passing, incl. new identity-override behaviour test)
mypy flagsmith/ clean
benchmarks/bench.py numbers reproduce above figures
Offline mode still populates derived state correctly (exercised by fixture tests)

Addresses #198: for a 262-feature environment, local evaluation is 1000x faster for the common case (no variants / no overrides) and 8-60% faster overall depending on scenario. Changes: - Cache the output of `get_environment_flags()` — the evaluation context is immutable between environment refreshes, so we rebuild it once on update instead of once per call. Invalidated via a property setter on `_evaluation_context` so existing direct-assignment call sites (tests, offline mode) keep working transparently. - Short-circuit `get_identity_flags()` to the cached environment Flags when the environment has no features with variants and no segments with overrides — in that case identity flags are guaranteed to equal environment flags. A fresh `Flags` wrapper is allocated around the cached flag dict to preserve identity metadata for pipeline analytics. - Precompute `identity.key` in `map_context_and_identity_data_to_context` so flag-engine's `get_enriched_context` becomes a no-op instead of performing a shallow context copy on every call. - Pre-sort multivariate variants once at env-load time; Timsort on an already-sorted list is a fast path compared to resorting per call. - Add `__slots__` to `Flag` / `DefaultFlag` / `BaseFlag` on Python 3.10+ and inline the per-flag construction inside `Flags.from_evaluation_result` to skip the redundant helper call and truthiness check. Also adds a `benchmarks/` harness (issue-#198 scenario, variant-density knobs, cProfile mode) so future regressions can be caught locally before release. Two tests that asserted on the engine's internal call shape were rewritten to assert on actual flag values instead. beep boop

CI runs ``mypy --strict .`` which covers the new ``benchmarks/`` dir and is stricter than the default invocation. Fixes: - ``flagsmith/mappers.py::_variant_priority`` — annotate the dict lookup so it doesn't leak ``Any`` through the ``int`` return type. - ``benchmarks/env.py::build_environment`` — annotate the ``json.load`` result so the ``dict[str, Any]`` return type is satisfied. - ``benchmarks/bench.py::_make_client`` — drop the redundant ``_Flagsmith__evaluation_context`` pre-seed (the property setter sets the backing field itself) and cast the synthetic env dict to ``EnvironmentModel`` at the engine boundary. beep boop

khvn26 requested a review from a team as a code owner April 24, 2026 12:09

khvn26 requested review from Zaimwa9 and removed request for a team April 24, 2026 12:09

khvn26 marked this pull request as draft April 24, 2026 12:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Speed up local-evaluation hot path for large environments#199

perf: Speed up local-evaluation hot path for large environments#199
khvn26 wants to merge 2 commits intomainfrom
perf/local-eval-hot-path

khvn26 commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

khvn26 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Test changes

Benchmark harness

Results (perf_counter, M-series Mac, 262 features)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

khvn26 commented Apr 24, 2026 •

edited

Loading