Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correctly count metrics when a flag with dependencies has disabled parents #304

Merged

Conversation

thomasheartman
Copy link
Contributor

@thomasheartman thomasheartman commented Mar 7, 2024

Description

This PR fixes a bug where we didn't count metrics correctly: if the dependencies were not satisfied during a get_variant call, we wouldn't track any metrics for the flag at all. During an is_enabled call, we'd track the result of the child flag's evaluation without taking into account the dependencies.

Note that the behavior was still correct. This is only about metrics.

The root cause is related to how we check dependencies and act on that result: the Unleash client checks dependencies as part of its is_enabled and get_variant calls, but the metrics are tracked in the individual Feature objects.

The current implementation was the quickest way I could fix the issue. I think we should restructure it (probably; either now or later) to avoid calling internal methods of the Feature object. However, it serves as a nice way to show what's wrong and how we can fix it. (Maybe just make that _count_variant method non-internal?)

Due to the way the code is organized already, it might make sense to move the dependencies check into the Feature class instead of performing it in the Unleash Client. However, that would require us to have some way to access those dependencies from the dependent feature. This probably means more restructuring. I'm not sure that's worth it. Maybe it's better to leave as is? I'll take any input on it 🙋🏼

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

  • Unit tests

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

tests/unit_tests/test_client.py Show resolved Hide resolved
Comment on lines +458 to +459
feature.increment_stats(False)
feature._count_variant("disabled")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never counted stats for variants at all if dependencies were not satisfied.

Comment on lines +371 to +375
if dependency_check:
feature_check = feature.is_enabled(context)
else:
feature.increment_stats(False)
feature_check = False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the flag is enabled itself, but its dependencies are not satisfied, we'd count it as a "yes" previously (by checking "is_enabled" first.

@coveralls
Copy link

coveralls commented Mar 7, 2024

Coverage Status

coverage: 97.031% (+0.1%) from 96.905%
when pulling 2f4a2ff on fix/metrics-counting-for-flags-with-disabled-parents
into 31c33eb on main.

Copy link
Member

@sighphyre sighphyre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quick and dirty and I'm okay with that. This code seems some refactor love but that, I think, is too big to girl scout in this PR.

Would really appreciate it if we could spend a 30 mins to hour to see if we can reduce the sleep in the test in the spirit of not making the test suite worse but the fix looks cool

unleash_client.get_variant(child)

# Verify that the parent doesn't have any metrics registered
time.sleep(12)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is my only objection here. These tests are very painful to run since they all trigger long sleeps. This one adds a full 13 seconds

The client is being newed up with a 10 second metrics send interval. Can we see what happens to the test suite if we reduce that to 1 and trim the sleep down to say 2 seconds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure. I'll have a look. Think I might have had a look before, but can't remember why I didn't follow through in that case. Will check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there's a suggestion in 1abe74e. It worked when I last ran it locally, but I'm suspecting it might be a little flaky. Not sure that's a cost we wanna pay. Have a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And of course, we can move it out into a separate PR too. But just wanted some input on it first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure. I'll have a look. Think I might have had a look before, but can't remember why I didn't follow through in that case. Will check.

I think I have too and failed to improve this a few times. Looks like the build server hates that change. I think these are accidentally doing more than just pausing execution.

Okay, let's not do that then, this looks to be bigger than just this PR, I just to make sure it wasn't an easy fix we weren't doing.

Let's remove the commit, I'm okay for this to go in without the timing change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of in-person chat: I think I have a way to fix this. Will take those changes into a new PR (#305 tentatively)

@thomasheartman thomasheartman force-pushed the fix/metrics-counting-for-flags-with-disabled-parents branch from 410c5f8 to 2f4a2ff Compare March 7, 2024 10:30
@thomasheartman thomasheartman enabled auto-merge (squash) March 7, 2024 10:36
@thomasheartman thomasheartman merged commit 232da69 into main Mar 7, 2024
70 checks passed
@thomasheartman thomasheartman deleted the fix/metrics-counting-for-flags-with-disabled-parents branch March 7, 2024 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants