Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

Merged
merged 14 commits into from Dec 14, 2018

Conversation

jotes
Copy link
Collaborator

@jotes jotes commented Dec 5, 2018

Howdy,

So, first of all -> I'm not good at naming things. Any suggestions are always welcome!

Main changes:
We don't make the full recalc of stats everytime a translaion is saved/approved/rejected and so on.
Instead of that, We create a small map with states of an entity before and after a translation.
Thanks to that, we're able to generate a diff and apply it to the all AggregatedStats objects.

To give you more perspective, I've made some benchmarks: https://gist.github.com/jotes/c040baa84faab7ce68bbc7875590e9f4

If you have any additional questions/suggestions/metrics that could be useful -> feel free to suggest them.

Misc changes:

  • removed missing variable from calculate_stats because it was unused.

TODO:

  • check total_strings
  • check if the calculations works the same way
  • move save_stats_diff to base.utils and use it in the calculate_stats
  • tests

Copy link
Collaborator

@mathjazz mathjazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, just added some comments, mostly nits and name change suggestions.

I still need to look into stats_states_map, because it doesn't seem to always work as expected (I managed to broke stats locally).

pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
@jotes
Copy link
Collaborator Author

jotes commented Dec 6, 2018

@mathjazz Hey, I moved some lines of code around that could produce those strange results, could you look if the problem still occurs?

Copy link
Collaborator

@mathjazz mathjazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, I can no longer reporoduce the issue.

Some more nits added.

Fix that and we're good to go I think.

pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
@jotes jotes force-pushed the bug-1407016-remove-calculate-stats branch 2 times, most recently from 6265905 to ab1eb78 Compare December 7, 2018 00:29
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
@mathjazz mathjazz force-pushed the bug-1407016-remove-calculate-stats branch from 352ed16 to 80aea3d Compare December 7, 2018 23:00
@jotes
Copy link
Collaborator Author

jotes commented Dec 11, 2018

@mathjazz Hey,
I wrote a script which should allow to gather some numbers in more automated manner.
You can find it here: https://gist.github.com/jotes/c8f61fceae51ed0440cc3dced72c5591
https://gist.github.com/jotes/c8f61fceae51ed0440cc3dced72c5591
In my case, numbers look like this:

Before patch:

unreject avg 0.124710986614 min 0.113973855972 max 0.15446805954
reject avg 0.125182232857 min 0.11275100708 max 0.164891004562

After patch:

unreject avg 0.0380410385132 min 0.0324711799622 max 0.0587360858917
reject avg 0.0382667922974 min 0.0318470001221 max 0.0838429927826

Could you run this script on your local machine and then (if that makes sense) on the staging to gather some numbers?

@mathjazz
Copy link
Collaborator

Thanks! I ran the test case twice before and twice after the patch and in both cases the numebrs are consistent:

Before:

unreject avg 0.749211573601 min 0.495611190796 max 1.5232770443
reject avg 0.771279101372 min 0.508476018906 max 2.43999290466

After:

unreject avg 0.0389637184143 min 0.0276548862457 max 0.104743003845
reject avg 0.038744904995 min 0.0286099910736 max 0.11382484436

@jotes Did you want to make any further changes to the patch? I suggest we deploy it to prod, observe numbers in New Relic for a few days and then make further decisions.

@jotes
Copy link
Collaborator Author

jotes commented Dec 11, 2018 via email

@mathjazz
Copy link
Collaborator

mathjazz commented Dec 11, 2018

Good point about the test case!

I've ran the command on Stage and Prod - without the patch applied. I'll run it on stage again after the patch is deployed.


Stage - BEFORE:

reject avg 0.606796922684 min 0.458231925964 max 1.31855106354
unreject avg 0.58060685873 min 0.453376054764 max 0.881937026978

Stage - AFTER:

unreject avg 0.194613735676 min 0.116050958633 max 0.445156097412
reject avg 0.189559895992 min 0.10894203186 max 0.305606126785

Prod - BEFORE:

unreject avg 0.334202797413 min 0.289776086807 max 0.583406925201
reject avg 0.33278989315 min 0.285384893417 max 0.55864906311

Prod - AFTER:

reject avg 0.146877138615 min 0.0803010463715 max 0.413851976395
unreject avg 0.142880239487 min 0.0798790454865 max 0.409416913986

@jotes
Copy link
Collaborator Author

jotes commented Dec 12, 2018

@mathjazz Hey,
I've started working on test-suite, I committed the first test which checks both implementations (calculate_stats and get_stats). I think it's required to be consistent between implementations.
I hope to finish this in the next 2 days.

@mathjazz
Copy link
Collaborator

@jotes Sounds good. However, note that the main idea of pushing this patch to prod at this stage is to gather performance numbers. And if they aren't great, we might need to revert this patch (including the tests).

I'm not saying that we shouldn't write tests, but we also don't need 100% coverage, given that the code that is in prod right now also doesn't have it. Besides, if something goes south with stats, we can always fix it with calculate_stats.

Also, I don't know how thoroughly you plan to test this, and we might even be on the same page, I just wanted to give you a heads-up. :)

@jotes
Copy link
Collaborator Author

jotes commented Dec 12, 2018 via email

@@ -2864,7 +2949,8 @@ def reject(self, user):
self.approved_user = None
self.approved_date = None
self.fuzzy = False
self.save()
if save:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose is to change the state of translation and not trigger stats recalc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this approach doesn't even "change the state of translation", it prevents everything that happens in Translation.save(). But why do we even need it? It doesn't seem like we use these update_stats/save params anywhere in the codebase. If the sole purpose of that it to help with tests, I'd simply do translation.approved/reject = True/False.

The fact that it's slightly different in each of the function (reject(), unapprove(), Translation.save()) also doesn't help.

@jotes jotes force-pushed the bug-1407016-remove-calculate-stats branch from 9018ecc to 5d623e9 Compare December 13, 2018 00:03
@jotes
Copy link
Collaborator Author

jotes commented Dec 13, 2018

@mathjazz Can you look again? A couple of tests have been added (only for singulars).

Copy link
Collaborator

@mathjazz mathjazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work with tests! Left a few comments.

pontoon/base/models.py Outdated Show resolved Hide resolved
@@ -2864,7 +2949,8 @@ def reject(self, user):
self.approved_user = None
self.approved_date = None
self.fuzzy = False
self.save()
if save:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this approach doesn't even "change the state of translation", it prevents everything that happens in Translation.save(). But why do we even need it? It doesn't seem like we use these update_stats/save params anywhere in the codebase. If the sole purpose of that it to help with tests, I'd simply do translation.approved/reject = True/False.

The fact that it's slightly different in each of the function (reject(), unapprove(), Translation.save()) also doesn't help.

pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/tests/models/test_stats.py Show resolved Hide resolved
pontoon/base/tests/models/test_stats.py Show resolved Hide resolved
pontoon/base/tests/models/test_stats.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
pontoon/base/models.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@mathjazz mathjazz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done! Please add that comment and we can ship this!

pontoon/base/models.py Show resolved Hide resolved
@mathjazz mathjazz merged commit a18fdd7 into mozilla:master Dec 14, 2018
@mathjazz mathjazz deleted the bug-1407016-remove-calculate-stats branch December 14, 2018 11:31
@mathjazz
Copy link
Collaborator

mathjazz commented Dec 19, 2018

Comparing performance 1 day before and 1 day after deployment of the patch to prod:

Note that Counts for anything but update_translation are too low for any serious conclusions. We should wait at least two more days.

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.78   1,959  581      568     5.82     5,610    1,140     05.0      08.0
update: after     0.84   2,869  452      411     6.33     8,630    1,300     04.8      06.8

reject: before    0.92   12     296      162     156      684      355       00.0      00.0
reject: after     0.76   163    521      251     159      1,420    85        00.3      00.6

unreject: before  0.90   21     226      288     5.31     1,100    4.75      00.0      00.0
unreject: after   0.75   4      473      149     322      618      1.89      00.0      00.0

unapprove: before 1.00   1      173      0       173      173      0.173     00.0      00.0
unapprove: after  0.91   11     330      205     199      848      3.63      00.0      00.0

@mathjazz
Copy link
Collaborator

Comparing performance 3 days before and 3 days after deployment of the patch to prod:

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.67   6,191  723      577     5.82     6,740    4,480     05.7      08.8
update: after     0.84   5,818  463      416     6.33     8,630    2,690     03.8      04.9

reject: before    0.65   185    847      694     156      7,240    157       00.2      00.3
reject: after     0.79   211    499      387     159      4,740    105       00.1      00.2

unreject: before  0.76   44     461      360     5.31     1,170    20.3      00.0      00.0
unreject: after   0.81   8      417      151     235      618      3.34      00.0      00.0

unapprove: before 0.56 	 17     832      471     173      2,410    14.1      00.0      00.0
unapprove: after  0.92   24     336      240     150      1,140    8         00.0      00.0

@mathjazz
Copy link
Collaborator

Comparing performance 7 days before and 7 days after deployment of the patch to prod:

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.68 	 13,004 715      596     5.82     9,530    9,300     05.3      08.1
update: after     0.88   9,999  425      409     9.61     6,340    4,250     03.3      03.8

reject: before    0.63   348    935      794     156      7,240    325       00.2      00.3
reject: after     0.92   160    354      397     145      4,740    56.6      00.0      00.0

unreject: before  0.74   60     518      402     5.31     1,880    31.1      00.0      00.0
unreject: after   0.92   19     265      188     10.7     734      5.04      00.0      00.0

unapprove: before 0.57 	 28     831      407     173      2,410    23.1      00.0      00.0
unapprove: after  0.94   33     331      251     150      1,230    10.9      00.0      00.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants