Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

jotes · 2018-12-05T16:59:58Z

Howdy,

So, first of all -> I'm not good at naming things. Any suggestions are always welcome!

Main changes:
We don't make the full recalc of stats everytime a translaion is saved/approved/rejected and so on.
Instead of that, We create a small map with states of an entity before and after a translation.
Thanks to that, we're able to generate a diff and apply it to the all AggregatedStats objects.

To give you more perspective, I've made some benchmarks: https://gist.github.com/jotes/c040baa84faab7ce68bbc7875590e9f4

If you have any additional questions/suggestions/metrics that could be useful -> feel free to suggest them.

Misc changes:

removed missing variable from calculate_stats because it was unused.

TODO:

check total_strings
check if the calculations works the same way
move save_stats_diff to base.utils and use it in the calculate_stats
tests

mathjazz

Hey, just added some comments, mostly nits and name change suggestions.

I still need to look into stats_states_map, because it doesn't seem to always work as expected (I managed to broke stats locally).

pontoon/base/models.py

jotes · 2018-12-06T15:15:17Z

@mathjazz Hey, I moved some lines of code around that could produce those strange results, could you look if the problem still occurs?

mathjazz

Well done, I can no longer reporoduce the issue.

Some more nits added.

Fix that and we're good to go I think.

pontoon/base/models.py

jotes · 2018-12-11T06:48:15Z

@mathjazz Hey,
I wrote a script which should allow to gather some numbers in more automated manner.
You can find it here: https://gist.github.com/jotes/c8f61fceae51ed0440cc3dced72c5591
https://gist.github.com/jotes/c8f61fceae51ed0440cc3dced72c5591
In my case, numbers look like this:

Before patch:

unreject avg 0.124710986614 min 0.113973855972 max 0.15446805954
reject avg 0.125182232857 min 0.11275100708 max 0.164891004562

After patch:

unreject avg 0.0380410385132 min 0.0324711799622 max 0.0587360858917
reject avg 0.0382667922974 min 0.0318470001221 max 0.0838429927826

Could you run this script on your local machine and then (if that makes sense) on the staging to gather some numbers?

mathjazz · 2018-12-11T11:46:06Z

Thanks! I ran the test case twice before and twice after the patch and in both cases the numebrs are consistent:

Before:

unreject avg 0.749211573601 min 0.495611190796 max 1.5232770443
reject avg 0.771279101372 min 0.508476018906 max 2.43999290466

After:

unreject avg 0.0389637184143 min 0.0276548862457 max 0.104743003845
reject avg 0.038744904995 min 0.0286099910736 max 0.11382484436

@jotes Did you want to make any further changes to the patch? I suggest we deploy it to prod, observe numbers in New Relic for a few days and then make further decisions.

jotes · 2018-12-11T12:17:10Z

Hey, If the result is consistent I think it wouldn't hurt to add tests before this will land on the prod. Could you run this test on the staging? To just have some baseline numbers for the future reference.

…

On Tue, 11 Dec 2018 at 12:46, Matjaž Horvat ***@***.***> wrote: Thanks! I ran the test case twice before and twice after the patch and in both cases the numebrs are consistent: Before: unreject avg 0.749211573601 min 0.495611190796 max 1.5232770443 reject avg 0.771279101372 min 0.508476018906 max 2.43999290466 After: unreject avg 0.0389637184143 min 0.0276548862457 max 0.104743003845 reject avg 0.038744904995 min 0.0286099910736 max 0.11382484436 @jotes <https://github.com/jotes> Did you want to make any further changes to the patch? I suggest we deploy it to prod, observe numbers in New Relic for a few days and then make further decisions. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1140 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIYMTLXcaWeJV1_xh8kzl4CbSopXw-Ibks5u35r-gaJpZM4ZDJ0E> .

-- Kind regards Jarek "jotes" Śmiejczak, a python programmer by day, a mozillian by heart... Homepage: http://jotes.work, Github: http://github.com/jotes Mobile: +48693027040, Mozillian: http://mozillians.org/u/jotes

mathjazz · 2018-12-11T12:55:39Z

Good point about the test case!

I've ran the command on Stage and Prod - without the patch applied. I'll run it on stage again after the patch is deployed.

Stage - BEFORE:

reject avg 0.606796922684 min 0.458231925964 max 1.31855106354
unreject avg 0.58060685873 min 0.453376054764 max 0.881937026978

Stage - AFTER:

unreject avg 0.194613735676 min 0.116050958633 max 0.445156097412
reject avg 0.189559895992 min 0.10894203186 max 0.305606126785

Prod - BEFORE:

unreject avg 0.334202797413 min 0.289776086807 max 0.583406925201
reject avg 0.33278989315 min 0.285384893417 max 0.55864906311

Prod - AFTER:

reject avg 0.146877138615 min 0.0803010463715 max 0.413851976395
unreject avg 0.142880239487 min 0.0798790454865 max 0.409416913986

jotes · 2018-12-12T00:21:09Z

@mathjazz Hey,
I've started working on test-suite, I committed the first test which checks both implementations (calculate_stats and get_stats). I think it's required to be consistent between implementations.
I hope to finish this in the next 2 days.

mathjazz · 2018-12-12T02:52:25Z

@jotes Sounds good. However, note that the main idea of pushing this patch to prod at this stage is to gather performance numbers. And if they aren't great, we might need to revert this patch (including the tests).

I'm not saying that we shouldn't write tests, but we also don't need 100% coverage, given that the code that is in prod right now also doesn't have it. Besides, if something goes south with stats, we can always fix it with calculate_stats.

Also, I don't know how thoroughly you plan to test this, and we might even be on the same page, I just wanted to give you a heads-up. :)

jotes · 2018-12-12T13:00:37Z

Okay, I'll reduce the number of test-cases (especially plurals) and I'll ping you when It's done.

…

On Wed, 12 Dec 2018 at 03:52, Matjaž Horvat ***@***.***> wrote: @jotes <https://github.com/jotes> Sounds good. However, note that the main idea of pushing this patch to prod at this stage is to gather performance numbers. And if they aren't great, we might need to revert this patch (including the tests). I'm not saying that we shouldn't write tests, but we also don't need 100% coverage, given that the code that is in prod right now also doesn't have it. Besides, if something goes south with stats, we can always fix it with calculate_stats. Also, I don't know how thoroughly you plan to test this, and we might even be on the same page, I just wanted to give you a heads-up. :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1140 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIYMTAsek6DwX1hio79KTSjzAVidm3MMks5u4G9qgaJpZM4ZDJ0E> .

-- Kind regards Jarek "jotes" Śmiejczak, a python programmer by day, a mozillian by heart... Homepage: http://jotes.work, Github: http://github.com/jotes Mobile: +48693027040, Mozillian: http://mozillians.org/u/jotes

jotes · 2018-12-13T00:02:03Z

pontoon/base/models.py

@@ -2864,7 +2949,8 @@ def reject(self, user):
        self.approved_user = None
        self.approved_date = None
        self.fuzzy = False
-        self.save()
+        if save:


The purpose is to change the state of translation and not trigger stats recalc.

Note that this approach doesn't even "change the state of translation", it prevents everything that happens in Translation.save(). But why do we even need it? It doesn't seem like we use these update_stats/save params anywhere in the codebase. If the sole purpose of that it to help with tests, I'd simply do translation.approved/reject = True/False.

The fact that it's slightly different in each of the function (reject(), unapprove(), Translation.save()) also doesn't help.

jotes · 2018-12-13T00:04:31Z

@mathjazz Can you look again? A couple of tests have been added (only for singulars).

mathjazz

Nice work with tests! Left a few comments.

pontoon/base/models.py

mathjazz · 2018-12-13T11:22:36Z

pontoon/base/models.py

@@ -2864,7 +2949,8 @@ def reject(self, user):
        self.approved_user = None
        self.approved_date = None
        self.fuzzy = False
-        self.save()
+        if save:


Note that this approach doesn't even "change the state of translation", it prevents everything that happens in Translation.save(). But why do we even need it? It doesn't seem like we use these update_stats/save params anywhere in the codebase. If the sole purpose of that it to help with tests, I'd simply do translation.approved/reject = True/False.

The fact that it's slightly different in each of the function (reject(), unapprove(), Translation.save()) also doesn't help.

pontoon/base/models.py

pontoon/base/tests/models/test_stats.py

pontoon/base/models.py

mathjazz

Well done! Please add that comment and we can ship this!

pontoon/base/models.py

mathjazz · 2018-12-19T16:17:07Z

Comparing performance 1 day before and 1 day after deployment of the patch to prod:

Note that Counts for anything but update_translation are too low for any serious conclusions. We should wait at least two more days.

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.78   1,959  581      568     5.82     5,610    1,140     05.0      08.0
update: after     0.84   2,869  452      411     6.33     8,630    1,300     04.8      06.8

reject: before    0.92   12     296      162     156      684      355       00.0      00.0
reject: after     0.76   163    521      251     159      1,420    85        00.3      00.6

unreject: before  0.90   21     226      288     5.31     1,100    4.75      00.0      00.0
unreject: after   0.75   4      473      149     322      618      1.89      00.0      00.0

unapprove: before 1.00   1      173      0       173      173      0.173     00.0      00.0
unapprove: after  0.91   11     330      205     199      848      3.63      00.0      00.0

mathjazz · 2018-12-21T11:34:28Z

Comparing performance 3 days before and 3 days after deployment of the patch to prod:

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.67   6,191  723      577     5.82     6,740    4,480     05.7      08.8
update: after     0.84   5,818  463      416     6.33     8,630    2,690     03.8      04.9

reject: before    0.65   185    847      694     156      7,240    157       00.2      00.3
reject: after     0.79   211    499      387     159      4,740    105       00.1      00.2

unreject: before  0.76   44     461      360     5.31     1,170    20.3      00.0      00.0
unreject: after   0.81   8      417      151     235      618      3.34      00.0      00.0

unapprove: before 0.56 	 17     832      471     173      2,410    14.1      00.0      00.0
unapprove: after  0.92   24     336      240     150      1,140    8         00.0      00.0

mathjazz · 2018-12-28T10:10:42Z

Comparing performance 7 days before and 7 days after deployment of the patch to prod:

                  Apdex  Count  Avg(ms)  SD(ms)  Min(ms)  Max(ms)  Total(s)  Total(%)  Dissat(%)
update: before    0.68 	 13,004 715      596     5.82     9,530    9,300     05.3      08.1
update: after     0.88   9,999  425      409     9.61     6,340    4,250     03.3      03.8

reject: before    0.63   348    935      794     156      7,240    325       00.2      00.3
reject: after     0.92   160    354      397     145      4,740    56.6      00.0      00.0

unreject: before  0.74   60     518      402     5.31     1,880    31.1      00.0      00.0
unreject: after   0.92   19     265      188     10.7     734      5.04      00.0      00.0

unapprove: before 0.57 	 28     831      407     173      2,410    23.1      00.0      00.0
unapprove: after  0.94   33     331      251     150      1,230    10.9      00.0      00.0

mathjazz reviewed Dec 5, 2018

View reviewed changes

mathjazz reviewed Dec 6, 2018

View reviewed changes

jotes force-pushed the bug-1407016-remove-calculate-stats branch 2 times, most recently from 6265905 to ab1eb78 Compare December 7, 2018 00:29

mathjazz reviewed Dec 7, 2018

View reviewed changes

pontoon/base/models.py Outdated Show resolved Hide resolved

pontoon/base/models.py Show resolved Hide resolved

pontoon/base/models.py Outdated Show resolved Hide resolved

pontoon/base/models.py Outdated Show resolved Hide resolved

pontoon/base/models.py Outdated Show resolved Hide resolved

jotes and others added 8 commits December 7, 2018 18:00

Bug 1407016 - Remove calculate_stats() from Translation.save()

687ab7a

review changes

e5b658b

review fixes #2

ab3c710

cleanups

0523e13

tests fix

a265410

pylama

8a3e079

magic

3b5df08

Generalize comments and fix typo

80aea3d

mathjazz force-pushed the bug-1407016-remove-calculate-stats branch from 352ed16 to 80aea3d Compare December 7, 2018 23:00

Jarosław Śmiejczak added 2 commits December 11, 2018 23:03

remove redundant part of calculate_stats

5541bff

new tests

ea23741

jotes commented Dec 13, 2018

View reviewed changes

tests for singulars

5d623e9

jotes force-pushed the bug-1407016-remove-calculate-stats branch from 9018ecc to 5d623e9 Compare December 13, 2018 00:03

mathjazz reviewed Dec 13, 2018

View reviewed changes

test fixes

6a34382

review notes

dfb5290

mathjazz approved these changes Dec 14, 2018

View reviewed changes

pontoon/base/models.py Show resolved Hide resolved

comments

e01d7e7

mathjazz merged commit a18fdd7 into mozilla:master Dec 14, 2018

mathjazz deleted the bug-1407016-remove-calculate-stats branch December 14, 2018 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

jotes commented Dec 5, 2018 •

edited

mathjazz left a comment

jotes commented Dec 6, 2018

mathjazz left a comment

jotes commented Dec 11, 2018

mathjazz commented Dec 11, 2018

jotes commented Dec 11, 2018 via email

mathjazz commented Dec 11, 2018 •

edited

jotes commented Dec 12, 2018

mathjazz commented Dec 12, 2018

jotes commented Dec 12, 2018 via email

jotes Dec 13, 2018

mathjazz Dec 13, 2018

jotes commented Dec 13, 2018

mathjazz left a comment

mathjazz Dec 13, 2018

mathjazz left a comment

mathjazz commented Dec 19, 2018 •

edited

mathjazz commented Dec 21, 2018

mathjazz commented Dec 28, 2018

Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

Bug 1407016 - Remove calculate_stats() from Translation.save() #1140

Conversation

jotes commented Dec 5, 2018 • edited

mathjazz left a comment

Choose a reason for hiding this comment

jotes commented Dec 6, 2018

mathjazz left a comment

Choose a reason for hiding this comment

jotes commented Dec 11, 2018

mathjazz commented Dec 11, 2018

jotes commented Dec 11, 2018 via email

mathjazz commented Dec 11, 2018 • edited

jotes commented Dec 12, 2018

mathjazz commented Dec 12, 2018

jotes commented Dec 12, 2018 via email

jotes Dec 13, 2018

Choose a reason for hiding this comment

mathjazz Dec 13, 2018

Choose a reason for hiding this comment

jotes commented Dec 13, 2018

mathjazz left a comment

Choose a reason for hiding this comment

mathjazz Dec 13, 2018

Choose a reason for hiding this comment

mathjazz left a comment

Choose a reason for hiding this comment

mathjazz commented Dec 19, 2018 • edited

mathjazz commented Dec 21, 2018

mathjazz commented Dec 28, 2018

jotes commented Dec 5, 2018 •

edited

mathjazz commented Dec 11, 2018 •

edited

mathjazz commented Dec 19, 2018 •

edited