Calculation of Awario social scores can easily be gamed #199

njordhov · 2019-12-02T00:32:09Z

What is the problem you are seeing? Please describe.

The Awario Z scores can be gamed to affect current and future ranking.

For example, using the November app mining data, by systematically maximizing the Awario social score of all apps with an original score above zero, the max social Z score drops from 1.74 to 0.84. As a consequence, high ranking apps would loose around 0.05 from their final scores, which would lift top debuting apps several positions in rank (they have no awario score) and give future debuts an edge as the lower final scores would affect the "memory function" for incumbents.

Alternatively, somebody could boost all apps to have a minimum of four in awario social score. This would increase the score of high ranking apps with around 0.02 points, while other would experience a loss of up to 0.3 points. Debuting apps would keep the same score, thus stand to benefit substantially.

How is this problem misaligned with goals of app mining?

It opens up for disturbing manipulation of the scores and rankings. Also, a purpose of the awario social score is to require a minimal level of activity to remain in higher rank, which would be defeated by third party boosting the score of inactive teams.

What is the explicit recommendation you’re looking to propose?

Use the mean social score of an app in place of the Z-score, making it independent of the scoring of the other apps.

The loophole is caused by the awario Z scores depending on average score of all apps and their the standard deviation, which can easily be manipulated by boosting the count of social channels of the apps towards the extremes or towards the average:

if(App score=0, -1, (App score -average App score:App score))/stdev(App score:App score))

Describe your long term considerations in proposing this change. Please include the ways you can predict this recommendation could go wrong and possible ways mitigate.

The purpose of using the standard deviation is to "normalize" the scores so they are comparable across the different reviewers. The proposal normalizes the scores to be in range 0 to 1.

Additional context

#30 #135 #154 #127

The text was updated successfully, but these errors were encountered:

dantrevino · 2019-12-02T02:18:15Z

@njordhov do you have any examples?

njordhov · 2019-12-02T03:00:07Z

@dantrevino In the November app mining spreadsheet, there are 70 apps with a zero Awario social score, out of 170 apps with a score, leaving 100 apps to have a non-zero score. Now let's boost the latter to all have the max Awario score of five. Calculating z-score for an app with 5 social score:

Mean: 2.94
Standard Deviation, σ: 2.46
Social Z-score: (5 - 2.94)/2.46 = 0.837

Let's take the awario scores of this months front-runner (pDrive) as example and replace its Social Z (1.744150654) with the depressed Social Z-score above. Find the Awario score by averaging its Reach Z (1.365339254) and replaced Social Z (0.837) => 1.101169627 then take the square root for a Awario theta score of 1.04936629782.

Subtracting this from the original final awario score of 1.246894123 gives 0.19752782518 in loss. Divide by 4 to get the resulting loss for their final score: 0.04938.

This demonstrates that the final score of an app can be depressed by strategically boosting the awario social score of selected apps.

friedger · 2019-12-02T07:30:44Z

The purpose of using the standard deviation is to "normalize" the scores so they are comparable across the different reviewers

Indeed, the reviewers provides scores that reflects the relations between an app and THE "average app". If the average app changes (by boosting the social score of some apps) then indeed the score of all apps changes. And that is a good thing.
In the long run all apps will have awario social Z score of near 0, same for NIL Z score because all apps do their best to receive maximum points. Apps that don't will get a very low Z score. (Maybe that is a barrier for entrance but that is a different issue)

#119 is relevant here describing the same thing for NIL.

So, I don't see anything wrong in the awario social Z score.

I agree, that 3rd parties can and do boost the social score by promoting the app on the social networks. But that is exactly what the awario score is about. The score can only give a vague indication whether the team is active or not. For that we would need a better activity reviewer (#169)

njordhov · 2019-12-03T22:05:43Z

In the long run all apps will have awario social Z score of near 0

If the social awario of all apps are boosted to the max, they will indeed all have an awario social Z score of zero. As a result, using the November app mining data, high ranking apps stand to loose more than 0.1 points of their final score, while debuting apps keep their scores. The top debuting app would move into 4th from a ranking at 13th. That's considerable motivation to manipulate the scores by spending an hour posting all app names to social channels.

because all apps do their best to receive maximum points.

This is a naive assumption. Participants with multiple apps have motivation and opportunity to let some of their apps deliberately drop in score to boost the score of their high ranking app(s).

3rd parties can and do boost the social score by promoting the app on the social networks. But that is exactly what the awario score is about.

That one app can gain in rank by boosting the score of other apps is a perversion of what the awario score is about.

hstove · 2019-12-05T16:53:16Z

Although I certainly follow the logic behind what you're describing, I would rather wait until this actually becomes a problem before making big changes to the scoring method. Removing the Z score for social would introduce a ton of other problems, and vastly boost existing apps vs. new apps.

If we ended up in a world where 100% of apps received a 5/5, then maybe we could just drop the social score. I'd also prefer to remove the social score if we suspect that someone is engaging in this 'attack'.

njordhov mentioned this issue Dec 2, 2019

Division by zero bug in z-score calculations #200

Closed

friedger mentioned this issue Dec 2, 2019

New apps can benefit from a higher number of good apps #201

Closed

stackatron assigned hstove Dec 3, 2019

njordhov mentioned this issue Feb 20, 2020

Contextual example to highlight a problem any new App Mining governance should solve #234

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calculation of Awario social scores can easily be gamed #199

Calculation of Awario social scores can easily be gamed #199

njordhov commented Dec 2, 2019 •

edited

Loading

dantrevino commented Dec 2, 2019

njordhov commented Dec 2, 2019

friedger commented Dec 2, 2019

njordhov commented Dec 3, 2019 •

edited

Loading

hstove commented Dec 5, 2019

Calculation of Awario social scores can easily be gamed #199

Calculation of Awario social scores can easily be gamed #199

Comments

njordhov commented Dec 2, 2019 • edited Loading

dantrevino commented Dec 2, 2019

njordhov commented Dec 2, 2019

friedger commented Dec 2, 2019

njordhov commented Dec 3, 2019 • edited Loading

hstove commented Dec 5, 2019

njordhov commented Dec 2, 2019 •

edited

Loading

njordhov commented Dec 3, 2019 •

edited

Loading