-
Notifications
You must be signed in to change notification settings - Fork 16
Calculation of Awario social scores can easily be gamed #199
Comments
@njordhov do you have any examples? |
@dantrevino In the November app mining spreadsheet, there are 70 apps with a zero Awario social score, out of 170 apps with a score, leaving 100 apps to have a non-zero score. Now let's boost the latter to all have the max Awario score of five. Calculating z-score for an app with 5 social score:
Let's take the awario scores of this months front-runner (pDrive) as example and replace its Subtracting this from the original final awario score of 1.246894123 gives This demonstrates that the final score of an app can be depressed by strategically boosting the awario social score of selected apps. |
Indeed, the reviewers provides scores that reflects the relations between an app and THE "average app". If the average app changes (by boosting the social score of some apps) then indeed the score of all apps changes. And that is a good thing. #119 is relevant here describing the same thing for NIL. So, I don't see anything wrong in the awario social Z score. I agree, that 3rd parties can and do boost the social score by promoting the app on the social networks. But that is exactly what the awario score is about. The score can only give a vague indication whether the team is active or not. For that we would need a better activity reviewer (#169) |
If the social awario of all apps are boosted to the max, they will indeed all have an awario social Z score of zero. As a result, using the November app mining data, high ranking apps stand to loose more than 0.1 points of their final score, while debuting apps keep their scores. The top debuting app would move into 4th from a ranking at 13th. That's considerable motivation to manipulate the scores by spending an hour posting all app names to social channels.
This is a naive assumption. Participants with multiple apps have motivation and opportunity to let some of their apps deliberately drop in score to boost the score of their high ranking app(s).
That one app can gain in rank by boosting the score of other apps is a perversion of what the awario score is about. |
Although I certainly follow the logic behind what you're describing, I would rather wait until this actually becomes a problem before making big changes to the scoring method. Removing the Z score for social would introduce a ton of other problems, and vastly boost existing apps vs. new apps. If we ended up in a world where 100% of apps received a 5/5, then maybe we could just drop the social score. I'd also prefer to remove the social score if we suspect that someone is engaging in this 'attack'. |
What is the problem you are seeing? Please describe.
The Awario Z scores can be gamed to affect current and future ranking.
For example, using the November app mining data, by systematically maximizing the Awario social score of all apps with an original score above zero, the max social Z score drops from 1.74 to 0.84. As a consequence, high ranking apps would loose around 0.05 from their final scores, which would lift top debuting apps several positions in rank (they have no awario score) and give future debuts an edge as the lower final scores would affect the "memory function" for incumbents.
Alternatively, somebody could boost all apps to have a minimum of four in awario social score. This would increase the score of high ranking apps with around 0.02 points, while other would experience a loss of up to 0.3 points. Debuting apps would keep the same score, thus stand to benefit substantially.
How is this problem misaligned with goals of app mining?
It opens up for disturbing manipulation of the scores and rankings. Also, a purpose of the awario social score is to require a minimal level of activity to remain in higher rank, which would be defeated by third party boosting the score of inactive teams.
What is the explicit recommendation you’re looking to propose?
Use the mean social score of an app in place of the Z-score, making it independent of the scoring of the other apps.
The loophole is caused by the awario Z scores depending on average score of all apps and their the standard deviation, which can easily be manipulated by boosting the count of social channels of the apps towards the extremes or towards the average:
Describe your long term considerations in proposing this change. Please include the ways you can predict this recommendation could go wrong and possible ways mitigate.
The purpose of using the standard deviation is to "normalize" the scores so they are comparable across the different reviewers. The proposal normalizes the scores to be in range 0 to 1.
Additional context
#30 #135 #154 #127
The text was updated successfully, but these errors were encountered: