Skip to content
This repository has been archived by the owner on Nov 9, 2017. It is now read-only.

Fix "hot" algorithm to treat negative score sanely #583

Closed

Conversation

iangreenleaf
Copy link

The hot algorithm applies the "sign" of a post's score weirdly. This has been noticed before.

The time-based value is a large value that grows over time, so applying the sign to that yields weird results for posts with negative score. Not only do posts with a negative score rank lower than all positive posts, but older negative posts actually rank higher than newer negative posts, all else being equal.

Here's what I mean:

_hot(1, 0, 1262304000) #=> 2850.0
_hot(0, 1, 1262304000) #=> -2851.0
_hot(0, 1, 1353107345) #=> -4869.0

With this patch, it returns much more expected values:

_hot(1, 0, 1262304000) #=> 2850.0
_hot(0, 1, 1262304000) #=> 2850.0
_hot(0, 1, 1353107345) #=> 4868.0

@andreazevedo
Copy link

+1
It makes absolutly no sense to be the way it is right now.

@andreazevedo
Copy link

@iangreenleaf your commit isn't the best option either.
What you have to do is multiply sign by both seconds and order.

@iangreenleaf
Copy link
Author

@andreazevedo I disagree. Negative votes means that order is a bad thing, and it should be counted against you. However, newer is still better than older, so a larger seconds value should still count towards a better score.

@andreazevedo
Copy link

@iangreenleaf It's kind of odd. In my opinion, if a content receives 10,000 negative votes in one second, it is worse than one that receives the same 10,000 negative votes during a month.
Don't you?

@iangreenleaf
Copy link
Author

Interesting, I hadn't looked at it that way before. They don't really account for that with positively-scored entries, because content with higher Votes-Per-Second is naturally newer than similarly-scored content with lower VPS, and so still gets scored higher. But that relationship gets inverted with negatively-scored content.

Still, I don't think applying the sign to both is a good solution. That will result in all negative content (even a score of -1 from 5 seconds ago) being ranked lower than all positive content (even a score of 1 from 3 years ago).

@bsimpson63
Copy link
Contributor

The hot sort might seem a bit weird, but it's intentionally set up that way.

@bsimpson63 bsimpson63 closed this Mar 6, 2013
@iangreenleaf
Copy link
Author

@bsimpson63 I'm genuinely curious -- why is it set up that way? I can't think of a reason for this to be desirable. In fact, I investigated some controversial posts present in the hot score lists, and it seems like there's another mechanism present to keep those from being buried if they momentarily tip negative (I had a hard time saying for certain, because there appeared to be some caching or load balancing happening that made the results a little jittery). If that mechanism exists, though, it's been implemented somewhere else in the code.

@bsimpson63
Copy link
Contributor

The intent was to have 3 groups of posts: positive, zero, and negative score. Anything without a positive score is safe to throw away. I agree that it seems odd for the function to be discontinuous, but it's that way by design.

@iangreenleaf
Copy link
Author

It does seem odd. Aren't you worried about gaming? Someone motivated can completely bury a post from the Hot page, as long as they strike quickly and have as many votes available to them as a post is going to pick up from the All page (the only place it will appear once it goes negative). In lower-traffic Reddits, I suspect the required number of votes is ~1.

@pa-zz-yco
Copy link

I don't see how the current math even accomplishes the stated intention. If the goal is to find good content, why would you penalize brand new content with two negative votes more than three-day old content with the same two negative votes? Can you really declare that newer piece more fit for deletion than the older post? A discontinuous function is one thing, but to decide new bad content is worse than old bad content just seems strange, especially because there's a good chance that old content was banished to the "bad" pile quickly as well.

Some subreddits have well over three million viewers; why should two fast trolls get to trigger garbage collection on a piece immediately when it's common for a given post to end up receiving on the order of 10,000 votes before it ages out? Sure, /new's raw time-ordering helps a little on the highly trafficked subs, but that just means all the power rests with those who specifically seek out /new. That also doesn't help smaller subs, where people are much more likely to only browse the main ordering.

So who exactly is this math helping at this point? Reddit is big enough that it shouldn't have to worry about throwing away a post two minutes in; /new readers wield disproportionate control of a sub in the best case and can virtually instaban posts in the worst case. What does this accomplish now that the site has eight years and millions of users worth of storage scaling under its belt?

@jamonholmgren
Copy link

There's a post by the PR author that presents this in more detail. It seems pretty odd to me too.

@ketralnis
Copy link
Contributor

This comes up all of the time, see my comments on http://www.reddit.com/r/programming/comments/td4tz/reddits_actual_story_ranking_algorithm_explained/

I feel like every 6 months someone has an epiphany about how wrong it is, convinces the front-page of /r/programming, and one of us has to explain it again.

@ketralnis
Copy link
Contributor

I propose this fix:

diff --git a/r2/r2/lib/db/_sorts.pyx b/r2/r2/lib/db/_sorts.pyx
index 0c3b4c3..48e6b7e 100644
--- a/r2/r2/lib/db/_sorts.pyx
+++ b/r2/r2/lib/db/_sorts.pyx
@@ -44,6 +44,9 @@ cpdef double hot(long ups, long downs, date):

 cpdef double _hot(long ups, long downs, double date):
     """The hot formula. Should match the equivalent function in postgres."""
+
+    # not a typo. seriously.
+
     s = score(ups, downs)
     order = log10(max(abs(s), 1))
     if s > 0:

@dls
Copy link

dls commented Dec 10, 2013

Maybe put "not a typo" right above return round(order + sign * seconds / 45000, 7), and link to here or one of the reddit discussions?

@rhiever
Copy link

rhiever commented Dec 10, 2013

This is the most clarifying reddit discussion: http://www.reddit.com/r/programming/comments/td4tz/reddits_actual_story_ranking_algorithm_explained/c4m18tb

Although I'm still puzzled why this strange behavior when a post goes negative is considered "correct." I disagree with the statement that "what happens below 0 is pretty moot"; with this algorithm, if someone's post sinks below 0, their post is dead and won't recover. This means that someone could potentially abuse the new queue: http://technotes.iangreenleaf.com/posts/2013-12-09-reddits-empire-is-built-on-a-flawed-algorithm.html

@pa-zz-yco
Copy link

I've read your comments there and on the new post. You've explained that you don't care about negative scores, but I've yet to see a reasonable explanation for why. Your math fits the assumption that a negative score should never be shown, but I contend that said assumption doesn't jive with the actual goal of reaching a consensus on whether a post is good as postAge approaches infinity. You're assuming that browsers of /r/foo/new behave in the same way as the population of /r/foo as a whole would behave, and you're assuming that all posts to /r/foo/new receive enough actual views to reach a remotely useful consensus before effectively being pushed out of the queue.

@ketralnis
Copy link
Contributor

Scores below 0 are a non-issue. We've been over this a lot of times. Here are a few from my paste-buffer:

But seriously this happens every 6 months, and the situation hasn't changed since the first "discovery" about five years ago.

@bharrisau
Copy link

The really confusing part is that there is an equation in there dealing with the sign in a weird way. Based on all the comments, you are saying "when up <= down we don't care", if you had

if up <= down:
    return 0;

People might still not agree, but there would be much less confusion. You might not care, but the function is returning values for those cases and the values behave in a confusing way. Because transposing two symbols causes the function to return sane values for all inputs, good samaritans are always going to think it is a bug.

@passthefist
Copy link

Unless there's something else in place, doesn't this open up a way to manipulate votes?

The quickmeme guy did something similar to manipulate non-quickmeme posts. I'm sure reddit has something to detect this (that guy got caught, but it was people sleuthing, not automatic detection), but suppose I have some bots, and I want to game the system to kill posts with some criteria.

If a post matches my criteria, then some but not all bots downvotes with say 60% probability, otherwise 50/50 up-down. That'd look fairly normal to most people looking over the voting pattern other than them only voting in new, but because even a small negative difference kills things quickly, it would let me selectively prevent content from bubbling to a front page.

There's stuff in place to look for vote manipulation, but would a scheme like this be caught? Otherwise /u/gtw08 might still be gaming advice animals.

@taywrobel
Copy link

If this "happens every 6 months" that likely means that it is not intended behavior, or at least not intuitive.

From what I can see, every previous discussion on this topic you link to, @ketralnis ends up with you saying either "You're just incorrect." or "Yes that's accurate", without addressing the actual concerns people express and have ample evidence to support.

The point of open source software is to attempt to improve programs using a community of reviewers. If every half year a new reviewer makes the same feedback, and gets a large community of programmers to back them, I find it very difficult to believe that they are all mistaken, especially when all the justification ever given is "it's intentionally done that way". It's intentional for a single downvote on a new post to nuke its chances of being seen? It's intentional for newer posts with a negative score to be less "hot" than order posts with the same score? I find that unlikely, and can't see any justification for these behaviors in any prior discussion.

You fail to even be internally consistent with your stated ranking logic, and giving no actual reasoning as to why is what is what I feel frustrates people more about this issue.

I'd say the real question is, since this change would make no difference to net positive posts, and will have little to no impact on larger subreddits, why not just merge the code, appease the masses, and not have to justify this every few months? ;)

@pa-zz-yco
Copy link

Yes. I get you know about it. I get you think it's good. Your math matches your assumptions. You've yet to say why those assumptions actually help reddit's overarching goal. Why should non-positive scores be thrown away (even zero scores are penalized over 5500 points compared to +1 scores) across the board?

In the second link, /u/allenth says, "Very small subreddits are the main area where things like this can be a problem. In those cases, things that aren't on the hot listing are much less likely to ever get seen."

Why is this a good thing?

Reddit's default sorting is known, even to non-technical users, to not really do what it says on the tin if you're unlucky with the first few voters, or if someone is intentionally sniping posts early. As you say, every six months a programmer redditor gets bothered by the funky behavior, walks up to this code, and says "yup, this code has the same funk as the behavior nobody likes". He then brings it up to other programming-minded people, who agree there's a funk. And you just keep repeating that the funk is supposed to be there because it matches your assumptions about how this behavior should smell. How many other people--mathematically minded people who are familiar with reddit's goal of delivering good content on the order of hours or days--need to say this algorithm (not the code, the algorithm) has a nasty funk before you consider that it maybe shouldn't smell that way?

@ketralnis You yourself say, "Smoothness around the real life dates and scores on the site is more important than smoothness around 0, where we don't really have listings that will display it anyway." If you care about smoothness over human-scale time, why do week-old -1 posts get ranked higher than day-old -1 posts? Why should either be harder to stumble across than a +1 post on that same board 5 years ago? If the time spent on /r/foo/new is long enough for post to reach an idea of how well it will fare over its lifetime, why do you need to change the math to push it further out of sight? If you care about smoothness of scores on the population-scale, why let a handful of fast voters hinder the population from judging the content? We already know people are more likely to downvote something that already has a negative score; seems like that's a more honest way to let content die than sending it back to 1997.

Your quote is implying that you find an hour old -1 post next to a week old 200 post unintuitive. Fine. But to in order to escape that unintuitive case (which would only happen to the small minority of redditors who venture from the top/default subs anyway), you produce a more unintuitive system where the first handful of votes has a vastly disproportionate effect on the final karma score of a post. Users on larger subs complain about getting two downvotes and nothing else ever happening; people with an axe to grind and meme site owners spin up bots to exploit the magic around the [-1,1] range, and you get to write ever more sophisticated detection measures. 11000 upvotes on a brand new post are worth 4.04 ranking points, or a little over 2 days of priority. 2 downvotes are worth 11000 ranking points, or -16 years. Cuz smoothness. And even if everybody does their civic duty and browses /new, a highly read, highly controversial post that interleaves 1000 upvotes and 1002 downvotes by the time it falls out of /new won't be considered worth as much as that one bad joke that two people weeded out last month. Clearly this is a smooth and intuitive means of generating consensus.

@NullEntity
Copy link

@ketralnis But if one person downvotes something that was just posted, why should it immediately be thrown away from ever being on the Hot list, even if it gets 1000 upvotes in an hour?

@bharrisau
Copy link

Guys, devs have pointed out they are happy with "hot articles must have positive score". This isn't a democracy. Looks like the code originally had to match the SQL. Maybe things will change, maybe they wont; walls of text on a 12 months old pull request isn't going to change anything.

@NullEntity
Copy link

They should. It's a bug. Just because it hasn't been addressed yet, doesn't mean it should never be.

@taywrobel
Copy link

If they don't want feedback they shouldn't open source their code.

@iangreenleaf
Copy link
Author

Hey people, I appreciate all the attention, and it's nice that I'm not the only one who cares about this. That said, this is effectively a closed ticket and the devs don't have to fix it if they don't want to. Let's not blow this up into a never-ending comment thread / flame war (that's what the Reddit comments are for ;)).

@taywrobel
Copy link

Not meaning to make this a flame war at all, I just wish they were more open to change/responsive to the opinions of their users. Their approach now sort of undermines the whole open source idea...

@vadi2
Copy link

vadi2 commented Dec 10, 2013

It doesn't. Just because they open-source the code, it does not mean their project is now free to get taken over other random people on the internet. They still decide where they want to go.

You are confusing open source with open project management (or something of that sort - there is a clear distinction).

@rubensayshi
Copy link

maybe write it in a way that at least shows you intended the code to work like this and it wasn't just a small fuck up with + * / signs...

retuen round(order + ((sign * seconds) / 45000), 7)

@svivian
Copy link

svivian commented Dec 10, 2013

Wouldn't a better fix be to simply make sign negative if it has more downvotes? In other words:

if s > 0:
    sign = 1
elif s < -3:
    sign = -1
else:
    sign = 0

This means that (a) posts with many negative votes are banished as they should be, and (b) a single user cannot banish a post from /hot by themselves.

@guilhermesimoes
Copy link

Why not make this an option for smaller subreddits?

@rhiever
Copy link

rhiever commented Dec 10, 2013

Okay y'all, put your pitchforks away, I finally figured out their reasoning. Unfortunately their reasoning was broken apart in multiple responses over the months, so let me summarize my understanding of why the hotness score for negative score posts does not matter in their eyes:

The Hotness function, as implemented, works like this (http://www.reddit.com/r/programming/comments/td4tz/reddits_actual_story_ranking_algorithm_explained/c4m18tb):

So that means all posts in all subreddits (when browsing 'hot') are sorted this way:

  1. all posts with more upvotes than downvotes with the order determined by age (newer posts are preferred) and popularity
  2. all posts with the same number of up- and downvotes in whatever order the database returns them
  3. all posts with less upvotes than downvotes with the order determined by age (older posts are preferred) and popularity (posts with a lot more downvotes are preferred)

And that behavior is what folks are confused about. Especially, this opens up the possibility for gaming the system by downvoting a new post into oblivion with just a few early downvotes. (http://technotes.iangreenleaf.com/posts/2013-12-09-reddits-empire-is-built-on-a-flawed-algorithm.html)

HOWEVER...

In practice, the Hotness algorithm is only ever applied to posts with > 0 score. Typically posts remain in the /foo/new queue until they get 9-10 score, at which point they move into the Hot queue. This Hotness ranking algorithm is not applied on the post rank until they have acquired that 9-10 score in the New queue. The posts are sorted only by time in the /r/foo/new queue up until the point they reach 9-10 score. This is why posts with < 1 score don't matter for the Hotness algorithm.

If this explanation is correct, please resolve this issue by placing this explanation or a similar explanation in the comments about the confusing code. That is the whole point of comments, after all -- to explain non-intuitive portions of code.

@iangreenleaf
Copy link
Author

Looks like as of 50d35de, this bug is officially fixed. Thank you, reddit devs, for making this change.

@Quest79
Copy link

Quest79 commented Feb 20, 2014

Such a small, simple change having such complex and widespread affect on the quality of a site with millions of viewers. Fascinating.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet