Skip to content
This repository was archived by the owner on Oct 11, 2022. It is now read-only.

Conversation

@mxstbr
Copy link
Contributor

@mxstbr mxstbr commented Aug 18, 2018

Status

  • WIP
  • Ready for review
  • Needs testing

Deploy after merge (delete what needn't be deployed)

  • mercury
  • api
  • hyperion (frontend)

Deploy in that order!!

Related issues (delete if you don't know of any)
Supersedes and thus closes #3727
Supersedes and thus closes #3726
Closes #3718

Based on the conversations in #3727 here is a new approach to frecency thread feeds.

TODO

  • Add new calculateThreadScore queue
  • Process jobs in mercury, assigning a score and scoreUpdatedAt for each thread
  • Add jobs to the queue from mutations changing properties of a thread related to the score
  • Figure out what to do with all the existing threads
  • Add sort argument to Community.threadConnection GraphQL type
  • Add sort argument to Channel.threadConnection GraphQL type
  • Look at results with recent db backup and tweak algorithm if necessary
  • After merging first deploy mercury to production and then the API and the frontend to alpha, then check alpha to see if all looks fine

@mxstbr mxstbr mentioned this pull request Aug 18, 2018
10 tasks
@spectrum-bot
Copy link

spectrum-bot bot commented Aug 18, 2018

Warnings
⚠️

These modified files do not have Flow enabled:

  • mercury/index.js

Generated by 🚫 dangerJS

@mxstbr mxstbr changed the title Frecency thread feeds #2 Frecency thread feeds (second try) Aug 18, 2018
@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 18, 2018

These are the results on a not-very-recent prod backup.

Before:

screen shot 2018-08-18 at 19 21 57

After:

screen shot 2018-08-18 at 19 23 52

I don't really like these results much yet. The current algo indexes heavily on recent (i.e. last activity), but that means that a thread with 150 participants that's half a year old will shoot up to the top if a single person writes another message. 😕

I think we need to do time boxing when considering the raw content, and count messages that were sent a while ago a lot less than messages that were sent just now. We can do that now! Will dig into that.

@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 20, 2018

I edited the algorithm to exponentially bias towards recency, based on what Mozilla does. Here's what the results look like on a 15 minute old db backup; first image is sorted by latest, i.e. the old version, second image is trending, i.e. the new version: (note that being ~as good as the old version is good enough for the algorithm since we really just don't want people bumping shitty threads themselves)

Figma

screen shot 2018-08-20 at 09 43 26

screen shot 2018-08-20 at 09 50 34

Verdict: Not really a difference in terms of quality at this snapshot in time.

Codepen

screen shot 2018-08-20 at 09 43 42

screen shot 2018-08-20 at 09 50 29

Verdict: Algorithmic version seems a little bit better.

React

screen shot 2018-08-20 at 09 43 54

screen shot 2018-08-20 at 09 50 24

Verdict: Algorithmic versions looks better.

styled-components

screen shot 2018-08-20 at 09 59 25

screen shot 2018-08-20 at 09 59 06

Verdict: Algorithmic versions looks better.

Based on those four hand-picked examples (I didn't check anything else) I say this algorithm looks good for now, let's ship it if the code looks good?

@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 20, 2018

Just realized my previous migration won't work in production—not sure what to do with all the existing threads just yet...

type GetThreadsByChannelPaginationOptions = {
first: number,
after: number,
sort: 'new' | 'trending'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we want these options to be defined and expand over time? Is trending the same as popular? Can we ever sort by date? e.g. trending this week/month/year

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably won't be able to do trending this week/month/year with this current setup since we only store the score right now. I think other than that we probably want "TOP", but that's a different calculation altogether.

Not sure, any thoguhts/ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure either, just trying to think ahead a bit for what other constrains might enter the mix. But also don't want to get too caught up to prevent this from moving forward :)

export type CommunityThreadConnectionPaginationOptions = {
after: string,
first: number,
sort: 'NEW' | 'TRENDING',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above we had 'new' and 'trending' but here they are all caps. Should we be consistent with these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so? One is the GraphQL stuff (which has to be all caps) and one is internally the argument for the db query. (which we never do all caps)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

export default async (root: DBCommunity, args: PaginationOptions, ctx: GraphQLContext) => {
const { first = 10, after } = args
export default async (root: DBCommunity, args: CommunityThreadConnectionPaginationOptions, ctx: GraphQLContext) => {
const { first = 10, after, sort = 'NEW' } = args
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto to caps vs lowercase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL stuff

const threads = await getThreadsByChannels(channels, {
first,
after: lastThreadIndex,
sort: sort === 'NEW' ? 'new' : 'trending',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caps vs lowercase - then we could just say: { sort } if the sort argument is either new or trending

Copy link
Contributor Author

@mxstbr mxstbr Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This translates the external GraphQL stuff to the internal db query options.

}
enum CommunityThreadConnectionSort {
NEW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caps vs lowercase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL stuff

threadConnection(
first: Int = 10
after: String
sort: CommunityThreadConnectionSort = NEW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caps vs lowercase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL stuff

@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 21, 2018

Oh damn I thought all our GraphQL enums were purely uppercase, that's why I did the external stuff uppercase and internal stuff lowercase—but I just now realized GraphQL enums can also be lower case 🤦‍♂️

Thanks for calling me out on that, you're totally right those should be lower case. On it!

@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 21, 2018

This should be ready to go, assuming e2e tests pass. The only thing I'm not sure about is how we're going to handle assigning a score to all existing threads—should we just manually add a calculateThreadScore job for every thread in the db?

@brianlovin
Copy link
Contributor

should we just manually add a calculateThreadScore job for every thread in the db?

Seems fine to do?

  1. Ship mercury
  2. Run migration
  3. Let mercury process all existing threads
  4. Ship the rest

@mxstbr
Copy link
Contributor Author

mxstbr commented Aug 22, 2018

Shipping mercury and the the API!

@mxstbr mxstbr merged commit 080caed into alpha Aug 22, 2018
@mxstbr mxstbr deleted the new-frecency-feeds branch August 22, 2018 07:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Top Threads

2 participants