Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Give karma to coauthors #5250

Merged
merged 16 commits into from Jul 15, 2022
Merged

Give karma to coauthors #5250

merged 16 commits into from Jul 15, 2022

Conversation

oetherington
Copy link
Collaborator

@oetherington oetherington commented Jul 12, 2022

This PR allows users to receive karma from posts that they co-author.

I've tested the following:

  • Co-authors receive karma when upvoting/lose karma for downvoting (and this can be undone as well)
  • The karma is shown in their "karma changes" star
  • The karma is updated correctly when merging accounts

Deploying should require the following process (times are approximate based on local testing and extrapolated for the production DB size):

  1. Run allowMultipleVoteAuthors migration (~10-15 minues)
  2. Deploy code changes (~8 mins)
  3. Run allowMultipleVoteAuthors migration again (<1 min)
  4. Run removeVoteAuthorId migration (~10-15 mins)
  5. Run updateCollectionScores migration (<5 mins)

┆Issue is synchronized with this Asana task by Unito

@oetherington oetherington marked this pull request as ready for review July 12, 2022 10:26
@oetherington oetherington requested a review from a team as a code owner July 12, 2022 10:26
@oetherington oetherington requested review from Xodarap and jpaddison3 and removed request for a team and Xodarap July 12, 2022 10:26
@oetherington oetherington assigned jpaddison3 and unassigned Xodarap Jul 12, 2022
Copy link
Collaborator

@jpaddison3 jpaddison3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASTLAC

Test out the user karma migration as mentioned, read through the comments, then I think you've got it, really nice work.

Comment on lines +277 to +280
if (!hasCoauthorPermission) {
coauthorStatuses = coauthorStatuses.filter(({ confirmed }) => confirmed);
}
return coauthorStatuses.map(({ userId }) => userId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a cute way to write this

@@ -418,22 +418,6 @@ export const useUserLocation = (currentUser: UsersCurrent|DbUser|null, dontAsk?:
return {...locationData, setLocationData}
}

// utility function for checking how much karma a user is supposed to have
export const userGetAggregateKarma = async (user: DbUser): Promise<number> => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for onlookers: this was unused.

type: String,

// The IDs of the authors of the document that was voted on
authorIds: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I told you I didn't think this was getting exposed to the GraphQL API. I was wrong. It's conceivable that someone (namely GreaterWrong, see also) is depending on authorId. It seems unlikely, but maybe like 15%? Worth maintaining Author as a resolver-only field, I'd say.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a bit uncomfortable at removing this. It will definitely break some analytics stuff for us, which is probably okay, but I have a feeling it will cause us headaches and that our general paradigm assumes documents have one canonical owner. One example, I assume revisions are still tied to one userId and am not yet sure how this PR will interact with that. (But only partway through reading it)

(I do separately think there's an anti-pattern around having both thing and thing-plural and that the existence of both email and emails has hurt us. (My ideal there is to get rid of the latter).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point that I hadn't thought of. Whereabouts is it exposed in the API - I still can't find it? (Also, FWIW, a case-insensitive search for authorid in the GreaterWrong codebase returns 0 results).

For now, I'll go ahead and add a resolver-only authorId as this seems to fix the analytics issue (albeit, not beautifully). I feel that this field should generally be considered deprecated though. If there's any reason why you think this won't be enough then let me know and I'll give it a deeper think.

RE documents generally have only one canonical owner: I think this doesn't really matter here as semantically the "owner" of a vote is the userId who cast the vote, not the authors receiving the karma. I definitely could be missing something here though so feel free to object.

{
$group: {
_id: '$authorId',
karmaTotal: {$sum: "$adjustedPower"},
_id: '$authorIds',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this makes sense anymore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm missing something, I think it's correct. After the $unwind, $authorIds has been exploded into a separate, single string for each user so even though it's still plural it's really only a single ID. I did check the docs to see if there's some way to rename it to $authorId to make it clearer but it looks like that's only possible by adding an extra $project after the $unwind, which doesn't seem worth it performance-wise.

adjustedPower: {$cond: [{$ne: ['$userId', '$authorId']}, '$power', 0]},
adjustedAfPower: {$cond: [{$and: [{$ne: ['$userId', '$authorId']}, {$eq: ['$documentIsAf', true]}]}, '$afPower', 0 ]}
authorIds: 1,
power: 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good refactor

action: async () => {
await forEachDocumentBatchInCollection({
collection: Votes,
batchSize: 100,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
batchSize: 100,
batchSize: 1000,

@@ -233,7 +233,7 @@ async function buildContributorsList(tag: DbTag, version: string|null): Promise<
// Filtered by: is a self-vote
{ $match: {
$expr: {
$eq: ["$userId", "$authorId"]
$eq: ["$userId", "$authorIds"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this fine? I think this is not fine? I think you need to do $in?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm right, careful with your find and replace's

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right - it wasn't find and replace though, just a misunderstanding. It looks like find will happily search the array for a match (like in the rate limiting query you commented on), but the same doesn't work in an aggregation. I've made the change and added a unit test. Mongo's kinda crazy.

@@ -111,7 +111,7 @@ Vulcan.mergeAccounts = async (sourceUserId: string, targetUserId: string) => {
// Transfer votes that target content from source user (authorId)
// eslint-disable-next-line no-console
console.log("Transferring votes that target source user")
await Votes.rawUpdateMany({authorId: sourceUserId}, {$set: {authorId: targetUserId}}, {multi: true})
await Votes.rawUpdateMany({authorIds: sourceUserId}, {$set: {"authorIds.$": targetUserId}}, {multi: true})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to look this up, let's throw in a doc reference:

Suggested change
await Votes.rawUpdateMany({authorIds: sourceUserId}, {$set: {"authorIds.$": targetUserId}}, {multi: true})
// https://www.mongodb.com/docs/manual/reference/operator/update/positional/
await Votes.rawUpdateMany({authorIds: sourceUserId}, {$set: {"authorIds.$": targetUserId}}, {multi: true})

@@ -267,7 +271,7 @@ const checkRateLimit = async ({ document, collection, voteType, user }: {
const oneDayAgo = moment().subtract(1, 'days').toDate();
const votesInLastDay = await Votes.find({
userId: user._id,
authorId: {$ne: user._id}, // Self-votes don't count
authorIds: {$ne: user._id}, // Self-votes don't count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work, based on my reading of mongo docs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it out and it seems to work. For instance, in the EA dev db db.votes.find({ _id: "5be4dc3ce61b803757b01324", authorIds: { $eq: "Yx4AN5v8gJBCtrN67" } }) will return 1 result and db.votes.find({ _id: "5be4dc3ce61b803757b01324", authorIds: { $ne: "Yx4AN5v8gJBCtrN67" } }) will return 0.

I guess the alternative would be db.votes.find({ _id: "5be4dc3ce61b803757b01324", $expr: { $in: ["Yx4AN5v8gJBCtrN67", "$authorIds"] } }) and db.votes.find({ _id: "5be4dc3ce61b803757b01324", $expr: { $not: { $in: ["Yx4AN5v8gJBCtrN67", "$authorIds"] } } }), but everything I've read (without having actually measured it) suggests this will be much worse for performance.

I've added another couple of tests to cover rate limiting since it's pretty impractical to test manually (it doesn't cover everything, but it does cover this query - we can expand the test in the future if necessary).

@@ -222,6 +244,48 @@ describe('Voting', function() {
await waitUntilCallbacksFinished();
clock.uninstall();
});
it('includes co-authored posts in the selected date range', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yaaaaaay amazing

Copy link
Collaborator

@darkruby501 darkruby501 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few edge cases I don't think are necessarily covered by this. They don't necessarily matter, but that should be a deliberate choice:

  • Are coauthors prevented from upvoting documents they're authors on?
  • Coauthors miss out on karma if they're added after a post was published and first got votes
  • As above, coauthors risk getting negative karma if they were added after an upvote but before it was removed

Otherwise I'm a little uncertain about the direction of removing canonical "userId/author" on things. I have vague hesitations about this either requiring us to change a lot of code to handle multiple document owners (with ensuing complexity) or ending up in a mixed state where some things assume documents have a single primary owner and others expecting an array. My instinct is towards preserving the "documents, etc have canonical owners" and adding coauthors as an extension that doesn't need to be handled everywhere.

type: String,

// The IDs of the authors of the document that was voted on
authorIds: {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a bit uncomfortable at removing this. It will definitely break some analytics stuff for us, which is probably okay, but I have a feeling it will cause us headaches and that our general paradigm assumes documents have one canonical owner. One example, I assume revisions are still tied to one userId and am not yet sure how this PR will interact with that. (But only partway through reading it)

(I do separately think there's an anti-pattern around having both thing and thing-plural and that the existence of both email and emails has hurt us. (My ideal there is to get rid of the latter).

*/
const getDocumentOwners = ({newDocument, vote}: VoteDocTuple): string[] => {
const coauthors = vote.collectionName === "Posts"
? getConfirmedCoauthorIds(newDocument as DbPost)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: what permissions do coauthors have on documents? If they're not owners in the full sense of permissions (delete, share, draft, etc), this language could be confusing to call them owners here

}
});

voteCallbacks.cancelAsync.add(function cancelVoteKarma({newDocument, vote}: VoteDocTuple, collection: CollectionBase<DbVoteableType>, user: DbUser) {
// only update karma is the operation isn't done by the item's author
if (newDocument.userId !== vote.userId && collectionsThatAffectKarma.includes(vote.collectionName)) {
void Users.rawUpdateOne({_id: newDocument.userId}, {$inc: {"karma": -vote.power}});
const owners = getDocumentOwners({newDocument, vote});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this implementation, I think, if a post is upvoted, then a co-author is added, and then someone removes their upvote, the co-author will end up with net-negative karma.

Co-authors being added after initial publication is not uncommon in my experience since often people publish before getting our help on adding coauthors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

_id: '$authorId',
karmaTotal: {$sum: "$adjustedPower"},
_id: '$authorIds',
karmaTotal: {$sum: "$power"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Def want to see the migrations tested thoroughly

@sync-by-unito sync-by-unito bot assigned oetherington and unassigned jpaddison3 Jul 13, 2022
@oetherington
Copy link
Collaborator Author

@darkruby501 Thanks for the review!

  • Are coauthors prevented from upvoting documents they're authors on? No, they can vote just like the main author, but they won't receive karma for their own votes (just like the main author).
  • Coauthors miss out on karma if they're added after a post was published and first got votes. Yes - JP and I thought this made the most sense, otherwise co-authors will suddenly receive an inexplicable chunk of karma once this goes live. It also simplifies the handling of the next point...
  • As above, coauthors risk getting negative karma if they were added after an upvote but before it was removed. Yes - I didn't think of this case. I've pushed a commit which prevents this, along with a unit test for it.

@oetherington
Copy link
Collaborator Author

@jpaddison3 I know you've already approved, but I'm assigning back to you as a sinity check because I added 7 commits since you reviewed.

@sync-by-unito sync-by-unito bot assigned jpaddison3 and unassigned oetherington Jul 13, 2022
@oetherington oetherington merged commit dd4d579 into master Jul 15, 2022
@sync-by-unito sync-by-unito bot assigned oetherington and unassigned jpaddison3 Jul 15, 2022
@oetherington
Copy link
Collaborator Author

@jpaddison3 @darkruby501 This has now been merged to master and deployed successfully. Here's the timings of the migrations:

  1. allowMultipleVoteAuthors migration (~840000 votes in 14.5 minutes)
  2. deploy to production (~5 minutes)
  3. allowMultipleVoteAuthors migration (10 votes in <1 second)
  4. removeVoteAuthorId migration (~840000 votes in 14 minutes)
  5. updateCollectionScores migration (14820 posts, 85655 comments, 26687 tagrels and 4747 users in 1 minute)

For LessWrong, it's very important that you run allowMultipleVoteAuthors before and after your next deploy. The other two migrations can be left until later if you don't have time, but allowMultipleVoteAuthors is absolutely required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants