Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a node for each Discourse like #1587

Merged
merged 1 commit into from Jan 29, 2020
Merged

Create a node for each Discourse like #1587

merged 1 commit into from Jan 29, 2020

Conversation

@decentralion
Copy link
Member

decentralion commented Jan 22, 2020

Let's use the syntax (node) to represent some node, and > edge > to
represent some edge.

In the past, for every like, we would create the following graph
structure:

(user) > likes > (post)

As of this commit, we instead create:

(user) > createsLike > (like) > likes > (post)

We make this change because we want to mint cred for likes. Arguably,
this is more robust than minting cred for activity: something being
liked signals that at least one person in the community found a post
valuable, so you can think of moving cred minting away from raw activity
and towards likes as a sort of implicit "cred review".

Create a node for each like is a somewhat hacky way to do it--in
principle, we should have a heuristic which increases the cred weight of
a post based on the number of likes it has received--but it is expedient
so we can prototype this quickly.

Obviously, this is not robust to Sibyll attacks. If we decide to adopt
this, in the medium term we can add some filtering logic so that e.g. a
user must be whitelisted for their likes to mint cred. (And, in a nice
recursive step, the whitelist can be auto-generated from the last week's
cred scores, so that e.g. every user with at least 50 cred can mint more
cred.) I think it's OK to put in a Sibyll-vulnerable mechanism here
because SourceCred is still being designed for high trust-level
communities, and the existing system of minting cred for raw activity is
also vulnerable to Sibyll and spam attacks.

Test plan: Unit tests updated; also @s-ben can report back on whether
this is useful to him in demo-ing SourceCred on MakerDAO.

If we merge this, we should simultaneously explicitly set the weight to
like nodes to 0 in our cred instance, so that we can separate merging
this feature from actually changing our own cred (which should go
through a separate review).

@decentralion decentralion requested review from Beanow and wchargin Jan 22, 2020
@decentralion decentralion force-pushed the discourse-likes branch from 20b3f7e to 41f4d3d Jan 22, 2020
@Beanow

This comment has been minimized.

Copy link
Member

Beanow commented Jan 22, 2020

If we merge this, we should simultaneously explicitly set the weight to
like nodes to 0 in our cred instance, so that we can separate merging
this feature from actually changing our own cred (which should go
through a separate review).

I'm not sure this:
(user) > likes > (post)
Is fully equivalent to this:
(user) > createsLike > (like: 0 weight) > likes > (post)
I'm expecting parameters like alpha to change the impact of a like this way.

And doing a node contraction instead to be sure they are equivalent seems like needless extra work.

So I would have a strong preference for bringing the community consensus forward. Discussing the proposed changes before merging. Perhaps including a prototype to show the new behavior with example weights.

Copy link
Member

Beanow left a comment

Obviously, this is not robust to Sibyll attacks. If we decide to adopt
this, in the medium term we can add some filtering logic so that e.g. a
user must be whitelisted for their likes to mint cred.

Create a node for each like is a somewhat hacky way to do it--in
principle, we should have a heuristic which increases the cred weight of
a post based on the number of likes it has received--but it is expedient
so we can prototype this quickly.

Gave this some thought.
Perhaps that's a reason not to implement this.

Suppose we increase the like edge weights instead to make likes more important. The main difference I can think of is, it matters a lot who gave you a like. It's better to be liked by a "Cred whale" than someone fairly new. This could be the sibyll proofing we need, because existing cred works as the filtering logic here. But the linear scaling might make it biased in comparison towards circlejerking and favoring the whales.

Perhaps this "heuristic which increases the cred weight of a post" should be like quadratic voting.
Although without stake, it normalizes the cred of the person liking, through a quadratic voting-like function to gauge the "quality of the signal".


Given that we're interested in allowing plugins to set weights with their own logic (WeightedGraph).
And we're interested in more versatile ways to mint cred (supernodes). It seems like a very temporary solution.

That said. I think it may be worth having a discussion about temporary solutions for experimentation.
Going to propose an approach for this on the forums.

@vsoch

This comment has been minimized.

Copy link
Contributor

vsoch commented Jan 22, 2020

For those that need a refresher on Sybil Attacks.

I'd want to ask what is the current standard for the entities that make up nodes. Intuitively it would be (somewhat) tangible things like people, topics, pull requests, etc. Adding a like as a full fledged entity first makes me think about how this fits into a long term graph. If we think of many different kinds of reactions to posts, there are many different ones, and so representing just a "like" doesn't do the future landscape of reactions full justice. An alternative could be to have a "reaction" node that then has an attribute to describe the kind (does the graph support nodes with attributes?).

My gut reaction is that it would be more appropriate to adjust the weight on the link, or some attribute that determines the weight of the link. The links are relationships, the nodes are entities, and to model a relationship as a node seems off.

@Beanow

This comment has been minimized.

Copy link
Member

Beanow commented Jan 22, 2020

I'd want to ask what is the current standard for the entities that make up nodes. Intuitively it would be (somewhat) tangible things like people, topics, pull requests, etc.

Don't know how much sense this makes for SC nodes, but here's a way of thinking about it from DDD.
https://en.wikipedia.org/wiki/Domain-driven_design#Building_blocks
And the difference it makes between a value object and entities.

Entities are not defined by their attributes.
A Topic title is changed, it's still the same Topic.

Value objects are defined by their attributes.
Change the zipcode of an address, it's not the same address.

The commit may be in a grey area, but all other nodes so far are entities.
And all the edges so far are value objects.

I would argue a like is a value object.
Likes only have (same as edges):

  • a source
  • a destination

If you change the source or destination.

Source:
- Beanow likes Post-123
+ Dandelion likes Post-123

Destination:
- Beanow likes Post-123
+ Beanow likes Post-555

It's not the same like.

@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Jan 23, 2020

I'm not sure this:
(user) > likes > (post)
Is fully equivalent to this:
(user) > createsLike > (like: 0 weight) > likes > (post)
I'm expecting parameters like alpha to change the impact of a like this way.

This is a good point. If we set the node weight of the like nodes to 0, then this PR would actually decrease the cred-significance of likes, because alpha fraction of the cred that would flow through the like will instead "evaporate" as it passes through the like node.

Given that we're interested in allowing plugins to set weights with their own logic (WeightedGraph).
And we're interested in more versatile ways to mint cred (supernodes). It seems like a very temporary solution.

Yeah, it definitely is a temporary solution. It's actually motivated less by SC's own use case than by something we saw on another project's Discourse instance. Basically, there was a user who had made a very large number of relatively low-effort (and low-like) posts, and was receiving a disproportionate share of the cred, while other higher-signal users had low cred. On a hunch, I explored the forum data, and found that the higher signal posters had fewer posts, but far more likes received.

This gave me the idea that we could mint cred based on likes (which represent implicit review of activity) rather than on raw activity. Since we're in active dialogue with that other project, I wanted to prototype something super quickly to show them. The feedback from that project was that this version seems better than the original.

I sent a PR for this branch mostly to prompt discussion -- I don't feel it's super important that it merges in it's current form, esp. since we are pretty close (via WeightedGraph) to being able to implement it much more cleanly.

Perhaps including a prototype to show the new behavior with example weights.

I've done an analysis of the new behavior. I'm going to post it on Discourse instead of here, so that everyone in the community can take a look and chime in.

I'd want to ask what is the current standard for the entities that make up nodes. Intuitively it would be (somewhat) tangible things like people, topics, pull requests, etc. Adding a like as a full fledged entity first makes me think about how this fits into a long term graph.

I would go with a minimalist description of a node, and say: "a node is anything which should accumulate cred". (One will note that this is somewhat subjective, which is fine.) I think it's clear that contributors, pull requests, etc, should accumulate cred. Likes probably should not. So I agree that likes should not be nodes.

If we think of many different kinds of reactions to posts, there are many different ones, and so representing just a "like" doesn't do the future landscape of reactions full justice.

On the GitHub side, we have distinct edge types for each positive reaction (we ignore the 👎 reaction for now)

(does the graph support nodes with attributes?).

Not really. You can see the types here:

/**
* Represents a node in the graph.
*/
export type Node = {|
+address: NodeAddressT,
// Brief (ideally oneline) description for the node.
// Markdown is supported.
+description: string,
// When this node was created.
// Should be null for a "timeless" node, where we don't
// want to model that node as having been created at any particular
// point in time. User nodes are a good example of this.
+timestampMs: number | null,
|};
/**
* An edge between two nodes.
*/
export type Edge = {|
+address: EdgeAddressT,
+src: NodeAddressT,
+dst: NodeAddressT,
+timestampMs: number,
|};

@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Jan 23, 2020

Update: I've posted an analysis of how this changes the Discourse cred here: https://discourse.sourcecred.io/t/minting-discourse-cred-on-likes-not-posts/603

@decentralion decentralion removed the request for review from wchargin Jan 26, 2020
@wchargin wchargin self-requested a review Jan 26, 2020
@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Jan 26, 2020

I've come to believe we should merge this change as-is. By having likes and posts be separate nodes (which both mint cred), we can express cred minting formulae like nLikes * likesConstant + nPosts * postConstant

If we don't make likes correspond to nodes, we could still set custom weights on each post based on the number of likes... but we're constrained to expressing things like nPosts * postConstant + nLikes * nPosts * likeConstant

Because for any individual post, right now we can mint it postConstant cred directly, and nLikeForThisPost * likeConstant implicitly, if we conceptualize each post as "owning" its likes.

However, without this change, then the minted cred would be postConstant * (nLikes * likeConstant + 1), because postConstant, applied at the type level, would be multiplicative with the specific node weight.

Feel free to ask questions if this is unclear; hopefully as we start to document the algorithm more cleanly it will be easier to communicate this.

As for why this makes sense conceptually: it basically means that Likes are contributions. It's counter-intuitive, but it makes sense if you think about it. People are doing some work by reading posts, and expressing which posts they found valuable. That information is itself valuable (which is why we are consuming it in the algorithm). Right now the amount of cred we mint per like is definitely in excess of the like's intrinsic value, which is a (temporary, acceptable) hack since we're still in an alpha state.

(thanks to an offline convo with @wchargin)

@decentralion decentralion force-pushed the discourse-likes branch from 41f4d3d to e615e9b Jan 26, 2020
Copy link
Member

wchargin left a comment

@decentralion and I discussed this a bit, and we actually think that
“the set of things that mint cred is the set of contributions” is
reasonable and that likes are contributions. They’re something of
value that people create and curate. That this additionally enables the
cred minting dynamics that we want without introducing new mechanics
into the system—mechanics that are (imho) significantly more confusing
than this graph structure.

@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Jan 26, 2020

Note: If we were using semver to signify cred-changing changes, we would need to bump version numbers here. We're not doing that right now, but it's interesting to think about.

(Even if the client set the weights on likes to an explicit 0, this would still be a breaking change, due to the alpha decay matter brought up by @Beanow.)

@wchargin

This comment has been minimized.

Copy link
Member

wchargin commented Jan 26, 2020

(Semver for cred-changing changes seems a long way off to me, for the
same reason that Prettier takes a semver exemption for code formatting
changes.)

@Beanow
Beanow approved these changes Jan 26, 2020
Copy link
Member

Beanow left a comment

Based on https://discourse.sourcecred.io/t/minting-discourse-cred-on-likes-not-posts/603
I think minting cred for likes has nice properties. So semantics aside, think this is an improvement.

addPost(post: Post): string {
const topicTitle =
this.topicIdToTitle.get(post.topicId) || "[unknown topic]";
Comment on lines 217 to 219

This comment has been minimized.

Copy link
@Beanow

Beanow Jan 26, 2020

Member

Nit: returning just the post description here seems weird.
Prefer to factor out a postDescription(post: Post): string method, which both addPost() use, and the constructor's loop uses.

This comment has been minimized.

Copy link
@decentralion

decentralion Jan 29, 2020

Author Member

I refactored it to be consistent with topicIdToTitle; the weird return is removed.

@@ -173,19 +199,26 @@ class _GraphCreator {
this.graph.addEdge(authorsTopicEdge(serverUrl, topic));
}

const postDescriptions: Map<PostId, string> = new Map();

This comment has been minimized.

Copy link
@Beanow

Beanow Jan 26, 2020

Member

Nit: inconsistent with topicIdToTitle which sets the map as a class property.

This comment has been minimized.

Copy link
@decentralion

decentralion Jan 29, 2020

Author Member

Refactored for consistency.

const postDescription =
postDescriptions.get(like.postId) || "[unknown post]";
this.graph.addNode(likeNode(serverUrl, like, postDescription));
this.graph.addEdge(likesEdge(serverUrl, like));
this.graph.addEdge(createsLikeEdge(serverUrl, like));
Comment on lines 209 to 213

This comment has been minimized.

Copy link
@Beanow

Beanow Jan 26, 2020

Member

Nit: inconsistent with other node types, which use an addLike style method.

This comment has been minimized.

Copy link
@decentralion

decentralion Jan 29, 2020

Author Member

Refactored for consistency.

*Let's use the syntax `(node)` to represent some node, and `> edge >` to
represent some edge.*

In the past, for every like, we would create the following graph
structure:

`(user) > likes > (post)`

As of this commit, we instead create:

`(user) > createsLike > (like) > likes > (post)`

We make this change because we want to mint cred for likes. Arguably,
this is more robust than minting cred for activity: something being
liked signals that at least one person in the community found a post
valuable, so you can think of moving cred minting away from raw activity
and towards likes as a sort of implicit "cred review".

Create a node for each like is a somewhat hacky way to do it--in
principle, we should have a heuristic which increases the cred weight of
a post based on the number of likes it has received--but it is expedient
so we can prototype this quickly.

Obviously, this is not robust to Sibyll attacks. If we decide to adopt
this, in the medium term we can add some filtering logic so that e.g. a
user must be whitelisted for their likes to mint cred. (And, in a nice
recursive step, the whitelist can be auto-generated from the last week's
cred scores, so that e.g. every user with at least 50 cred can mint more
cred.) I think it's OK to put in a Sibyll-vulnerable mechanism here
because SourceCred is still being designed for high trust-level
communities, and the existing system of minting cred for raw activity is
also vulnerable to Sibyll and spam attacks.

Test plan: Unit tests updated; also @s-ben can report back on whether
this is useful to him in demo-ing SourceCred [on MakerDAO][1].

If we merge this, we should simultaneously explicitly set the weight to
like nodes to 0 in our cred instance, so that we can separate merging
this feature from actually changing our own cred (which should go
through a separate review).

[1]: https://forum.makerdao.com/t/possible-data-source-for-determining-compensation
@decentralion decentralion force-pushed the discourse-likes branch from e615e9b to d0f3f82 Jan 29, 2020
@decentralion decentralion merged commit 3f42d72 into master Jan 29, 2020
4 checks passed
4 checks passed
ci/circleci: publish-1 Your tests passed on CircleCI!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
ci/circleci: test_full Your tests passed on CircleCI!
Details
ci/circleci: test_full_10 Your tests passed on CircleCI!
Details
@decentralion decentralion deleted the discourse-likes branch Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.