Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify default Discourse weights #1681

Merged
merged 1 commit into from Mar 4, 2020
Merged

Modify default Discourse weights #1681

merged 1 commit into from Mar 4, 2020

Conversation

@decentralion
Copy link
Member

decentralion commented Feb 29, 2020

This commit modifies the default weights in the Discourse plugin. The
overall theme is to make the plugin flow less cred to "raw activity", in
favor of only flowing cred to posts where there is some explicit signal
that they were valuable.

Most significantly, we move over to fully like-minted cred, instead
of minting cred directly to posts and topics. Also, we remove the edges
that tend to flow cred to posts indiscriminately. For example, topics no
longer flow cred to every post within the topic.

Based on local testing on a few forums I'm familiar with, I feel
confident that these cred scores are an improvement over the current
defaults, as we now have a few "real life test cases" of high-noise,
high-activity users, and these weights reduce the amount of cred that
accrues to such "stress testers". With these changes, I feel that we can
start cautiously using the Discourse plugin in Trust Level 2
communities.

Test plan: yarn flow and code inspection are sufficient to verify that
the new weights are technically valid. Because my calibration process
for validating these changes involves subjective judgements about
contributions from real people, I'm declining to publicly post any
specifics; reviewers are welcome to reach out to my offline for further
discussion.

This commit modifies the default weights in the Discourse plugin. The
overall theme is to make the plugin flow less cred to "raw activity", in
favor of only flowing cred to posts where there is some explicit signal
that they were valuable.

Most significantly, we move over to fully [like-minted cred][1], instead
of minting cred directly to posts and topics. Also, we remove the edges
that tend to flow cred to posts indiscriminately. For example, topics no
longer flow cred to every post within the topic.

Based on local testing on a few forums I'm familiar with, I feel
confident that these cred scores are an improvement over the current
defaults, as we now have a few "real life test cases" of high-noise,
high-activity users, and these weights reduce the amount of cred that
accrues to such "stress testers". With these changes, I feel that we can
start cautiously using the Discourse plugin in [Trust Level 2][2]
communities.

Test plan: `yarn flow` and code inspection are sufficient to verify that
the new weights are technically valid. Because my calibration process
for validating these changes involves subjective judgements about
contributions from real people, I'm declining to publicly post any
specifics; reviewers are welcome to reach out to my offline for further
discussion.

[1]: https://discourse.sourcecred.io/t/minting-discourse-cred-on-likes-not-posts/603
[2]: sourcecred/docs#24
@decentralion decentralion requested review from Beanow and wchargin Feb 29, 2020
Copy link
Member

wchargin left a comment

These aren’t exactly the same as the weights used for SourceCred itself
(we set node weight for likes to 16 rather than 4), but that’s okay with
me.

@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Mar 1, 2020

These aren’t exactly the same as the weights used for SourceCred itself
(we set node weight for likes to 16 rather than 4), but that’s okay with
me.

Yeah, my main goal here was to remove all the activity-minted cred, and make the edge weights more spam-resilient. If Discourse is used in isolation, then the absolute cred minting per like is irrelevant; and I plan to come up with "cross plugin relative weight norms" later (after #1682).

Copy link
Member

Beanow left a comment

We've debated why minting cred on explicit value signals are an improvement. So 👍 for the node weights.

I would definitely like to see a better explanation of the edge weights, or to split this to a second PR.

Significantly changing the cred flow topology by removing all of these backward edges, seems like an entire separate discussion and I'm not comfortable approving or blocking, as I don't have enough insight into what this will do.

Copy link
Member

wchargin left a comment

Most of the edge weight changes mitigate posts accruing “un-reviewed”
cred by linking to lots of high-cred things. It sounds enough reasonable
to me that I don’t mind trying it out, though I do like the philosophy
of the backward-edge behavior. Would be interested in thinking about how
to get the best of all worlds.

@Beanow

This comment has been minimized.

Copy link
Member

Beanow commented Mar 3, 2020

Most of the edge weight changes mitigate posts accruing “un-reviewed”
cred by linking to lots of high-cred things. It sounds enough reasonable
to me that I don’t mind trying it out, though I do like the philosophy
of the backward-edge behavior. Would be interested in thinking about how
to get the best of all worlds.

TL;DR
I understood the intention, but to review I would need to have some indication of whether it would do what it's intended to do. It's too complex for me to judge the implications as-is. Too much uncertainty for me to even estimate whether "it looks like an improvement to me" from the author is causal or coincidental.

If I was asked to review: should we run this as an experiment? I could have given an answer.
(It would be: an experiment would make sense to me in our own instance rather than master.)

Since that's not the question, I'm going with the PR is under-explained in this regard.


For example:

  • A lot of learning algorithms require sufficient entropy to avoid local maxima/minima. Does SourceCred benefit from a similar notion, esp. when taking a holistic view, with the contributors' behavior as part of the algorithm? And would flowing cred back to a larger context (like post -> topic -> forum -> project) provide this entropy?
  • Does the lack of back-edges give more control to contributors to shape their graph, opening up to new gaming options? Perhaps making it actually less robust for TL2?
  • Would having a lot more one-way edges increase the average amount of "hops" needed for cred to arrive at valuable nodes? And would the alpha parameter dilute those valuable nodes? Or would it simply take more iterations to arrive at a similar stable distribution?
  • Does this change include a trade-off on the axis of [social validation <-> being "discovered" by the algorithm]? And have we got at least a theory where the sweet spot would be?
  • Would this change be extra detrimental to "leaf nodes" and cause new contributors to be stuck in a cred-quarantine state until their first social validation?

etc. etc. Worst of all, a lot of this seems like it can't be considered in isolation.

@decentralion

This comment has been minimized.

Copy link
Member Author

decentralion commented Mar 4, 2020

@Beanow, I appreciate your pushback and think you raise reasonable points. The underlying issue here is that right now, we don't have tools or processes for assessing the quality of cred scores, and thus we're flying blind when making changes to them.

This means that we should invest in cred analysis tooling, in repeatable experiments we can run on cred quality, etc. However, I don't think we should block changes to weights on developing this tooling.

The current discourse weights were chosen with very little forethought (the PR that added them (#1292) had 0 discussion of weight choices, and was just focused on unblocking the CredSperiment. These updated weights are more grounded, since they came about from a practical need to mitigate certain issues we're seeing in practice. We should avoid status quo bias, especially in cases where the status quo came about from a mostly random/arbitrary process.

Therefore, I'm merging this change. Thanks @wchargin and @Beanow for reviews.

@decentralion decentralion merged commit 711805d into master Mar 4, 2020
4 checks passed
4 checks passed
ci/circleci: publish-1 Your tests passed on CircleCI!
Details
ci/circleci: test Your tests passed on CircleCI!
Details
ci/circleci: test_full Your tests passed on CircleCI!
Details
ci/circleci: test_full_10 Your tests passed on CircleCI!
Details
@decentralion decentralion deleted the change-default-weights branch Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.