[Research] Add toxicity detection pipeline #32826
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description (*)
Did you ever consider the addition of a toxicity detection bot to this project?
We’re Nathan Cassee and Alexander Serebrenik (Eindhoven University of Technology, The Netherlands), Nicole Novielli (University of Bari, Italy), Christian Kastner (Carnegie Mellon University, USA), and Bogdan Vasilescu (Carnegie Mellon University, USA). And we are conducting research to understand the effectiveness of toxicity detection bots on GitHub. As part of our research we want to understand the impact of a state of the art toxicity detection bot in your project. We hope that a better knowledge and understanding of how these toxicity bots operate can be used to further improve the health of open-source projects.
To participate in this experiment we ask you to adopt a toxicity bot in magento/magento2. You can adopt the bot by merging this pull-request. This bot will monitor issues and pull-requests for comments containing toxicity, and will post a comment if it detects toxicity. Additionally, the bot will securely store comments and edits or deletions made to those comments. This will allow us to study the frequency of toxic comments, and whether and how toxic comments are edited or deleted.
We expect that the toxicity bot reduces toxicity in issues or in pull-requests. However, there might be cases where the bot responds in issues or pull-requests where there is no toxicity (a false positive) and this might distract on-topic discussions.
Practicalities
Your participation in this study is completely voluntary, at any point in time you can retract your project from this study. This can be done by disabling the toxicity bot, or by sending us an email. Additionally, if you don’t want us to use the results of your project in the analysis of the study, you can always email us to inform us that you want to retract your data from the study.
The study itself will run for roughly three months, at the end of this time we will open a PR in your project to remove the toxicity bot from the project. If after the experiment you want to keep using the toxicity bot we can also provide you with a version of the toxicity bot that does not record telemetry.
The data collected for this study will be stored securely on a private server, and the raw data will only be available to the researchers involved in this study. When we release or publicize results of the study the results will be released anonymously or in an aggregated way.
This study has been approved by the Ethical Review Board of the Eindhoven University of Technology.
Closing and Survey
If you have any questions about the bot feel free to ask them here, or mail them to n.w.cassee@tue.nl.
If you are not interested in participating we would really appreciate it if you would let us know why you are not participating.
Secondly, it would be really helpful if everyone involved in the project could respond to the following survey on your expectations of the bot, especially if you are not interested in adopting the bot (https://docs.google.com/forms/d/e/1FAIpQLSdaioKzNeYjeYqbo2MpAvCGBgClo4zeSqDlA2Lx4o5KJKJ24A/viewform)!
Questions or comments
Note: We are not sure how best to approach projects to participate in this study, if you thought this PR was spammy, or unhelpful, please let us know so we can modify how we invite projects!
Contribution checklist (*)