UPDATE: this discussion has moved to its own repo now: https://github.com/npscience/equitable-preprints
Place for notes about idea to improve access of authors to literature contributions & peer review, help readers find otherwise low visibility content (before it has cause to be low visibility). Initial proposal written out for eLife Sprint 2019 application: here and copied below in italics.
Useful to curate content, so that valid and high-quality work is more visible amongst lots of content with minimal filtering before it is shared on internet. Value of peer review to check/amend assertions before entry into 'body of scientific work'. See Bill Gates comment about this https://inews.co.uk/news/terry-pratchett-fake-news-internet-prediction-bill-gates/
Note curation channels are social, subject to same biases as other social processes:
- we pay attention to those we know
- we don't actively seek out those we don't know or have connection/commonality with -- North American/Europe 'western world' publishers are 'international' journals but not global? Based in, designed by, editorially gatekept by NA/EU communities?
- we make judgements based on what we know already about people we know -- previous performance as the best indicator of future performance -- hence why we resort to author names to prejudge quality of work and whether it's worth us reading more
Preprint discovery is dominated by twitter
- follow the money: not public infrastructure, vulnerable to shareholder decisions and profit margins (none yet? what is future plan to recoup investment? or is investment being recouped via secondary methods eg value of twitter for advertising and brand awareness)
- size of social network --> degree of visibility of what you share (including your own work) -- Matthew Effect, rich get richer
- not all scientists on or comfortable with twitter (prefer linkedin, researchgate, for professionalism?) -- the new reputation skill frontier
- how to break beyond the twitter bubble?
Other discovery mechanisms:
- sign up for alerts, feeds, direct from server & from third-party tools e.g. PLOS monthly email, PreLights & PreLists, rxivist (trending preprints), twitter bots (@PromPreprints = promising preprints according to early social attention; SBotLite = RTs female first author preprints from biorxiv) third-party infrastructure design decisions:
- work with specific preprint servers (e.g. bioRxiv, which is largest atm --> platform dominance)
New data: US authors (first and last) are over-represented in authorship of journal articles that were previously deposited to biorxiv (nov 2013-Dec 2017) while female authors are under-represented: https://www.biorxiv.org/content/10.1101/673665v1.full
- discovery of preprints beyond twitter network
- identify preprints by authors who are less visible in traditional Global North publishing venues --> top of the pile for review, filtering?
- identify preprints regardless of where posted? (so long as posted somewhere with ethical screening, archiving, withdrawal processes?)
Existing tools / initiatives
- Open Knowledge Maps indicate increasing visibility of African research as reason for partnering with AfricArxiv: https://info.africarxiv.org/strategic-partnership-with-open-knowledge-maps/
- PREreview wants to review preprints in a way that is more inclusive than traditional peer review ??
- EuropePMC indexes preprints-- what could they do to display indicators of which preprints could be given attention?
- PreLists is new initiative by CoB to curate lists of preprints -- what are the assumptions and biases of the curators? How might they mitigate?
eLife Sprint project proposal
The paucity of research evaluation processes that directly and fairly reward research integrity, reproducibility and excellence underpins many of the challenges we face in science today. Journal prestige/Impact Factor continue to dominate as time-efficient surrogate metrics despite well-known insufficiencies. With preprints, science could be curated and evaluated in a more dynamic, transparent and equitable way than traditional journal gatekeeping mechanisms (known to be subject to multiple biases), allowing the incentives of reputation and funding to align with open science principles. With 3000 new biology preprints posted every month and no magic extra time available, the challenge is in deciding which to read, review, curate and recognise. The visibility of a preprint today is largely determined by the size of the authors’ Twitter network and pre-existing name recognition (in the absence of journal name): with preprints in biology, we risk perpetuating the same research evaluation challenges that stymie a more open and collaborative research enterprise. And to avoid widening existing inequalities, we must ensure systems that encourage attention on preprints are designed to recognise and support the less-visible, including smaller subject areas and work by authors negatively affected by bias (whether career stage, gender, ethnicity, geography and/or institutional affiliation). Lists of female and BAME scientists have helped promote more diverse conference speaker representation: following suit, there is already a twitter bot that retweets preprints posted by female first authors (https://sbotlite.github.io/). However, there are also bots/tools highlighting preprints that have already attracted the most social attention (https://github.com/wdecoster/PromisingPreprint; rxivist.org). I propose we prototype an easy, time-efficient way to identify and suggest for review preprints that may otherwise go unnoticed; I seek technological and design contributions from developers, data scientists, editors and prospective users (e.g. developers/contributors to PREreview, PreLists/PreLights, europePMC).
UPDATE: alternative may be to rationalise preprint views according to the social network reach of the authors (what is really being seen more than expected by the Matthew Effect?)