Proposal: Category "fake science" / malicious journals #720

pascalwhoop · 2018-07-19T15:04:18Z

this project mirrors a list that was taken down recently. The idea is to help researchers beware of fake journals. This could be very useful for academic institutions that make sure their researchers aren't tricked into publishing in such fake journals.
Obviously this requires good crowdsourcing to ensure the domains are actually fake and not legit but small journals.
Kicking off a discussion to see what others think.

I think this repo is a great place to embed this into. You have reputation, experience and the toolchain to manage such a list efficiently and publicly. There was a recent study done by my University that found that 5% of all German researchers have been tricked once or more and that several thousand researchers worldwide were fooled by these.

I'd be happy to turn that linked list into an initial hosts file but I'd like to make sure somehow that these are actually all fake and I am not yet sure how that could be easily achieved.

welcome · 2018-07-19T15:04:20Z

Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!

StevenBlack · 2018-07-19T16:06:50Z

Hi @pascalwhoop that's a very interesting idea. Thanks!

katrinleinweber · 2018-08-07T07:47:27Z

Nice :-) Should the list be maintained here then, or over at @stop-predatory-journals?

I wonder whether it would be possible to generate the journals & publishers lists in such a machine-readable hosts file (or two), and auto-generate the website from them?

pascalwhoop · 2018-08-07T08:23:17Z

@katrinleinweber you can definitely generate the lists (hosts -> csv -> website) automatically, as long as the base list follows some strict pattern. The rest can be done with grep and sed. I wrote one script that did most of the work from the csv files to hosts files but the csv files are a bit messy and so I didn't continue.
More importantly, how do we ensure that these lists are "true"? I imagine there is a gradient between predatory journals and just a really unpopular / unimportant ones. What would be a good "in or out" determination criteria. Alternatively, we could have 3 categories with increasing levels of "probably evil". Then, universities could manage these and handle them differently. They could have a yellow warning, red warning and finally complete block of the host from within their network.

I will contact my universities network administrators and see what they have set up in terms of infrastructure. Hosts are a good start for plain DNS blocking but there may be some other ways that are a bit more complex but gentle, like the "this is malware, continue anyways?" page that chrome sometimes displays. One could have an internally hosted application that says "this is a bad journal known to trick people, continue anyways?" and if selected, the researcher is forwarded to the actual website.

katrinleinweber · 2018-08-07T08:42:21Z

More importantly, how do we ensure that these lists are "true"?

In whatever way @stop-predatory-journals is currently using. See stop-predatory-journals/stop-predatory-journals.github.io#1 (comment) for example. That's why I think also a hosts file should either be maintained there, or auto-generated from their source.

What exactly is wrong with their CSV files? I imagine they can be cleaned up so that they lend themselves to being sed straight automatically.

pascalwhoop · 2018-08-07T08:50:08Z

I was having troubles catching this line for example
https://github.com/stop-predatory-journals/stop-predatory-journals.github.io/blob/master/_data/journals.csv#L361

also line 404 and 477

spirillen · 2018-11-20T23:44:43Z

After reading into this project goal, I must admit it's a good idea, but how would you differ the greedy from the hoax sites?

As I understand this repo it's not against greedy basters as github.com (Microsoft) would have been add to the hosts file..

StevenBlack self-assigned this Jul 19, 2018

StevenBlack added the discussion label Jul 19, 2018

pascalwhoop mentioned this issue Aug 7, 2018

Proposal: Creation of hosts file stop-predatory-journals/stop-predatory-journals.github.io#9

Open

katrinleinweber mentioned this issue Aug 7, 2018

WIP: Start hosts file with reg-ex-ed domains from stop-predatory-journals/stop-predatory-journals.github.io#11

Open

katrinleinweber mentioned this issue Aug 7, 2018

journals.csv contains some titles as URLs stop-predatory-journals/stop-predatory-journals.github.io#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Category "fake science" / malicious journals #720

Proposal: Category "fake science" / malicious journals #720

pascalwhoop commented Jul 19, 2018 •

edited

Loading

welcome bot commented Jul 19, 2018

StevenBlack commented Jul 19, 2018

katrinleinweber commented Aug 7, 2018

pascalwhoop commented Aug 7, 2018

katrinleinweber commented Aug 7, 2018

pascalwhoop commented Aug 7, 2018 •

edited

Loading

spirillen commented Nov 20, 2018 •

edited

Loading

Proposal: Category "fake science" / malicious journals #720

Proposal: Category "fake science" / malicious journals #720

Comments

pascalwhoop commented Jul 19, 2018 • edited Loading

welcome bot commented Jul 19, 2018

StevenBlack commented Jul 19, 2018

katrinleinweber commented Aug 7, 2018

pascalwhoop commented Aug 7, 2018

katrinleinweber commented Aug 7, 2018

pascalwhoop commented Aug 7, 2018 • edited Loading

spirillen commented Nov 20, 2018 • edited Loading

pascalwhoop commented Jul 19, 2018 •

edited

Loading

pascalwhoop commented Aug 7, 2018 •

edited

Loading

spirillen commented Nov 20, 2018 •

edited

Loading