New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend list of known Referrer Spammers #5099
Comments
I'll start: How about an antispam plugin that adds a 'Spam' button next to each visit? Not sure about how this would work on the backend, whether it is a straight filter, a learning feature (SpamAssassin) for each installation or a database managed some place (akismet). |
Well, I don't have the technical know-how to discuss how to do this. But a button next to each referer which can send it to a block list, sounds like a good idea to me. Mostly I'm visiting to report what I think is another referer spammer: web.mail.comcast.net (http://web.mail.comcast.net/zimbra/mail?app=mail) |
I can't say if it's useful in this case but in the past I was very happy with Bad Behavior |
Re: Bad Behavior I don't think it's the right tool for this job. I had a longer response that I spent ~20 minutes writing, but it was destroyed by the Trac spam checker for a pattern that I don't recall having in the message. I think if we are going to have a discussion on spam, then we need to disable spam control on this task, have the discussion some place else entirely, or post all messages to pastebin, because it seems like every message I try to post here is getting caught by the spam checker. |
I highly recommend to use this, it's saved me hours / days of frustration: https://addons.mozilla.org/en-US/firefox/addon/lazarus-form-recovery/ |
Replying to tassoman: I have Bad Behavior installed (on my SMF/Tiny Portal site), yet still am "visited" by semalt almost daily (at least until the Referer Spam blacklist comes out in a stable release). And I have another question. Would it be a 2-way button, like a toggle, so that if you accident-ally clicked it, you could click again to unblock? If undoing the block will be hard to do, perhaps it should not be quite so convenient. Or maybe have a comfirmation "are you sure you want to ....." or "OK" button? Idk what might be wrong with this message, but I also go the spam error. It said it has a "dental" pattern. What the heck??! (I'm removing the link from the quote, maybe that's the problem.)(No, still "dental"..... I'll try removing the whole quote.) (Maybe it means a text string? I'll try breaking up accident-ally.) Edit -- Bingo! Seems to be the text string. |
I think the list of referrers to be excluded from Piwik reports is a good idea, and Matt has done a good job in getting this started. However, the current implementation requires manual editing of the configuration file, which is generally not recommended for the average user. It would be better if there were a user interface for this feature, much like the way in which the "Settings --> Manage Websites --> Global websites settings --> Global list of user agents to exclude" list can be managed by a Superuser. Site spammers to be excluded should be listed one per line sorted alphabetically. Adding or removing a referrer would be as simple as adding or removing an entry from the displayed list. The out-of-the-box list should be pre-loaded with semalt.com and possibly one or two other known site spammers. If somebody then really wants to count such referrers, they can remove them from the predefined list. This simple user interface would be much easier to implement than having a button somewhere to add a displayed referrer to the Exclude list. |
Whatever solution is used, I hope that the future of this nuisance is considered. When I first started looking at access logs, maybe 12 years ago, I never saw any such thing as referral spam. Now, as seen in this pastebin, I get 54 spammers from one week. I don't recall when they started doing this, but I suspect it is only going to grow. I don't know if it will reach the epic proportions of comment spam, but whatever tool is created I hope this is taken into account to reduce workload of future devs. |
Ok, looks like I'm official now :-p Hope I'm doing this right.... I got a new referer the other day, and when I followed the link, my system security (ESET) blocked it. Here's the URL: http+://youtube-downloader.savetubevideo.com/youtube-downloader.php?u=http+://mydomain. And I realize there's such a thing as false positives, etc. But that's the 1st time I've gotten an apparently dangerous site, as a referrer. And that is a suspicious (looking) URL. So far, just that 1 incidence. I suppose I could find more info about it, from ESET, if it would be helpful. All best :-) PS - Hhmm....getting "an error" when I try to post. (What helpful error msg!) I seem to recall someplace where putting a URL in a comment would cause an error. So I'll mess up URL above with http+://etc. Cross fingers.... Nope -- still "There was an error"..... |
fyi, it wasn't the link -- it was some 1 of my Firefox extensions....which I'll have to figure out which one to post this (or just disable all of them again). Anyway, I think I have another referrer spammer. Twice in 2 days, and the link goes to a nearly blank page (3 dots and nothing else). The URL is: http://musicas.kambasoft.com/2.php?u=http://mydomain. Thanks, PS -- AdBlock Plus is the culprit ;-) |
Same top list of referrer spams for me as for brynnd: |
And new version of the kambasoft: |
Maybe should just make it *.kambasoft.com. (New one today: http://9.kambasoft.com/2.php?u=mydomain) |
@brynnd, setting the known Referrer Spammers list in piwik config file [Edit Feb 19, 2015: clarification that the configuration shall be made in the local config file]
Hope it will do the same for you. :-) |
Thank you jlj. For the most part, I'm content to wait for next stable release, for newly reported spammers to be added. But it's good to know how to do it manually :-) |
Another: getpocket.com |
musicas.baixar-musicas-gratis.com |
urlopener.blogspot.com I'm not sure if I would exactly call this one spam. But it's not the kind of referrer which I think Piwik intends to provide. |
herahair.com |
I guess this is a new trend, and perhaps needs to be considered in a permanent feature. Now it looks like these guys are changing their domain names just slightly, to bypass our blocks! I leads me to wonder (AGAIN!) what benefit these guys gain in doing this. "Spam" is "spam" and I guess no one really knows or understands its purpose. But I still do wonder why they waste time on this. What benefit could there possibly be? For anyone? Note that also there is buttons-for-website.com and buttons-for-your-website.com ggrrr! Also, another new one: best-seo-offer.com |
+1 best-seo-offer.com |
+1 for regex support, would be very useful |
+1 best-seo-offer.com |
I agree this is getting more and more of a problem. By the way it seems that Google Analytics is not tackling this problem yet (I see all those spammers in GA), which gives Piwik a nice little plus. Anyway I believe too that we need to find a more robust solution for this: as a user I don't like to have to wait for new Piwik releases to exclude new spammers (my data keeps being polluted in the meantime). I also think the current way of reporting spammers (GitHub issue) is not the best. We need:
@mattab I'd like to open a separate issue to discuss a solution (I have a few ideas) so that we keep this issue for reporting spammers, what do you think? Should we tackle this soon or let it be for now? |
something we can try with low effort is to put the list in a separate file with one spammer per line, and then ask the community to issue pull request, because it's so easy to make a PR on github web interface alone, it could be quick and efficient solution. to discuss auto update feature +1 to discuss in a separate new issue to make sure it is followed up edit: as it looks like Referrer spamming is here to stay, I guess many people managing websites will have issues with this. there is value in Piwik sharing our list with the world! |
I've opened #7674 to continue the discussion. |
It seems we are not the only one trying to build such list, see http://www.reddit.com/r/Wordpress/comments/2qteln/i_want_to_build_a_list_of_referrer_spam_links_to/ (seems to be updated regularly) |
While there is no better way to report spammers at the moment, I'll continue here :)
|
Entire list alphabetically: 4webmasters.org,7makemoneyonline.com,adcash.com,anticrawler.org,best-seo-offer.com,best-seo-solution.com,bestwebsitesawards.com,blackhatworth.com,buttons-for-website.com,buttons-for-your-website.com,cenokos.ru,cenoval.ru,cityadspix.com,darodar.com,econom.co,iskalko.ru,edakgfvwql.ru,forum.smailik.org,Get-Free-Traffic-Now.com,gobongo.info,googlsucks.com,hulfingtonpost.com,humanorightswatch.org,ilovevitaly.co,ilovevitaly.com,ilovevitaly.ru,kambasoft.com,luxup.ru,make-money-online.7makemoneyonline.com,myftpupload.com,o-o-6-o-o.ru,o-o-8-o-o.ru,priceg.com,prlog.ru,ranksonic.info,ranksonic.org,savetubevideo.com,screentoolkit.com,semalt.com,semalt.semalt.com,seoexperimenty.ru,simple-share-buttons.com,slftsdybbg.ru,social-buttons.com,socialseet.ru,superiends.org,theguardlan.com,vodkoved.ru,websocial.me,ykecwqlixx.ru |
I think this can easily be countered if someone builds a thingy in Piwik to download a new list every so much hours, say once every 24 hours. And the serving system where people can submit links. If 10 (?) people submit the same link it's added in the downloadable list which each piwik setup can fetch every 24 hours. This could employ a simple filter to get double submissions like www.semalt.com vs. semalt.com so the list stays somewhat compact and clean. This submission system can be a thing in Github (Similar to torrent blocklists - https://gist.github.com/johntyree/3331662) or something more fancy. I'm not overly familiar with Github and new to Piwik. Just came across this because I have a referrer spam problem also. |
The list has been moved to https://github.com/piwik/referrer-spam-blacklist in order to make it more visible and more practical to maintain. Please open issues and pull requests in that new repository :) If you are interested to know how this list will be handled, read this issue: #7674 @Fensterbank thanks I confirm those 2, I have added them to the new list. |
How do I contribute to the Referrer spammer Piwik list?To add a new referrer spammer to the list, click here to edit the spammers.txt file and create a pull request. Alternatively you can create a new issue. Looking forward in the future to maintaining this spammer list together as a community 👍 |
Since the list is now maintained via file on github, any chance that the admin site can check for newer versions, and ideally go download it automatically? |
Well, it's exciting to see some new energy and seeing this project move forwards! Unfortunately, I'm not very familiar with programming or how this site works. When I click the link in mattab's last msg, to edit the spammers.txt file, it says: -- You need to fork this repository to propose changes. I don't know what that means to "fork this repository". So I used the other option and reported a new spammer (another semalt variety). But I'm still not sure how to proceed with my piwik installation. I did read #7674, but unfortunately, I don't understand much of it. I also read https://github.com/piwik/referrer-spam-blacklist, but again, don't understand much. If I have the current version of Piwik, am I getting all the spammers blocked? Or do I need to continue adding new spammers to my config.ini.php file? Thank you very much :-) |
@brynnd it's fine if you weren't able to edit the file directly, opening an issue is good too. When it is added to the list it will be included in the new Piwik version, so that's why it's important to keep Piwik up to date. In the future we want to auto-update the list so that you get the latest spammers blocked even before the new Piwik release is available (issue #7674). |
Thanks mnapoli! Where can I check the current list, so I don't accidentally add duplicates to the list? Especially these new semalt-related one, where they're just changing the domain by a character or 2/ |
@brynnd latest version is at: https://github.com/piwik/referrer-spam-blacklist/ Please note: you don't need to add the semalt variation if they are sub-domains of semalt.com (or any other spammer). but if they are new domain names (not sub-domains) then please suggest the new spammers on this project: https://github.com/piwik/referrer-spam-blacklist/ |
I've added several domains to piwik/config/config.ini.php as described above. A week has passed and every day I see new entries from these same domains. Am I missing something basic, or is this worth opening a new issue? This is at the end of my config.ini.php, and I've restarted Apache (and later the server) but modifying this configuration file has had no effect at all:
|
@tombrossman with the latest Piwik versions this INI config option isn't used anymore. There will be a new Piwik release very soon (probably tomorrow), else you can update to the latest beta and those spammers will be blocked. With Piwik 2.14 the spammers list will be updated automatically. |
Ah, thanks - that makes sense now. I thought it was me doing something stupid again... |
Is anti-referral spamming included in piwik 2.14 and enabled by default? Is there any way to retroactively apply it, like with the geoip location dbs? I couldn't find any documentation on this. Thanks. |
Yes
No, it doesn't apply retroactively. |
In #2268 we have implemented a Referrer spam list, initially seeded with the worst of all spammers: semalt.
However it turns out there are thousands of other spammers that attack Piwik users with their lame websites. It will be hard to keep track of them all and write the list in the config file here.
List of spammers:
What is our best way to move forward?
The text was updated successfully, but these errors were encountered: