Extend list of known Referrer Spammers #5099

Closed
mattab opened this Issue May 6, 2014 · 128 comments

Projects

None yet
@mattab
Member
mattab commented May 6, 2014

In #2268 we have implemented a Referrer spam list, initially seeded with the worst of all spammers: semalt.

However it turns out there are thousands of other spammers that attack Piwik users with their lame websites. It will be hard to keep track of them all and write the list in the config file here.

List of spammers:

What is our best way to move forward?

@anonymous-piwik-user

I'll start:

How about an antispam plugin that adds a 'Spam' button next to each visit? Not sure about how this would work on the backend, whether it is a straight filter, a learning feature (SpamAssassin) for each installation or a database managed some place (akismet).

@anonymous-piwik-user

Well, I don't have the technical know-how to discuss how to do this. But a button next to each referer which can send it to a block list, sounds like a good idea to me.

Mostly I'm visiting to report what I think is another referer spammer:

web.mail.comcast.net (http://web.mail.comcast.net/zimbra/mail?app=mail)

@tassoman
Contributor
tassoman commented May 7, 2014

I can't say if it's useful in this case but in the past I was very happy with Bad Behavior

@anonymous-piwik-user

Re: Bad Behavior

I don't think it's the right tool for this job. I had a longer response that I spent ~20 minutes writing, but it was destroyed by the Trac spam checker for a pattern that I don't recall having in the message.

I think if we are going to have a discussion on spam, then we need to disable spam control on this task, have the discussion some place else entirely, or post all messages to pastebin, because it seems like every message I try to post here is getting caught by the spam checker.

@mattab
Member
mattab commented May 8, 2014

it was destroyed by the Trac spam checker for a pattern that I don't recall having in the message.

I highly recommend to use this, it's saved me hours / days of frustration: https://addons.mozilla.org/en-US/firefox/addon/lazarus-form-recovery/

@anonymous-piwik-user

Replying to tassoman:
I can't say if it's useful in this case but in the past I was very happy with Bad Behavior (link removed)

I have Bad Behavior installed (on my SMF/Tiny Portal site), yet still am "visited" by semalt almost daily (at least until the Referer Spam blacklist comes out in a stable release).

And I have another question. Would it be a 2-way button, like a toggle, so that if you accident-ally clicked it, you could click again to unblock? If undoing the block will be hard to do, perhaps it should not be quite so convenient. Or maybe have a comfirmation "are you sure you want to ....." or "OK" button?


Idk what might be wrong with this message, but I also go the spam error. It said it has a "dental" pattern. What the heck??!

(I'm removing the link from the quote, maybe that's the problem.)(No, still "dental"..... I'll try removing the whole quote.) (Maybe it means a text string? I'll try breaking up accident-ally.)

Edit -- Bingo! Seems to be the text string.

@mattab
Member
mattab commented May 11, 2014

In acb1bc2: Actually call the Referrer Spam check.
Fixes #2268 Refs #5099

@canajun2eh

I think the list of referrers to be excluded from Piwik reports is a good idea, and Matt has done a good job in getting this started.

However, the current implementation requires manual editing of the configuration file, which is generally not recommended for the average user. It would be better if there were a user interface for this feature, much like the way in which the "Settings --> Manage Websites --> Global websites settings --> Global list of user agents to exclude" list can be managed by a Superuser. Site spammers to be excluded should be listed one per line sorted alphabetically. Adding or removing a referrer would be as simple as adding or removing an entry from the displayed list.

The out-of-the-box list should be pre-loaded with semalt.com and possibly one or two other known site spammers. If somebody then really wants to count such referrers, they can remove them from the predefined list.

This simple user interface would be much easier to implement than having a button somewhere to add a displayed referrer to the Exclude list.

@anonymous-piwik-user

Whatever solution is used, I hope that the future of this nuisance is considered. When I first started looking at access logs, maybe 12 years ago, I never saw any such thing as referral spam. Now, as seen in this pastebin, I get 54 spammers from one week. I don't recall when they started doing this, but I suspect it is only going to grow. I don't know if it will reach the epic proportions of comment spam, but whatever tool is created I hope this is taken into account to reduce workload of future devs.

@brynnd
brynnd commented Jul 20, 2014

Ok, looks like I'm official now :-p Hope I'm doing this right....

I got a new referer the other day, and when I followed the link, my system security (ESET) blocked it. Here's the URL: http+://youtube-downloader.savetubevideo.com/youtube-downloader.php?u=http+://mydomain.

And I realize there's such a thing as false positives, etc. But that's the 1st time I've gotten an apparently dangerous site, as a referrer. And that is a suspicious (looking) URL. So far, just that 1 incidence. I suppose I could find more info about it, from ESET, if it would be helpful.

All best :-)

PS - Hhmm....getting "an error" when I try to post. (What helpful error msg!) I seem to recall someplace where putting a URL in a comment would cause an error. So I'll mess up URL above with http+://etc. Cross fingers....

Nope -- still "There was an error".....
Oh wait -- there's another url within an url....cross fingers again....

@brynnd
brynnd commented Jul 22, 2014

fyi, it wasn't the link -- it was some 1 of my Firefox extensions....which I'll have to figure out which one to post this (or just disable all of them again).

Anyway, I think I have another referrer spammer. Twice in 2 days, and the link goes to a nearly blank page (3 dots and nothing else). The URL is: http://musicas.kambasoft.com/2.php?u=http://mydomain.

Thanks,
brynn

PS -- AdBlock Plus is the culprit ;-)

@jlj
jlj commented Jul 26, 2014

; If you find new spam entries in Referrers>Websites, please report them here: #5099

Same top list of referrer spams for me as for brynnd:

  1. http://semalt.semalt.com/crawler.php?u=http://mydomain.com (already in global.ini.php)
  2. http://youtube-downloader.savetubevideo.com/youtube-downloader.php?u=http://mydomain.com
  3. http://musicas.kambasoft.com/2.php?u=http://mydomain.com
@brynnd
brynnd commented Jul 27, 2014

And new version of the kambasoft:

http://5.kambasoft.com/2.php?u=http://mydomain.com

@brynnd
brynnd commented Jul 28, 2014

Maybe should just make it *.kambasoft.com. (New one today: http://9.kambasoft.com/2.php?u=mydomain)

@jlj
jlj commented Jul 29, 2014

@brynnd, setting the known Referrer Spammers list in piwik config file piwik/config/global.ini.php with these 3 domains had the expected effect of removing all known referrer spam visits from my site's statistics in the last 3 days.

[Edit Feb 19, 2015: clarification that the configuration shall be made in the local config file]
To do this, add the lines below in the [Tracker] section of your local config file piwik/config/config.ini.php (create the section if it does not exist):
[/Edit]

; Comma separated list of known Referrer Spammers, ie. bot visits that set a fake Referrer field.
; All Visits with a Referrer URL host set to one of these will be excluded.
; If you find new spam entries in Referrers>Websites, please report them here: https://github.com/piwik/piwik/issues/5099
referrer_urls_spam = "semalt.com,savetubevideo.com,kambasoft.com"

Hope it will do the same for you. :-)

@brynnd
brynnd commented Jul 30, 2014

Thank you jlj. For the most part, I'm content to wait for next stable release, for newly reported spammers to be added. But it's good to know how to do it manually :-)

@brynnd
brynnd commented Aug 17, 2014

Another:

getpocket.com

@brynnd
brynnd commented Sep 2, 2014

urlopener.blogspot.com

I'm not sure if I would exactly call this one spam. But it's not the kind of referrer which I think Piwik intends to provide.

@sabl0r sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
@mattab mattab Actually call the Referrer Spam check.
Fixes #2268 Refs #5099
acb1bc2
@brynnd
brynnd commented Oct 10, 2014

herahair.com

@mattab mattab modified the milestone: Mid term, Long term Oct 11, 2014
@brynnd
brynnd commented Oct 11, 2014

In the Referer Websties widget, it says: http://16782868.website-errors-scanner.com/

But it redirects to: livefixer.com

@classaxe

I too am seeing new spam entries every day, and I'll bet many operators are seeing the same useless data all the time, and it changes on a daily basis

It seems to be the result of companies who promise to increase visitor stats for customers who are not sophisticated enough to distinguish quantity from quality and will pay per click for what is in effect simply meaningless white noise from people who not only have no interest in what they have to offer but are actually made hostile to the brands they represent because of the trickery involved in getting us there.

Is there any interest in setting up a common repository that can be automatically shared with all Piwik installations, so that hundreds of thousands of us don't end up having to make the same reactive modifications to out config file entries each day, and this of course after any PDFs sent to the operators of the sites we support have already gone out and the damage has already been done?

It seems to me that once the spam is detected it's already too late and a shared repository would help protect at least some of us from this scourge, especially where multiple referrer fakes originate from a small range of IP addresses, often note in South America.

@brynnd
brynnd commented Oct 18, 2014

New one for me: http://maranbrinfo.com.br/

classaxe,
It's my understanding that's the purpose for this project (to set up some way to control this long term). I don't have the technical know-how to discuss exactly how would be the best way to do that. But at least I can report when I find new ones.

I don't think it's only "....the result of companies who promise to increase visitor stats....". From what I've seen, it's just another way to proliferate spam -- any kind of spam.

I don't understand why this project hasn't gained more attention. I guess most people must either be using piwik for some purpose where this kind of spam doesn't mess up their stats -- or it just doesn't bother them. Or maybe it's like you suggested, that many website owners are just happy to get the traffic, and don't really care where it comes from. But it annoys me! If it's not a person visiting my site, I don't want Piwik to report it. It's frustrating, because I know there must be so many more than just the few that visit my tiny site. I wish more people would report.

I agree. By they time they're in your site stats, it's too late. So a solution like you're suggesting would be awesome!

@brynnd
brynnd commented Oct 18, 2014

And a 2nd new one for today!

http://speechtotextservice.com/

(this is not a blind visitor, but a service that converts speech to text) (they have no link to my site)

@mattab
Member
mattab commented Oct 19, 2014

Maybe we could build a new plugin that when enabled, would download the latest list of known spammers website URLs and automatically exclude them from the Referrer Websites. This plugin could be released on the Marketplace. what do you think?

For now also please keep pasting the spam links here as it is already useful start!

@classaxe

Hi Matthieu,

That would be a great solution in my opinion.

Blessings,

Martin Francis

416 410 9240

<><

http://www.churchesinyourtown.ca/ ecclesiact-churches-in-your-town

http://www.churchesinyourtown.ca/ http://www.ChurchesInYourTown.ca

http://www.linkedin.com/in/ecclesiact http://www.linkedin.com/in/ecclesiact

From: Matthieu Aubry [mailto:notifications@github.com]
Sent: October 18, 2014 8:36 PM
To: piwik/piwik
Cc: Martin Francis
Subject: Re: [piwik] Extend list of known Referrer Spammers (#5099)

Maybe we could build a new plugin that when enabled, would download the latest list of known spammers website URLs and automatically exclude them from the Referrer Websites. This plugin could be released on the Marketplace http://plugins.piwik.org/ .

For now also please keep pasting the spam links here as it is already useful start!


Reply to this email directly or view it on GitHub #5099 (comment) . https://github.com/notifications/beacon/9293057__eyJzY29wZSI6Ik5ld3NpZXM6QmVhY29uIiwiZXhwaXJlcyI6MTcyOTI5ODE1MiwiZGF0YSI6eyJpZCI6Mzc0NDk1MTV9fQ==--6558e85515bfc26722909ff7071a1e1c93eaa733.gif

@MathijsV

Another new spammer: buttons-for-website.com

@evll
evll commented Nov 14, 2014

referrer_urls_spam option does not work for me.
buttons-for-website.com is added like this:
referrer_urls_spam = "semalt.com,buttons-for-website.com" in config.ini.php
but this referrer keeps appearing in stats every day anyway.

@mattab
Member
mattab commented Nov 15, 2014

@evll did you add below [Tracker] section?

@evll
evll commented Nov 15, 2014

@mattab, thanks so much, I didn't notice the section heading was missing

@brynnd
brynnd commented Nov 24, 2014

seo.my-api.com

edit -
Actually the visitor reg'd in my forum, but was ID'd as a spammer, and was rejected (by SMF's Stop Spammer mod)! It's the first time I've had referrer spam from an actual (would-be) forum spammer!

@scheinercc

Hi there,

Just wondering if this feature/plugin and the list has already been shipped?
Found it!

Just for me as reminder:

  • in piwik config file piwik/config/config.ini.php
  • add below [Tracker] section:
; Comma separated list of known Referrer Spammers, ie. bot visits that set a fake Referrer field.
; All Visits with a Referrer URL host set to one of these will be excluded.
; If you find new spam entries in Referrers>Websites, please report them here: https://github.com/piwik/piwik/issues/5099
referrer_urls_spam = "5.kambasoft.com/2.php,9.kambasoft.com/2.php,16782868.website-errors-scanner.com,39946554.semalt.com/crawler.php,bodywisesupplements.com,buttons-for-website.com,caldercarrentals.com,coastalstickers.com.au,herahair.com,icompbusiness.com,livefixer.com,maranbrinfo.com.br,mountainstream.ms,musica.descargar-musica-gratis.net/descargar-musica-gratis.php,musicas.baixar-musicas-gratis.com/baixar-musicas-gratis.php,musicas.kambasoft.com/2.php,religionenserio.com,semalt.com,semalt.com/crawler.php,semalt.semalt.com/crawler.php,seo.my-api.com,speechtotextservice.com,textelle.ee,ticimax.com,waterfallscanopy.com,www.semalt.com,www.semalt.com/crawler.php,www.star61.de,www.windstream.net,youtube-downloader.savetubevideo.com/youtube-downloader.php"
  1. Current List (2014-12-07) in alphabetical order:
    5.kambasoft.com/2.php
    9.kambasoft.com/2.php
    16782868.website-errors-scanner.com
    39946554.semalt.com/crawler.php
    bodywisesupplements.com
    buttons-for-website.com
    caldercarrentals.com
    coastalstickers.com.au
    herahair.com
    icompbusiness.com
    livefixer.com
    maranbrinfo.com.br
    mountainstream.ms
    musica.descargar-musica-gratis.net/descargar-musica-gratis.php
    musicas.baixar-musicas-gratis.com/baixar-musicas-gratis.php
    musicas.kambasoft.com/2.php
    religionenserio.com
    semalt.com
    semalt.com/crawler.php
    semalt.semalt.com/crawler.php
    seo.my-api.com
    speechtotextservice.com
    textelle.ee
    ticimax.com
    waterfallscanopy.com
    www.semalt.com
    www.semalt.com/crawler.php
    www.star61.de
    www.windstream.net
    youtube-downloader.savetubevideo.com/youtube-downloader.php
@mattab
Member
mattab commented Nov 25, 2014

@scheinercc thanks for suggestions. btw you should be editing config.ini.php rather than global.ini.php

@scheinercc

@mattab thanks for the note. Should it be still be under [Tracker]? I only see [database], [General] and [Plugins]. Do I simple have to add it last?

@brynnd
brynnd commented Nov 29, 2014

l.facebook.com

@brynnd
brynnd commented Nov 30, 2014

Now that there are a few people reporting, I wonder if it might be a good idea to keep a full list somewhere convenient, like on this page, of all the reported referrer spammers. This would be just to help prevent duplicate reports. Because I just noticed that I already reported this domain, somewhere around the half-page mark, but didn't remember it. And the page is getting long enough that a lot of people would not be motivated to read the whole thing.

Just a thought :-)

@scheinercc

@brynnd @mattab I have updated my comment above from the 24.11. including all the links I could find in this issue and in #2268. I have removed all protocols ('http://') and query stings. The list is included in the code example as comma separated values and underneath in alphabetical order to be able to easier compare it.

It would good of someone could double check that I

  • haven't forgotten any URL
  • added any URL wrongfully
  • and that my explanation above is generally correct now

;)

@mattab mattab modified the milestone: Short term, Long term Dec 1, 2014
@fvdm
fvdm commented Dec 4, 2014

A few more:

waterfallscanopy.com
www.star61.de

@brynnd
brynnd commented Dec 6, 2014

I'm not quite sure about this one. The visitor landed at my site from a search, starting from this site. But I'm not sure it's a true search engine. In any case, it's not a true referrer!
http://www.reliancenetconnect.co.in/3g/search.html?q=CLICK%20AND%20DRAG%20WITH%20THE%20ERASER%20TO&channel=ui_new

@fvdm
fvdm commented Dec 6, 2014

@scheinercc, @brynnd l.facebook.com is not spam, just a redirector from Facebook mentions. When you blacklist this you block a lot of Facebook referals; many of their outlinks go through l.facebook.com.

Also getpocket.com is not spam either, it is a popular bookmark service.

@scheinercc

@fvdm Thanks for checking. Updated the list(s) above accordingly.
@brynnd are you sure? It looks like a valid search result to me.

@brynnd
brynnd commented Dec 7, 2014

fvdm and scheinercc,
A referring website, as far as I understand it, is a website that has a link to my website. A search engine or a website's search service, doesn't count as a referring website, to my understanding. For reporting statistics about my website, I want searches to be counted in the Search Engine widget, not in Referrer Websites widgets.

If someone has a link to my website on their facebook account, then it needs to look like it comes from an individual facebook account - not a generic sub-domain of facebook. That doesn't help me much, if I don't know whose or which facebook account has made the referral.

The term "referrer spam" doesn't mean that we think these are spamming or malicious websites. It means someone has faked having a link to my website, when they really don't have a link. It messes up our statistics (for whatever reason we might have for wanting statistics).

Blocking these sites, in this context, does not mean we are blocking them from visiting our website. It just blocks them from showing up in our referring website statistics. At least that's my understanding.

scheinercc, I appreciate you making that list! It makes it easier to avoid double postings.

But I do still want getpocket and l.facebook listed, because they are not true referring websites, in my opinion. Actually, I don't know what a bookmarking service is. But if they somehow provide a way for people to easily find websites, just like bookmarks in a browser, that is not a referring website.

@scheinercc

@brynnd fair enough, but tbo ...

  1. for me it's just about the "spamming or malicious websites". I just started using piwik and have it on a couple of smaller domains where most of the traffic was spam, which I wanted to get rid of. And just as one example where FB might send a generic domain as referrer is if the link comes from a non-public profile ... maybe!?
  2. You can simply copy "my" list or make a second (small!?) one of these additional domains and combine them for yourself.
  3. I am not enough into the topic "analytics best pratice" to argue with you if yours is the correct approach or not, but maybe @mattab can clear things up, or knows someone who can!?
@scheinercc

@brynnd I just checked @mattab first entry on top again and for me it is actually clear, that it is only about spam.

spammers that attack Piwik users with their lame websites

and also

A referring website, as far as I understand it, is a website that has a link to my website.

that's my understanding as well - https://en.wikipedia.org/wiki/HTTP_referer

A search engine or a website's search service, doesn't count as a referring website, to my understanding.

I would say that's wrong. Coming from search is a normal referrer, but it can be filtered for all (well) known search engines/pages, removed from the list of "referrer websites" and applied to it's own list of referrers "search engines". I doubt that generic websites which are using a search engine's API fall under this category. Even if they would, it wouldn't be for the list generated here, but for the list that splits up incoming referrers into "search engines" and "other websites".

someone has faked having a link to my website, when they really don't have a link

I rather think that for the examples above of facebook and the bookmark service there are links, but those sites hide the real referrer for (account-) security reasons - https://en.wikipedia.org/wiki/HTTP_referer#Referer_hiding have internal services they are linking to first, before redirecting the request to your website which leaves you just with a generic domain.

@mattab
Member
mattab commented Dec 8, 2014

Hi guys,

I created this issue so we can discuss the matter and decide together what is the best course of action. It's good to have this discussion, but i'm still not sure what is a good next step. In particular, I'm not sure how we can confirm that some websites are Spam or not.

For sure we should not mark l.facebook.com as "referrer spam" since FB is not spam. Marking as spam is pretty big deal: in Piwik if a user is detected as coming from one of the "spam websites" then the user action with a spam referrer will not be tracked in Piwik.

@fvdm
fvdm commented Dec 9, 2014

I'm with @mattab, blacklisting prevents the user from being tracked.

I think a smart approach would be for Piwik to release a referrer blocklist plugin so everyone can easily maintain their own list without manually editing the config file. It can have an option to (auto) upload the custom list to Piwik.org to count the most blocked domains. These can then be moderated and published on a synchronization feed back to the plugin.

Regarding search engines, I use the ReferrersManager plugin from @sgiehl for this.

@brynnd
brynnd commented Dec 9, 2014

mattab said
"For sure we should not mark l.facebook.com as "referrer spam" since FB is not spam. Marking as spam is pretty big deal: in Piwik if a user is detected as coming from one of the "spam websites" then the user action with a spam referrer will not be tracked in Piwik."

Maybe we should be clear about the term "referrer spam". There are all sorts of various apps and mods and plugins, which are designed and used to identify (and prevent) spam on websites. Because Piwik is not one of those, and instead is more of an analytics program, I don't think identifying spamming websites should be its goal.

Up until now, I thought "referrer spam" meant sites that have tricked Piwik into identifying it as a referrering website (a site which has a link to my site), or that are inappropriately identified as a referrer. In only one case that I've reported, the referrer spam site was actually a true purveyor of spam! What I've understood (up until now) is that totally legitimate and respectable websites (such as Facebook) could be ID'd as referrer spam, because they found a way to trick Piwik into thinking they have a link to my site. What I've surmised by observation is that Piwik sometimes erroneously sends search engines or search service visits to Referrer Websites, instead of Search Engines.

mattab, can you confirm if that is a correct understanding (or not)? According to your comments that I quoted above, it sounds like Piwik is now in the business of identifying spam sites! I thought the term "spam" in this case, was relative to the Referrer Websites reporting category. If I don't want the site reported as a referrer, it's spam.

If Piwik wants to ID spam sites, they should work out a deal with StopForumSpam or similar!

When I report referrer spam, it means that I've checked the website and confirmed that it does not have a link to my site. That's all.

And the other thing I need to be clear on is how Piwik handles these sites. Until now, my understanding is that blocking the referrer spam meant blocking it from being counted as a referring website. I didn't think it meant that we are not tracking any visits from that site!

If blocking referrer spam means not tracking those visits at all, then I certainly would want the choice whether to use any sort of Piwik code or plugin that does so! I wouldn't want those visit not to be counted at all. I just don't want them counted as a referrer!

I'm content to let all my other spam apps and mods and plugins handle spam on my website. What I want from Piwik is not to count these sites as referrers. I want them counted or tracked, but not as referrer.

Another one that I get a lot is translate.googleusercontent.com. Sometimes it's from people translating one of my own webpages, and sometimes it's when someone translated a page from a referring website. To my understanding, it should only be in Referrer Websites if it's from a referring website. If it's my own site, that blows my statistics as much as referrer spam does.

@mattab
Member
mattab commented Dec 9, 2014

Hi @brynnd

If blocking referrer spam means not tracking those visits at all, then I certainly would want the choice whether to use any sort of Piwik code or plugin that does so! I wouldn't want those visit not to be counted at all. I just don't want them counted as a referrer!

This is not what the feature does. As explained, if you specify a website as Referrer Spam in the config, then all requests with this referrer will be excluded from Piwik (because they are from bots). What you want is likely something else and not this feature.

@brynnd
brynnd commented Dec 13, 2014

Well, no, I do want this feature. I just misunderstood it.

@mattab
Member
mattab commented Dec 16, 2014

FYI: New Referrer Spammer: buttons-for-website.com #6858

@Globulopolis
Contributor

New spam from make-money-online.7makemoneyonline.com
Need confirmation from users.

@fvdm
fvdm commented Dec 17, 2014

New spam from make-money-online.7makemoneyonline.com
Need confirmation from users.

@Globulopolis I noticed 7makemoneyonline.com in my stats too.

@ways2web

@Globulopolis

New spam from make-money-online.7makemoneyonline.com
Need confirmation from users.

confirm 7makemoneyonline.com with:
make-money-online.7makemoneyonline.com/money.php?u=myurl

@mnapoli
Member
mnapoli commented Dec 17, 2014

+1 for 7makemoneyonline.com

@llocally

+1 for 7makemoneyonline.com

@mattab
Member
mattab commented Dec 17, 2014

Thanks for the report, it's been added in #6872

@garfieldairlines

I have spam from :
make-money-online.7makemoneyonline.com
buttons-for-website.com

@ranzeflitze

I know this probably isn't related to modifying the config.ini.php file but after adding the new [Tracker] section to the end of the config.ini.php file, I get an error (below) and was wondering if you could point me in the right direction. I think the error stems from performing automatic upgrades but any assistance is greatly appreciated. Forgive me if this isn't the proper place to be asking this.

There is an error. Please report the message (Piwik 2.9.1) and full backtrace in the Piwik forums (please do a Search first as it might have been reported already!).

Warning: syntax error, unexpected $end, expecting TC_DOLLAR_CURLY or TC_QUOTED_STRING or '"' in /home/my_domain/public_html/piwik_install_location/config/config.ini.php on line 128 in /home/my_domain/public_html/piwik_install_location/libs/upgradephp/upgrade.php on line 134

Backtrace -->

#0 Piwik\Error::errorHandler(...) called at [:]
#1 parse_ini_file(...) called at [/home/my_domain/public_html/piwik_install_location/libs/upgradephp/upgrade.php:134]
#2 _parse_ini_file(...) called at [/home/my_domain/public_html/piwik_install_location/core/Config.php:331]
#3 Piwik\Config->init(...) called at [/home/my_domain/public_html/piwik_install_location/core/Config.php:407]
#4 Piwik\Config->__get(...) called at [/home/my_domain/public_html/piwik_install_location/core/SettingsPiwik.php:202]
#5 Piwik\SettingsPiwik::isPiwikInstalled(...) called at [/home/my_domain/public_html/piwik_install_location/core/SettingsPiwik.php:401]
#6 Piwik\SettingsPiwik::getPiwikInstanceId(...) called at [/home/my_domain/public_html/piwik_install_location/core/SettingsPiwik.php:378]
#7 Piwik\SettingsPiwik::rewritePathAppendPiwikInstanceId(...) called at [/home/my_domain/public_html/piwik_install_location/core/SettingsPiwik.php:276]
#8 Piwik\SettingsPiwik::rewriteTmpPathWithInstanceId(...) called at [/home/my_domain/public_html/piwik_install_location/core/Filechecks.php:49]
#9 Piwik\Filechecks::checkDirectoriesWritable(...) called at [/home/my_domain/public_html/piwik_install_location/core/Filechecks.php:73]
#10 Piwik\Filechecks::dieIfDirectoriesNotWritable(...) called at [/home/my_domain/public_html/piwik_install_location/core/FrontController.php:324]
#11 Piwik\FrontController->init(...) called at [/home/my_domain/public_html/piwik_install_location/core/dispatch.php:35]
#12 require_once(...) called at [/home/my_domain/public_html/piwik_install_location/index.php:46]

I did restore a backup of the original config file so at least I just have to deal with those sites for now.
Thanks again and have a great day!

@mattab
Member
mattab commented Dec 19, 2014

could you paste what you copied in your config file?

@Globulopolis
Contributor

@ranzeflitze try to quote string. E.g.

[Tracker]
somevar=true
somevar1=0
somevar2="string"

@mattab mattab referenced this issue in piwik/piwik-package Dec 19, 2014
Closed

System check says: File integrity check failed #16

@akriesch

buttons-for-website.com,make-money-online.7makemoneyonline.com,7makemoneyonline.com

@ranzeflitze

@mattab This is what I pasted into the config file (at the end of the file after all the PluginsInstalled)

[Tracker]
; Comma separated list of known Referrer Spammers, ie. bot visits that set a fake Referrer field.
; All Visits with a Referrer URL host set to one of these will be excluded.
; If you find new spam entries in Referrers>Websites, please report them here: #5099
referrer_urls_spam = "semalt.com,buttons-for-website.com,make-money-online.7makemoneyonline.com,7makemoneyonline.com”

Thank you.

@Trance-Man

streamcyclone.com
perkinslawgroup.net

@fvdm
fvdm commented Jan 16, 2015

anticrawler.org

Interesting, they are to prevent bots or so they say... probably going to serve malware.

@PeterTheOne

+1 anticrawler.org

@ways2web

abcd4.de
findsimilarsites.de

@cdtoad
cdtoad commented Feb 2, 2015

Just got hit with about 50 buttons-for-website.com referrer in my dashboard. Upgraded yesterday to 2.10.0 and do see referrer_urls_spam = "semalt.com,buttons-for-website.com,7makemoneyonline.com" in global.ini.php. Am I missing a toggle somewhere in settings?

@fvdm
fvdm commented Feb 2, 2015

@cdtoad there is no toggle, would be nice though..

@brynnd
brynnd commented Feb 2, 2015

buttons-for-website.com redirects to http://sharebutton.net/. I'm not sure if both need to be blocked, but fwiw :-)

@brynnd
brynnd commented Feb 2, 2015

netvibes.com

@brynnd
brynnd commented Feb 7, 2015

ranksonic.info Not sure if this is spam or not. But looks like to me.

@samwaterston

+1 for ranksonic.info

@brynnd
brynnd commented Feb 19, 2015

I wonder if there has been any decision made about how to handle this issue. I've been putting off adding new spammers to the block list myself, and instead, wait for the next upgrade. But buttons-for-website and ranksonic have become prolific, and waiting for the next upgrade will not be possible for me. Now I'm going to follow the instructions that are posted above, and add them myself. But a nice button click would be very nice! Thanks :-)

@brynnd
brynnd commented Feb 19, 2015

Well, it turns out I'm going to need a little help, to add them myself. Up near the top of this thread, a comment by jlj says to edit this file: piwik/config/global.ini.php. And then it shows where to add the spam urls.

(Comma separated list of known Referrer Spammers, ie. bot visits that set a fake Referrer field.
All Visits with a Referrer URL host set to one of these will be excluded.
If you find new spam entries in Referrers>Websites, please report them here: #5099
referrer_urls_spam = "semalt.com,savetubevideo.com,kambasoft.com")

But when I find the global.ini.php file, it says near the top "; WARNING - YOU SHOULD NOT EDIT THIS FILE DIRECTLY - Edit config.ini.php instead." However, when I open config.ini.php, it doesn't look anything like global.ini.php. It doesn't even have the code from global.ini.php, where ljl says I need to add the spam urls.

Later in this thread (about halfway down the page) scheinercc gives some instructions with a list of all the urls to add. His instructions say:
"
in piwik config file piwik/config/config.ini.php
add below [Tracker] section:
"
and then he gives the same code that ljl indicated from the global.ini.php file. However, [Tracker] does not appear in config.ini.php.

So I am pretty thoroughly confused. Could someone clear this up for me? Where do I add the referrer spam urls?

Thank you very much :-)

@fvdm
fvdm commented Feb 19, 2015

@brynnd don't directly change global.ini.php, it's only for the default settings and descriptions and may be reset when upgrading Piwik. Instead copy the parts you wish to change to config.ini.php. When the relevant section like [Tracker] is missing you simply add it yourself:

[Tracker]
referrer_urls_spam = "semalt.com,savetubevideo.com,kambasoft.com"
@jlj
jlj commented Feb 19, 2015

@brynnd I just edited the comment above to clarify which config file shall be edited, for future readers of this thread. ;-)

@brynnd
brynnd commented Feb 20, 2015

So I don't need to add the part that starts with ";Comma separated list...."? I just need to put

[Tracker]
referrer_urls_spam = "semalt.com,savetubevideo.com,kambasoft.com"

in config.ini.php? Do I just put it at the end of config.ini.php?

@fvdm
fvdm commented Feb 20, 2015

@brynnd The lines that start with a ; are comments and therefore ignored by Piwik. You can add them for your own (future) reference, but they are not required.

If the [Tracker] section does not exist yet it's fine to add it to the end of the config.ini.php file.

@brynnd
brynnd commented Feb 21, 2015

Ok, thanks fvdm :-)

I've edited the file, and by tomorrow, I should know if it's working. The buttons-for-website and ranksonic have been coming every day lately. Although buttons-for-website first came up long ago enough that it should be in my version of piwik. So since these "manual install" instructions were posted, maybe they haven't been keeping the list up to date, in new versions?

Here's an updated list (comma-separated), since the list provided by scheinercc hasn't been updated lately. Maybe we could keep it updated, with the addition of each new domain/URL?

Maybe every time someone reports a new domain/URL, they can copy this list, paste it in their message, and add their new item to it. If everyone does that, we can have an always current and complete list?

Just a thought :-)

5.kambasoft.com/2.php,9.kambasoft.com/2.php,16782868.website-errors-scanner.com,39946554.semalt.com/crawler.php,7makemoneyonline.com,abcd4.de,anticrawler.org,bodywisesupplements.com,buttons-for-website.com,caldercarrentals.com,coastalstickers.com.au,findsimilarsites.de,herahair.com,icompbusiness.com,livefixer.com,make-money-online.7makemoneyonline.com,make-money-online.7makemoneyonline.com/money.php,maranbrinfo.com.br,mountainstream.ms,musica.descargar-musica-gratis.net/descargar-musica-gratis.php,musicas.baixar-musicas-gratis.com/baixar-musicas-gratis.php,musicas.kambasoft.com/2.php,netvibes.com,perkinslawgroup.net,ranksonic.info,religionenserio.com,semalt.com,semalt.com/crawler.php,semalt.semalt.com/crawler.php,seo.my-api.com,sharebutton.net,speechtotextservice.com,streamcyclone.com,textelle.ee,ticimax.com,waterfallscanopy.com,www.semalt.com,www.semalt.com/crawler.php,www.star61.de,www.windstream.net,youtube-downloader.savetubevideo.com/youtube-downloader.php

@fvdm
fvdm commented Feb 21, 2015

You don't have to write full urls. Piwik checks if the referrer contains the string, so only including 7makemoneyonline.com is enough for all urls that mention this part somewhere.

@mattab
Member
mattab commented Feb 23, 2015

Hi guys, i'm adding the following to the list of referrer spammer: anticrawler.org,ranksonic.info,savetubevideo.com,kambasoft.com - if there are more, please let me know

@larsactionhero

Hi there,

I've created a alphabetically ordered list (just to make it more comfortable), maybe I forgot some entries, maybe some of them are not neccessary, I'm not sure.
Feel free to edit and repost :)

7makemoneyonline.com,abcd4.de,anticrawler.org,bodywisesupplements.com,buttons-for-website.com,caldercarrentals.com,coastalstickers.com.au,descargar-musica-gratis.net,findsimilarsites.de,herahair.com,icompbusiness.com,kambasoft.com,livefixer.com,maranbrinfo.com.br,mountainstream.ms,musicas.baixar-musicas-gratis.com,netvibes.com,perkinslawgroup.net,ranksonic.info,religionenserio.com,savetubevideo.com,semalt.com,seo.my-api.com,sharebutton.net,speechtotextservice.com,star61.de,streamcyclone.com,textelle.ee,ticimax.com,urlopener.blogspot.com,waterfallscanopy.com,website-errors-scanner.com,windstream.net,youtube-downloader.savetubevideo.com

@cohan
cohan commented Mar 4, 2015

Switching from Google Analytics to piwik for this exact reason. Couple new ones I've started seeing that don't seem to be listed here already

o-o-6-o-o.com,bestwebsitesawards.com,*.darodar.com,ranksonic.org,ranksonic.info,delta-search.com,sr.searchfunmoods.com

@mattab
Member
mattab commented Mar 4, 2015

Hi guys,

Thanks for the feedback. I'm adding the following websites that i've confirmed to be referrer spammers:

  • ilovevitaly.com, priceg.com, blackhatworth.com, hulfingtonpost.com
  • darodar.com, econom.co, o-o-6-o-o.com,bestwebsitesawards.com, darodar.com,ranksonic.org,ranksonic.info

If you have others confirmed, please report them here!

@mattab
Member
mattab commented Mar 4, 2015

Switching from Google Analytics to piwik for this exact reason

That's the way to go @cohan :-)

@gfive
gfive commented Mar 7, 2015

What about http://www.projecthoneypot.org ? use it for my forum, email forms and others. Works like a charm...

@mattab
Member
mattab commented Mar 8, 2015

@gfive does it provide a list of referrer spammers as well?

@gfive
gfive commented Mar 9, 2015

I think so. But main function is to get all spammers by IP. I used to get lot's of visits from semalt.com as well. But after installing my honeypot It completely stopped.
Account is free at http://www.projecthoneypot.org so you could register and check if it offers where you are looking for...

@brynnd
brynnd commented Mar 11, 2015

I have a honey pot on my site, but it doesn't seem to stop the referrer spam. I guess these referrer spammers aren't traditional spammers.

If you installed the honey pot around the time you upgraded Piwik, it might have been a coincidence.

@samwaterston

Hi,

New spam referrer (I guess): 4webmasters.org referral traffic from Russia started showing 4 days ago on my website stats.

@brynnd
brynnd commented Apr 2, 2015

best-seo-solutions.com
It's a semalt-related domain.

@garfieldairlines

Same here : best-seo-solution.com all from Brazil from Thursday 2 April 2015 - 21:00 to Friday 3 April 2015 - 02:00 :
189.58.107.27
186.225.151.50
201.42.91.49
189.106.183.216

@Joey3000
Contributor
Joey3000 commented Apr 4, 2015

+1 for best-seo-solution.com

Additional note: referrer_urls_spam currently has following entries twice:
darodar.com
ranksonic.info

The entries could be sorted alphabetically, to prevent duplication in future.

@mattab
Member
mattab commented Apr 7, 2015

best-seo-solution.com was added to the list, and the list was sorted in alphabetical order. We'll keep listening to your feedback in this issue 👍

@AgentGod
AgentGod commented Apr 7, 2015

Hello to All
@mattab maybe the best way to keep the spammer list will be to keep them in external file instead in global.ini.
Option in administration panel Enable/Disable/ spammer list, plus button next to them to update the list, so it will be possible to update only the list.
Another option is again Enable/Disable spammer list, plus text box next to them to easily add spam url.

@AgentGod
AgentGod commented Apr 7, 2015

Also one spammer from me humanorightswatch.org
In google analytics I have site13.simple-share-buttons.com and site11.simple-share-buttons.com
Will the entry for simple-share-buttons.com block also site13.simple-share-buttons.com, if not is it possible to add something like *.simple-share-buttons.com

@jazzcrack

+1 best-seo-solutions.com

@Globulopolis
Contributor

Totally agree with @AgentGod. And I think support for regex in URL to reduce the list of similar URLs

@brynnd
brynnd commented Apr 10, 2015

Note that there's a best-seo-solutions.com, and best-seo-solution.com! Gggrrr!!

@gallolu
gallolu commented Apr 10, 2015

Please add also buttons-for-your-website.com and best-seo-solution.com .. :)

@brynnd
brynnd commented Apr 10, 2015

I guess this is a new trend, and perhaps needs to be considered in a permanent feature. Now it looks like these guys are changing their domain names just slightly, to bypass our blocks!

I leads me to wonder (AGAIN!) what benefit these guys gain in doing this. "Spam" is "spam" and I guess no one really knows or understands its purpose. But I still do wonder why they waste time on this. What benefit could there possibly be? For anyone?

Note that also there is buttons-for-website.com and buttons-for-your-website.com ggrrr!

Also, another new one: best-seo-offer.com

@AgentGod

+1 best-seo-offer.com
+1 buttons-for-your-website.com

@fvdm
fvdm commented Apr 11, 2015

+1 for regex support, would be very useful

@campino2k

+1 best-seo-offer.com

@mnapoli
Member
mnapoli commented Apr 13, 2015

I guess this is a new trend, and perhaps needs to be considered in a permanent feature. Now it looks like these guys are changing their domain names just slightly, to bypass our blocks!

I agree this is getting more and more of a problem. By the way it seems that Google Analytics is not tackling this problem yet (I see all those spammers in GA), which gives Piwik a nice little plus.

Anyway I believe too that we need to find a more robust solution for this: as a user I don't like to have to wait for new Piwik releases to exclude new spammers (my data keeps being polluted in the meantime). I also think the current way of reporting spammers (GitHub issue) is not the best.

We need:

  • to make it easier for users to report new spammers
  • Piwik to auto-update the spammers list
  • while still keep the list up to date in new releases (for the Piwik installs that are setup to avoid any external network call)

@mattab I'd like to open a separate issue to discuss a solution (I have a few ideas) so that we keep this issue for reporting spammers, what do you think? Should we tackle this soon or let it be for now?

@mattab
Member
mattab commented Apr 14, 2015

@mattab I'd like to open a separate issue to discuss a solution (I have a few ideas) so that we keep this issue for reporting spammers, what do you think? Should we tackle this soon or let it be for now?

something we can try with low effort is to put the list in a separate file with one spammer per line, and then ask the community to issue pull request, because it's so easy to make a PR on github web interface alone, it could be quick and efficient solution.

to discuss auto update feature +1 to discuss in a separate new issue to make sure it is followed up

edit: as it looks like Referrer spamming is here to stay, I guess many people managing websites will have issues with this. there is value in Piwik sharing our list with the world!
-> maybe we create a separate repo with just the spammer list?

@mnapoli
Member
mnapoli commented Apr 14, 2015

I've opened #7674 to continue the discussion.

@mnapoli
Member
mnapoli commented Apr 17, 2015

It seems we are not the only one trying to build such list, see http://www.reddit.com/r/Wordpress/comments/2qteln/i_want_to_build_a_list_of_referrer_spam_links_to/ (seems to be updated regularly)

@Fensterbank
Contributor

While there is no better way to report spammers at the moment, I'll continue here :)

  • buttons-for-your-website.com (which is not the same than buttons-for-website.com, which is already added to the list)
  • best-seo-offer.com
@AgentGod

Entire list alphabetically:

4webmasters.org,7makemoneyonline.com,adcash.com,anticrawler.org,best-seo-offer.com,best-seo-solution.com,bestwebsitesawards.com,blackhatworth.com,buttons-for-website.com,buttons-for-your-website.com,cenokos.ru,cenoval.ru,cityadspix.com,darodar.com,econom.co,iskalko.ru,edakgfvwql.ru,forum.smailik.org,Get-Free-Traffic-Now.com,gobongo.info,googlsucks.com,hulfingtonpost.com,humanorightswatch.org,ilovevitaly.co,ilovevitaly.com,ilovevitaly.ru,kambasoft.com,luxup.ru,make-money-online.7makemoneyonline.com,myftpupload.com,o-o-6-o-o.ru,o-o-8-o-o.ru,priceg.com,prlog.ru,ranksonic.info,ranksonic.org,savetubevideo.com,screentoolkit.com,semalt.com,semalt.semalt.com,seoexperimenty.ru,simple-share-buttons.com,slftsdybbg.ru,social-buttons.com,socialseet.ru,superiends.org,theguardlan.com,vodkoved.ru,websocial.me,ykecwqlixx.ru

@adegans
adegans commented Apr 18, 2015

I think this can easily be countered if someone builds a thingy in Piwik to download a new list every so much hours, say once every 24 hours.

And the serving system where people can submit links. If 10 (?) people submit the same link it's added in the downloadable list which each piwik setup can fetch every 24 hours. This could employ a simple filter to get double submissions like www.semalt.com vs. semalt.com so the list stays somewhat compact and clean.

This submission system can be a thing in Github (Similar to torrent blocklists - https://gist.github.com/johntyree/3331662) or something more fancy.

I'm not overly familiar with Github and new to Piwik. Just came across this because I have a referrer spam problem also.

@mnapoli
Member
mnapoli commented Apr 19, 2015

The list has been moved to https://github.com/piwik/referrer-spam-blacklist in order to make it more visible and more practical to maintain.

Please open issues and pull requests in that new repository :)

If you are interested to know how this list will be handled, read this issue: #7674

@Fensterbank thanks I confirm those 2, I have added them to the new list.

@mnapoli mnapoli closed this Apr 19, 2015
@mattab
Member
mattab commented Apr 21, 2015

How do I contribute to the Referrer spammer Piwik list?

To add a new referrer spammer to the list, click here to edit the spammers.txt file and create a pull request. Alternatively you can create a new issue.

Looking forward in the future to maintaining this spammer list together as a community 👍

@sbrickey
sbrickey commented May 5, 2015

Since the list is now maintained via file on github, any chance that the admin site can check for newer versions, and ideally go download it automatically?

@mnapoli
Member
mnapoli commented May 6, 2015
@brynnd
brynnd commented Jun 6, 2015

Well, it's exciting to see some new energy and seeing this project move forwards!

Unfortunately, I'm not very familiar with programming or how this site works. When I click the link in mattab's last msg, to edit the spammers.txt file, it says:

-- You need to fork this repository to propose changes.
-- Sorry, you’re not able to edit this repository directly— you need to fork it and propose your changes from there instead.

I don't know what that means to "fork this repository". So I used the other option and reported a new spammer (another semalt variety). But I'm still not sure how to proceed with my piwik installation.

I did read #7674, but unfortunately, I don't understand much of it. I also read https://github.com/piwik/referrer-spam-blacklist, but again, don't understand much.

If I have the current version of Piwik, am I getting all the spammers blocked? Or do I need to continue adding new spammers to my config.ini.php file?

Thank you very much :-)

@mnapoli
Member
mnapoli commented Jun 6, 2015

@brynnd it's fine if you weren't able to edit the file directly, opening an issue is good too. When it is added to the list it will be included in the new Piwik version, so that's why it's important to keep Piwik up to date.

In the future we want to auto-update the list so that you get the latest spammers blocked even before the new Piwik release is available (issue #7674).

@brynnd
brynnd commented Jun 9, 2015

Thanks mnapoli!

Where can I check the current list, so I don't accidentally add duplicates to the list? Especially these new semalt-related one, where they're just changing the domain by a character or 2/

@mattab
Member
mattab commented Jun 9, 2015

@brynnd latest version is at: https://github.com/piwik/referrer-spam-blacklist/

Please note: you don't need to add the semalt variation if they are sub-domains of semalt.com (or any other spammer). but if they are new domain names (not sub-domains) then please suggest the new spammers on this project: https://github.com/piwik/referrer-spam-blacklist/

@ghost
ghost commented Jun 25, 2015

I've added several domains to piwik/config/config.ini.php as described above. A week has passed and every day I see new entries from these same domains. Am I missing something basic, or is this worth opening a new issue?

This is at the end of my config.ini.php, and I've restarted Apache (and later the server) but modifying this configuration file has had no effect at all:

[Tracker]
referrer_urls_spam = "100dollars-seo.com,semaltmedia.com"
@mnapoli
Member
mnapoli commented Jun 25, 2015

@tombrossman with the latest Piwik versions this INI config option isn't used anymore. There will be a new Piwik release very soon (probably tomorrow), else you can update to the latest beta and those spammers will be blocked.

With Piwik 2.14 the spammers list will be updated automatically.

@ghost
ghost commented Jun 25, 2015

Ah, thanks - that makes sense now. I thought it was me doing something stupid again...

@jpjp
jpjp commented Jul 13, 2015

Is anti-referral spamming included in piwik 2.14 and enabled by default? Is there any way to retroactively apply it, like with the geoip location dbs? I couldn't find any documentation on this. Thanks.

@mnapoli
Member
mnapoli commented Jul 19, 2015

Is anti-referral spamming included in piwik 2.14 and enabled by default?

Yes

Is there any way to retroactively apply it, like with the geoip location dbs?

No, it doesn't apply retroactively.

@mattab mattab added the answered label Oct 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment