Skip to content

Commit

Permalink
Include search consistently
Browse files Browse the repository at this point in the history
It's possible to go to, say, en.wikipedia.org/?search=foobarbazqux
or even en.wikipedia.org/wiki/an_article?search=foobarbazqux, because
MediaWiki is trying to kill me. Search pages are manually-accessed
HTML pages, and it's looking a lot like the iOS search
includes this structure. So, this patch includes them explicitly.

Change-Id: I9312cbfcc3987c6a70c3c95a288ce4cefc91dc31
  • Loading branch information
ironholds committed Mar 2, 2015
1 parent 287da77 commit 05e5da9
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 1 deletion.
1 change: 1 addition & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
* Stop counting edit attempts as pageviews
* Start counting www.wikidata.org hits
* Start counting www.mediawiki.org hits
* Consistently count search attempts

## v0.0.7
* Add Referer classifier
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ public class Pageview {
);

private static final Pattern uriQueryPattern = Pattern.compile(
"\\?((cur|old)id|title)="
"\\?((cur|old)id|title|search)="
);

private static final Pattern uriPathUnwantedSpecialPagesPattern = Pattern.compile(
Expand Down
1 change: 1 addition & 0 deletions refinery-core/src/test/resources/pageview_test_data.csv
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Is Pageview – Desktop - Chinese zh-sg, true,false,174.62.175.93,-,zh.wikipedia
Is Pageview – Desktop - Chinese zh-tw, true,false,174.62.175.94,-,zh.wikipedia.org,/zh-tw/Wikipedia:首页,-,200,text/html,Five-test plan
Is Pageview – Wikidata, true, true,174.62.175.94,-,www.wikidata.org,/wiki/Q5651758,-,200,text/html,Five-test plan
Is Pageview – MediaWiki, true, true,174.62.175.94,-,www.mediawiki.org,/wiki/Gerrit/git-review,-,200,text/html,Five-test plan
Is Pageview – iOS search, true,false,174.62.175.94,-,en.wikipedia.org,/,?search=afdfsdfsd,200,text/html,Five-test plan
Is Not Pageview - http_status != 200, false,true,174.62.175.95,-, en.wikipedia.org, /wiki/Noppperrrrs,-,400,text/html ,turnip
Is Not Pageview - content_type does not match, false,true,174.62.175.96,-, en.wikipedia.org, /wiki/Noppperrrrs,-,200, image/png, turnip
Is Not Pageview - API stupidity: it outputs a 200 status code and text/html as a MIME type on certain classes of error., false, false,174.62.175.97,-, en.wikipedia.org, /w/api.php,-,200, text/html, turnip
Expand Down

0 comments on commit 05e5da9

Please sign in to comment.