New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A sequence of user actions that is missed by Visitor Log #7773

Closed
BaruchYoussin opened this Issue Apr 27, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@BaruchYoussin

BaruchYoussin commented Apr 27, 2015

I have seen the following sequence of actions not reported in Visitor Log, this happening twice with two different visitors.

In both cases user first opens Java incorrect time zone bug on Windows, then Step by step Java Native Interface (JNI) guide and tutorial but this is not reported by Piwik apparently because user closes this page immediately, and then opens Time zones in Java: design discussion.

The first and the third of the viewed pages are reported by Piwik script as seen on the server log but in both cases the visit is not listed in the Visitor Log at all.

The first visit is logged as follows:

87.249.Visitor.IP - - [01/Apr/2015:08:05:49 -0500] "GET /java-incorrect-time-zone-bug-windows.html HTTP/1.1" 200 9251 "http://stackoverflow.com/questions/2106525/java-incorrect-timezone" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:05:49 -0500] "GET /wp-content/cache/autoptimize/js/autoptimize_2af39afd414802b92688b956476aa38e.js?b7a424 HTTP/1.1" 200 46999 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:05:50 -0500] "GET /piwik-analytics/piwik.js HTTP/1.1" 200 15094 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:05:50 -0500] "GET /wp-content/cache/autoptimize/css/autoptimize_8de2408e83a7c9dd6137e726b3faba75.css HTTP/1.1" 200 17565 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
50.22.11.17 - - [01/Apr/2015:08:05:57 -0500] "GET /piwik-analytics/index.php?module=API&method=API.getDefaultMetricTranslations&format=original&serialize=1&trigger=archivephp HTTP/1.1" 200 4085 "-" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
50.22.11.17 - - [01/Apr/2015:08:06:01 -0500] "GET /piwik-analytics/index.php?module=API&method=CoreAdminHome.runScheduledTasks&format=csv&convertToUnicode=0&token_auth=bdb038e526e3e7e663498d6cbfb59ff9&trigger=archivephp HTTP/1.1" 200 73 "-" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:05:51 -0500] "GET /piwik-analytics/piwik.php?action_name=Java%20incorrect%20time%20zone%20bug%20on%20Windows&idsite=1&rec=1&r=850876&h=17&m=5&s=51&url=http%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html&urlref=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F2106525%2Fjava-incorrect-timezone&_id=248d9efc2d0e8449&_idts=1427893551&_idvc=1&_idn=0&_refts=1427893551&_viewts=1427893551&_ref=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F2106525%2Fjava-incorrect-timezone&send_image=0&pdf=0&qt=0&realp=0&wma=0&dir=0&fla=1&java=1&gears=0&ag=1&cookie=1&res=1280x1024&gt_ms=500 HTTP/1.1" 204 - "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:06:02 -0500] "GET /java-native-interface-jni-programmer-guide-tutorial-javah.html HTTP/1.1" 200 8545 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:06:02 -0500] "GET /favicon.ico HTTP/1.1" 200 - "-" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:06:20 -0500] "GET /time-zone-design-in-java-olson-time-zone-database.html HTTP/1.1" 200 7131 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"
87.249.Visitor.IP - - [01/Apr/2015:08:06:20 -0500] "GET /piwik-analytics/piwik.php?action_name=Time%20zone%20design%20in%20Java%20%7C%20Olson%20time%20zone%20database%20in%20Java&idsite=1&rec=1&r=128961&h=17&m=6&s=20&url=http%3A%2F%2Fbaruchyoussin.com%2Ftime-zone-design-in-java-olson-time-zone-database.html&urlref=http%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html&_id=248d9efc2d0e8449&_idts=1427893551&_idvc=1&_idn=0&_refts=1427893551&_viewts=1427893551&_ref=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F2106525%2Fjava-incorrect-timezone&send_image=0&pdf=0&qt=0&realp=0&wma=0&dir=0&fla=1&java=1&gears=0&ag=1&cookie=1&res=1280x1024&gt_ms=484 HTTP/1.1" 204 - "http://baruchyoussin.com/time-zone-design-in-java-olson-time-zone-database.html" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0"

Here Visitor.IP stands for the trailing part of the actual visitor's IP and 50.22.11.17 is the IP of my server.

The second visit is logged as follows:
31.61.Visitor.IP - - [01/Apr/2015:09:38:59 -0500] "GET /java-incorrect-time-zone-bug-windows.html HTTP/1.1" 200 9251 "http://www.google.pl/url?sa=t&rct=j&q=&esrc=s&source=web&cd=10&cad=rja&uact=8&ved=0CFwQFjAJ&url=http%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html&ei=aAEcVYbbLeyP7Ab3hYCoBg&usg=AFQjCNE5Vhj-qGMY_9FrcQjLEUAw38ge5w&bvm=bv.89744112,d.ZGU" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:38:59 -0500] "GET /wp-content/cache/autoptimize/js/autoptimize_2af39afd414802b92688b956476aa38e.js?b7a424 HTTP/1.1" 200 46999 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:00 -0500] "GET /piwik-analytics/piwik.js HTTP/1.1" 200 15094 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:00 -0500] "GET /wp-content/cache/autoptimize/css/autoptimize_8de2408e83a7c9dd6137e726b3faba75.css HTTP/1.1" 200 17565 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
50.22.11.17 - - [01/Apr/2015:09:39:01 -0500] "GET /piwik-analytics/index.php?module=API&method=API.getDefaultMetricTranslations&format=original&serialize=1&trigger=archivephp HTTP/1.1" 200 4085 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
50.22.11.17 - - [01/Apr/2015:09:39:03 -0500] "GET /piwik-analytics/index.php?module=API&method=CoreAdminHome.runScheduledTasks&format=csv&convertToUnicode=0&token_auth=bdb038e526e3e7e663498d6cbfb59ff9&trigger=archivephp HTTP/1.1" 200 73 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:00 -0500] "GET /piwik-analytics/piwik.php?action_name=Java%20incorrect%20time%20zone%20bug%20on%20Windows&idsite=1&rec=1&r=833922&h=16&m=39&s=0&url=http%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html&urlref=http%3A%2F%2Fwww.google.pl%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D10%26cad%3Drja%26uact%3D8%26ved%3D0CFwQFjAJ%26url%3Dhttp%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html%26ei%3DaAEcVYbbLeyP7Ab3hYCoBg%26usg%3DAFQjCNE5Vhj-qGMY_9FrcQjLEUAw38ge5w%26bvm%3Dbv.89744112%2Cd.ZGU&_id=042fdff7fec856fa&_idts=1427899140&_idvc=1&_idn=0&_refts=1427899140&_viewts=1427899140&_ref=http%3A%2F%2Fwww.google.pl%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D10%26cad%3Drja%26uact%3D8%26ved%3D0CFwQFjAJ%26url%3Dhttp%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html%26ei%3DaAEcVYbbLeyP7Ab3hYCoBg%26usg%3DAFQjCNE5Vhj-qGMY_9FrcQjLEUAw38ge5w%26bvm%3Dbv.89 744112%2Cd.ZGU&send_image=0&pdf=0&qt=1&realp=0&wma=1&dir=0&fla=1&java=1&gears=0&ag=0&cookie=1&res=1920x1080&gt_ms=220 HTTP/1.1" 204 - "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:03 -0500] "GET /favicon.ico HTTP/1.1" 200 - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:03 -0500] "GET /java-native-interface-jni-programmer-guide-tutorial-javah.html HTTP/1.1" 200 8545 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:04 -0500] "GET /favicon.ico HTTP/1.1" 200 - "-" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:18 -0500] "GET /time-zone-design-in-java-olson-time-zone-database.html HTTP/1.1" 200 7131 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"
31.61.Visitor.IP - - [01/Apr/2015:09:39:18 -0500] "GET /piwik-analytics/piwik.php?action_name=Time%20zone%20design%20in%20Java%20%7C%20Olson%20time%20zone%20database%20in%20Java&idsite=1&rec=1&r=552871&h=16&m=39&s=18&url=http%3A%2F%2Fbaruchyoussin.com%2Ftime-zone-design-in-java-olson-time-zone-database.html&urlref=http%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html&_id=042fdff7fec856fa&_idts=1427899140&_idvc=1&_idn=0&_refts=1427899140&_viewts=1427899140&_ref=http%3A%2F%2Fwww.google.pl%2Furl%3Fsa%3Dt%26rct%3Dj%26q%3D%26esrc%3Ds%26source%3Dweb%26cd%3D10%26cad%3Drja%26uact%3D8%26ved%3D0CFwQFjAJ%26url%3Dhttp%3A%2F%2Fbaruchyoussin.com%2Fjava-incorrect-time-zone-bug-windows.html%26ei%3DaAEcVYbbLeyP7Ab3hYCoBg%26usg%3DAFQjCNE5Vhj-qGMY_9FrcQjLEUAw38ge5w%26bvm%3Dbv.89744112%2Cd.ZGU&send_image=0&pdf=0&qt=1&realp=0&wma=1&dir=0&fla=1&java=1&gears=0&ag=0&cookie=1&res=1600x900&gt_ms=204 HTTP/1.1" 204 - "http://baruchyoussin.com/time-zone-design-in-java-olson-time-zone-database.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"

Both visits were not listed in Piwik Visitor Log at all.

I have compared Piwik with Google Analytics and reported the results here; this bug accounts for the four page missed by Piwik according to this report.

@BaruchYoussin

This comment has been minimized.

Show comment
Hide comment
@BaruchYoussin

BaruchYoussin Apr 27, 2015

One more thing puzzles me although it is probably irrelevant.

These logs indicate that for the second page that was viewed and immediately closed without Piwik reported it, the referrer was in both cases http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html :

 31.61.Visitor.IP - - [01/Apr/2015:09:39:03 -0500] "GET /java-native-interface-jni-programmer-guide-tutorial-javah.html HTTP/1.1" 200 8545 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"

However, this referrer page http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html does not contain any visible link to the viewed page http://baruchyoussin.com/java-native-interface-jni-programmer-guide-tutorial-javah.html . It does contain a link to it in the head as a link element, link rel='next'.
It is not clear to me how Firefox 35 or 39 visitor (according to the same logs) could have used such link.

BaruchYoussin commented Apr 27, 2015

One more thing puzzles me although it is probably irrelevant.

These logs indicate that for the second page that was viewed and immediately closed without Piwik reported it, the referrer was in both cases http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html :

 31.61.Visitor.IP - - [01/Apr/2015:09:39:03 -0500] "GET /java-native-interface-jni-programmer-guide-tutorial-javah.html HTTP/1.1" 200 8545 "http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html" "Mozilla/5.0 (X11; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0"

However, this referrer page http://baruchyoussin.com/java-incorrect-time-zone-bug-windows.html does not contain any visible link to the viewed page http://baruchyoussin.com/java-native-interface-jni-programmer-guide-tutorial-javah.html . It does contain a link to it in the head as a link element, link rel='next'.
It is not clear to me how Firefox 35 or 39 visitor (according to the same logs) could have used such link.

@mattab

This comment has been minimized.

Show comment
Hide comment
@mattab

mattab Apr 29, 2015

Member

Hi @BaruchYoussin

it's not clear to me that there is a bug, do you mind asking in the forums first? http://forum.piwik.org/

it will help if you could post the command you used to import logs and the output you got from this command

Member

mattab commented Apr 29, 2015

Hi @BaruchYoussin

it's not clear to me that there is a bug, do you mind asking in the forums first? http://forum.piwik.org/

it will help if you could post the command you used to import logs and the output you got from this command

@mattab mattab closed this Apr 29, 2015

@mattab mattab added the answered label Apr 29, 2015

@BaruchYoussin

This comment has been minimized.

Show comment
Hide comment
@BaruchYoussin

BaruchYoussin Apr 30, 2015

Hi @mattab

it will help if you could post the command you used to import logs and the output you got from this command

--I did not import logs. All I did, was to install and activate Piwik on my site, let it run for a month and open visitor log. Separately, I downloaded Raw access logs from my server for the same period and compared.
The comparison showed a pattern in the server logs (see above) that appeared twice (with the same sequence of pageviews, some downloaded with all files and some closed immediately), of visits reported on server logs but not reported on Piwik Visitor log. These visits included calls to piwik.php reporting the pageviews with the exception of those that were immediately closed.
I believe that the fact that these calls to piwik.php were not reported in Piwik visitor log, is a malfunction of Piwik system (from piwik.php to Visitor log), unless you tell me that these were actually bot visits which Piwik knows how to identify.
In the latter case it would be great if you could clarify this as people on piwik forums are interested in such comparisons:
http://forum.piwik.org/read.php?15,125863
http://forum.piwik.org/read.php?15,126295
The matter here is the reliability of Piwik reports.
To reproduce, you can try to simulate calls to piwik.php that appear in these logs, and see how they appear on Visitor logs.
As for asking this on piwik forums, I have asked simpler questions and ended answering them myself:
http://forum.piwik.org/read.php?15,126161
I myself have no personal interest in this as I have deactivated (but not yet removed) piwik on my site for performance reasons. However, I found Piwik to be a great tool that I might use in the future, and I wanted to contribute to its reliability.
I regret if you decide to stand on your decision not to investigate this matter further.

BaruchYoussin commented Apr 30, 2015

Hi @mattab

it will help if you could post the command you used to import logs and the output you got from this command

--I did not import logs. All I did, was to install and activate Piwik on my site, let it run for a month and open visitor log. Separately, I downloaded Raw access logs from my server for the same period and compared.
The comparison showed a pattern in the server logs (see above) that appeared twice (with the same sequence of pageviews, some downloaded with all files and some closed immediately), of visits reported on server logs but not reported on Piwik Visitor log. These visits included calls to piwik.php reporting the pageviews with the exception of those that were immediately closed.
I believe that the fact that these calls to piwik.php were not reported in Piwik visitor log, is a malfunction of Piwik system (from piwik.php to Visitor log), unless you tell me that these were actually bot visits which Piwik knows how to identify.
In the latter case it would be great if you could clarify this as people on piwik forums are interested in such comparisons:
http://forum.piwik.org/read.php?15,125863
http://forum.piwik.org/read.php?15,126295
The matter here is the reliability of Piwik reports.
To reproduce, you can try to simulate calls to piwik.php that appear in these logs, and see how they appear on Visitor logs.
As for asking this on piwik forums, I have asked simpler questions and ended answering them myself:
http://forum.piwik.org/read.php?15,126161
I myself have no personal interest in this as I have deactivated (but not yet removed) piwik on my site for performance reasons. However, I found Piwik to be a great tool that I might use in the future, and I wanted to contribute to its reliability.
I regret if you decide to stand on your decision not to investigate this matter further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment