Skip to content

Commit

Permalink
Be forgiving of extra space after date in logs
Browse files Browse the repository at this point in the history
With the recent nginx update, we changed the access log format slightly:

* 2 spaces after the date instead of 1
* append the sni host to the end of the line

The log parser ignores the sni host, but was strict about the spaces
after the date, causing stat generation to find no downloads. This
relaxes the regex to allow for any number of spaces after the date so we
can re-process logs that were generated with the new format, and allows
us to adjust the format to go back to one space if we decide to do so.

Related to #554
  • Loading branch information
tobias committed Aug 13, 2016
1 parent 41201ea commit 05363c7
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 13 deletions.
2 changes: 1 addition & 1 deletion dev-resources/fake.access.log
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ blistering barnacles
127.0.0.4 - - [14/May/2012:06:45:40 +0000] "GET /repo/snowy/snowy/0.3.0/snowy-0.3.0.jar HTTP/1.1" 200 2377 "-" "Java/1.6.0_30"
127.0.0.2 - - [14/May/2012:06:43:40 +0000] "GET /repo/snowy/snowy/0.2.0/snowy-0.2.0.jar HTTP/1.1" 200 2377 "-" "Java/1.6.0_30"
billions of bilious [blue blistering] "barnacles in ten" thousand thundering "typhoons" "!"
127.0.0.4 - - [14/May/2012:06:45:40 +0000] "GET /repo/snowy/snowy/0.3.0/snowy-0.3.0.jar HTTP/1.1" 200 2377 "-" "Java/1.6.0_30"
127.0.0.4 - - [14/May/2012:06:45:40 +0000] "GET /repo/snowy/snowy/0.3.0/snowy-0.3.0.jar HTTP/1.1" 200 2377 "-" "Java/1.6.0_30" "clojars.org"
2 changes: 1 addition & 1 deletion src/clojars/tools/process_stats.clj
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
(re/regex [field :as :host] \space
[field :as :ident] \space
[field :as :authuser] \space
\[ [nonbracket :as :time] \] \space
\[ [nonbracket :as :time] \] #"\s+"
\" reqline \" \space
[field :as :status] \space
[field :as :size]
Expand Down
25 changes: 14 additions & 11 deletions test/clojars/test/unit/tools/process_stats.clj
Original file line number Diff line number Diff line change
Expand Up @@ -11,19 +11,22 @@
:ext "jar"}
(stats/parse-path "/repo/captain/archibald/haddock/0.1.0/haddock-0.1.0.jar"))))

(def sample-line "127.0.0.1 - - [14/Apr/2012:06:40:59 +0000] \"GET /repo/captain/archibald/haddock/0.1.0/haddock-0.1.0.jar HTTP/1.1\" 200 2377 \"-\" \"Java/1.6.0_30\"")
(def old-format "::ffff:127.0.0.1 - - [14/Apr/2012:06:40:59 +0000] \"GET /repo/captain/archibald/haddock/0.1.0/haddock-0.1.0.jar HTTP/1.1\" 200 2377 \"-\" \"Java/1.6.0_30\"")

(def new-format "::ffff:127.0.0.1 - - [14/Apr/2012:06:40:59 +0000] \"GET /repo/captain/archibald/haddock/0.1.0/haddock-0.1.0.jar HTTP/1.1\" 200 2377 \"-\" \"Java/1.6.0_30\" \"clojars.org\"")

(deftest parse-clf
(let [m (stats/parse-clf sample-line)]
(is (= 200 (:status m)))
(is (= 2377 (:size m)))
(is (= "GET" (:method m)))
(is (= 14 (time/day (:time m))))
(is (= 40 (time/minute (:time m))))
(is (= "haddock" (:name m)))
(is (= "captain.archibald" (:group m)))
(is (= "0.1.0" (:version m)))
(is (= "jar" (:ext m)))))
(doseq [sample-line [old-format new-format]]
(let [m (stats/parse-clf sample-line)]
(is (= 200 (:status m)))
(is (= 2377 (:size m)))
(is (= "GET" (:method m)))
(is (= 14 (time/day (:time m))))
(is (= 40 (time/minute (:time m))))
(is (= "haddock" (:name m)))
(is (= "captain.archibald" (:group m)))
(is (= "0.1.0" (:version m)))
(is (= "jar" (:ext m))))))

(deftest compute-stats
(let [stats (stats/process-log (io/resource "fake.access.log"))]
Expand Down

0 comments on commit 05363c7

Please sign in to comment.