Skip to content

Commit

Permalink
Add readme, total count for a bucket
Browse files Browse the repository at this point in the history
  • Loading branch information
courtenay committed Jan 12, 2010
1 parent b51b602 commit b31bcb5
Show file tree
Hide file tree
Showing 4 changed files with 123 additions and 5 deletions.
95 changes: 95 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
Weed-3 \../ .`'`^.
A Sinatra Analytics API \ \ ( @ )
\ `./ (__/,
.,____~~~~~-`

(Picture is unrelated)
=================================

Gathering real-time analytics for arbitrary factors is simple enough to
do at low levels of scale. However, once you are recording millions of
items and storing them in SQL, the aggregate queries become too slow to
realistically perform any kind of performant data analysis.

This library aims to solve this problem well enough for the average web
application to use in daily analytics activities by utilizing some data
warehousing concepts and smart caching.

Installation
------------
Weed3 is built on Sinatra, a simple ruby web framework, the installation
of which is beyond the scope of this file.

For rubyists, it should behave fairly well as a Rack middleware, or even
as a plugin, since the entire application is namespaced under the Weed module.

For everyone else, you should be able to start this application as a daemon
and proxy to it, or start it under your webserver (Apache 2 or Nginx) as
a Passenger application.

Your zeroth step is to create and load the database (later, modify the database
settings in environment.rb). The application should create its own sqlite3
database in db/ but you'll want to load the schema.

$ rake db:migrate

Now, check to see if the tests all pass on your system.

$ rake test

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby -I"lib:test" "/Users/courtenay/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake/rake_test_loader.rb" "test/application_test.rb" "test/stats_test.rb"
Loaded suite /Users/courtenay/.gem/ruby/1.8/gems/rake-0.8.7/lib/rake/rake_test_loader
Started
..............
Finished in 10.277229 seconds.

Start the server like this:

$ ruby -r weed.rb -e 'Weed::Application.run!'

Access it in your browser at this URL: http://localhost:4567/ or test it with curl

$ curl http://localhost:4567
Weed!

$ curl http://localhost:4567/stats
{"count":7}

Weed attempts to be a good RESTful citizen, so, send a hit with a POST

$ curl http://localhost:4567 -d"q[bucket_id]=10"
$ curl http://localhost:4567/stats
{"count":8}

$ curl http://localhost:4567/stats/10
{"count":1}

$ curl http://localhost:4567/stats/10/month/2010/1
{"count":1}

$ curl http://localhost:4567/stats/10/week/2010/1/14/daily
[0,0,0,1,0,0,0]

For more information (there are more things you can do) see the 'weed.rb' file.

How it works
------------

Weed3 records stats in the Stats table with a datestamp.

When you request an aggregate (hits this month), it creates an entry in the
Weed::CachedStats table with the period scale ('month') the date (2009, 12)
and the sum of daily counts so that you never have to run that aggregate again.
(Note the daily counts are themselves aggregates, a count of all items that day)

In addition, because it knows about the hierarchy of dates (year - month - day)
if you do a sum of the year's dates, it will calculate the sum of the month's dates
(only 12 items scanned in the aggregate) rather than the days (365 items)

This part of the app is transparent to users.





Copyright ©2010 Courtenay Gasking
15 changes: 10 additions & 5 deletions lib/weed/stats.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def self.by_year(year, conditions)
# Weed::CachedStats.sum('counter', :conditions => ['period = ? AND year = ?', 'year', year])
sum
else
raise "not implemented!"
# raise "not implemented!"
cached.counter
end
end
Expand All @@ -82,11 +82,16 @@ def self.by_total(conditions)
oldest = Weed::Stats.first(:conditions => conditions)
newest = Weed::Stats.first(:conditions => conditions, :order => "cdate desc")
sum = 0
(oldest.cdate.year..newest.cdate.year).each do |year|
sum += by_year(year, conditions)
if oldest && newest
(oldest.cdate.year..newest.cdate.year).each do |year|
sum += by_year(year, conditions)
end
Weed::CachedStats.override(conditions.merge({:period => 'total', :counter => sum}))
# Weed::CachedStats.sum('counter', :conditions => ['period = ?', 'total'])
sum
end
Weed::CachedStats.override(conditions.merge({:period => 'total', :counter => sum}))
Weed::CachedStats.sum('counter', :conditions => ['period = ?', 'total'])
else
cached.counter
end
end
end
Expand Down
10 changes: 10 additions & 0 deletions test/application_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,16 @@ def app
get "/stats/3/month/#{Date.today.year}/#{Date.today.month}"
assert_equal({ :count => 1 }.to_json, last_response.body)
end

it "shows total count with conditions" do
Weed::CachedStats.delete_all # wtf
Weed::Stats.delete_all # wtf

post '/record', { :q => { "bucket_id" => 4 }, :user => 'jimmy-5' }
post '/record', { :q => { "bucket_id" => 3 }, :user => 'jimmy-5' }
get "/stats/3"
assert_equal({ :count => 1 }.to_json, last_response.body)
end

it "shows stats per day for a week" do
Weed::CachedStats.delete_all # wtf
Expand Down
8 changes: 8 additions & 0 deletions weed.rb
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,14 @@ class Application < Sinatra::Base
{ :count => Stats.count(:conditions => params[:q]) }.to_json
end

get "/stats/all" do
Stats.all.to_json
end

get "/stats/:bucket_id" do
{ :count => Stats.by_total({ :bucket_id => params[:bucket_id] }) }.to_json
end

get "/stats/:bucket_id/day/:date" do # hmm. year/month/day?
{ :count => Stats.by_day(params[:date], { :bucket_id => params[:bucket_id] }) }.to_json
end
Expand Down

0 comments on commit b31bcb5

Please sign in to comment.