Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended infrastructure #23

Closed
Lazmonster opened this issue Feb 19, 2014 · 5 comments
Closed

Recommended infrastructure #23

Lazmonster opened this issue Feb 19, 2014 · 5 comments

Comments

@Lazmonster
Copy link

Hi I'd welcome some advice. We are new to Boomerang but are considering using it to monitor a client site with circa 82m Page Views per month traffic.
Do we need to supply our own infrastructure for data storage etc, or is there some out there for general use? If the latter, how is it paid for, and if the former grateful if somebody could recommend a spec for an environment to support this.
In other words, can we just deploy the tag and capture the results, or do we need to setup an infrastructure to do this?
By all means point me to any docs on the subject.
Many thanks

@bluesmoon
Copy link
Member

You have a few options.

  1. Use something like Google Analytics or Piwik. There are howtos on the web about using boomerang with both of these. For Piwik you'll have to have your own hardware, but the software is already built for you. I'm not sure if you'll get histograms and percentiles with these though.
  2. Use a commercial service. My company (SOASTA) has a commercial service built around boomerang (mPulse), and many web/performance shops use our service for their clients. If you're interested in this option, send me an offline message since I'd rather not push a commercial service on the opensource forum. There are other companies that offer commercial services around boomerang as well, like Neustar, Keynote, and more.
  3. Build your own. It could be as easy as post-hoc log processing (I think Howto-0 covers some of this), or writing a php or jsp, or some other web endpoint to receive the data and insert it into a database. 82MM b/m < 2000 beacons/minute. This is very easy to handle on a single web server, but I'd suggest using 2 just for redundancy.

Will write more as I think of it.

@Lazmonster
Copy link
Author

Thanks!

@bluesmoon
Copy link
Member

Just wanted to add another opensource project I found called boomcatch that handles the backend for boomerang: https://github.com/nature/boomcatch/ and http://cruft.io/posts/introducing-boomcatch/

@andreas-marschke
Copy link

Btw. I'm currently in the process of writing and roll-out planning for my own boomerang backend server boomerang-express

Most of the ruleset is pretty solid by now. It's designed to scale-up to a multi-tenant system but also capable to scale down to a single user that collects data for his sites.

Is capable of serializing everything from the headers over cookies to everything else that might come with a beacon of any kind.

It also has a "pluggable"/"replaceable" backend where I plan to integrate it at first only with a locally running NeDB (developer setup) but also scale to the point of multiple mongodb instances. Most of these things are configured using the beacon url in the frontend and the datastore in the backend.

It also actively works at preventing the beacon from being abused by scoring incoming beacons or requests to beacon urls on its referral URL and url-parameters.

Current working tree here is capable of running with NeDB and all loggin deferred to scale to approx 3K Req/s on a single cpu and single node with concurrency of 100 requests in 2 threads over 60s.

I used wrk for these measurements. They aren't necessairly complete but a viable first start for benchmarking.

@nicjansma
Copy link

Since there aren't any open questions in this Issue, I'm going to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants