Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carbon vs Megacarbon and Roadmap ? #235

Closed
toni-moreno opened this issue Apr 7, 2014 · 18 comments
Closed

Carbon vs Megacarbon and Roadmap ? #235

toni-moreno opened this issue Apr 7, 2014 · 18 comments

Comments

@toni-moreno
Copy link

Hi guys.

I'm planning a big graphite installation and I have to decide the final infrastructure and releases to install.

I have tested Graphite 0.9.X and 0.10-alpha. Wisper and Ceres and Also Carbon ( master branch and Megacarbon branch), I 'm planning a multi node Graphite Cluster.

My only documentation sources for clustering Graphite are :

http://graphite.readthedocs.org/en/latest/carbon-daemons.html
http://bitprophet.org/blog/2013/03/07/graphite/
http://anatolijd.blogspot.com.es/2013/06/graphitemegacarbonceres-multi-node.html

I'm worried because of leak of information on carbon vs megacarbon features. I'm also watching updates and the master seems to be updated newer than megacarbon branch.

Which is best for a big platform now?
What's the roadmap for carbon and megacarbon in nexts months ?

@meteozond
Copy link

+1

1 similar comment
@nickchappell
Copy link

+1

@nickchappell
Copy link

Any more info or updates?

@psych0d0g
Copy link

+1

2 similar comments
@payamsabz
Copy link

+1

@proover
Copy link

proover commented Jul 7, 2014

+1

@esc
Copy link
Contributor

esc commented Jul 10, 2014

I don't think anyone knows, TBH. Anyone have an idea what could/should be done?

@zstyblik
Copy link

Anyone have an idea what could/should be done?

It's simple:

  1. fork it
  2. document hell out of it
  3. attract developers(previous should do the trick)
  4. secure some backend(= money to pay for dev time)

Please note, we're talking about 3-4 separate projects. That is:

  • graphite-web
  • carbon
  • whisper
  • ceres

@esc
Copy link
Contributor

esc commented Jul 12, 2014

I was asking more about carbon vs. megacarbon, rather than the state of the project itself.

If you want to help out with releases, feel free to look at:

graphite-project/graphite-web#677

@zstyblik
Copy link

EDIT: mysterious trigger of post button.

I was asking more about carbon vs. megacarbon, rather than the state of the project itself.

I see. Now, my own opinion and experience. I think megacarbon is a step in right direction due to the fact Python process can't run on more than one CPU. And I find game with relays and caches and all that stuff rather appalling. But hey, it apparently works for some people.
And since PRs for megacarbon aren't being accepted for whatever reason, we've ended up with megacarbon from a "3rd party" git. And so will anybody else should they run megacarbon.
The same goes for Ceres and its maintenance tools.

And, actually, if anybody, or even you, could say which one is it going to be in the future, I mean megacarbon and Whisper or Ceres, it would save me from a lot of questions at my current job. I'm being asked about that a lot.

If you want to help out with releases, feel free to look at:

I have read it now and thank you for bringing it to my attention. However, that thread is 95% about whether next release is going to be labeled as 0.9.x or 0.10.x and ~ 3% about combing things up.
I don't know what you want or expect me to say.

@steve-dave
Copy link
Member

FWIW, I haven't used megacarbon. I made the choice based on the fact that it didn't yet seem to be considered "stable". I haven't hit any performance with carbon, but haven't yet hit a single relay with more than 2M+ metrics/min. Perhaps if you're planning to go significantly higher than that megacarbon may be your only option.

@SEJeff
Copy link
Member

SEJeff commented Jul 14, 2014

@steve-dave Honestly if you're going > 1m metrics every 10 seconds, carbon-relay doesn't work at all. For super scalable and large installations, cyanite with the graphite storage backend is a better option. It uses cassandra as the backing store.

@steve-dave
Copy link
Member

@SEJeff are you saying that there's no real use for megacarbon then, by the time you've maxed out carbon-relay megacarbon won't help?

@SEJeff
Copy link
Member

SEJeff commented Jul 15, 2014

Not exactly. I was throwing (at peak) 3.5 million metrics through collectd to carbon-relay every 10 seconds from many hosts. The caches weren't idle, but we maxed out the ssds in our cache nodes, so IO was not the limiting factor, the relay was. We tuned the kernel, we put it on a 10g network, we tuned every thing a large and senior group of sysadmins would.

megacarbon allows you to distribute things with ceres for the caches, but it doesn't do anything much at all for the relays. Megacarbon just allows carbon's backend to be pluggable and as it stands supports both ceres and whisper. This is a massive improvement for scaling carbon horizontally, but again, nothing for the relay.

Now the relay falling over is something we investigated a LOT. We tried swapping out the hashing algo in the relay (hoping it would lower cpu usage), we were tasksetting each relay process on it's own dedicated cpu core (we were using the isolcpus functionality to totally isolate the relay cpu cores from the general scheduler). That improved performance by a few hundred thousand metrics every 10 seconds, but even still, with some serious tuning, we didn't get the performance we wanted. We tried "striping" the load from collectd aggregators to different relay processes with the same config set via config management, and that sort of worked, but was clunky at best. Again, carbon-cache was able to keep up, but the relays simply fell over. The relay was and stayed cpu bound under heavy load. Once you reach the "max throughput" that it can handle it just starts queuing and eventually will start dropping metrics on the floor. We wanted high resolution metrics from large clusters of computers and were out to build a reliable and stable platform to store them all in. I used megacarbon "in production" and it worked perfectly fine. For the graphite-web and carbon side of things, megacarbon works perfectly well.

Now @pcn has done some interesting things with saving metrics to disk and then another process to send them, but looking heavily at it, it just is a bandaid that does indeed help scale the relays to higher throughput. One solution was to rewrite a large chunk of the relay in c, but haproxy works just as well. Then so long as your datastore is multi-master, you can just evenly round robin requests over them using a battletested codebase such as haproxy or LVS.

Keep in mind that most graphite installs won't remotely come close to 3.5+ million metrics every 10 seconds and 25 million metrics per minute is a pretty large number of incoming datapoints. We found that carbon struggled massively while opentsdb along with cassandra were fine (with appropriate tuning obviously). I know the hostedgraphite.com guys use Riak to do the exact same thing and their experience mimics mine, hence them building their own backend ontop of Riak. Carbon is a great idea, but fundamentally, twisted doesn't do what carbon-relay or carbon-aggregator were built to do when hit with sustained and heavy throughput. Much to my chagrin, concurrency isn't one of python's core competencies.

Sorry for the long response, but for a LOT of scale, carbon is just fundamentally not a wonderful piece of code.

@ccope
Copy link

ccope commented Jul 15, 2014

@SEJeff what about using round-robin DNS (or haproxy) with a scalable tier of just relay nodes? I suppose you might have to stop your relays when adding new cache nodes until their configs are updated, but that seems like a minor annoyance (collectd on the hosts should buffer data until the relays are available again...)

@steve-dave
Copy link
Member

@SEJeff Thanks for sharing your experience above!

@obfuscurity
Copy link
Member

With regards to "what is the future of Carbon", the megacarbon branch was carefully merged in graphite-web some time ago but was never merged into master here in the Carbon project. That work remains and it won't be easy.

Per a recent conversation with @mleinart, the process will look something like this:

$ git log --pretty=oneline 5a286f5..master | wc -l
      77
$ git merge ~77
  ... fix conflicts ...
$ git merge ~76
  ... fix conflicts ...

Insert joke about 99 bottles of beer on the wall.

@toni-moreno
Copy link
Author

Closed by leak of activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests