New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Carbon vs Megacarbon and Roadmap ? #235
Comments
+1 |
1 similar comment
+1 |
Any more info or updates? |
+1 |
2 similar comments
+1 |
+1 |
I don't think anyone knows, TBH. Anyone have an idea what could/should be done? |
It's simple:
Please note, we're talking about 3-4 separate projects. That is:
|
I was asking more about carbon vs. megacarbon, rather than the state of the project itself. If you want to help out with releases, feel free to look at: |
EDIT: mysterious trigger of post button.
I see. Now, my own opinion and experience. I think megacarbon is a step in right direction due to the fact Python process can't run on more than one CPU. And I find game with relays and caches and all that stuff rather appalling. But hey, it apparently works for some people. And, actually, if anybody, or even you, could say which one is it going to be in the future, I mean megacarbon and Whisper or Ceres, it would save me from a lot of questions at my current job. I'm being asked about that a lot.
I have read it now and thank you for bringing it to my attention. However, that thread is 95% about whether next release is going to be labeled as 0.9.x or 0.10.x and ~ 3% about combing things up. |
FWIW, I haven't used megacarbon. I made the choice based on the fact that it didn't yet seem to be considered "stable". I haven't hit any performance with carbon, but haven't yet hit a single relay with more than 2M+ metrics/min. Perhaps if you're planning to go significantly higher than that megacarbon may be your only option. |
@steve-dave Honestly if you're going > 1m metrics every 10 seconds, carbon-relay doesn't work at all. For super scalable and large installations, cyanite with the graphite storage backend is a better option. It uses cassandra as the backing store. |
@SEJeff are you saying that there's no real use for megacarbon then, by the time you've maxed out carbon-relay megacarbon won't help? |
Not exactly. I was throwing (at peak) 3.5 million metrics through collectd to carbon-relay every 10 seconds from many hosts. The caches weren't idle, but we maxed out the ssds in our cache nodes, so IO was not the limiting factor, the relay was. We tuned the kernel, we put it on a 10g network, we tuned every thing a large and senior group of sysadmins would. megacarbon allows you to distribute things with ceres for the caches, but it doesn't do anything much at all for the relays. Megacarbon just allows carbon's backend to be pluggable and as it stands supports both ceres and whisper. This is a massive improvement for scaling carbon horizontally, but again, nothing for the relay. Now the relay falling over is something we investigated a LOT. We tried swapping out the hashing algo in the relay (hoping it would lower cpu usage), we were tasksetting each relay process on it's own dedicated cpu core (we were using the isolcpus functionality to totally isolate the relay cpu cores from the general scheduler). That improved performance by a few hundred thousand metrics every 10 seconds, but even still, with some serious tuning, we didn't get the performance we wanted. We tried "striping" the load from collectd aggregators to different relay processes with the same config set via config management, and that sort of worked, but was clunky at best. Again, carbon-cache was able to keep up, but the relays simply fell over. The relay was and stayed cpu bound under heavy load. Once you reach the "max throughput" that it can handle it just starts queuing and eventually will start dropping metrics on the floor. We wanted high resolution metrics from large clusters of computers and were out to build a reliable and stable platform to store them all in. I used megacarbon "in production" and it worked perfectly fine. For the graphite-web and carbon side of things, megacarbon works perfectly well. Now @pcn has done some interesting things with saving metrics to disk and then another process to send them, but looking heavily at it, it just is a bandaid that does indeed help scale the relays to higher throughput. One solution was to rewrite a large chunk of the relay in c, but haproxy works just as well. Then so long as your datastore is multi-master, you can just evenly round robin requests over them using a battletested codebase such as haproxy or LVS. Keep in mind that most graphite installs won't remotely come close to 3.5+ million metrics every 10 seconds and 25 million metrics per minute is a pretty large number of incoming datapoints. We found that carbon struggled massively while opentsdb along with cassandra were fine (with appropriate tuning obviously). I know the hostedgraphite.com guys use Riak to do the exact same thing and their experience mimics mine, hence them building their own backend ontop of Riak. Carbon is a great idea, but fundamentally, twisted doesn't do what carbon-relay or carbon-aggregator were built to do when hit with sustained and heavy throughput. Much to my chagrin, concurrency isn't one of python's core competencies. Sorry for the long response, but for a LOT of scale, carbon is just fundamentally not a wonderful piece of code. |
@SEJeff what about using round-robin DNS (or haproxy) with a scalable tier of just relay nodes? I suppose you might have to stop your relays when adding new cache nodes until their configs are updated, but that seems like a minor annoyance (collectd on the hosts should buffer data until the relays are available again...) |
@SEJeff Thanks for sharing your experience above! |
With regards to "what is the future of Carbon", the Per a recent conversation with @mleinart, the process will look something like this:
Insert joke about 99 bottles of beer on the wall. |
Closed by leak of activity. |
Hi guys.
I'm planning a big graphite installation and I have to decide the final infrastructure and releases to install.
I have tested Graphite 0.9.X and 0.10-alpha. Wisper and Ceres and Also Carbon ( master branch and Megacarbon branch), I 'm planning a multi node Graphite Cluster.
My only documentation sources for clustering Graphite are :
http://graphite.readthedocs.org/en/latest/carbon-daemons.html
http://bitprophet.org/blog/2013/03/07/graphite/
http://anatolijd.blogspot.com.es/2013/06/graphitemegacarbonceres-multi-node.html
I'm worried because of leak of information on carbon vs megacarbon features. I'm also watching updates and the master seems to be updated newer than megacarbon branch.
Which is best for a big platform now?
What's the roadmap for carbon and megacarbon in nexts months ?
The text was updated successfully, but these errors were encountered: