Drive adoption via specific projects #3254

sandstrom · 2014-12-25T12:57:56Z

I think the idea behind Rubinius (implementing most of Ruby in Ruby itself) is great, so I want to see this project succeed.

One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

Some properties of such a project that'll probably be useful:

Should ideally be self-contained, not a library or framework. That way, less things can break, since no additional, end-user developed code is running on top. Optimizations are also easier.
Should use aspects of Ruby that Rubinius is extra good at, e.g. speed or introspection.
Should be popular in itself, since the purpose is to increase interest adoption of Rubinius.

You'll probably have better thoughts on candidates, but some that come to mind are:

Logstash: popular, performance sensitive, memory introspection may be useful
Chef: self-contained and popular
Rubocop: self-contained, ubiquitous, performance may be important
Sidekiq: performance is important

Things to do could be improving documentation on running the project with Rubinius, making adjustments to make use of specific features in Rubinius (where relevant), make enhancements to Rubinius tailored specifically to these projects (similar to the Rails-MRI symbols garbage collection improvement).

What do you think? Good idea, bad?

[1] https://github.com/elasticsearch/logstash
[3] https://github.com/opscode/chef
[4] https://github.com/bbatsov/rubocop
[5] https://github.com/mperham/sidekiq

brixen · 2014-12-29T21:07:30Z

@sandstrom

One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

This is a great idea. We have definitely put more focus on broad feature compatibility over showing particular success in one area. However, people are more interested in the one thing they care about than the ten things that might work.

In my opinion, Sidekiq is the most interesting of these. Rubinius supports true parallelism and that benefits Sidekiq a lot. We are using a single Rubinius process on each node of a 10-node cluster of 8-core VMs with Sidekiq to churn through processing while saturating the CPU. By contrast, an MRI process can only utilize a single core, requiring 8x the memory usage to effectively use the 8 cores. Multiplied out, that's 80x the memory pressure under MRI simply because MRI has a global interpreter lock (GIL).

If you think you'd like to help with Sidekiq, perhaps a good start would be making a list of issues here that you know need to be addressed. We may then split those into separate issues.

What do you think?

chuckremes · 2014-12-29T21:14:23Z

If sidekiq is the choice, then the Actor library “celluloid” will also need some TLC. It provides all of the parallelism magic, so to make Sidekiq shine then we would need to make celluloid shine first.

On Dec 29, 2014, at 3:07 PM, Brian Shirai notifications@github.com wrote:

@sandstrom https://github.com/sandstrom
One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

This is a great idea. We have definitely put more focus on broad feature compatibility over showing particular success in one area. However, people are more interested in the one thing they care about than the ten things that might work.

In my opinion, Sidekiq is the most interesting of these. Rubinius supports true parallelism and that benefits Sidekiq a lot. We are using a single Rubinius process on each node of a 10-node cluster of 8-core VMs with Sidekiq to churn through processing while saturating the CPU. By contrast, an MRI process can only utilize a single core, requiring 8x the memory usage to effectively use the 8 cores. Multiplied out, that's 80x the memory pressure under MRI simply because MRI has a global interpreter lock (GIL).

If you think you'd like to help with Sidekiq, perhaps a good start would be making a list of issues here that you know need to be addressed. We may then split those into separate issues.

What do you think?

—
Reply to this email directly or view it on GitHub #3254 (comment).

sandstrom · 2014-12-29T21:51:19Z

I'm glad you like it!

Perhaps we should loop in @mperham and see if he has any wishes for Sidekiq and Rubinius?

(Unfortunately I'm working ~70h weeks with a startup and don't have any time to offer. My modest contribution must sadly end with the creation of this issue. Feel free to use or discard this idea as it pleases you).

mperham · 2014-12-29T22:25:16Z

If you want to make Rubinius outshine MRI, make easy to use memory and CPU profiling tools and make them work in a multithreaded environment. Lots of Sidekiq users want to profile a job (running on a single thread) in production but can't get anything useful out of MRI because its profiling APIs watch all threads so there's too much noise in the output. Many also complain about memory bloat, usually due to ActiveRecord, but there's no tools that can be used to track down the source of the bloat.

brixen · 2014-12-30T04:35:43Z

@mperham awesome suggestions, thank you! Those are on the way and I'll prioritize them as high as I can.

brixen · 2014-12-30T04:37:19Z

@mperham btw, if you know of any users who may have open-source apps or apps they may be willing to share to look into this, that would be fantastic.

ileitch · 2014-12-30T07:09:18Z

I hope you don't mind me chiming in here. I'm an old contributor and long time user - It bothers me that Rubinius hasn't seen the adoption it deserves.

My 2 cents on why this is. Rubinius needs to outshine MRI on the basic, day-to-day use cases of the majority of Ruby developers. I see that as local, development mode Rails app development. People often deploy the same Ruby platform to production as they use in development; it's more familiar and reduces risk. If Rubinius can outshine MRI here, production deployments will follow.

I see only 1 major hurdle remaining for this to be the case: Rails dev mode performance. I've tried many times over the years to make Rubinius my local Ruby of choice, but I always switch back to MRI due to the agonizing page load times. I'm sure 90% of this boils down to code reloading and the asset pipeline, places where the JIT can't shine.

I realize it sounds a rather pathetic argument when you compare it to all the other advantages Rubinius provides, but the majority of developers will follow the path of least resistance. I look forward to seeing if the "CodeDB" in Rubinius 3.0 can bring us on-par with MRI.

❤️

brixen · 2014-12-30T07:18:26Z

@ileitch you're totally correct as well, we need to have the Rails boot times on par with MRI. This is super high on my list and I swear I'll work on it once the latest round of keyword argument parsing misery is addressed. We're using a hybrid approach where developers can use MRI or Rubinius for development and deploy on Rubinius. Travis CI makes this nearly painless. We need to document this better as well.

lucasmartins · 2015-02-25T12:34:41Z

@brixen Nice to read this discussion, about leveraging Sidekiq on Rubinius, MailCannon is an Open Source project running at production for an year, I'm not working on the company which I created it for anymore but I'm sure I can get updated production metrics with the guys who do.

We faced all the issues @mperham mentioned, performance profiling was mainly made by using Librato at that time. Also, I used OSX "Instruments" to watch the memory, but it's barely useful.

brixen · 2015-02-25T21:37:07Z

@lucasmartins it would be awesome to have a dialogue with folks supporting MailCannon in production. The new(-ish) Rubinius Metrics stuff was created exactly to address the problem of understanding performance without resorting to artificial (and nearly useless) benchmarking. The next step is to make it even easier to get the results of Rubinius Metrics. If you can put me in contact with someone, that would be awesome!

lucasmartins · 2015-02-26T10:38:34Z

@joaohornburg John, can you catch up on this thread? @brixen runs the Rubinius project and it would be awesome if you could give him some metrics about MailCannon running in production. Optimizing Sidekiq on RBX would leverage the adoption. It still runs on RBX right?

lucasmartins · 2015-02-26T12:56:17Z

@pcasaretto Hey Paulo, can you catch up on this thread? @brixen runs the Rubinius project and it would be awesome if you could give him some metrics about MailCannon running in production. Optimizing Sidekiq on RBX would leverage its adoption. It still runs on RBX right?

pcasaretto · 2015-03-03T13:22:54Z

@andrehjr is the guy running MailCannon these days.

sandstrom · 2015-05-11T19:52:05Z

@brixen just curious if you've found any good targets?

brixen · 2015-05-23T16:31:15Z

@sandstrom mostly focused on production apps at this time, but I think Discourse could be a good app, especially to demonstrate concurrency, concurrent garbage collection, and the built-in Metrics with StatsD output.

Any interest in looking into setting up a Discourse test app and environment?

sandstrom · 2015-05-24T16:39:16Z

No time, sorry (#3254 (comment))

digitalextremist · 2015-08-14T03:15:52Z

Right now sidekiq will segfault with Rubinius ( sidekiq/sidekiq#2489 )

But otherwise, I have been doing all my latest Celluloid related work using Rubinus and would like to continue to. I will be doing deeper comparisons to jRuby in different types of operations, system configurations, etc. For development it is much faster, but production is unclear.

brixen · 2020-01-04T21:14:38Z

Rubinius encourages people to experiment with existing Ruby projects, but migrating from MRI is not an activity that has been very successful in the past.

The focus for Rubinius in the near term is on the following capabilities:

Instruction set
Debugger
Profiler
Just-in-time compiler
Concurrency
Garbage collector

Contributions in the form of PRs for any of the areas of focus above are appreciated. Once those core capabilities are more robust, it will be possible to better support people interested in trying to run their existing projects on Rubinius.

yorickpeterse added the Community label Jan 1, 2015

brixen closed this as completed Jan 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drive adoption via specific projects #3254

Drive adoption via specific projects #3254

sandstrom commented Dec 25, 2014

brixen commented Dec 29, 2014

chuckremes commented Dec 29, 2014

sandstrom commented Dec 29, 2014

mperham commented Dec 29, 2014

brixen commented Dec 30, 2014

brixen commented Dec 30, 2014

ileitch commented Dec 30, 2014

brixen commented Dec 30, 2014

lucasmartins commented Feb 25, 2015

brixen commented Feb 25, 2015

lucasmartins commented Feb 26, 2015

lucasmartins commented Feb 26, 2015

pcasaretto commented Mar 3, 2015

sandstrom commented May 11, 2015

brixen commented May 23, 2015

sandstrom commented May 24, 2015

digitalextremist commented Aug 14, 2015

brixen commented Jan 4, 2020

Drive adoption via specific projects #3254

Drive adoption via specific projects #3254

Comments

sandstrom commented Dec 25, 2014

brixen commented Dec 29, 2014

chuckremes commented Dec 29, 2014

sandstrom commented Dec 29, 2014

mperham commented Dec 29, 2014

brixen commented Dec 30, 2014

brixen commented Dec 30, 2014

ileitch commented Dec 30, 2014

brixen commented Dec 30, 2014

lucasmartins commented Feb 25, 2015

brixen commented Feb 25, 2015

lucasmartins commented Feb 26, 2015

lucasmartins commented Feb 26, 2015

pcasaretto commented Mar 3, 2015

sandstrom commented May 11, 2015

brixen commented May 23, 2015

sandstrom commented May 24, 2015

digitalextremist commented Aug 14, 2015

brixen commented Jan 4, 2020