Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drive adoption via specific projects #3254

Closed
sandstrom opened this issue Dec 25, 2014 · 18 comments
Closed

Drive adoption via specific projects #3254

sandstrom opened this issue Dec 25, 2014 · 18 comments

Comments

@sandstrom
Copy link
Contributor

I think the idea behind Rubinius (implementing most of Ruby in Ruby itself) is great, so I want to see this project succeed.

One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

Some properties of such a project that'll probably be useful:

  1. Should ideally be self-contained, not a library or framework. That way, less things can break, since no additional, end-user developed code is running on top. Optimizations are also easier.
  2. Should use aspects of Ruby that Rubinius is extra good at, e.g. speed or introspection.
  3. Should be popular in itself, since the purpose is to increase interest adoption of Rubinius.

You'll probably have better thoughts on candidates, but some that come to mind are:

  1. Logstash: popular, performance sensitive, memory introspection may be useful
  2. Chef: self-contained and popular
  3. Rubocop: self-contained, ubiquitous, performance may be important
  4. Sidekiq: performance is important

Things to do could be improving documentation on running the project with Rubinius, making adjustments to make use of specific features in Rubinius (where relevant), make enhancements to Rubinius tailored specifically to these projects (similar to the Rails-MRI symbols garbage collection improvement).

What do you think? Good idea, bad?

[1] https://github.com/elasticsearch/logstash
[3] https://github.com/opscode/chef
[4] https://github.com/bbatsov/rubocop
[5] https://github.com/mperham/sidekiq

@brixen
Copy link
Member

brixen commented Dec 29, 2014

@sandstrom

One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

This is a great idea. We have definitely put more focus on broad feature compatibility over showing particular success in one area. However, people are more interested in the one thing they care about than the ten things that might work.

In my opinion, Sidekiq is the most interesting of these. Rubinius supports true parallelism and that benefits Sidekiq a lot. We are using a single Rubinius process on each node of a 10-node cluster of 8-core VMs with Sidekiq to churn through processing while saturating the CPU. By contrast, an MRI process can only utilize a single core, requiring 8x the memory usage to effectively use the 8 cores. Multiplied out, that's 80x the memory pressure under MRI simply because MRI has a global interpreter lock (GIL).

If you think you'd like to help with Sidekiq, perhaps a good start would be making a list of issues here that you know need to be addressed. We may then split those into separate issues.

What do you think?

@chuckremes
Copy link
Member

If sidekiq is the choice, then the Actor library “celluloid” will also need some TLC. It provides all of the parallelism magic, so to make Sidekiq shine then we would need to make celluloid shine first.

On Dec 29, 2014, at 3:07 PM, Brian Shirai notifications@github.com wrote:

@sandstrom https://github.com/sandstrom
One thing that may be helpful is to single out particular software projects and make them run really well on Rubinius. This will mostly entail work in the project itself, not in Rubinius. But by making Rubinius the primary runtime for the project, it'll increase participation and interest in Rubinius itself.

This is a great idea. We have definitely put more focus on broad feature compatibility over showing particular success in one area. However, people are more interested in the one thing they care about than the ten things that might work.

In my opinion, Sidekiq is the most interesting of these. Rubinius supports true parallelism and that benefits Sidekiq a lot. We are using a single Rubinius process on each node of a 10-node cluster of 8-core VMs with Sidekiq to churn through processing while saturating the CPU. By contrast, an MRI process can only utilize a single core, requiring 8x the memory usage to effectively use the 8 cores. Multiplied out, that's 80x the memory pressure under MRI simply because MRI has a global interpreter lock (GIL).

If you think you'd like to help with Sidekiq, perhaps a good start would be making a list of issues here that you know need to be addressed. We may then split those into separate issues.

What do you think?


Reply to this email directly or view it on GitHub #3254 (comment).

@sandstrom
Copy link
Contributor Author

I'm glad you like it!

Perhaps we should loop in @mperham and see if he has any wishes for Sidekiq and Rubinius?

(Unfortunately I'm working ~70h weeks with a startup and don't have any time to offer. My modest contribution must sadly end with the creation of this issue. Feel free to use or discard this idea as it pleases you).

@mperham
Copy link
Contributor

mperham commented Dec 29, 2014

If you want to make Rubinius outshine MRI, make easy to use memory and CPU profiling tools and make them work in a multithreaded environment. Lots of Sidekiq users want to profile a job (running on a single thread) in production but can't get anything useful out of MRI because its profiling APIs watch all threads so there's too much noise in the output. Many also complain about memory bloat, usually due to ActiveRecord, but there's no tools that can be used to track down the source of the bloat.

@brixen
Copy link
Member

brixen commented Dec 30, 2014

@mperham awesome suggestions, thank you! Those are on the way and I'll prioritize them as high as I can.

@brixen
Copy link
Member

brixen commented Dec 30, 2014

@mperham btw, if you know of any users who may have open-source apps or apps they may be willing to share to look into this, that would be fantastic.

@ileitch
Copy link
Member

ileitch commented Dec 30, 2014

I hope you don't mind me chiming in here. I'm an old contributor and long time user - It bothers me that Rubinius hasn't seen the adoption it deserves.

My 2 cents on why this is. Rubinius needs to outshine MRI on the basic, day-to-day use cases of the majority of Ruby developers. I see that as local, development mode Rails app development. People often deploy the same Ruby platform to production as they use in development; it's more familiar and reduces risk. If Rubinius can outshine MRI here, production deployments will follow.

I see only 1 major hurdle remaining for this to be the case: Rails dev mode performance. I've tried many times over the years to make Rubinius my local Ruby of choice, but I always switch back to MRI due to the agonizing page load times. I'm sure 90% of this boils down to code reloading and the asset pipeline, places where the JIT can't shine.

I realize it sounds a rather pathetic argument when you compare it to all the other advantages Rubinius provides, but the majority of developers will follow the path of least resistance. I look forward to seeing if the "CodeDB" in Rubinius 3.0 can bring us on-par with MRI.

❤️

@brixen
Copy link
Member

brixen commented Dec 30, 2014

@ileitch you're totally correct as well, we need to have the Rails boot times on par with MRI. This is super high on my list and I swear I'll work on it once the latest round of keyword argument parsing misery is addressed. We're using a hybrid approach where developers can use MRI or Rubinius for development and deploy on Rubinius. Travis CI makes this nearly painless. We need to document this better as well.

@lucasmartins
Copy link

@brixen Nice to read this discussion, about leveraging Sidekiq on Rubinius, MailCannon is an Open Source project running at production for an year, I'm not working on the company which I created it for anymore but I'm sure I can get updated production metrics with the guys who do.

We faced all the issues @mperham mentioned, performance profiling was mainly made by using Librato at that time. Also, I used OSX "Instruments" to watch the memory, but it's barely useful.

@brixen
Copy link
Member

brixen commented Feb 25, 2015

@lucasmartins it would be awesome to have a dialogue with folks supporting MailCannon in production. The new(-ish) Rubinius Metrics stuff was created exactly to address the problem of understanding performance without resorting to artificial (and nearly useless) benchmarking. The next step is to make it even easier to get the results of Rubinius Metrics. If you can put me in contact with someone, that would be awesome!

@lucasmartins
Copy link

@joaohornburg John, can you catch up on this thread? @brixen runs the Rubinius project and it would be awesome if you could give him some metrics about MailCannon running in production. Optimizing Sidekiq on RBX would leverage the adoption. It still runs on RBX right?

@lucasmartins
Copy link

@pcasaretto Hey Paulo, can you catch up on this thread? @brixen runs the Rubinius project and it would be awesome if you could give him some metrics about MailCannon running in production. Optimizing Sidekiq on RBX would leverage its adoption. It still runs on RBX right?

@pcasaretto
Copy link

@andrehjr is the guy running MailCannon these days.

@sandstrom
Copy link
Contributor Author

@brixen just curious if you've found any good targets?

@brixen
Copy link
Member

brixen commented May 23, 2015

@sandstrom mostly focused on production apps at this time, but I think Discourse could be a good app, especially to demonstrate concurrency, concurrent garbage collection, and the built-in Metrics with StatsD output.

Any interest in looking into setting up a Discourse test app and environment?

@sandstrom
Copy link
Contributor Author

No time, sorry (#3254 (comment))

@digitalextremist
Copy link
Member

Right now sidekiq will segfault with Rubinius ( sidekiq/sidekiq#2489 )

But otherwise, I have been doing all my latest Celluloid related work using Rubinus and would like to continue to. I will be doing deeper comparisons to jRuby in different types of operations, system configurations, etc. For development it is much faster, but production is unclear.

@brixen
Copy link
Member

brixen commented Jan 4, 2020

Rubinius encourages people to experiment with existing Ruby projects, but migrating from MRI is not an activity that has been very successful in the past.

The focus for Rubinius in the near term is on the following capabilities:

  1. Instruction set
  2. Debugger
  3. Profiler
  4. Just-in-time compiler
  5. Concurrency
  6. Garbage collector

Contributions in the form of PRs for any of the areas of focus above are appreciated. Once those core capabilities are more robust, it will be possible to better support people interested in trying to run their existing projects on Rubinius.

@brixen brixen closed this as completed Jan 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants