Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to enable ir writing on-the-fly #3791

Merged
merged 1 commit into from Apr 15, 2016

Conversation

@chyh1990
Copy link

@chyh1990 chyh1990 commented Apr 10, 2016

Hi,

I'm working on ruby (JRuby only) binding for Apache Spark. I have finished a working prototype which enables REAL closure serialization (IRClosure and variables in dynamic scopes) on JRuby:

https://github.com/chyh1990/jruby-spark/blob/master/src/main/java/com/sensetime/utils/ProcToBytesService.java

It's actually a java extension, which bring marshalling to Proc.
But I found JRuby do not matain children static scopes by default.
Set jruby.ir.writing to true can solve this problem, but it will write out a lot of IR files. So I add
an option to force children static scopes registration, which enable IR writing on the fly.

I think it will be amazing to enable closure serialization on JRuby, which makes it possible to write
a fully functional Ruby-Spark binding.

@kares
Copy link
Member

@kares kares commented Apr 10, 2016

the option and its desc are a little confusing - understand the need for a short desc but still unable to wrap my head around the option without reading the PR desc.

@chyh1990 chyh1990 force-pushed the chyh1990:add_ir_writing_dyn branch from 11812b3 to f2646ec Apr 11, 2016
@chyh1990
Copy link
Author

@chyh1990 chyh1990 commented Apr 11, 2016

Option name changed.

@enebo
Copy link
Member

@enebo enebo commented Apr 11, 2016

@chyh1990 I still don't like this option name. This option is really just recording or tracking lexical containment of scopes. This information can be used by more than dumping/persistence (for example, we use it when dry run is specified). Names are tough :)

The essence is that it is recording/tracking lexical/static child scopes. Hmmm: ir.record_lexical_scopes. This is a little longer but it is already a JRUBY_OPTS sort of flag. @subbuss you have an opinion on this?

@chyh1990 As for your extension, I would like to try and minimize how much persistence code you need to maintain on this. So far no one has depended on IR persistence outside of the core project so we have been much more readily breaking stuff. The actual interfaces for IR persistence has not changed as much but our instruction set is evolving and I fear us breaking you.

In looking at your provided extension source one glaring issue is that our IRWriter is not extendable. Had we made those classes with non-static methods then you could have used a lot more of our original code. OR we could have made all these static methods public. I think the former is better than the later. Since you had to wrestle with how we organized this do you have any thoughts on improvement? I could fairly quickly make an instance-based version of IRWriter/IRReader (largely just make it an instance but still keep the public static methods for backwards compat. Then I think you will largely only need to override one method to add your static scope saving code. What do you think?

@subbuss
Copy link
Contributor

@subbuss subbuss commented Apr 12, 2016

Yes, dump_scopes is the not right flag. record_lexical_scopes is better but it doesn't quite capture that you are recording lexical nesting of scopes ... so, maybe record_lexical_hierarchy?

@chyh1990
Copy link
Author

@chyh1990 chyh1990 commented Apr 12, 2016

@enebo Thanks for your reply! As you say, all the trouble comes from the non-extendable IRWriter and IRAnalyzier. There is several hacks (including code copy&paste) in the closure serializer extension to workaround limitations in JRuby cores:

do not handle EVAL_SCRIPT correctly, causing NullPointerException where serializing closures in REPL.

  • JRuby does not support _load callback so I have to register a dummy NEW_PROC_ALLOCATOR

I definitely want to see improvement of closure serialization feature in Ruby. This feature is available in scala/python and even R but hardly exists in ruby world.

@enebo is record_lexical_hierarchy ok?

@chyh1990 chyh1990 force-pushed the chyh1990:add_ir_writing_dyn branch from f2646ec to 29fbd9e Apr 12, 2016
@chyh1990
Copy link
Author

@chyh1990 chyh1990 commented Apr 14, 2016

change the option name to record_lexical_hierarchy

@enebo enebo added this to the JRuby 9.1.0.0 milestone Apr 15, 2016
@enebo enebo merged commit fcf3451 into jruby:master Apr 15, 2016
1 of 2 checks passed
1 of 2 checks passed
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@enebo
Copy link
Member

@enebo enebo commented May 3, 2016

@chyh1990 I opened #3844. If you look at that branch I am trying to move IRPersistence to something where you can map most of your code to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.