New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to enable ir writing on-the-fly #3791

Merged
merged 1 commit into from Apr 15, 2016

Conversation

Projects
None yet
5 participants
@chyh1990

Hi,

I'm working on ruby (JRuby only) binding for Apache Spark. I have finished a working prototype which enables REAL closure serialization (IRClosure and variables in dynamic scopes) on JRuby:

https://github.com/chyh1990/jruby-spark/blob/master/src/main/java/com/sensetime/utils/ProcToBytesService.java

It's actually a java extension, which bring marshalling to Proc.
But I found JRuby do not matain children static scopes by default.
Set jruby.ir.writing to true can solve this problem, but it will write out a lot of IR files. So I add
an option to force children static scopes registration, which enable IR writing on the fly.

I think it will be amazing to enable closure serialization on JRuby, which makes it possible to write
a fully functional Ruby-Spark binding.

@kares

This comment has been minimized.

Show comment
Hide comment
@kares

kares Apr 10, 2016

Member

the option and its desc are a little confusing - understand the need for a short desc but still unable to wrap my head around the option without reading the PR desc.

Member

kares commented Apr 10, 2016

the option and its desc are a little confusing - understand the need for a short desc but still unable to wrap my head around the option without reading the PR desc.

@chyh1990

This comment has been minimized.

Show comment
Hide comment
@chyh1990

chyh1990 Apr 11, 2016

Option name changed.

Option name changed.

@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo Apr 11, 2016

Member

@chyh1990 I still don't like this option name. This option is really just recording or tracking lexical containment of scopes. This information can be used by more than dumping/persistence (for example, we use it when dry run is specified). Names are tough :)

The essence is that it is recording/tracking lexical/static child scopes. Hmmm: ir.record_lexical_scopes. This is a little longer but it is already a JRUBY_OPTS sort of flag. @subbuss you have an opinion on this?

@chyh1990 As for your extension, I would like to try and minimize how much persistence code you need to maintain on this. So far no one has depended on IR persistence outside of the core project so we have been much more readily breaking stuff. The actual interfaces for IR persistence has not changed as much but our instruction set is evolving and I fear us breaking you.

In looking at your provided extension source one glaring issue is that our IRWriter is not extendable. Had we made those classes with non-static methods then you could have used a lot more of our original code. OR we could have made all these static methods public. I think the former is better than the later. Since you had to wrestle with how we organized this do you have any thoughts on improvement? I could fairly quickly make an instance-based version of IRWriter/IRReader (largely just make it an instance but still keep the public static methods for backwards compat. Then I think you will largely only need to override one method to add your static scope saving code. What do you think?

Member

enebo commented Apr 11, 2016

@chyh1990 I still don't like this option name. This option is really just recording or tracking lexical containment of scopes. This information can be used by more than dumping/persistence (for example, we use it when dry run is specified). Names are tough :)

The essence is that it is recording/tracking lexical/static child scopes. Hmmm: ir.record_lexical_scopes. This is a little longer but it is already a JRUBY_OPTS sort of flag. @subbuss you have an opinion on this?

@chyh1990 As for your extension, I would like to try and minimize how much persistence code you need to maintain on this. So far no one has depended on IR persistence outside of the core project so we have been much more readily breaking stuff. The actual interfaces for IR persistence has not changed as much but our instruction set is evolving and I fear us breaking you.

In looking at your provided extension source one glaring issue is that our IRWriter is not extendable. Had we made those classes with non-static methods then you could have used a lot more of our original code. OR we could have made all these static methods public. I think the former is better than the later. Since you had to wrestle with how we organized this do you have any thoughts on improvement? I could fairly quickly make an instance-based version of IRWriter/IRReader (largely just make it an instance but still keep the public static methods for backwards compat. Then I think you will largely only need to override one method to add your static scope saving code. What do you think?

@subbuss

This comment has been minimized.

Show comment
Hide comment
@subbuss

subbuss Apr 12, 2016

Contributor

Yes, dump_scopes is the not right flag. record_lexical_scopes is better but it doesn't quite capture that you are recording lexical nesting of scopes ... so, maybe record_lexical_hierarchy?

Contributor

subbuss commented Apr 12, 2016

Yes, dump_scopes is the not right flag. record_lexical_scopes is better but it doesn't quite capture that you are recording lexical nesting of scopes ... so, maybe record_lexical_hierarchy?

@chyh1990

This comment has been minimized.

Show comment
Hide comment
@chyh1990

chyh1990 Apr 12, 2016

@enebo Thanks for your reply! As you say, all the trouble comes from the non-extendable IRWriter and IRAnalyzier. There is several hacks (including code copy&paste) in the closure serializer extension to workaround limitations in JRuby cores:

do not handle EVAL_SCRIPT correctly, causing NullPointerException where serializing closures in REPL.

  • JRuby does not support _load callback so I have to register a dummy NEW_PROC_ALLOCATOR

I definitely want to see improvement of closure serialization feature in Ruby. This feature is available in scala/python and even R but hardly exists in ruby world.

@enebo is record_lexical_hierarchy ok?

@enebo Thanks for your reply! As you say, all the trouble comes from the non-extendable IRWriter and IRAnalyzier. There is several hacks (including code copy&paste) in the closure serializer extension to workaround limitations in JRuby cores:

do not handle EVAL_SCRIPT correctly, causing NullPointerException where serializing closures in REPL.

  • JRuby does not support _load callback so I have to register a dummy NEW_PROC_ALLOCATOR

I definitely want to see improvement of closure serialization feature in Ruby. This feature is available in scala/python and even R but hardly exists in ruby world.

@enebo is record_lexical_hierarchy ok?

@chyh1990

This comment has been minimized.

Show comment
Hide comment
@chyh1990

chyh1990 Apr 14, 2016

change the option name to record_lexical_hierarchy

change the option name to record_lexical_hierarchy

@enebo enebo added this to the JRuby 9.1.0.0 milestone Apr 15, 2016

@enebo enebo merged commit fcf3451 into jruby:master Apr 15, 2016

1 of 2 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@enebo

This comment has been minimized.

Show comment
Hide comment
@enebo

enebo May 3, 2016

Member

@chyh1990 I opened #3844. If you look at that branch I am trying to move IRPersistence to something where you can map most of your code to.

Member

enebo commented May 3, 2016

@chyh1990 I opened #3844. If you look at that branch I am trying to move IRPersistence to something where you can map most of your code to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment