Cleanup Array and add Reflector #2077

Closed
wants to merge 2 commits into
from

Projects

None yet

3 participants

@evanphx
Member
evanphx commented Dec 5, 2012

Rather than expose the guts of Array, use a simple Reflector class to access data that isn't otherwise accessible.

The exposure of Array's internal data was only necessary because Ruby doesn't provide mechanisms to otherwise access it. This works around that and by using a separate class, the JIT can make it as fast as a normal ivar access.

evanphx added some commits Dec 4, 2012
@evanphx evanphx Remove Array.attr_accessor :total 2dd95c3
@evanphx evanphx Clean Array by using a Reflector (simple mirror)
Array left it's guts exposed to deal with the way that data has to be
exposed in ruby. To clean things up, I've introduce a Reflector class
which is a simple mirror. This can easily access the internal data of an
object. Additionally, rather than using instance_variable_[get|set] the
JIT can make access quite fast because it doesn't have to deal with any
edgecases.
0b96773
@dbussink
Member
dbussink commented Feb 5, 2013

@evanphx Do you want to change this to use the Mirror setup we have for String now as well?

@brixen
Member
brixen commented Feb 5, 2013

@dbussink I'm going to do that.

@brixen brixen added a commit that closed this pull request May 15, 2014
@brixen brixen Introduced mirrors for some of Array. Closes #2077.
The original ticket (#2015) that lead to the PR in #2077 contains a fair
amount of bullshit, so some clarification is warranted.

Mirrors are an elegant and useful architectural tool for separating meta-level
and base-level functionality. The problem here is only partially of this
nature so use of mirrors provides only a partial solution. The meta, base
separation permits re-classifying some methods as "meta" when they are more
accurately just a set of primitive operations omitted from the object's API.

Ruby's core classes are generally poorly designed, being more a random bag of
operations than a well-designed interface. Two concepts are essential to a
well-designed interface: stratification and composition. Stratification is the
clear separation of methods into primitives and compositions. Composition is
the combining of two or more primitive operations to provide a more complex
operation. All compositions should clearly state what they compose and the
constraints they impose on the composition. This permits subclasses to
accurately know how refining methods can impact other behaviors of the class.

Nothing even remotely like this is specified for Ruby core classes. The Ruby
methods of a class are a thin veneer of names over a jumble of C functions.
Primitive operations and their constraints are unspecified and may not even be
exposed to Ruby.

Given this situation, attempting to implement the core Ruby classes like Array
in Ruby, and attempting to use good OO design principles, is extremely hard.
There are essentially three options:

1. Duplicate primitive operations in literal code in numerous methods that are
   the exposed Ruby API.
2. Factor these methods into "helper" methods that pollute the class namespace
   relative to the MRI namespace, where the helper functions are in C and
   invisible to Ruby.
3. Use an abstraction like mirrors to segregate the implementation of
   primitive operations or more complex arrangements of operations to
   implement some aspect of the exposed Ruby API.

A Ruby object's method table is logically flat and has a single namespace. In
other words, you can build a simple list of all methods that an object
responds to. There is no segmentation in this list.

The use of mirrors, as done here, provides a second, separate namespace of
methods that the exposed API methods can access but users of the object won't
access (however, nothing prohibits a user from accessing them at this point).
This solves the problem of polluting the object's namespace while still
permitting the implementation of the exposed API to factor out common
functionality.

However, that's not the full story. The state of the object (i.e. what
instance variables it has) is also a flat namespace. Nothing prevents a user
from overriding that state. In the case at issue here, Array has the following
attributes: @start, @total, and @tuple. There is no way to implement Array in
normal Ruby without some state. So the only options are: 1. have state that a
user can interfere with (as here), or 2. use some implementation method that
interposes a rigid boundary (like MRI does by implementing things in C). The
use of mirrors here does not prevent code like the following:

  class A < Array
    attr_accessor :state

    # Some method sets state to values incompatible with the expectations of
    # the methods implementing Array
  end

None of the suggested solutions in #2015 actually solve these problems. Naming
methods with __some_name__ convention does not prevent them from being
overridden, even if it *may* prevent them from being inadvertently overridden
(we have delt with cases is DataMapper and BlankSlate where these
implementations *did not* properly ovoid modifying under under methods). So
that is an ineffective hack that significantly reduces the quality of the core
class code. Neither does only accessing the instance variables through
reflection prevent them from being over-written or misused.

The only solutions that completely address the situation are frozen core
classes (i.e. not permitting code to modify core classes once they are loaded)
or implementing Ruby as a thin veneer of names on top of some language like C
or Java.

Both of these options are unacceptable for Rubinius. We are attempting to
build a more useful Ruby system. One that permits good use of OO principles to
compose programs. We also want to make the system easier to develop and
maintain by Ruby developers through use of Ruby code instead of some foreign
language. Just as important, there are significant benefits to writing the
core classes in Ruby:

1. The JIT compiler can inline core class code into application code, or
   application code into core class code, to produce very fast machine code.
   By using Ruby, arbitrary barriers in the system that defeat optimization
   are removed.
2. Writing the core classes in Ruby enables tools for comprehending program
   operation without an arbitrary opaque barrier between the application code
   and the core class code.
3. The core classes are available for inspection, learning, and improvement by
   ordinary Ruby programmers.
4. Improvements to the system that make Ruby execute faster improve the core
   classes and the application code.
5. Debugging application code is not inhibited by an opaque boundary at the
   core class level where a debugger would need to be able to step through
   Ruby code and then into C code.

These and other benefits weigh heavily in favor of the architecture Rubinius
is pursuing and against the idea of a system that is a bag of names over some
opaque and foreign language. This is a decision that Rubinius is free to make.

Finally, Rubinius is not "reimplementing someone's library". Rubinius would
not exist if MRI wasn't a technical disaster. Rubinius is both a competing
project and specifically guided by the goal of improving Ruby as a language.

Rubinius is the fundamental system underlying and providing services to a Ruby
application.  Rubinius gets priority. The core classes are completely visible
in Ruby. No programmer has to wonder if their code is using some
"undocumented* feature. They can, *and should*, read the code that implements
the APIs they use. It's their choice to write code that does not run on
Rubinius. It's not Rubinius' responsibility to make any random code work. If
someone doesn't like that, they are welcome to and encouraged to use MRI
instead.
06d6f25
@brixen brixen closed this in 06d6f25 May 15, 2014
@brixen brixen deleted the array+reflector branch Oct 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment