Skip to content

Associations Refactor

Sean Cribbs edited this page Apr 19, 2012 · 4 revisions

Associations Refactor

This is an analysis of the state of associations in Ripple as of April 2012, and a strawman proposal for how to fix them. After discussion of the proposal, the committers will determine whether to include this work in the 1.0 release or to defer it to a later release, or to roll it out in stages. See also #284.

Current problems

Bad defaults

Links are not sustainable for associations that have larger than a few dozen related documents; furthermore, the reasoning for making links the default is solely historical. Other types of associations have to be created with the :using option, or intuited by the fact that the associated class is an EmbeddedDocument. This is definitely a place where we violate the mantra of "expose the tradeoffs to the developer".

Unclear save/validation behavior

For associations between Document classes, the default behavior for whether the assigned target will be saved or validated is unclear. It is also unclear what the default should be. Furthermore, the associated validator is added lazily, leading to surprising behavior in some edge cases.

#286, #145

Unclear dependency behavior

People used to ActiveRecord find the lack of the :dependent => [:destroy | :delete] option to be surprising. While I would rather not imply that things can be deterministically deleted in transitive ways (ON DELETE CASCADE is not available), we need something.

In addition, we have no automatic inverse detection, which makes certain types of association traversals tricky.

Too much proxying

Proxies are currently used for one associations, leading to some confusion when the target is nil, but is not falsey. Other frameworks define writer/reader methods on the owner class that differ by association type, whereas Ripple defines a single type of reader/writer that is used regardless of whether the association is singular or plural.

#197

SRP violations, Too highly factored

There was an impression that we could achieve a lot of code reuse in the association logic by keeping things in separate modules, e.g. One, Linked, etc. This has actually led to too much confusion and complexity; a simpler model would be more maintainable and easier to understand.

Strawman Proposal

What follows is a step-by-step "strawman" proposal for how to fix our current associations problems.

1 - Create association macros with explicit types

Instead of simply many and one, we add macros (class methods) like embed_one and link_to_many (a full list below, renaming suggestions requested). These can be added before any other refactoring is done and deprecation warnings added to the old generic macros. A release may be possible after this step is complete, deferring total refactor completion until the following major release.

Proposed Macros:

  • Embedded: embed_one, embed_many
  • Link: link_to_one, link_to_many
  • Key Correspondence: key_owner_of, key_belongs_to
  • Stored key: store_keys_of_many, store_key_of_one
  • Reference (search): referenced_by_many

2A - Reimplement Association as sub-types

Association has too many responsibilities. Instead of continuing to add complexity to it, we reimplement its responsibilities in macro-specific metadata and proxies (where appropriate). When common features are discovered, they can be pulled back up into the parent class of Association or Proxy as appropriate.

2B - Fix Proxy method delegation

Instead of deferring all accessors to the Proxy as we do currently (which leads to the non-nil nil problem), we define methods on the owner class as appropriate to each association type. For singular associations, this means direct access to the target on read. For plural associations, the proxy will be used for read/write.

3 - Fix and document auto-save behavior

Each association type should have a default behavior when adding to or replacing its target with an unsaved document. Furthermore, this behavior should be configurable with a macro option where appropriate.

4 - Add secondary-index association type

The current reference association type uses Riak Search and returns surprising results, including sorting-order oddities and limits on the numbers of records returned. The more reliable and less-surprising way to implement this would be to use secondary indexes, where the dependent side uses a store_key_of_one and the independent side uses a secondary index query to find associated keys. To accomplish this,

  1. store_key_of_one should automatically define a document property, with an index on its property. store_keys_of_many should do this as well, removing the need to predefine the property.
  2. referenced_by_many should discover or guess the inverse automatically (see below) and determine the index key to find associated documents.
  3. The search-based referenced_by_many should be renamed to something else, perhaps search_many. Optionally, it may add parameters to its definition that further constrain the search.

5 - Implement/fix inverses

In some association types, inverses can be easily known or detected. In all types, the user should be able to specify what the name of the inverse is on the target class. The inverse can then be used to determine how to behave in various life-cycle events or to allow chaining to remotely-associated documents.