New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make repositories to work with data #291

Closed
jodosha opened this Issue Feb 2, 2016 · 10 comments

Comments

Projects
None yet
6 participants
@jodosha
Member

jodosha commented Feb 2, 2016

Actual

Right now we only accept entities as input for repositories default CRUD operations. This often leads to awkward and unnecessary objects allocations.

book = BookRepository.create(
  Book.new(params[:book])
)

That intermediate line is unnecessary. We should adopt the "data in, entities out" principle.

Proposal

Default CRUD operations should accept any object that respond to #to_hash.

book = BookRepository.create(params[:book])

To maintain backward compatibility we should implement Hanami::Entity#to_hash. In this way, the syntax above will still work. Please notice that Hanami::Entity already has #to_h, so it's just a matter of alias that method.

All the other objects will work just fine:

➜ pry
[1] pry(main)> require 'rubygems'
=> false
[2] pry(main)> require 'hanami/controller'
=> true
[3] pry(main)> Hanami::Action::Params.new({}).respond_to?(:to_hash)
=> true
[4] pry(main)> {}.respond_to?(:to_hash)
=> true

The decision to use #to_hash over #to_h is demanded by the dangerous implementation that some objects like Array have.

➜ pry
[1] pry(main)> [].to_h
=> {}
[2] pry(main)> [1].to_h
TypeError: wrong element type Fixnum at 0 (expected array)
from (pry):2:in `to_h'
[3] pry(main)> [['a', 1]].to_h
=> {"a"=>1}

There is the risk that a TypeError could be raised if an array is passed. Better to rely on #to_hash, and keep Array out because:

➜ pry
[1] pry(main)> [].respond_to?(:to_hash)
=> false

@jodosha jodosha added this to the v0.7.0 milestone Feb 2, 2016

@booch

This comment has been minimized.

booch commented Feb 2, 2016

I hadn't considered the "data in, entities out" idea, but it looks like a good idea. Given that we can still use an entity as input, I don't see the harm in the proposed change.

@joneslee85

This comment has been minimized.

Member

joneslee85 commented Feb 3, 2016

Seems a safe change to me.

What's data?

What type of hash should to_hash return? A string vs symbol key hash?

@Bounga

This comment has been minimized.

Contributor

Bounga commented Feb 3, 2016

It looks really good to me 👍

It's a huge step forward in term of flexibility.

@joneslee85 I'd like to be able to use both. I think it's simple to handle.

@jodosha

This comment has been minimized.

Member

jodosha commented Feb 11, 2016

@joneslee85

What type of hash should to_hash return? A string vs symbol key hash?

This is an excellent question that opens a new, huge topic. Please keep in mind that most of the times data means a Hash that is coming from a Rack env.

Here's the status of symbol vs string per MRI version.

MRI 2.0, 2.1

Symbol

  • Immutable: VM efficient
  • Non GC'd: security threats

String

  • Mutable: less efficient
  • GC'd: no security threats

Until we supported these versions, we had to offer "indifferent access". Internally, data was stored with string keys, because of security reasons.

MRI 2.2, 2.3

Symbol

  • Immutable: VM efficient
  • Non GC'd: security threats
  • GC'd: no security threats

String

  • Mutable: less efficient
  • GC'd: no security issues

We can safely go for "symbol only" approach, because it's efficient and because we define all the other attributes with symbols: param validations, entity attributes, database mapping, column names, etc..

This "symbol only" solution has a cost of deeply symbolize the Rack params for each request.

MRI 3.0

Symbol

  • Immutable: VM efficient
  • Non GC'd: security threats

String

  • Immutable (?): VM efficient
  • GC'd: no security issues

Probably this version will ship with immutable strings. That can allow us to use a "string only" approach and save us from the perf cost of deep symbolize incoming data. But it's too early to speculate about this change.


With this background in mind, we can:

Leave it as it is: "indifferent access"

PROs

  • It's well known pattern, and it feels "natural" for many developers
  • It's already implemented
  • It allows us to defer the choice, according to the future of MRI

CONs

  • We can't simplify our internals

Go for "Symbol only"

PROs

  • Consistent with other components: params definition, entity attributes, database mapping, etc..
  • "Future proof" (?) - Symbol shouldn't change in next major Ruby release
  • It allows to simplify our internals

CONs

  • Perf overhead due to deep symbolize

Go for "String only"

PROs

  • No perf overhead (keys are already strings)
  • It allows to simplify our internals

CONs

  • Inconsistent with other components: params definition, entity attributes, database mapping, etc..
  • We don't know if strings will be efficient as symbols in the future. It will depend on if Ruby 3.0 will include immutable strings.
@joneslee85

This comment has been minimized.

Member

joneslee85 commented Feb 12, 2016

@jodosha 👍 on detailed explanation. If I were to cast a vote, I would go for Symbol only for the sake of consistency with other components. It helps transition to String (if we decided to adopt it when Ruby 2.x are all dead).

@jodosha

This comment has been minimized.

Member

jodosha commented Feb 12, 2016

@joneslee85 Me too.

@AlfonsoUceda AlfonsoUceda self-assigned this Feb 13, 2016

@booch

This comment has been minimized.

booch commented Feb 19, 2016

Small correction to @jodosha's description of symbols versus strings: MRI 2.2 introduced garbage collection of symbols. See http://www.infoq.com/news/2014/12/ruby-2.2.0-released for more details.

@booch

This comment has been minimized.

booch commented Feb 19, 2016

I vote for symbol-only hash keys. That seems to be the preference in most situations in Ruby, especially given the Ruby 1.9+ JavaScript-style key: value syntax.

@jodosha

This comment has been minimized.

Member

jodosha commented Feb 19, 2016

@booch Thanks for let me know. It was a copy and paste mistake 😄 Thanks for expressing your opinion, we appreciate it 👍

@solnic

This comment has been minimized.

Member

solnic commented Apr 7, 2016

This will be supported in the next release of rom-repository. Here's a preview: https://gist.github.com/solnic/b369420c32b1432f6d239db470dbd2d3

Re symbol-vs-strings - of course symbols-only. Indifferent access is such a smell...it's not even funny. Input from http requests should be properly coerced, lower layers shouldn't even know that there's such a thing as params hash with string keys.

@jodosha jodosha assigned jodosha and unassigned AlfonsoUceda May 23, 2016

@jodosha jodosha closed this in #334 Nov 10, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment