Specs for association options (and bug fixes) and Model.find(multiple pks) #132

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
2 participants
@jh125486

Sorry both these commits are in one pull request, it's my first time with git in a team environment.

I was converting over a Rails app to use rethinkdb and the belongs_to/has_many association options were not working as I expected. I think I fixed where the problems were and wrote tests to support.

After that I tried to pass an array to Model.find() and tracked down where that failed. I used the AR find method as a example for all the cases.

Also it's my first time using rspec, so if the syntax isn't right or something please send me on the right path.

@nviennot

This comment has been minimized.

Show comment
Hide comment
@nviennot

nviennot Mar 30, 2015

Owner

Hi Jacob :)

Congrats on your first pull request (PR)!!
So in terms of managing the PRs, I would have preferred to have two PR, one of the multi find, one for the custom primary key association bug fix. To do this, you need to use branches (you added your two commits on your master branch -- this is fine when you do a single PR).
The good thing is that as you add more commits to a branch, the PR gets automatically updated, and comments related to a piece of code get hidden.

Anyway, this PR will be just fine to comments on both commits.

Multi Find Feature

In terms of API, we do something like:

  • for single id queries: find(id)
  • for multi id queries: find([id1, id2, id3]).
    (I don't think we want to use find(id1, id2, id3) is because it's not clear whether you'd want to return an array or not. For example, a dev might use find(*ids), and always expect to see an array returned. Except that if ids has a single element, we wouldn't return the same thing and so things will start to break.)
  1. There's some feature where one can declare a primary key of type Array e.g.:
class User
  include NoBrainer::Document
  field :id, :primary_key => true, :type => Array
end

User.create(:id => [1,2])
User.find([1,2])

It's a little hard to know without looking at the type of the primary key (which we may know nothing about if the type is set to be Object), to know if we are operating in the single pk case, or multi pks case.

So I think that the multi find feature should go in a separate method, like find_all or find_many, which accepts a single argument, a list of ids, and returns a list of instances, in the same order as the ids.
User.find_all([1,2,3]) would get back[User(1), User(2), User(3)]`.

Does this sound good to you?
Also, what is the usecase of such find_all() API? do you have a real example, in which User.where(:id.in => [1,2,3]) is not sufficient.

btw, in your implementation, you should add to_a in the function find_some, otherwise find() returns some criteria (and so the logic to raise if a document is missing in find() becomes racy, docs.count is actually a DB operation, not an array).

Associations Bugs

Good catch!
I have to think about it more, so I'll come back to you on this.
I think there might be some other bugs (the default for the primary key name doesn't seem right either).

Regarding the rspec stuff:
I'm not an expert in rspec, but:

  • I find the .should notation easier to use. I would write expect(docs).to contain_exactly(doc, doc2, doc3) as docs.should == [doc, doc2, doc3]. I use =~ if the ordering does not matter.
  • I don't test if methods exist with respond_to when I'm going to test its value.
    However, I'm not religious about styles. I'm happy with what you did.
Owner

nviennot commented Mar 30, 2015

Hi Jacob :)

Congrats on your first pull request (PR)!!
So in terms of managing the PRs, I would have preferred to have two PR, one of the multi find, one for the custom primary key association bug fix. To do this, you need to use branches (you added your two commits on your master branch -- this is fine when you do a single PR).
The good thing is that as you add more commits to a branch, the PR gets automatically updated, and comments related to a piece of code get hidden.

Anyway, this PR will be just fine to comments on both commits.

Multi Find Feature

In terms of API, we do something like:

  • for single id queries: find(id)
  • for multi id queries: find([id1, id2, id3]).
    (I don't think we want to use find(id1, id2, id3) is because it's not clear whether you'd want to return an array or not. For example, a dev might use find(*ids), and always expect to see an array returned. Except that if ids has a single element, we wouldn't return the same thing and so things will start to break.)
  1. There's some feature where one can declare a primary key of type Array e.g.:
class User
  include NoBrainer::Document
  field :id, :primary_key => true, :type => Array
end

User.create(:id => [1,2])
User.find([1,2])

It's a little hard to know without looking at the type of the primary key (which we may know nothing about if the type is set to be Object), to know if we are operating in the single pk case, or multi pks case.

So I think that the multi find feature should go in a separate method, like find_all or find_many, which accepts a single argument, a list of ids, and returns a list of instances, in the same order as the ids.
User.find_all([1,2,3]) would get back[User(1), User(2), User(3)]`.

Does this sound good to you?
Also, what is the usecase of such find_all() API? do you have a real example, in which User.where(:id.in => [1,2,3]) is not sufficient.

btw, in your implementation, you should add to_a in the function find_some, otherwise find() returns some criteria (and so the logic to raise if a document is missing in find() becomes racy, docs.count is actually a DB operation, not an array).

Associations Bugs

Good catch!
I have to think about it more, so I'll come back to you on this.
I think there might be some other bugs (the default for the primary key name doesn't seem right either).

Regarding the rspec stuff:
I'm not an expert in rspec, but:

  • I find the .should notation easier to use. I would write expect(docs).to contain_exactly(doc, doc2, doc3) as docs.should == [doc, doc2, doc3]. I use =~ if the ordering does not matter.
  • I don't test if methods exist with respond_to when I'm going to test its value.
    However, I'm not religious about styles. I'm happy with what you did.
@jh125486

This comment has been minimized.

Show comment
Hide comment
@jh125486

jh125486 Apr 1, 2015

  1. I essentially have a list of items where a user would check off one or more of them, and they are passed to find() as an array of ids. I'm using a before_action on the controller to feed an ajax datatable from that criteria, and in keeping it DRY, I'm using the same find() in the index method to grab all the items for that user. I'm essentially just trying to mimic what AR does.
  2. I agree that if there is no reflection on the primary key object type, find([]) will get messy. Was having an array as a primary key a request by a user?
  3. I'll clean up the rspec stuff within the next few days.

All-in-all, what I am doing with rethinkdb is probably not the best use case, but I coming from clustering other DBs, it's seems pretty awesome. So I may not eventually use it (my coworker is testing out Cassandra, and I'll be testing Couchbase soon), but I'm happy to contribute.

jh125486 commented Apr 1, 2015

  1. I essentially have a list of items where a user would check off one or more of them, and they are passed to find() as an array of ids. I'm using a before_action on the controller to feed an ajax datatable from that criteria, and in keeping it DRY, I'm using the same find() in the index method to grab all the items for that user. I'm essentially just trying to mimic what AR does.
  2. I agree that if there is no reflection on the primary key object type, find([]) will get messy. Was having an array as a primary key a request by a user?
  3. I'll clean up the rspec stuff within the next few days.

All-in-all, what I am doing with rethinkdb is probably not the best use case, but I coming from clustering other DBs, it's seems pretty awesome. So I may not eventually use it (my coworker is testing out Cassandra, and I'll be testing Couchbase soon), but I'm happy to contribute.

@nviennot

This comment has been minimized.

Show comment
Hide comment
@nviennot

nviennot Apr 2, 2015

Owner

let me quickly reply:

  1. wouldn't Items.where(:ids.in => ids).each(&:check) to check them all be better? (you don't really care about the ordering, and you probably don't want to raise if one of the items wasn't found as find_all() would do). To grab all the items for that user, wouldn't an association or a named scope be more appropriate? e.g. on the User model with has_many :items or on the Item model with :scope :of_user, ->(user) { where(...) }.

I've been grepping on gitlabhq, diaspora, and discourse, 3 major rails apps. Turns out git grep '\.find\(' app/**/*.rb show only single id find. Why isn't this abstraction more used?

  1. it hasn't been requested, but having arrays as a primary key is currently supported. This may be useful when doing associations with a compound id. (e.g. you can imagine a Comment model that belongs to some Article model, and its primary key would be [article_id, some_comment_id]. A bit like you would model it on Cassandra.
  2. I will deal with the association bug fix commit this weekend

Regarding using rethinkdb, it's not an easy decision. But clustering postgresql properly is not really something you do easily. And rethinkdb has interesting features of its own.

Using nobrainer to give rethinkdb a feel of a relational DB can be seen as a bit weird.
and nobrainer's implementation of certain feature can be seen as a hack (e.g. implementation of a distributed lock through a secondary table to acquire locks when doing uniqueness validations).
But still, for usecases where using mongodb/mongoid is a valid option, the combo rethinkdb/nobrainer offers a superior solution IMHO.

Thank you for your contributions, very much appreciated!! 👍

Owner

nviennot commented Apr 2, 2015

let me quickly reply:

  1. wouldn't Items.where(:ids.in => ids).each(&:check) to check them all be better? (you don't really care about the ordering, and you probably don't want to raise if one of the items wasn't found as find_all() would do). To grab all the items for that user, wouldn't an association or a named scope be more appropriate? e.g. on the User model with has_many :items or on the Item model with :scope :of_user, ->(user) { where(...) }.

I've been grepping on gitlabhq, diaspora, and discourse, 3 major rails apps. Turns out git grep '\.find\(' app/**/*.rb show only single id find. Why isn't this abstraction more used?

  1. it hasn't been requested, but having arrays as a primary key is currently supported. This may be useful when doing associations with a compound id. (e.g. you can imagine a Comment model that belongs to some Article model, and its primary key would be [article_id, some_comment_id]. A bit like you would model it on Cassandra.
  2. I will deal with the association bug fix commit this weekend

Regarding using rethinkdb, it's not an easy decision. But clustering postgresql properly is not really something you do easily. And rethinkdb has interesting features of its own.

Using nobrainer to give rethinkdb a feel of a relational DB can be seen as a bit weird.
and nobrainer's implementation of certain feature can be seen as a hack (e.g. implementation of a distributed lock through a secondary table to acquire locks when doing uniqueness validations).
But still, for usecases where using mongodb/mongoid is a valid option, the combo rethinkdb/nobrainer offers a superior solution IMHO.

Thank you for your contributions, very much appreciated!! 👍

nviennot added a commit that referenced this pull request Apr 2, 2015

@nviennot

This comment has been minimized.

Show comment
Hide comment
@nviennot

nviennot Apr 2, 2015

Owner

I've pushed your associations fix on master, with an additional fix: the default primary_key for the has_many should not be the target's primary key, but the owner's primary key.

Owner

nviennot commented Apr 2, 2015

I've pushed your associations fix on master, with an additional fix: the default primary_key for the has_many should not be the target's primary key, but the owner's primary key.

nviennot added a commit that referenced this pull request Apr 2, 2015

@nviennot

This comment has been minimized.

Show comment
Hide comment
@nviennot

nviennot May 23, 2015

Owner

Hey @jh125486

There's a bit of a problem with the belongs_to association:
The foreign_key is composed of the association name + the primary key of the target class.
That mean that we have to lookup the target class at the time of the declaration of the belongs_to declaration. This is a problem if the target class source file is not yet declared. The order in which the models are getting loaded can be an issue, especially if there are some circular dependencies between models.

What's your take on the issue?

Owner

nviennot commented May 23, 2015

Hey @jh125486

There's a bit of a problem with the belongs_to association:
The foreign_key is composed of the association name + the primary key of the target class.
That mean that we have to lookup the target class at the time of the declaration of the belongs_to declaration. This is a problem if the target class source file is not yet declared. The order in which the models are getting loaded can be an issue, especially if there are some circular dependencies between models.

What's your take on the issue?

@jh125486

This comment has been minimized.

Show comment
Hide comment
@jh125486

jh125486 May 28, 2015

Looking through active record source, the rails core is creating a
singleton reflection instance which holds both sides of the relationship.
Instainiating one side would pull in the other side of the singleton.
Does that make sense?
I have been pulled away to work on another project unfortunately and
haven't been able to use rethinkdb very much.

V/r
Jacob

On Friday, May 22, 2015, Nicolas Viennot notifications@github.com wrote:

Hey @jh125486 https://github.com/jh125486

There's a bit of a problem with the belongs_to association:
The foreign_key is composed of the association name + the primary key of
the target class.
That mean that we have to lookup the target class at the time of the
declaration of the belongs_to declaration. This is a problem if the target
class source file is not yet declared. The order in which the models are
getting loaded can be an issue, especially if there are some circular
dependencies between models.

What's your take on the issue?


Reply to this email directly or view it on GitHub
#132 (comment).

Sent from my iPhone

Looking through active record source, the rails core is creating a
singleton reflection instance which holds both sides of the relationship.
Instainiating one side would pull in the other side of the singleton.
Does that make sense?
I have been pulled away to work on another project unfortunately and
haven't been able to use rethinkdb very much.

V/r
Jacob

On Friday, May 22, 2015, Nicolas Viennot notifications@github.com wrote:

Hey @jh125486 https://github.com/jh125486

There's a bit of a problem with the belongs_to association:
The foreign_key is composed of the association name + the primary key of
the target class.
That mean that we have to lookup the target class at the time of the
declaration of the belongs_to declaration. This is a problem if the target
class source file is not yet declared. The order in which the models are
getting loaded can be an issue, especially if there are some circular
dependencies between models.

What's your take on the issue?


Reply to this email directly or view it on GitHub
#132 (comment).

Sent from my iPhone

@nviennot

This comment has been minimized.

Show comment
Hide comment
@nviennot

nviennot May 28, 2015

Owner

With active record, you can declare a belongs_to :post association in a Comment model, without having the Post model ever defined. The code will function well, until you try to access the post variable of some comment. At this point, AR yells NameError: uninitialized constant Comment::Post.

With NoBrainer, we raise much sooner because we need to know the primary key name of the Post class to construct our foreign key. So that's why I think we might want to have the foreign keys to be suffixed with _id and not _#{target.primary_key} to avoid this problem.

Owner

nviennot commented May 28, 2015

With active record, you can declare a belongs_to :post association in a Comment model, without having the Post model ever defined. The code will function well, until you try to access the post variable of some comment. At this point, AR yells NameError: uninitialized constant Comment::Post.

With NoBrainer, we raise much sooner because we need to know the primary key name of the Post class to construct our foreign key. So that's why I think we might want to have the foreign keys to be suffixed with _id and not _#{target.primary_key} to avoid this problem.

@nviennot nviennot closed this in 10387d9 Jul 3, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment