Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Discussion surrounding stringified BSON::ObjectIds for associated data #2368

Closed
joshuaclayton opened this Issue · 14 comments

2 participants

@joshuaclayton

Hi all,

I've recently started working on a project that uses Mongoid and we're interacting with data that's also accessed by other projects (primarily written in Java and Python). After some discussion (before my involvement), it was determined that 'foreign keys' to associate this data would be used but that they'd be stored as strings instead of native ObjectIds. I fully understand the implications including storage space, lookup speeds, etc. However, this is the path that's been chosen and I have to continue down this path.

Currently, there's "all-or-nothing" settings for Mongoid to use strings vs ObjectIds for their data; however, I was looking to customize behavior for belongs_to and has_many relationships (and only certain ones at that, not all) so that only certain ones would use strings (and behave correctly from both directions. I have two questions: 1) is this possible currently with minimal monkey-patching? I've found that adding a field :association_id after the belongs_to works fine for that direction but the has_many still doesn't work (assuming it's looking at the document's _id, recognizing it's an ObjectId, and querying with that type instead of casting to a string) 2) if I create a pull request, would this be accepted (assuming there are tests, etc.)?

Again, I understand that this is not the ideal way to use mongo (or mongoid) but I've a problem to solve and I'd like to continue using mongoid in a non-hacky manner. Thanks for any insight!

UPDATE: Declaring field :association_id after a belongs_to does, in fact, cast properly when coming from the opposite direction (a has_many).

@durran
Owner

Mongoid should be casting to the appropriate type if you override either the association_id field on the relation, or on the actual _id field on the model in question. So for example...

This would properly cast the band_id field on Album to a String:

class Band
  include Mongoid::Document
  field :_id, type: String
  has_many :albums
end

class Album
  include Mongoid::Document
  belongs_to :band
end

This would be the preferred way, since there are places where Mongoid would query Band#using_object_ids? which would correctly return false in this case above, but in your scenario would actually return true since you only override the foreign key field on the other side of the relation.

Does this help clear this up?

@joshuaclayton

It does but it's not what I'm trying to do; I want all "PK"s to be ObjectIds but the associations stored as string versions of ObjectIds. Band and Album documents would look like this:

// band
{
  _id: ObjectId('5051d3faa0b7f949d9000001'),
  name: 'Rage Against the Machine'
}

// album
{
  _id: ObjectId('5051d431a0b7f949d9000002'),
  band_id: '5051d3faa0b7f949d9000001',
  title: 'Evil Empire',
  year_released: 1996
}

So, the documents themselves would still use ObjectId for _id but the associations would use strings. I hacked this together:

module Mongoid
  module Relations
    module Macros
      module ClassMethods
        alias_method :__belongs_to, :belongs_to

        def belongs_to(name, options = {}, &block)
          options = options.dup
          foreign_key_type = options.delete(:foreign_key_type)

          __belongs_to(name, options, &block).tap do |meta|
            if foreign_key_type
              field meta.foreign_key, type: foreign_key_type
            end
          end
        end
      end
    end
  end
end

The downfall is that it doesn't recognize the field as a ForeignKey field; it's just a Standard field. The big implications I noticed were the behavior of export changed, and there was some removal of an ivar conditionally on #foreign_key?, but both directions of the association seemed to work fine.

My concern is that I'm missing out on other behaviors due to my unfamiliarity with the codebase and that I'll be bit later on by the monkey patch. I've locked the gem currently to help prevent that, but again, I think I'd feel better if I knew the extent to which this affected things and could whip up a proper pull request instead.

If this is better off outside the codebase, I understand; most people wouldn't want to specify the manner in which data is associated so this is probably an edge-case, but let me know! Thanks @durran

@durran
Owner

Ok now I see - strange indeed. :) how about this... I will change one of the highly used models in the test suite to behave in this manner and see what breaks, if anything. If I can fix it without getting too hacky then I'll do just that. If not I'll take a deeper look at your proposal and see how we can make that work with some guarantees of no side effects outside of core. I'll be able to get into this on the weekend if that's ok with you.

@joshuaclayton

Sounds perfect, thank you sir!

@durran durran was assigned
@durran
Owner

@joshuaclayton Here's your definition to use and you are good to go:

field :association_id, type: String, identity: true, metadata: reflect_on_association(:association)
@durran durran closed this
@joshuaclayton

That's awesome, thanks @durran!

@joshuaclayton

@durran Just tried it to no avail; when I try and query from the model that has_many, neither querying with string nor ObjectId works. I'll try and fire up a repo that demonstrates the issue later today.

@durran
Owner

@joshuaclayton Ok sorry about that - I switched a model to use that code and everything worked in the suite after I changed the expectations to expect strings... But I think a concrete example will help. :)

@durran durran reopened this
@joshuaclayton

@durran I recreated the issue here: https://github.com/joshuaclayton/mongo-string-object-ids

Running rake will demonstrate the failure - in spec/models/post_spec.rb, I look at the user_id value as it's stored in the DB and assert that it's a string and not an ObjectId. Removing identity: true on the user_id field of Post seems to fix it, but only when running DROP_DB=true rake. I'm seeing some really weird shit where I think Mongo is handling attributes in an odd manner if they're set to ObjectIds.

In any case, this demonstrates the failure and what I'm trying to do. Let me know if you have any questions and I'll be happy to try and help!

@durran
Owner

@joshuaclayton Sweet thanks, I'll have a look tonight or in the morning.

@durran
Owner

@joshuaclayton Just a heads up I've got your repo cloned, just trying to find a good way to solve your problem without getting too hacky.

@joshuaclayton

@durran awesome; I do think that some issues are more related to problems in Mongo, so don't spend TOO much time on it. I'll rope in some of the Mongo guys next week to see if it's desired behavior, a known bug, and if there's anything I can do about it (re: documents with ObjectIds and string ids in the same collection being queryable).

@durran durran was assigned
@durran
Owner

After going through this for a while, it really seems from a Mongoid "out of the box" side there's not much I can do in this case without destroying our serialization code. The change on this side to handle this sort of case I think is just too great for something that is not common. I think in your case you're just going to have to do the relation by hand, like the following:

class Post
  include Mongoid::Document

  field :user_id, type: String

  def user
    @user ||= User.where(_id: Moped::BSON::ObjectId.from_string(user_id)).first
  end

  def user=(user)
    update_attribute(:user_id, user.id.to_s)
    @user = user
  end
end

class User
  include Mongoid::Document

  def posts
    @posts ||= Post.where(user_id: id.to_s)
  end
end
@durran durran closed this
@joshuaclayton

@durran awesome, thanks for looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.