Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Mongoid::Shardable::ClassMethods#shard_key params from array t… #4650

Closed
wants to merge 8 commits into from
Closed

Conversation

xuxiangyang
Copy link
Contributor

Change Mongoid::Shardable::ClassMethods#shard_key params from array to hash. So we could shard collection with rake

…o hash. So we could shard collection with rake
xuxiangyang added a commit to xuxiangyang/capistrano-mongoid that referenced this pull request Jul 23, 2019
@p-mongo
Copy link
Contributor

p-mongo commented Jul 24, 2019

Thank you for the patch. Can you please describe the problem that this PR is solving (i.e. what you attempted to do that was not possible with existing Mongoid functionality)?

@xuxiangyang
Copy link
Contributor Author

xuxiangyang commented Jul 28, 2019

I want to shard collection with rake,Here is an example:

First I would define two models like this

class ChatLog
  include Mongoid::Document
  include Mongoid::Timestamps
  
  shard_key created_at: "hashed" # just for example
  index created_at: "hashed"
  
  field :user_id, type: String # use uuid
  field :content, type: String
end

class UserOrder
  include Mongoid::Document
  include Mongoid::Timestamps
  
  shard_key user_id: 1 # just for example
  index user_id: 1
  
  field :user_id, type: String # use uuid
  field :product_id, type: String
end

Then I clould use rake to auto shard collection if the mongo server is running with sharding model.

rake db:mongoid:create_indexes # should create index before shard
rake db:mongoid:shard_collections

will result in

chat_logs will use created_at as hashed shard key.

user_orders will use user_id as ranged based shard.

Forthemore, it cloud be combined with capistrano, which realized with capistrano-mongoid gem, to auto work with deploy.

@p-mongo
Copy link
Contributor

p-mongo commented Jul 29, 2019

Why does this functionality need to be in Mongoid? It seems to me that it would be more appropriate for a library that exists on top of Mongoid.

@xuxiangyang
Copy link
Contributor Author

All database administration operations were implement in Mongoid, such as create_indexesremove_undefined_indexesremove_indexes‘、droppurge. I think sharding operation should be the same place as others.

And I think Mongoid maintain the database by declaration, It would be consistent to declare how to shard with this collection as to show the indexes

@@ -11,7 +11,7 @@ module Shardable

included do
cattr_accessor :shard_key_fields
self.shard_key_fields = []
self.shard_key_fields = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an API change. Why was it made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if there is no one using shard_key, it may be better to change api. I think it is ugly to write like this

self.shard_key_fields = []
self.shard_key_hash = {}

and there will be two method to declare a shard_key

class User
  include Mongoid::Document
  include Mongoid::Timestamps
  
  shard_key :created_at
  shard_key_config created_at: "hashed"
end

shard_key_hash and shard_key_config may have a better name, but it will be confused to have two ways to declare shard_key

@p-mongo
Copy link
Contributor

p-mongo commented Jul 30, 2019

OK, since Mongoid does have support for defining shard keys in models it seems reasonable to also have rake tasks to assist in managing them. However, unlike indexes, it does not appear that shard key definition and usage is documented. Would you be able to create a page similar to https://docs.mongodb.com/mongoid/master/tutorials/mongoid-indexes/ (source: https://github.com/mongodb/mongoid/blob/master/docs/tutorials/mongoid-indexes.txt) that explains how to define shard keys in models and use the rake tasks proposed in this PR?

@xuxiangyang
Copy link
Contributor Author

xuxiangyang commented Aug 1, 2019

I am glad to create the doc page, but what's the grammar and how to preview the doc? I write the doc similar to index without preview...

@johnnyshields
Copy link
Contributor

@p-mongo
Copy link
Contributor

p-mongo commented Aug 6, 2019

@xuxiangyang The markup is restructured text (RST). Any tool converting RST to HTML should be usable for preview.

@johnnyshields What exactly is your proposal?

@xuxiangyang
Copy link
Contributor Author

May be @johnnyshields mean we should use shard_key like this

with default unique and options

shard_key user_id: 1

and we can change the unique and options like this

shard_key { user_id: 1 }, { unique: true, options: { numInitialChunks: 10 } }

@johnnyshields
Copy link
Contributor

My vote would be to merge unique in with the other options like this.

shard_key { user_id: 1 }, { unique: true, numInitialChunks: 10 }

But either way is fine.

@xuxiangyang
Copy link
Contributor Author

I enhance shard_key with @johnnyshields, how about it? @p-mongo

@p-mongo
Copy link
Contributor

p-mongo commented Aug 13, 2019

I like the new API but the existing API needs to still work. I also think that combining unique with the other options is preferable over copying the server's API in this case.

Thus, the changes needed to be made are:

  1. It should remain possible to use the old syntax:
    shard_key :first_name, :last_name

This should be converted by the implementation to the new syntax:

shard_key :first_name => 1, :last_name => 1

Both syntaxes need to be documented.

  1. Existing tests should remain unchanged, new tests need to be added for the new syntax.

  2. It would also be great to have some test coverage for shard_collections method, unless testing it is very difficult for some reason.

@xuxiangyang
Copy link
Contributor Author

To make all under syntaxes work

shard_key :first_name, :last_name
shard_key [:first_name, :last_name]
shard_key :first_name
shard_key first_name: 1, last_name: 1
shard_key{{ first_name: 1, last_name: 1}, unique: true) 

I have to write like this

def shard_key(name_or_key, *names, unique: false, options: {})
  key = if name_or_key.is_a?(Hash)
          name_or_key
        else
          Hash[([name_or_key] + names).flatten.map { |f| [f, 1] }]
        end
  key = Hash[key.map { |k, v| [self.database_field_name(k).to_sym, v] }]
  self.shard_key_fields = key.keys
  self.shard_config = {
    key: key,
    unique: unique,
    options: options,
  }
end

it's very ugly, any idea to make it better?

@johnnyshields
Copy link
Contributor

Guys, can we please decide on one syntax here? I think the only one which should be supported is:

shard_key{{ first_name: 1, last_name: 1}, { unique: true, option1: true, option2: true }) 

As it is very unlikely many people are using this, and moreover it will fail at class load time if the wrong/old syntax is used, we can simply force users to upgrade to the new syntax (or if we want to be super cautious we can put in a deprecation warning and remove in next major version).

@p-mongo
Copy link
Contributor

p-mongo commented Aug 15, 2019

Only two syntaxes need to be supported, the existing one:

shard_key :first_name, :last_name

... and the proposed new one:

shard_key{{ first_name: 1, last_name: 1}, options) 

Since the shardCollection command (https://docs.mongodb.com/manual/reference/command/shardCollection/) does not have any special handling of unique option (it's just a regular option there), Mongoid also does not need to have any special handling for unique.

Something like this should work (untested):

      def shard_key(*args, **options)
        if args.length == 1 && args.first.is_a?(Hash)
          # New syntax
          shard_key = args.first
        else
          # Old syntax, args must be an array of symbols
          shard_key = Hash[args.map { |sym| [sym, 1] }]
        end
        resolved_shard_key = Hash[shard_key.map do |k, v|
          [self.database_field_name(k).to_sym, v]
        end]
        self.shard_key_fields = resolved_shard_key.keys
        self.shard_config = {
          key: resolved_shard_key,
          options: options,
        }
      end

@xuxiangyang
Copy link
Contributor Author

I agree with @johnnyshields may we should only support new api , I think shard key must be decided very carefully, as shard key can not change after created, bad shard strategy may has poor performance, bad shard strategy may cause indivisible chunks.

Maybe shard need be declare much clear and cautious

@p-mongo
Copy link
Contributor

p-mongo commented Aug 16, 2019

The effort to support the old API is reasonable, therefore the old API needs to remain supported.

@xuxiangyang
Copy link
Contributor Author

as #4650 (comment) mentioned,this syntax not work. here is example

class Test
  def test(*args, **options)
   puts args.inspect
   puts options.inspect
  end
end

Test.new.test(user_id: 1) will make args=[] and options={user_id: 1}

@p-mongo
Copy link
Contributor

p-mongo commented Aug 19, 2019

If you like I can clean that up at the end. Can you add the tests for the new syntax, restore tests of the original syntax and add documentation of the original syntax?

@xuxiangyang
Copy link
Contributor Author

done with support original syntax and add documentation of the original syntax.

I need help to clean up the code ~ thx

@johnnyshields
Copy link
Contributor

I think the old syntax should give a deprecation warning. Should be removed in the next major version.

@p-mongo
Copy link
Contributor

p-mongo commented Feb 6, 2020

Merged as #4715.

@p-mongo p-mongo closed this Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants