Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different Partitioning Schemes in the same database #84

Closed
johnkattenhorn opened this issue Dec 23, 2015 · 14 comments
Closed

Different Partitioning Schemes in the same database #84

johnkattenhorn opened this issue Dec 23, 2015 · 14 comments

Comments

@johnkattenhorn
Copy link

We are currently setting the number of partitions in a single DocumentDb database on a per entity-basis by overwriting the parititionresolver dictionary entry on a per-entity basis.

Although we've proved that we have different instances of the DocumentDbClient and therefor different ParititonResolver dictionaries it doesn't seem to work if the collection counts are different.

We are wondering if variable partition counts in a single database is supported ? Since the DocumentDbClient Dictionary for Partition Resolves is keyed by database should we be separating our entities with different partition requirements into separate databases ?

@arramac
Copy link
Contributor

arramac commented Dec 23, 2015

Can you please provide additional details or a repro on what doesn't work?

Do I understand correctly that you have. say, collections 1, 2 and 3 storing type 1 and collections 4, 5, and 6 storing collection 2, then register/deregister partition resolvers based on context?

Separating into databases is probably a good idea, though I would like to understand this better.

You can email me at arramac@microsoft.com in case you want to discuss over email/phone.

@pavelbaykov89
Copy link

Some update about this issue:
After decompiling the code of HashPartitionResolver and ConsistentHashRing I found that ConsistentHashRing has this line:
byte[] hash = hashGenerator.ComputeHash(BitConverter.GetBytes(node.GetHashCode()));
but in another part of the code it has this line:
byte[] hash = hashGenerator.ComputeHash(Encoding.UTF8.GetBytes(node));
so when I change the first one to
byte[] hash = hashGenerator.ComputeHash(Encoding.UTF8.GetBytes(node));
everything works fine. I think this is because of hardware dependent BitConverter or GetHashCode method.
Can we find somewhere the source code we can fork and fix?

@rnagpal
Copy link
Contributor

rnagpal commented Dec 28, 2015

@johnkattenhorn: Let me know if I got your question correctly.

I'm assuming that you are instantiating an object of HashPartitionResolver class for your partition resolver on the documentdb database.

As you mentioned that you have different instances of the DocumentClient, you will have different instances of PartitionResolvers and if you have registered those partition resolvers with different collection set, they may not resolve a document to the same collection using those different PartitionResolver instances. The current .NET implementation for hash based partitioning registers the partition resolver on a per database level with a fixed set of collections but if the collections increase and you re-register the partition resolver with the updated collection set, the distribution of the documents will follow a different hash scheme. This means if you had inserted a document using the former partition resolver scheme and it went to collection1, if you try to query the same document using the new partition resolver, it may resolve to a different collection which might not have the document you are looking for.

If you want to support variable number of partitions, you will have to implement your own partition resolver(by implementing IPartitionResolver interface) which takes care of your requirement. You may be able to add more collections and re-register the partition resolver and still have the old behavior(documents inserted using former partition scheme resolve correctly using the new partition scheme in addition to new documents resolving to new collection you added).

We need to understand your use case better to suggest what will work for you. If you are already working with arramac, I'll sync up with him for further details. You shouldn't need to split your data to a different database and with a custom partition resolver, I hope you will be able to resolver your case.

Hope this helps.

Regards,
Rajesh

@rnagpal
Copy link
Contributor

rnagpal commented Dec 28, 2015

@pavelbaykov89: As you guessed it right, the following line of code is hardware/architecture dependent
byte[] hash = hashGenerator.ComputeHash(BitConverter.GetBytes(node.GetHashCode()));

so that results in different partitioning behavior if the DocumentDb client is running on a machine with a different hardware/architecture. We came to know about this issue quite late but unfortunately cannot change it now as this will cause breaking changes for someone upgrading to this new update of the client since their partitioning scheme will start behaving differently. So while we cannot fix this now for the default implementation, you should be able to get away with this by implementing your own partition resolver(by implementing IPartitionResolver interface) to get away from this limitation.

Please let us know if this is something you can consider to resolve your issue or need help setting up this custom partition resolver.

Regarding the code, unfortunately .NET SDK for DocumentDB is not open sourced yet, so you cannot have access to the code and modify it.

I have a question for you: You mentioned that after changing the code "everything" worked fine. Did you meant to refer the issue that johnkattenhorn was referring to? Does that get fixed with this change? I'll be surprised if that happens and would like to look into it in more detail.

Regards,
Rajesh

@johnkattenhorn
Copy link
Author

@rnagpal: I can confirm that this was our problem and @pavelbaykov89 fix has resolved this.

Whilst I understand the reason for not fixing this due to a breaking change (and there's will be a lot of partitions out there now) we found that partitioning was broken unless we addressed this issue with our fix inside a custom provider.

This is worrying because maybe there's some timeline / release of the DocumentDbClient that we happen to get caught up in which means there are going to be potentially more people in the same situation as us.

@rnagpal
Copy link
Contributor

rnagpal commented Dec 29, 2015

In that case, the underlying issue is that the hashes are not consistent across machines with different architecture(due to the GetHashCode method) and the fix that you did will make it work but as I mentioned that we cannot change this now(to avoid breaking change), we will have this limitation that the client side partitioning will not work if the DocumentClient are running on machines with different architecture.

Aravind may contact you offline to give more info on our alternate plan to address this issue in future for partitioning scenarios.

Regards,
Rajesh

@johnkattenhorn
Copy link
Author

I still don't get it though, all of the components of our application are running in Azure, surely there can't be different architectures in play between worker roles ?

@pavelbaykov89 will correct me if I'm wrong but I think even the examples do not work unless you we use our custom resolver, is this correct ?

Thanks

John

@ghost
Copy link

ghost commented Jan 20, 2016

@pavelbaykov89 do you have an update on this? I just ran the published samples and they do work, from a single machine. i have not tried deploying this to a multi machine environment and running the sample like that.

@johnkattenhorn
Copy link
Author

@ryancrawcour - @pavelbaykov89 and I needed to move on so we used a custom resolver with a consistent way of computing the hash.

I'm still not convinced the issue was understood fully by any of us but luckily we caught this problem before we went live as we had to delete all of our data and start over.

@ghost ghost closed this as completed Jan 22, 2016
@czielin
Copy link

czielin commented Feb 16, 2016

@johnkattenhorn @pavelbaykov89 would you mind please sharing the customer resolver you created to handle this? I believe I am running into the same issue. The same partition key seems to resolving to different partitions even when running on the same local machine.

@ghost
Copy link

ghost commented Feb 16, 2016

@czielin what version of the SDK are you using?

@czielin
Copy link

czielin commented Feb 16, 2016

Microsoft.Azure.DocumentDB 1.5.2

Is this something that should have been fixed?

@ghost
Copy link

ghost commented Feb 16, 2016

yes, this shouldn't be happening anymore. do you want to email me (email on profile) so we can dig in to this with you.

@czielin
Copy link

czielin commented Feb 16, 2016

Please disregard for now. I may have been too quick to attribute it to this issue. I will reach out again if I can come up with more solid evidence. Thanks for the quick reply.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants