Skip to content

Conversation

@4word
Copy link
Contributor

@4word 4word commented May 3, 2016

Requiring an external-to-the-db id generator is not an option for some use cases. This new field and related migration utilities allow for shard-safe (up to 8191 shards) inserts with guaranteed unique IDs (up to 1024 inserts per millisecond on an individual shard). The ID will return like a standard Django auto field ID. In the future, a given ID could be deconstructed to get the time it was inserted and the shard id (useful for the router to potentially know which shard to grab something from without needing help from the using() statement).

Titus Peterson added 15 commits May 1, 2016 10:09
… django updates to the base method rather than returning on their own
… argument that stores the field as an attribute and sets the default manager for the model to ShardManager
…del to get the correct shard based on the arguments passed to filter,create,get,etc. The default behavior is unchanged so existing models do not need any changes
…does not know what shard it should be on but there in a sharded by field defined
…ty on each primary shard. Added and updated unit tests to check for valid shard ids.
…-safe IDs. Added utils for migrations. Updated model_config decorator to enforce an ID strategy if the postgres shard id field does not exist on the model
@4word
Copy link
Contributor Author

4word commented May 3, 2016

Note that much of this code was ported over from a Django 1.4 and python 2.7 deployment and I am not exactly expecting unit tests to pass the first time (although I updated as much as I could think of and find). More commits will come to fix any failures.

@4word
Copy link
Contributor Author

4word commented May 3, 2016

This is ready for more scrutinizing review. Docs need to be updated but functionality is available and tested for both database and django level usage.

Titus Peterson added 5 commits May 3, 2016 00:59
… the base model does is set the manager. Unfortunately, this cannot be done in the decorator for some reason (I cannot figure out why, it simply doesnt work correctly)
…re instantiating the object with them, just in case the parent class messes with the kwargs later
if not db_alias:
raise EnvironmentError("A pre-migration receiver did not receive a database alias. "
"Perhaps your app is not registered correctly?")
if settings.DATABASES[db_alias]['ENGINE'] == Backends.POSTGRES:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this only supports postgres, somewhere (maybe here?) we should raise an exception if you try to use it and you're not using postgres.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The problem is that the library allows arbitrary backends within the same shard group. Should I just loop through the entire shard group and make sure all the backends are postgres?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you can do it either for all sharded databases or make it a shard
group level setting?

On Tue, May 3, 2016, 8:59 PM Titus Peterson notifications@github.com
wrote:

In django_sharding_library/fields.py
#26 (comment):

  •    if not hasattr(settings, 'SHARD_EPOCH'):
    
  •        raise ValueError("PostgresShardGeneratedIDField requires a SHARD_EPOCH to be defined in your settings file.")
    
  •    if connection.vendor == PostgresDatabaseWrapper.vendor:
    
  •        return "bigint DEFAULT next_sharded_id()"
    
  •    else:
    
  •        return super(PostgresShardGeneratedIDField, self).db_type(connection)
    
  • @staticmethod
  • def migration_receiver(_args, *_kwargs):
  •    sequence_name = "global_id_sequence"
    
  •    db_alias = kwargs.get('using')
    
  •    if not db_alias:
    
  •        raise EnvironmentError("A pre-migration receiver did not receive a database alias. "
    
  •                               "Perhaps your app is not registered correctly?")
    
  •    if settings.DATABASES[db_alias]['ENGINE'] == Backends.POSTGRES:
    

Hmm. The problem is that the library allows arbitrary backends within the
same shard group. Should I just loop through the entire shard group and
make sure all the backends are postgres?


You are receiving this because you commented.

Reply to this email directly or view it on GitHub
https://github.com/JBKahn/django-sharding/pull/26/files/e534b5ffa93bd79fab7bcb8f90fb315a344f3661#r61980198

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to raise an exception

Titus Peterson added 2 commits May 3, 2016 19:00
…ferences in code to check "in" rather than equality
…gres field without postgres being the engine. Updated test model and fields to reflect the change
Titus Peterson added 6 commits June 6, 2016 08:21
- New Field for Postgres ID, along with migrations
- Updated decorators.py to account for new postgres field
- Updated settings helpers and constants with new utils, and fixed
  referencs to the same
- Check that field works as intended
- Ensure stored procedure and sequence are created
- Added migration to support field tests
@4word 4word force-pushed the postgres-auto-shard-id branch from 838eb4e to 10ed281 Compare June 6, 2016 17:12
@4word
Copy link
Contributor Author

4word commented Jun 6, 2016

I rebased my commits to squash down the sheer volume of them that there were. I updated the doc to mention the shard limit.

This is ready to merge in, I am planning on doing the management commands in a new PR. I would rather wait for you to merge this in until after you release the manager and router changes on pypi, if you are planning on doing that soon, since its useful for most people. If not, then just merge it whenever and I will plan on working on the management commands later this week.

@coveralls
Copy link

coveralls commented Jun 6, 2016

Coverage Status

Coverage decreased (-0.7%) to 93.174% when pulling 10ed281 on 4word:postgres-auto-shard-id into e92cff9 on JBKahn:master.

In the above example, it takes an optional database. However you will find that you can choose to provide additional arguments later on when you make use of the generator. The only real requirements is that it be a class with a `get_next_id` function.

The two included in the package are:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it say

The three included in the package are

Also, I wonder if one could clarify what is enumerated here, since all these are mentioned above: strategies, additional arguments, or requirements. (<-- this is me being very pedantic)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, it should say three. I will update.

What do you mean clarify what is enumerated? There are no additional requirements other than what is listed, which basically includes Postgres and defining a shard epoch time. All of the exact requirements are in the Using section for this feature, which is in the ShardingAModel.md file. If thats not what you mean, let me know!

Titus Peterson added 4 commits July 14, 2016 11:11
@coveralls
Copy link

coveralls commented Jul 14, 2016

Coverage Status

Coverage decreased (-1.9%) to 92.027% when pulling c76760d on 4word:postgres-auto-shard-id into e92cff9 on JBKahn:master.

@coveralls
Copy link

coveralls commented Jul 15, 2016

Coverage Status

Coverage decreased (-2.03%) to 91.887% when pulling 43b465f on 4word:postgres-auto-shard-id into e92cff9 on JBKahn:master.

@JBKahn
Copy link
Owner

JBKahn commented Aug 17, 2016

I'm going to try looking at this tonight. I'll probably take a stab at writing the management commands I'd like to include and then we can see about releasing this ASAP since I've been really lazy about this.

@JBKahn
Copy link
Owner

JBKahn commented Aug 17, 2016

@4word if I write some commands, and this is reviewed, are any pieces missing or is this good to go (both from the point of view of the pull request and for a new release) from your point of view?

@4word
Copy link
Contributor Author

4word commented Aug 17, 2016

All the pieces are there, should be good to be merged in.

@coveralls
Copy link

Coverage Status

Coverage decreased (-2.03%) to 91.887% when pulling e5b5101 on 4word:postgres-auto-shard-id into e92cff9 on JBKahn:master.

1 similar comment
@coveralls
Copy link

coveralls commented Aug 17, 2016

Coverage Status

Coverage decreased (-2.03%) to 91.887% when pulling e5b5101 on 4word:postgres-auto-shard-id into e92cff9 on JBKahn:master.

# If the database needs similar types for key fields however, the only
# thing we can do is making AutoField an IntegerField.
rel_field = self.target_field
if rel_field.get_internal_type() is "BigIntegerField":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to put is and not == for the string comparison?

@JBKahn
Copy link
Owner

JBKahn commented Aug 18, 2016

@4word those are my last two questions, then I should be good to merge this in and write the commands.

@JBKahn
Copy link
Owner

JBKahn commented Aug 20, 2016

I'm going to merge and make the change and let me know if I should revert it. That way I can continue working on it before you get back to me.

@JBKahn JBKahn merged commit feb6693 into JBKahn:master Aug 20, 2016
@4word 4word deleted the postgres-auto-shard-id branch September 12, 2016 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants