PartitionedModel.db returns different value than db_for_read_write #21393

millerdev · 2018-07-30T20:46:52Z

@snopoke I have a question about partitioned db routing. It looks like PartitionedModel.db,

Lines 106 to 110 in 92167cc

    
           @property 
        
           def db(self): 
        
               """The partitioned database for this object""" 
        
               assert self.partition_value, 'Partitioned model must have a partition value' 
        
               return RequireDBManager.get_db(self.partition_value)

which calls get_db_alias_for_partitioned_doc,

commcare-hq/corehq/sql_db/util.py

Lines 77 to 83 in 92167cc

    
           def get_db_alias_for_partitioned_doc(partition_value): 
        
               if settings.USE_PARTITIONED_DATABASE: 
        
                   from corehq.form_processor.backends.sql.dbaccessors import ShardAccessor 
        
                   db_name = ShardAccessor.get_database_for_doc(partition_value) 
        
               else: 
        
                   db_name = 'default' 
        
               return db_name

does not use the same codepath as db_for_read_write,

commcare-hq/corehq/sql_db/routers.py

Lines 78 to 105 in 92167cc

    
           def db_for_read_write(model, write=True): 
        
               """ 
        
               :param model: Django model being queried 
        
               :param write: Default to True since the DB for writes can also handle reads 
        
               :return: Django DB alias to use for query 
        
               """ 
        
               app_label = model._meta.app_label 
        
               if app_label == WAREHOUSE_APP: 
        
                   return settings.WAREHOUSE_DATABASE_ALIAS 
        
               elif app_label == SYNCLOGS_APP: 
        
                   return settings.SYNCLOGS_SQL_DB_ALIAS 
        
               if not settings.USE_PARTITIONED_DATABASE: 
        
                   return 'default' 
        
               if app_label == FORM_PROCESSOR_APP: 
        
                   return partition_config.get_proxy_db() 
        
               elif app_label in (ICDS_MODEL, ICDS_REPORTS_APP): 
        
                   engine_id = ICDS_UCR_ENGINE_ID 
        
                   if not write: 
        
                       engine_id = connection_manager.get_load_balanced_read_db_alais(ICDS_UCR_ENGINE_ID) 
        
                   return connection_manager.get_django_db_alias(engine_id) 
        
               else: 
        
                   default_db = partition_config.get_main_db() 
        
                   if not write: 
        
                       return connection_manager.get_load_balanced_read_db_alais(app_label, default_db) 
        
                   return default_db

and in some cases will return a different value. Specifically, it may return 'default' when db_for_read_write would return something else (when USE_PARTITIONED_DATABASE == False). This is just one example of a few functions in corehq.sql_db.util that behave this way. Is that a feature or a bug?

Would it be acceptable to have get_db_alias_for_partitioned_doc (as well as other similar functions in that module that have hard-coded 'default') call db_for_read_write instead of returning 'default' when USE_PARTITIONED_DATABASE == False?

The text was updated successfully, but these errors were encountered:

snopoke · 2018-07-31T08:23:06Z

It's not a bug though I agree it is confusing. The reason it's not a bug is that get_db_alias_for_partitioned_doc is only intended for use when we are doing direct queries with partitioned models while db_for_read_write will direct partitioned model queries to the proxy DB for routing.

Note that if USE_PARTITIONED_DATABASE == False then db_for_read_write will also return default (though the logic there is rather suspect).

I think the only reason to call db_for_read_write from get_db_alias_for_partitioned_doc would be if we expect the partitioned tables to be in a database other than default (which we don't).

Happy to have this logic clean up though if you can see a good way to do that.

millerdev · 2018-07-31T11:58:09Z

get_db_alias_for_partitioned_doc is only intended for use when we are doing direct queries with partitioned models while db_for_read_write will direct partitioned model queries to the proxy DB for routing.

This seems like it could become the source of hard-to-find bugs since the db router (MultiDBRouter) uses db_for_read_write for some queries, and in other places we use get_db_alias_for_partitioned_doc & friends. Knowing which is used in which places and how that impacts model loading and saving has been hard for me to understand.

I suppose we're not using PartitionedModel or get_db_alias_for_partitioned_doc, etc. on warehouse or synclog models? Those would return default when db_for_read_write would return some other db name unrelated to partitioning. Does that sound right to you?

I think the only reason to call db_for_read_write from get_db_alias_for_partitioned_doc would be if we expect the partitioned tables to be in a database other than default (which we don't).

This is what I'm working on. I'm looking into to using a separate database connection for saving blob metadata so it is not lost if an exception is thrown resulting in transaction rollback on the default connection.

I think on partitioned dbs transaction rollback does not happen since those connections are in autocommit mode (maybe that's only the proxy?), but when running in non-sharded environments it can happen. ~~Maybe I should just forget this idea since it seems to be adding a fair bit of complexity, and we probably don't care about lost blobs on non-sharded environments since they are usually small scale.~~ Edit: I really want to know if we're accumulating orphaned blobs. I'm not sure how else to do that.

snopoke · 2018-07-31T13:13:22Z

I suppose we're not using PartitionedModel or get_db_alias_for_partitioned_doc, etc. on warehouse or synclog models? Those would return default when db_for_read_write would return some other db name unrelated to partitioning. Does that sound right to you?

Yes. If you called get_db_alias_for_partitioned_doc with the ID of a synclog you would certainly get the wrong database.

I'm a little confused about what you were saying about using separate connections for the blob metadata. Are you saying you want to store the in a new database? Or you just want an isolated connection pool?

I thought they would be stored in the shard databases.

millerdev · 2018-07-31T13:22:07Z

I just want an isolated connection pool. They will be stored in the shard databases.

millerdev · 2018-07-31T13:26:31Z

The way that I've found to add an isolated connection pool is to add a blobdb database to settings.DATABASES with the same connection info as the default database in non-partitioned environments. In partitioned environments it just uses the shard db connections (although I still need to research if those are in AUTOCOMMIT mode or if they use per-request transactions).

snopoke · 2018-07-31T14:29:25Z

Pretty sure none of the connections are in autocommit mode.

I agree that I don't think it's worth the effort to go down this route. Why do you think it's necessary in the first place? Is it because you don't want to end up with data in riak that's not represented in SQL?

millerdev · 2018-07-31T15:07:18Z

Is it because you don't want to end up with data in riak that's not represented in SQL?

Yes. Anywhere a blobdb.put(...) operation occurs inside a transaction.atomic() block where the transaction gets rolled back will result in an orphaned blob in the blob db.

millerdev · 2018-07-31T15:15:16Z

Pretty sure none of the connections are in autocommit mode.

Interesting. Django's transaction documentation says "Django’s default behavior is to run in autocommit mode." As far as I can tell we are not setting AUTOCOMMIT or ATOMIC_REQUESTS connection parameters anywhere in our codebase, so I would assume we're using the default. Are we wrapping each request in a transaction somewhere?

snopoke · 2018-07-31T15:37:29Z

Pretty sure none of the connections are in autocommit mode. Interesting. Django's transaction documentation <https://docs.djangoproject.com/en/1.11/topics/db/transactions/#django-s-default-transaction-behavior> says "Django’s default behavior is to run in autocommit mode." As far as I can tell we are not setting AUTOCOMMIT or ATOMIC_REQUESTS connection parameters anywhere in our codebase, so I would assume we're using the default. Are we wrapping each request in a transaction somewhere?

Sorry I was thinking of implicit transactions. Is it because you don't want to end up with data in riak that's not

represented in SQL?

Yes. Anywhere a blobdb.put(...) operation occurs inside a

transaction.atomic() block where the transaction gets rolled back will result in an orphaned blob in the blob db

What if you did something like what we during form submission where you create the metadata first and then delete it if there is an error (though in form processing we do the reverse - we delete if there was no error): https://github.com/dimagi/commcare-hq/blob/422ad1988804801a724ae113082f54e002ebe927/corehq/form_processor/submission_post.py#L532 That might fail to delete but having a meta that points to a missing blob is better than the reverse. You could also create it with a flag and then update the fag on success.

millerdev · 2018-07-31T16:10:08Z

What if you did something like what we during form submission where you create the metadata first and then delete it if there is an error (though in form processing we do the reverse - we delete if there was no error)

I think the critical difference there is SubmissionProcessTracker deletes if there was no error, where I want to do the opposite, but I don't have control of the transaction context inside blobdb.put(...). In other words, I want to keep metadata even if there is an error, but it's automatically deleted on error (transaction rollback), and I can't control that.

Maybe you meant to suggest that we delete the blob if there is an error? This would be great, but sounds hard. I'd need to somehow detect when a transaction rollback occurs involving blob metadata. Django has an on_commit() hook, but not an on_rollback() hook.

Maybe we should take this discussion to voice? I think for now I'm just going to punt and forget about it. Hopefully we don't orphan many blobs.

snopoke · 2018-08-01T07:07:53Z

I guess this is why Write Ahead Logs exist. I agree that leaving this for now is fine. It doesn't put us in any worse position than we already are.

millerdev · 2018-08-01T15:30:54Z

@snopoke do you know if xform.unsaved_attachments are always saved eventually?

commcare-hq/corehq/form_processor/backends/sql/processor.py

Lines 43 to 52 in 118d88d

    
               xform_attachment.write_content(attachment.content) 
        
               if xform_attachment.is_image: 
        
                   try: 
        
                       img_size = Image.open(attachment.content_as_file()).size 
        
                       xform_attachment.properties = dict(width=img_size[0], height=img_size[1]) 
        
                   except IOError: 
        
                       xform_attachment.content_type = 'application/octet-stream' 
        
               xform_attachments.append(xform_attachment) 
        
           xform.unsaved_attachments = xform_attachments

This looks like a place where blob content is written to the blob db and then later could be orphaned if unsaved_attachments are not persisted to the database (for example, if an error occurs during form processing).

snopoke · 2018-08-02T10:05:39Z

Depending on where the error happened that's definitely a possibility.

orangejenny linked a pull request Apr 20, 2020 that will close this issue

render message as html in email #27200

Merged

orangejenny removed a link to a pull request Apr 20, 2020

render message as html in email #27200

Merged

snopoke closed this as completed Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PartitionedModel.db returns different value than db_for_read_write #21393

PartitionedModel.db returns different value than db_for_read_write #21393

millerdev commented Jul 30, 2018

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018 •

edited

Loading

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018

millerdev commented Jul 31, 2018

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018

millerdev commented Jul 31, 2018

snopoke commented Jul 31, 2018 via email

millerdev commented Jul 31, 2018

snopoke commented Aug 1, 2018 via email

millerdev commented Aug 1, 2018

snopoke commented Aug 2, 2018

PartitionedModel.db returns different value than db_for_read_write #21393

PartitionedModel.db returns different value than db_for_read_write #21393

Comments

millerdev commented Jul 30, 2018

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018 • edited Loading

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018

millerdev commented Jul 31, 2018

snopoke commented Jul 31, 2018

millerdev commented Jul 31, 2018

millerdev commented Jul 31, 2018

snopoke commented Jul 31, 2018 via email

millerdev commented Jul 31, 2018

snopoke commented Aug 1, 2018 via email

millerdev commented Aug 1, 2018

snopoke commented Aug 2, 2018

millerdev commented Jul 31, 2018 •

edited

Loading