Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PartitionedModel.db returns different value than db_for_read_write #21393

Closed
millerdev opened this issue Jul 30, 2018 · 13 comments
Closed

PartitionedModel.db returns different value than db_for_read_write #21393

millerdev opened this issue Jul 30, 2018 · 13 comments

Comments

@millerdev
Copy link
Contributor

@snopoke I have a question about partitioned db routing. It looks like PartitionedModel.db,

@property
def db(self):
"""The partitioned database for this object"""
assert self.partition_value, 'Partitioned model must have a partition value'
return RequireDBManager.get_db(self.partition_value)

which calls get_db_alias_for_partitioned_doc,

def get_db_alias_for_partitioned_doc(partition_value):
if settings.USE_PARTITIONED_DATABASE:
from corehq.form_processor.backends.sql.dbaccessors import ShardAccessor
db_name = ShardAccessor.get_database_for_doc(partition_value)
else:
db_name = 'default'
return db_name

does not use the same codepath as db_for_read_write,

def db_for_read_write(model, write=True):
"""
:param model: Django model being queried
:param write: Default to True since the DB for writes can also handle reads
:return: Django DB alias to use for query
"""
app_label = model._meta.app_label
if app_label == WAREHOUSE_APP:
return settings.WAREHOUSE_DATABASE_ALIAS
elif app_label == SYNCLOGS_APP:
return settings.SYNCLOGS_SQL_DB_ALIAS
if not settings.USE_PARTITIONED_DATABASE:
return 'default'
if app_label == FORM_PROCESSOR_APP:
return partition_config.get_proxy_db()
elif app_label in (ICDS_MODEL, ICDS_REPORTS_APP):
engine_id = ICDS_UCR_ENGINE_ID
if not write:
engine_id = connection_manager.get_load_balanced_read_db_alais(ICDS_UCR_ENGINE_ID)
return connection_manager.get_django_db_alias(engine_id)
else:
default_db = partition_config.get_main_db()
if not write:
return connection_manager.get_load_balanced_read_db_alais(app_label, default_db)
return default_db

and in some cases will return a different value. Specifically, it may return 'default' when db_for_read_write would return something else (when USE_PARTITIONED_DATABASE == False). This is just one example of a few functions in corehq.sql_db.util that behave this way. Is that a feature or a bug?

Would it be acceptable to have get_db_alias_for_partitioned_doc (as well as other similar functions in that module that have hard-coded 'default') call db_for_read_write instead of returning 'default' when USE_PARTITIONED_DATABASE == False?

@snopoke
Copy link
Contributor

snopoke commented Jul 31, 2018

It's not a bug though I agree it is confusing. The reason it's not a bug is that get_db_alias_for_partitioned_doc is only intended for use when we are doing direct queries with partitioned models while db_for_read_write will direct partitioned model queries to the proxy DB for routing.

Note that if USE_PARTITIONED_DATABASE == False then db_for_read_write will also return default (though the logic there is rather suspect).

I think the only reason to call db_for_read_write from get_db_alias_for_partitioned_doc would be if we expect the partitioned tables to be in a database other than default (which we don't).

Happy to have this logic clean up though if you can see a good way to do that.

@millerdev
Copy link
Contributor Author

millerdev commented Jul 31, 2018

get_db_alias_for_partitioned_doc is only intended for use when we are doing direct queries with partitioned models while db_for_read_write will direct partitioned model queries to the proxy DB for routing.

This seems like it could become the source of hard-to-find bugs since the db router (MultiDBRouter) uses db_for_read_write for some queries, and in other places we use get_db_alias_for_partitioned_doc & friends. Knowing which is used in which places and how that impacts model loading and saving has been hard for me to understand.

I suppose we're not using PartitionedModel or get_db_alias_for_partitioned_doc, etc. on warehouse or synclog models? Those would return default when db_for_read_write would return some other db name unrelated to partitioning. Does that sound right to you?

I think the only reason to call db_for_read_write from get_db_alias_for_partitioned_doc would be if we expect the partitioned tables to be in a database other than default (which we don't).

This is what I'm working on. I'm looking into to using a separate database connection for saving blob metadata so it is not lost if an exception is thrown resulting in transaction rollback on the default connection.

I think on partitioned dbs transaction rollback does not happen since those connections are in autocommit mode (maybe that's only the proxy?), but when running in non-sharded environments it can happen. Maybe I should just forget this idea since it seems to be adding a fair bit of complexity, and we probably don't care about lost blobs on non-sharded environments since they are usually small scale. Edit: I really want to know if we're accumulating orphaned blobs. I'm not sure how else to do that.

@snopoke
Copy link
Contributor

snopoke commented Jul 31, 2018

I suppose we're not using PartitionedModel or get_db_alias_for_partitioned_doc, etc. on warehouse or synclog models? Those would return default when db_for_read_write would return some other db name unrelated to partitioning. Does that sound right to you?

Yes. If you called get_db_alias_for_partitioned_doc with the ID of a synclog you would certainly get the wrong database.

I'm a little confused about what you were saying about using separate connections for the blob metadata. Are you saying you want to store the in a new database? Or you just want an isolated connection pool?

I thought they would be stored in the shard databases.

@millerdev
Copy link
Contributor Author

I just want an isolated connection pool. They will be stored in the shard databases.

@millerdev
Copy link
Contributor Author

The way that I've found to add an isolated connection pool is to add a blobdb database to settings.DATABASES with the same connection info as the default database in non-partitioned environments. In partitioned environments it just uses the shard db connections (although I still need to research if those are in AUTOCOMMIT mode or if they use per-request transactions).

@snopoke
Copy link
Contributor

snopoke commented Jul 31, 2018

Pretty sure none of the connections are in autocommit mode.

I agree that I don't think it's worth the effort to go down this route. Why do you think it's necessary in the first place? Is it because you don't want to end up with data in riak that's not represented in SQL?

@millerdev
Copy link
Contributor Author

Is it because you don't want to end up with data in riak that's not represented in SQL?

Yes. Anywhere a blobdb.put(...) operation occurs inside a transaction.atomic() block where the transaction gets rolled back will result in an orphaned blob in the blob db.

@millerdev
Copy link
Contributor Author

Pretty sure none of the connections are in autocommit mode.

Interesting. Django's transaction documentation says "Django’s default behavior is to run in autocommit mode." As far as I can tell we are not setting AUTOCOMMIT or ATOMIC_REQUESTS connection parameters anywhere in our codebase, so I would assume we're using the default. Are we wrapping each request in a transaction somewhere?

@snopoke
Copy link
Contributor

snopoke commented Jul 31, 2018 via email

@millerdev
Copy link
Contributor Author

What if you did something like what we during form submission where you create the metadata first and then delete it if there is an error (though in form processing we do the reverse - we delete if there was no error)

I think the critical difference there is SubmissionProcessTracker deletes if there was no error, where I want to do the opposite, but I don't have control of the transaction context inside blobdb.put(...). In other words, I want to keep metadata even if there is an error, but it's automatically deleted on error (transaction rollback), and I can't control that.

Maybe you meant to suggest that we delete the blob if there is an error? This would be great, but sounds hard. I'd need to somehow detect when a transaction rollback occurs involving blob metadata. Django has an on_commit() hook, but not an on_rollback() hook.

Maybe we should take this discussion to voice? I think for now I'm just going to punt and forget about it. Hopefully we don't orphan many blobs.

@snopoke
Copy link
Contributor

snopoke commented Aug 1, 2018 via email

@millerdev
Copy link
Contributor Author

@snopoke do you know if xform.unsaved_attachments are always saved eventually?

xform_attachment.write_content(attachment.content)
if xform_attachment.is_image:
try:
img_size = Image.open(attachment.content_as_file()).size
xform_attachment.properties = dict(width=img_size[0], height=img_size[1])
except IOError:
xform_attachment.content_type = 'application/octet-stream'
xform_attachments.append(xform_attachment)
xform.unsaved_attachments = xform_attachments

This looks like a place where blob content is written to the blob db and then later could be orphaned if unsaved_attachments are not persisted to the database (for example, if an error occurs during form processing).

@snopoke
Copy link
Contributor

snopoke commented Aug 2, 2018

Depending on where the error happened that's definitely a possibility.

@orangejenny orangejenny linked a pull request Apr 20, 2020 that will close this issue
@orangejenny orangejenny removed a link to a pull request Apr 20, 2020
@snopoke snopoke closed this as completed Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants