-
-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PartitionedModel.db returns different value than db_for_read_write #21393
Comments
It's not a bug though I agree it is confusing. The reason it's not a bug is that Note that if I think the only reason to call Happy to have this logic clean up though if you can see a good way to do that. |
This seems like it could become the source of hard-to-find bugs since the db router ( I suppose we're not using
This is what I'm working on. I'm looking into to using a separate database connection for saving blob metadata so it is not lost if an exception is thrown resulting in transaction rollback on the I think on partitioned dbs transaction rollback does not happen since those connections are in autocommit mode (maybe that's only the proxy?), but when running in non-sharded environments it can happen. |
Yes. If you called I'm a little confused about what you were saying about using separate connections for the blob metadata. Are you saying you want to store the in a new database? Or you just want an isolated connection pool? I thought they would be stored in the shard databases. |
I just want an isolated connection pool. They will be stored in the shard databases. |
The way that I've found to add an isolated connection pool is to add a |
Pretty sure none of the connections are in autocommit mode. I agree that I don't think it's worth the effort to go down this route. Why do you think it's necessary in the first place? Is it because you don't want to end up with data in riak that's not represented in SQL? |
Yes. Anywhere a |
Interesting. Django's transaction documentation says "Django’s default behavior is to run in autocommit mode." As far as I can tell we are not setting |
Pretty sure none of the connections are in autocommit mode.
Interesting. Django's transaction documentation
<https://docs.djangoproject.com/en/1.11/topics/db/transactions/#django-s-default-transaction-behavior>
says "Django’s default behavior is to run in autocommit mode." As far as
I can tell we are not setting AUTOCOMMIT or ATOMIC_REQUESTS connection
parameters anywhere in our codebase, so I would assume we're using the
default. Are we wrapping each request in a transaction somewhere?
Sorry I was thinking of implicit transactions.
Is it because you don't want to end up with data in riak that's not
represented in SQL?
Yes. Anywhere a blobdb.put(...) operation occurs inside a
transaction.atomic() block where the transaction gets rolled back will
result in an orphaned blob in the blob db
What if you did something like what we during form submission where you
create the metadata first and then delete it if there is an error (though
in form processing we do the reverse - we delete if there was no error):
https://github.com/dimagi/commcare-hq/blob/422ad1988804801a724ae113082f54e002ebe927/corehq/form_processor/submission_post.py#L532
That might fail to delete but having a meta that points to a missing blob
is better than the reverse. You could also create it with a flag and then
update the fag on success.
|
I think the critical difference there is Maybe you meant to suggest that we delete the blob if there is an error? This would be great, but sounds hard. I'd need to somehow detect when a transaction rollback occurs involving blob metadata. Django has an Maybe we should take this discussion to voice? I think for now I'm just going to punt and forget about it. Hopefully we don't orphan many blobs. |
I guess this is why Write Ahead Logs exist.
I agree that leaving this for now is fine. It doesn't put us in any worse
position than we already are.
|
@snopoke do you know if commcare-hq/corehq/form_processor/backends/sql/processor.py Lines 43 to 52 in 118d88d
This looks like a place where blob content is written to the blob db and then later could be orphaned if |
Depending on where the error happened that's definitely a possibility. |
@snopoke I have a question about partitioned db routing. It looks like
PartitionedModel.db
,commcare-hq/corehq/sql_db/models.py
Lines 106 to 110 in 92167cc
which calls
get_db_alias_for_partitioned_doc
,commcare-hq/corehq/sql_db/util.py
Lines 77 to 83 in 92167cc
does not use the same codepath as
db_for_read_write
,commcare-hq/corehq/sql_db/routers.py
Lines 78 to 105 in 92167cc
and in some cases will return a different value. Specifically, it may return
'default'
whendb_for_read_write
would return something else (whenUSE_PARTITIONED_DATABASE == False
). This is just one example of a few functions incorehq.sql_db.util
that behave this way. Is that a feature or a bug?Would it be acceptable to have
get_db_alias_for_partitioned_doc
(as well as other similar functions in that module that have hard-coded'default'
) calldb_for_read_write
instead of returning'default'
whenUSE_PARTITIONED_DATABASE == False
?The text was updated successfully, but these errors were encountered: