Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(db_api): add an ability to set ReadOnly/ReadWrite connection mode #475

Merged
merged 37 commits into from Oct 5, 2021

Conversation

@IlyaFaer
Copy link
Member

@IlyaFaer IlyaFaer commented Aug 2, 2021

Proposal for the user's question: googleapis/python-spanner-sqlalchemy#97
Will require small changes in SQLAlchemy dialect as well.

@IlyaFaer IlyaFaer marked this pull request as ready for review Aug 3, 2021
@IlyaFaer IlyaFaer requested a review from as a code owner Aug 3, 2021
@IlyaFaer IlyaFaer requested a review from larkee Aug 3, 2021
google/cloud/spanner_v1/transaction.py Outdated Show resolved Hide resolved
tests/system/test_system_dbapi.py Outdated Show resolved Hide resolved
tests/system/test_system_dbapi.py Outdated Show resolved Hide resolved
google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/transaction.py Outdated Show resolved Hide resolved
google/cloud/spanner_v1/transaction.py Outdated Show resolved Hide resolved
google/cloud/spanner_dbapi/connection.py Outdated Show resolved Hide resolved
google/cloud/spanner_dbapi/cursor.py Show resolved Hide resolved
Copy link
Contributor

@larkee larkee left a comment

Read-only transaction in the Python client are represented by the Snapshot class.

Please do not modify transaction.py. Use Snapshot instead.

@IlyaFaer
Copy link
Member Author

@IlyaFaer IlyaFaer commented Aug 10, 2021

@larkee, hm, okay. If we want to use the Snapshot() class, we probably could re-use the existing method for single-use reads. Let me know if you think it's better to use a snapshot as a transaction for several sequential reads.

@larkee
Copy link
Contributor

@larkee larkee commented Aug 11, 2021

@larkee, hm, okay. If we want to use the Snapshot() class, we probably could re-use the existing method for single-use reads. Let me know if you think it's better to use a snapshot as a transaction for several sequential reads.

By default, a snapshot is a single-use read-only transaction so it would only be valid for one read operation. However, there is an option make a multi-use snapshot in which all reads are from the same snapshot of data i.e. data is consistent.

Multi-use snapshot example: https://cloud.google.com/spanner/docs/transactions#ro_transaction_example
Single-use snapshot example: https://cloud.google.com/spanner/docs/reads#single_read_methods

With this in mind, I think using multi_use snapshots should only be used when read_only=True and AUTOCOMMIT=False. WDYT?

@olavloite
Copy link

@olavloite olavloite commented Aug 11, 2021

With this in mind, I think using multi_use snapshots should only be used when read_only=True and AUTOCOMMIT=False. WDYT?

Yes, that is the intention here. So the implementation should roughly be:

  1. read_only=True and auto_commit=False: Use multi_use snapshots and only allow queries/reads. Calling commit() or rollback() should close the current multi-use snapshot, and the next statement that is executed should create a new multi-use snapshot.
  2. auto_commit=True (and regardless of the state of read_only): Always use single-use snapshots for queries/reads.

@skuruppu
Copy link
Contributor

@skuruppu skuruppu commented Sep 14, 2021

I would be a bit worried about doing the extra performance cost of keeping track of checksums for read-write transactions though. I wouldn't be comfortable releasing that. Is it hard to fix @IlyaFaer?

@IlyaFaer
Copy link
Member Author

@IlyaFaer IlyaFaer commented Sep 14, 2021

@skuruppu, of course, it's not. I don't want to sound boring, but I can't see why you both think it's that performance influencing. We don't calculate checksums in read_only - the ifs are already added (8332d49). As for retrying read_only, well, read_only transactions don't fail with Aborted, so they're not gonna be retried anyway. Thus, I don't see why these changes should be urgent and delay other fixes...

Anyway it's pushed.

@olavloite
Copy link

@olavloite olavloite commented Sep 14, 2021

... but I can't see why you both think it's that performance influencing. We don't calculate checksums in read_only - the ifs are already added (8332d49).

Sorry, that was my fault. I didn't realize that you had already created that change, as the comment that you referenced above didn't say anything about that.

... Thus, I don't see why these changes should be urgent and delay other fixes...

In my opinion, the current implementation would have been slightly confusing for a potential new developer. The retry mechanism is not needed for read-only transactions. I think that it is better that it is completely clear in the code to prevent anyone from accidentally re-enabling it in the future. (This last commit for example has also removed an unnecessary test case that tested the retry mechanism for a query in autocommit mode; a situation that could not happen, but still was part of the code base.)

Anyways, I'm happy enough with this as-is, but we should add a system test ASAP to actually show that it is working as intended. If @skuruppu and/or @larkee are OK with that, then we could do that in a separate PR.

@skuruppu
Copy link
Contributor

@skuruppu skuruppu commented Sep 15, 2021

Ultimately @larkee needs to approve this as the code owner so please do add any tests that they ask for before merging.

@larkee
Copy link
Contributor

@larkee larkee commented Sep 15, 2021

@IlyaFaer I second what @olavloite said. The refactor is less about performance (since it should be roughly equivalent) but is more about readability and reducing confusion for new developers.

However, if you create a tracking issue for the refactor work, I will approve and merge this PR 👍

@IlyaFaer
Copy link
Member Author

@IlyaFaer IlyaFaer commented Sep 15, 2021

I think all comments are already processed.

I've pushed changes with adding if statements into fetch methods to avoid retrying read_only transactions, etc. (see q-logic@ac8c4b2). So, I don't think any refactoring left here.

I also returned back that deleted-by-merge-of-another-commit system test. It checks that UPDATE operation will cause an error, while SELECT will work fine.

I think it's worth adding a unit test, which'll check that read_only transactions are not retried on Aborted exception. Otherwise, I think we don't have anything for a separate issue.

larkee
larkee approved these changes Sep 16, 2021
@gcf-merge-on-green
Copy link
Contributor

@gcf-merge-on-green gcf-merge-on-green bot commented Oct 5, 2021

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@IlyaFaer
Copy link
Member Author

@IlyaFaer IlyaFaer commented Oct 5, 2021

Looking at automerge label set by googler, I suppose it's not a problem if I'll merge it by myself (while tests are green and there are no changes in the main branch).

@IlyaFaer IlyaFaer merged commit cd3b950 into googleapis:main Oct 5, 2021
10 of 11 checks passed
@IlyaFaer IlyaFaer deleted the read_only_transactions branch Oct 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants