New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SQLAlchemy example(s) to use txn retries #4142

Merged
merged 1 commit into from Jan 8, 2019

Conversation

Projects
None yet
6 participants
@rmloveland
Copy link
Contributor

rmloveland commented Dec 10, 2018

Addresses #3315.

Summary of changes:

  • Update SQLAlchemy example code to use run_transaction function to
    perform txn retries automagically.

  • Also addresses the issue that the previous code sample could not be
    run more than once.

  • Add new 'Best Practices' section that describes what to do (and not
    do) when driving CockroachDB from SQLAlchemy, including:

    • Noting that one should not munge txn state inside run_transaction.

    • Showing some example code to address "transaction too large errors"
      by breaking updates into chunks.

    • Recommendation to use IMPORT for bulk work, not ORMs.

    • Recommendation to prefer query builder over ORM, if possible.

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

cockroach-teamcity commented Dec 10, 2018

This change is Reviewable

@rmloveland

This comment has been minimized.

@rmloveland rmloveland requested a review from bdarnell Dec 10, 2018

@knz

knz approved these changes Dec 12, 2018

@rmloveland rmloveland force-pushed the sqlalchemy-txn-retries branch from ac17866 to 11053b9 Dec 17, 2018

@rmloveland

This comment has been minimized.

Copy link
Contributor

rmloveland commented Dec 17, 2018

Thank you for the reviews @bdarnell and @knz !

Ben, I have addressed all but two of your comments. I spent some time trying to rewrite the code to follow your suggestions but I got stuck. I know in principal that I should use more query building (via Engine.execute I think) but I was not able to get something working after about 90 minutes of trying, so I figured I should call for help.

I see a few options:

  1. If either of you have a few minutes to patch the example code to address the remaining comments I would be thrilled to have the help.
  2. If the deficiencies are not urgent enough that they should distract you from other matters, we can merge the PR as is and have room to make improvements later. (Sorry if this option was implied by your LGTMs, I wanted to explicitly address your review comments Ben since you took the time to make them.)

What do you think?

@bdarnell

This comment has been minimized.

Copy link
Member

bdarnell commented Dec 17, 2018

I'm fine with this getting merged with the bulk loading file unchanged. I think it's worth getting rid of the retry loop for duplicates in the account creation step. For demo purposes I think you can just choose random numbers out of a billion and then just delete the logic to check for duplicates. (using UUIDs would be even better but might be a more far-reaching change)

@rmloveland rmloveland force-pushed the sqlalchemy-txn-retries branch from 11053b9 to ac40b05 Dec 18, 2018

@rmloveland

This comment has been minimized.

Copy link
Contributor

rmloveland commented Dec 18, 2018

I'm fine with this getting merged with the bulk loading file unchanged.

Cool, leaving it as is for now.

I think it's worth getting rid of the retry loop for duplicates in the account creation step. For demo purposes I think you can just choose random numbers out of a billion and then just delete the logic to check for duplicates. (using UUIDs would be even better but might be a more far-reaching change)

Fixed in latest update - now the code (1) generates random numbers from a billion and (2) uses a SQL query when it needs to fetch random account IDs (instead of the now-deleted dictionary). Thanks again for the review and feedback.

@rmloveland rmloveland requested a review from lhirata Dec 18, 2018

@lhirata
Copy link
Contributor

lhirata left a comment

Reviewed 1 of 4 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


_includes/v2.2/app/sqlalchemy-basic-sample.py, line 47 at r2 (raw file):

# The code below generates random IDs for new accounts.  Since this is

Might not need the second sentence since you have the same note below.


_includes/v2.2/app/sqlalchemy-basic-sample.py, line 85 at r2 (raw file):

def transfer_funds_randomly(session):
    """Transfer money randomly between accounts (during SESSION).

Should this note be consistent with the other and use ''' instead of """?


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 26 at r2 (raw file):

{{site.data.alerts.callout_info}}
Note that the example code on this page uses Python 3.

Remove "Note that" since you already have this info in a callout.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 61 at r2 (raw file):

You can run this script as many times as you want; on each run, it will create some new accounts and shuffle money around between randomly selected accounts.

Specifically, it:

For better scanability, I'd recommend changing "it" to "the script".


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 162 at r2 (raw file):

You can run this script as many times as you want; on each run, it will create some new accounts and shuffle money around between randomly selected accounts.

Specifically, it:

Same as above-- recommend changing "it" to "the script"


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 267 at r2 (raw file):

### Avoid mutations of session and/or transaction state inside `run_transaction()`

In general, this is line with the recommendations of the [SQLAlchemy FAQs](https://docs.sqlalchemy.org/en/latest/orm/session_basics.html#session-frequently-asked-questions), which state (with emphasis added by the original author) that

"In general, this is line with..." -> "In general, this is in line with..."

@lhirata
Copy link
Contributor

lhirata left a comment

:lgtm: with some nits. Also, replace all double spaces after periods with a single space.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@rmloveland rmloveland force-pushed the sqlalchemy-txn-retries branch from ac40b05 to dc9ed94 Dec 19, 2018

@rmloveland rmloveland force-pushed the sqlalchemy-txn-retries branch from dc9ed94 to fe24b14 Dec 19, 2018

@rmloveland
Copy link
Contributor

rmloveland left a comment

Thanks for the review Lauren. In addition to your inline comments, I also fixed all the double spaces after periods. Sorry - think I'm gonna add a linter to my editor for that.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


_includes/v2.2/app/sqlalchemy-basic-sample.py, line 47 at r2 (raw file):

Previously, lhirata wrote…

Might not need the second sentence since you have the same note below.

Fixed.


_includes/v2.2/app/sqlalchemy-basic-sample.py, line 85 at r2 (raw file):

Previously, lhirata wrote…

Should this note be consistent with the other and use ''' instead of """?

Fixed by having them both use 3 double quotes, i.e.""".

I was using the double quotes here because the enclosed text contains a single-quote "apostrophe" character. However, your comment made me look at the other function doc and change it to use the double-quote characters (""") as well, since it also includes an apostrophe. So they are now consistent.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 26 at r2 (raw file):

Previously, lhirata wrote…

Remove "Note that" since you already have this info in a callout.

Fixed.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 61 at r2 (raw file):

Previously, lhirata wrote…

For better scanability, I'd recommend changing "it" to "the script".

Done.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 162 at r2 (raw file):

Previously, lhirata wrote…

Same as above-- recommend changing "it" to "the script"

Done.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 267 at r2 (raw file):

Previously, lhirata wrote…

"In general, this is line with..." -> "In general, this is in line with..."

Done. Great catch!

@rmloveland rmloveland requested a review from jseldess Dec 19, 2018

@jseldess
Copy link
Contributor

jseldess left a comment

:lgtm_strong:

Awesome work, Rich! Just a few minor nits and then you can merge.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


v2.1/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 261 at r3 (raw file):

- Because it must be passed a [sqlalchemy.orm.session.sessionmaker](https://docs.sqlalchemy.org/en/latest/orm/session_api.html#session-and-sessionmaker) object (*not* a [session][session]), it ensures that a new session is created exclusively for use by the callback, which protects you from accidentally reusing objects via any sessions created outside the transaction.
- It abstracts away the [client-side transaction retry logic](transactions.html#client-side-intervention) from your application, which keeps your application code portable across different databases. For example, the sample code given on this page works identically when run against Postgres (modulo changes to the prefix and port number in the connection string).
- It will run your transactions to completion much faster than a naive implementation that created a fresh transaction after every retry error. Because of the way the CockroachDB dialect's driver structures the transaction attempts (using a [SAVEPOINT](savepoint.html) statement under the hood, which has a slightly different meaning in CockroachDB than in some other databases), the server is able to preserve some information about the previously attempted transactions to allow subsequent retries to complete more easily.

Put backticks around this statement name.


v2.1/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 293 at r3 (raw file):

### Use `IMPORT` to read in large data sets

If you are trying to get a large data set into CockroachDB all at once (a bulk import), avoid writing client-side code that uses an ORM and use the [`IMPORT`](import.html) statement instead. It is much faster and more efficient than making a series of [`INSERT`s](insert.html) and [`UPDATE`s](update.html) such as are generated by calls to [session.bulk_save_objects()](https://docs.sqlalchemy.org/en/latest/orm/session_api.html?highlight=bulk_save_object#sqlalchemy.orm.session.Session.bulk_save_objects).

Does this need backticks?


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 22 at r3 (raw file):

{{site.data.alerts.callout_danger}}
**Upgrading from CockroachDB 2.0 to 2.1?** If you used SQLAlchemy with your 2.0 cluster, you must [upgrade the adapter to the latest release](https://github.com/cockroachdb/cockroachdb-python) before upgrading to CockroachDB 2.1.

This is the 2.2 version. Not sure what that means about the upgrade note...


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 261 at r3 (raw file):

- Because it must be passed a [sqlalchemy.orm.session.sessionmaker](https://docs.sqlalchemy.org/en/latest/orm/session_api.html#session-and-sessionmaker) object (*not* a [session][session]), it ensures that a new session is created exclusively for use by the callback, which protects you from accidentally reusing objects via any sessions created outside the transaction.
- It abstracts away the [client-side transaction retry logic](transactions.html#client-side-intervention) from your application, which keeps your application code portable across different databases. For example, the sample code given on this page works identically when run against Postgres (modulo changes to the prefix and port number in the connection string).
- It will run your transactions to completion much faster than a naive implementation that created a fresh transaction after every retry error. Because of the way the CockroachDB dialect's driver structures the transaction attempts (using a [SAVEPOINT](savepoint.html) statement under the hood, which has a slightly different meaning in CockroachDB than in some other databases), the server is able to preserve some information about the previously attempted transactions to allow subsequent retries to complete more easily.

Put backticks around this statement name.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 293 at r3 (raw file):

### Use `IMPORT` to read in large data sets

If you are trying to get a large data set into CockroachDB all at once (a bulk import), avoid writing client-side code that uses an ORM and use the [`IMPORT`](import.html) statement instead. It is much faster and more efficient than making a series of [`INSERT`s](insert.html) and [`UPDATE`s](update.html) such as are generated by calls to [session.bulk_save_objects()](https://docs.sqlalchemy.org/en/latest/orm/session_api.html?highlight=bulk_save_object#sqlalchemy.orm.session.Session.bulk_save_objects).

Does this need backticks?

Update SQLAlchemy example(s) to use txn retries
Addresses #3315.

Summary of changes:

- Update SQLAlchemy example code to use `run_transaction` function to
  perform txn retries automagically.

- Address the issue that the previous code sample could not be run more
  than once.

- Add note that Python 3 is used for the example code

- Add new 'Best Practices' section that describes what to do (and not
  do) when driving CockroachDB from SQLAlchemy, including:

  - Noting that one should not munge txn state inside `run_transaction`.

  - Showing some example code to address "transaction too large errors"
    by breaking updates into chunks.

  - Recommendation to use IMPORT for bulk work, not ORMs.

  - Recommendation to prefer query builder over ORM, if possible.

- Note: All of the changes listed above apply to v2.1 and v2.2
  docs.

@rmloveland rmloveland force-pushed the sqlalchemy-txn-retries branch from fe24b14 to f0decbc Jan 8, 2019

@rmloveland
Copy link
Contributor

rmloveland left a comment

Thank you Jesse!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)


v2.1/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 261 at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Put backticks around this statement name.

Done.


v2.1/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 293 at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Does this need backticks?

Do you mean bulk_save_objects()? I think yes. Done.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 22 at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

This is the 2.2 version. Not sure what that means about the upgrade note...

Good catch. I deleted this note. If you are on 2.2 you will have already crossed this particular bridge, I think.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 261 at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Put backticks around this statement name.

Done.


v2.2/build-a-python-app-with-cockroachdb-sqlalchemy.md, line 293 at r3 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

Does this need backticks?

Yes - thanks!

Done.

@jseldess

This comment has been minimized.

Copy link
Contributor

jseldess commented Jan 8, 2019

LGTM!

@rmloveland rmloveland merged commit d11eb61 into master Jan 8, 2019

2 checks passed

GitHub CI (Docs) TeamCity build finished
Details
code-review/reviewable Review complete: 1 of 0 LGTMs obtained (and 1 stale)
Details

@rmloveland rmloveland deleted the sqlalchemy-txn-retries branch Jan 8, 2019

@rmloveland rmloveland removed the in progress label Jan 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment