Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement metadata queries with odbc metadata functions for pyodbc to avoid deadlocks #90

Conversation

tkilias
Copy link
Collaborator

@tkilias tkilias commented May 18, 2020

Situation:

  • Currently, the metadata functions of the Exasol dialect use SQL Queries on the metadata tables to retrieve the metadata of a DB.
  • Modification on objects (e.g. ALTER TABLE, INSERT INTO, DELETE INTO) in the database lock also the metadata tables.
  • This can lead to deadlocks between transaction which modify the database objects and metadata only transactions.
  • Exasol provides to possibilities to avoid this:
    1. Magic Statement to enable snapshot execution for metadata queries
    2. ODBC Metadata functions
  • The Magic Statement unfortunately only works with non-prepared statements, as such we would need to deal with SQL Injections on the client side. The ODBC Metadata functions are implemented differently and are safe to use with unknown input and use snapshot execution.

Solution:

  • For the time being, we used the ODBC Metadata functions where possible to fetch the metadata or resolve names, because they are safe to use. The previous implementation based on SQL Queries was renamed and kept as a fallback when the ODBC functions can't be used. For example, TurboODBC doesn't seem to provide them.
  • We added also tests for the metadata functions to ensure both implementations show the same behavior and the new implementation doesn't cause deadlock anymore.

I would appreciate your feedback.

@tkilias
Copy link
Collaborator Author

tkilias commented May 19, 2020

I checked the Travis builds and it seems they failed, because the secret environment variables were not available for the pull request.

@jank
Copy link

jank commented May 19, 2020

@tkilias thank you for your contribution. I will review this PR with @nndo1991 next week. Travis secrets are not readable for outside collaborators, as this would allow leakage of the EXASOL test DB credentials (could be fixed if we figure out how to run EXASOL via GitHub actions #85).
Comments and feedback on this PR will follow here.

@jank
Copy link

jank commented May 19, 2020

I had a quick glance at the changes. A review of base.py will take some time as besides the code changes a lot of formatting changes came in.
I am very happy to see the addition of two new test cases. Unfortunately, we will not be able to run these test cases as part of the Travis run as our user on the EXASOL hosted test DB does not have permissions to create or drop schemas.
@tkilias can you please change the test cases and use the pre-existing schema TEST_SCHEMA. See https://github.com/blue-yonder/sqlalchemy_exasol/blob/c4467b546aa010b2197e61c952ec7e6b1e59c21d/INTEGRATION_TEST.md#L13 for details.

@tkilias
Copy link
Collaborator Author

tkilias commented May 19, 2020

@BY-jk Do you mind, if I first try to use docker-db? I tested integration-test-docker-environment with Direct_IO=False and would add a switch to activate this.
Regarding formatting, sorry for that, I recognized it too late.

@jank
Copy link

jank commented May 19, 2020

@tkilias switching to GitHub Actions and using EXASOL in a docker image would be a huge step forward. It will remove the dependency on the EXASOL hosted test DB, allow this project to test against a preferred (or even multiple) EXASOL versions, potentially allow for more parallelism in testing (hence reducing build times).
If possible, please do this on a separate PR. I have a bit of a history with very large PRs on this project and prefer to have PRs that add smaller increments.
On the formatting, no worries. @nndo1991 and I will manage to separate real changes from formatting changes.

@tkilias
Copy link
Collaborator Author

tkilias commented May 19, 2020

Will do another pull request for the docker-db stuff, after that we can rebase this one. One interesting thing I found during my testing is, that I could start the docker-db without direct_io=false in Github actions. https://github.com/tkilias/integration-test-docker-environment/runs/688785706

@tkilias tkilias force-pushed the reimplement_metadata_queries_with_odbc_metadata_functions branch from a492bf7 to 10ff30a Compare May 20, 2020 15:32
@tkilias tkilias merged commit 10ff30a into exasol:master May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid DB locks when performing metadata queries on SYS tables
2 participants