Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing bug #18746. Adding a new kwargs parameter to DbClient #18873

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

EtienneT
Copy link

Summary & Motivation

Fixing issue #18746.

DuckDb can only have one write connection at a time, but multiple read_only connection. Right now we always open duckdb in write mode which can cause exceptions when multiple ops try to access the file at the same time (specially on a slow server).

The goal is to be able to open duckdb in read_only whenever possible and only open it in write mode when we need to.

I added a new kwargs parameter to DbClient connect method and a new way to pass a lambda to DbIOManager to calculate this kwargs dynamically based in the context (either reading or writing data).

How I Tested These Changes

This is my first PR to dagster and the contributing guidelines does not seems to be made for a windows dev, so unfortunately I did not test the code. Sorry in advance.

@@ -279,13 +280,16 @@ def get_select_statement(table_slice: TableSlice) -> str:

@staticmethod
@contextmanager
def connect(context, _):
def connect(context, _, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you consider determining whether to do a read connection by running isinstance(context, InputContext) in this function rather than doing the kwargs method?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used kwargs because you had mentionned it in the issue:

We can make this a kwarg with defaults so that any custom DBClients out there won't see breaking changes.

But I don't have a strong opinion about it. I mainly do software engineering with C#, so I don't have as much experience with the best practices in python yet + I am not certain if this follows your conventions elsewhere.

Also please note I was not able to test this code since I am in windows and it seemed quite complicated to setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants