Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: prepare_table() makes O(1) calls to prepare_column(), instead of O(n) (n number of columns) #2353

Open
raulbonet opened this issue Mar 30, 2024 · 0 comments
Labels
kind/Feature New feature or request valuestream/SDK

Comments

@raulbonet
Copy link
Contributor

raulbonet commented Mar 30, 2024

Feature scope

Taps (catalog, state, stream maps, tests, etc.)

Description

Function prepare_table() in SQLConnector calls prepare_column(), for each column in a table.
In turn, this function queries the database to first determine whether the column exists or not. Therefore we make O(n) calls to the database.
However, in a single call we can retrieve all columns and leverage this information.

The target-postgres already implemented this optimisation by overriding the default methods prepare_table() and prepare_columns().

The way target-postgres implemented it implied modifying the signature of the functions, my implementation is different.
I take the responsibility of checking whether the column exists out of the prepare_column() function and encapsulate this in a new function called prepare_table_columns().

However, a thought: I think that in the end the underlying issue is that functions like prepare_table() and this new function prepare_table_columns() belong to the SQLSink class rather than the connector: a new sink is created every time there is a schema change. The class that knows the current schema and therefore can do these optimisations is the sink, not the connector maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/Feature New feature or request valuestream/SDK
Projects
None yet
Development

No branches or pull requests

1 participant