Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdk-core: add new managed db_connection_execute_operation hook #805

Merged
merged 1 commit into from
Jun 4, 2022

Conversation

antoniivanov
Copy link
Collaborator

Currently we cannot track the full execution of a query (before and
after it) . This is necessary in order to be able to generate proper
lineage events (query start and end) or to track its duration (we had to
currently had to add this to vdk-core
(#804) but that's
really better implemented as plugin (think aspect-oriented programming).

In the future we should consider adding hook for fetch (fetchMany,
fetchAll) as well as it's there where the data is returned (plugins can
take stats - count number of rows, validate sensitive columns, do result
validation tests, etc.)

Testing Done: unit tests, functional test.

Signed-off-by: Antoni Ivanov aivanov@vmware.com

@antoniivanov antoniivanov force-pushed the person/aivanov/vdk-core-cursor branch 7 times, most recently from ea4bd00 to 82cced3 Compare May 4, 2022 14:32
@antoniivanov antoniivanov force-pushed the person/aivanov/vdk-core-cursor branch from 82cced3 to 44bb447 Compare May 4, 2022 19:09
@antoniivanov antoniivanov force-pushed the person/aivanov/vdk-core-cursor branch 2 times, most recently from 79dd795 to 2101a41 Compare June 3, 2022 08:08
Copy link
Contributor

@ivakoleva ivakoleva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simpler for API usage, yet local context with yielding is possible this way,
we agreed to merge

@antoniivanov
Copy link
Collaborator Author

antoniivanov commented Jun 3, 2022

Could be simpler for API usage

I don't think we have an idea for simpler API usage than this?

The other possible API could be to have two hooks

db_connection_execute_before_operation(cursor, query)
db_connection_execute_after_operation(cursor, query)

But in that case, a simple use case of tracking the duration of query becomes very hard to implement
Maybe like this

class QueryDuratinPlugin:
     queries: Dict # keep state of queries

     def db_connection_execute_before_operation(self, cursor) 
            # we do not have id because query is not yet executed so we have to generate it
            id = generate_id(query) 
            query = update_query_comments_with_id(query)
            self.queries[id] = time.time() 
   
    def db_connection_execute_after_operation(self, cursor) 
            id = parse_id_from_comments(query) # we do not have id because query is not yet executed 
            start = self.queries[id]
            end = time.time()
            log.info("Duration: {end - start}")
   

while now with this hook it is:

            @hookimpl(hookwrapper=True)
            db_connection_execute_operation(execution_cursor: ExecutionCursor) -> Optional[int]: 
                start = time.time()
                outcome = yield # we yield the execution to other implementations (including default one) 
                end = time.time()
                log.info(f" duration: {end - start}. ")

@antoniivanov antoniivanov force-pushed the person/aivanov/vdk-core-cursor branch 4 times, most recently from e0a7bb4 to e9c8de0 Compare June 4, 2022 17:40
Currently we cannot track the full execution of a query (before and
after it) . This is necessary in order to be able to generate proper
lineage events (query start and end) or to track its duration (we had to
currently had to add this to vdk-core
(#804) but that's
really better implemented as plugin (think aspect-oriented programming).

In the future we should consider adding hook for fetch (fetchMany,
fetchAll) as well as it's there where the data is returned (plugins can
take stats - count number of rows, validate sensitive columns, do result
validation tests, etc.)

Testing Done: unit tests, functional test.

Signed-off-by: Antoni Ivanov <aivanov@vmware.com>
@antoniivanov antoniivanov force-pushed the person/aivanov/vdk-core-cursor branch from e9c8de0 to c161f33 Compare June 4, 2022 21:42
@antoniivanov antoniivanov merged commit f2baeb7 into main Jun 4, 2022
@antoniivanov antoniivanov deleted the person/aivanov/vdk-core-cursor branch June 4, 2022 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants