Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix python profile html generator to support updated JSON format. #2880

Merged

Conversation

ChristianMurphy
Copy link
Contributor

when a profile is generated using

-- enable profiling in json format
PRAGMA enable_profiling='json';
-- write the profiling output to a specific file on disk
PRAGMA profile_output='/path/to/file.json';

and an HTML document is generated using

import duckdb_query_graph
duckdb_query_graph.generate('/path/to/file.json', '/path/to/out.html')

as documented at https://duckdb.org/docs/dev/profiling

Two issues occur.

  1. graph_data is now the root node of the graph, it is no longer nested
    under graph_data.tree
  2. extra_info is now option, it is present on some, but all nodes

this addresses both issues, but starting the parsing from the root node
directly, and adding an empty fallback for extra_info so split() does
not crash.

when a profile is generated using

```sql
-- enable profiling in json format
PRAGMA enable_profiling='json';
-- write the profiling output to a specific file on disk
PRAGMA profile_output='/path/to/file.json';
```

and an HTML document is generated using

```python
import duckdb_query_graph
duckdb_query_graph.generate('/path/to/file.json', '/path/to/out.html')
```

as documented at https://duckdb.org/docs/dev/profiling

Two issues occur.

1. graph_data is now the root node of the graph, it is no longer nested
under graph_data.tree
2. extra_info is now option, it is present on some, but all nodes

this addresses both issues, but starting the parsing from the root node
directly, and adding an empty fallback for extra_info so `split()` does
not crash.
@Mytherin
Copy link
Collaborator

Mytherin commented Jan 8, 2022

Thanks for the PR!

It was not entirely clear to me, the problem here is that the graph generation fails without these changes, correct?

Could you perhaps also add a test case to the CI that tests this workflow so that this does not break again? For example this could be added to the Python client tests, or as a shell script to e.g. the LinuxRelease CI run.

@ChristianMurphy
Copy link
Contributor Author

ChristianMurphy commented Jan 8, 2022

It was not entirely clear to me, the problem here is that the graph generation fails without these changes, correct?

Correct, more specifically.
The HTML and JavaScript generated by Python is valid (parse-able).
But when it is run in a browser, the graph generation function on the JavaScript side fails with various undefined exceptions, resulting in a blank page in browser.

For example:

undefined error

For example this could be added to the Python client tests

I'm not sure a Python client test would be able to catch this.
Unless, a browser was installed as part of CI, and the Python tests use a driver to instruct the browser to open the generated HTML+JS document.

Which is possible with libraries like https://playwright.dev/python/docs/intro and https://selenium-python.readthedocs.io/, but would add the overhead of needing to download browser(s) in the CI task and spin them up for this one test.

@Mytherin Mytherin merged commit e77e3ab into duckdb:master Jan 8, 2022
@Mytherin
Copy link
Collaborator

Mytherin commented Jan 8, 2022

Fair enough, I thought the problem would manifest already in the actual Python generation, but if the problem is in the generated javascript this is indeed much more difficult to test. I will just merge the PR as-is then.

Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants