-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow package import times in debug #292
Comments
I think the culprit is somewhere else in your environment or code. I used your reproduction steps and I get sub-second results in VSCode debug mode. I develop this connector in VSCode as well and have never seen this such delays.
|
If you can share a complete code sample we can try to dig deeper into this. It should not take more than a few seconds to connect to a Databricks compute resource. But ttypes alone isn't the issue. |
Hi! Thank you for the quick answer. Upon further research, the issue lies with the docker image used. Pulling the image in the dockerfile using For now the team will use an Ubuntu image in development and a minimal python3.12 image for the production app. Thanks! |
Thanks for following up with your findings! |
Hi This is happening only when I run my application tests suit. Need to do more analysis... ![]() |
After some experimentation I've found the root cause of this and written about it here #369 (comment) The only workaround I can think of at the moment is to run your tests without the |
Hi,
Environment:
Python 3.12.0 (also tried 3.8, 3.10)
databricks-sql-connector==3.0.0
SQLAlchemy==2.0.23
IDE: Vscode
When trying to connect to Databricks with SQLalchemy, we noticed huge delays in debug mode when trying to connect to the endpoint, taking multiple minutes. Digging through the code, we found that the hanging happened when importing the databricks.sqlalchemy package.
import databricks.sqlalchemy
Digging further in the code to find where is the problem, we found it was the
ttypes.py that contains ~111k lines that take a long time to parse in debug.
When trying to time it out with a simple script:
Run without debugging:
Time to load: from databricks.sql.thrift_api.TCLIService.ttypes import TSparkArrowResultLink 0.02857661247253418
Run with debugging:
Time to load: from databricks.sql.thrift_api.TCLIService.ttypes import TSparkArrowResultLink 203.38593983650208
Would there be a way to reduce loading time of the package in debug?
Thanks in advance!
The text was updated successfully, but these errors were encountered: