Skip to content

Lazy load snowflake imports to improve performance#62365

Open
dwreeves wants to merge 2 commits intoapache:mainfrom
dwreeves:62362-lazy-load-snowflake
Open

Lazy load snowflake imports to improve performance#62365
dwreeves wants to merge 2 commits intoapache:mainfrom
dwreeves:62362-lazy-load-snowflake

Conversation

@dwreeves
Copy link
Contributor

This PR lazy-loads Snowflake modules to improve performance when importing from airflow.providers.snowflake.

Because SnowflakeSqlApiHook is a subclass of SnowflakeHook, and the operators module imports the SnowflakeSqlApiHook globally, this improves the load time of effectively every import from airflow.providers.snowflake and puts significantly less stress on the scheduler instance when parsing the DAG.

See linked issue for full benchmarking code. TLDR, the cost of importing Snowflake modules is fairly significant:

─────────────────────────── Snowflake Hook — Import Cost Benchmark ───────────────────────────
  Python: 3.13.1 (main, Jan 14 2025, 23:48:54) [Clang 19.1.6 ]
  Runs per scenario: 10  |  also benchmark w/o .pyc: no

            avg ms (±stdev)  |  avg RSS MB  —  10 runs each             
╭───────────────────────────────┬───────────────────┬──────────────────╮
│ Scenario                      │ Time w/ .pyc (ms) │ RSS w/ .pyc (MB) │
├───────────────────────────────┼───────────────────┼──────────────────┤
│ No Snowflake, no SQLAlchemy   │             60 ±2 │             17.2 │
│ No Snowflake, yes SQLAlchemy  │     121 ±1  (+61) │    31.0  (+13.9) │
│ Yes Snowflake, yes SQLAlchemy │    291 ±5  (+231) │    59.8  (+42.6) │
╰───────────────────────────────┴───────────────────┴──────────────────╯
  (+N) = delta vs first scenario  |  ±N = stdev across runs

The one thing I am unclear about is covering this with tests. I have written code before which asserts the non-import of a module, and I also see this in some cases in Airflow (see [1], [2]), however I'm not sure if that's done generally or if it makes sense here.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

(No AI code is present in this commit. However, I did use code generated by Claude to help benchmark the performance improvement.)

@dwreeves dwreeves requested a review from potiuk as a code owner February 23, 2026 15:57
@boring-cyborg boring-cyborg bot added area:providers provider:snowflake Issues related to Snowflake provider labels Feb 23, 2026
@dwreeves dwreeves force-pushed the 62362-lazy-load-snowflake branch 2 times, most recently from 53fc532 to bc41432 Compare February 23, 2026 17:43
@dwreeves dwreeves force-pushed the 62362-lazy-load-snowflake branch 2 times, most recently from 558a677 to ff6f899 Compare February 24, 2026 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:snowflake Issues related to Snowflake provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feat] [Snowflake Provider] Lazy import of snowflake modules in snowflake/hooks/snowflake.py

1 participant