Skip to content

adbc.snowflake.statement.ingest_copy_concurrency = "0" blocks copy into #2793

Closed
@davlee1972

Description

@davlee1972

What happened?

According to the user docs...

adbc.snowflake.statement.ingest_copy_concurrency

Maximum number of COPY operations to run concurrently. Bulk ingestion performance is optimized by executing COPY queries as files are still being uploaded. Snowflake COPY speed scales with warehouse size, so smaller warehouses may benefit from setting this value higher to ensure long-running COPY queries do not block newly uploaded files from being loaded. Default is 4. If set to 0, only a single COPY query will be executed as part of ingestion, once all files have finished uploading. Cannot be negative

When set "0", the copy into operation never kicks off and adbc_ingest() seems to just hang with wait().

This is a bug with 1.5 and 1.6. I've tested this on Windows and Linux.

With the '1.5.0dev' driver I downloaded this wasn't a problem. Downgrading to 1.4 also works..

Current solution is to set adbc.snowflake.statement.ingest_copy_concurrency = "1"

Stack Trace

No response

How can we reproduce the bug?

No response

Environment/Setup

No response

Activity

davlee1972

davlee1972 commented on May 7, 2025

@davlee1972
Author

The record counts are also off..

Setting it to "1" spawns more than a single copy into operation, but it looks like it is running one at a time.. Not sure if it will run into the existing "select count(*)" copy into termination problem..

With ADBC 1.5.0 driver with adbc.snowflake.statement.ingest_copy_concurrency = "1"

Image

Image

Image

The total record counts are correct, but they don't match the rows being reported by copy into..

98,848,888 (17352214 + 63692116 + 17059885 + 744673) from the 3 screenshots above vs the correct 98,752,339 total..

Image

With ADBC 1.5.0 driver with adbc.snowflake.statement.ingest_copy_concurrency = "0". Copy Into never executes.

Image

With ADBC 1.5.0dev driver with adbc.snowflake.statement.ingest_copy_concurrency = "0"

Image

davlee1972

davlee1972 commented on May 7, 2025

@davlee1972
Author

The difference is coming from here..

Image

Image

Image

zeroshade

zeroshade commented on May 7, 2025

@zeroshade
Member

The difference between what COPY INTO reports and what is actually loaded is something we should bring up to Snowflake as I'm not sure what we can do about it in the ADBC driver as we can only report what COPY INTO returns.

For the hanging, I'll take a look. I'm not sure what changed between that dev build and the release 1.5.0, as we just did a release of 1.6.0 can you just quickly double check whether the issue persists in 1.6.0?

davlee1972

davlee1972 commented on May 7, 2025

@davlee1972
Author

I opened a support ticket with Snowflake already..

This bug is happening in 1.6 too.

zeroshade

zeroshade commented on May 7, 2025

@zeroshade
Member

Okay, thanks. Pretty sure I see the cause of the hang, will test and put up a fix tomorrow

added a commit that references this issue on Jul 1, 2025
4fab33e
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      Participants

      @zeroshade@davlee1972

      Issue actions

        adbc.snowflake.statement.ingest_copy_concurrency = "0" blocks copy into · Issue #2793 · apache/arrow-adbc