Skip to content

Fetching large dataset using ResultSet pauses when hitting FETCH_SIZE #75

@surjikal

Description

@surjikal

Hello! First of all, thanks for making this awesome driver. It's awesome!

I'm grabbing 150k rows (each row with 200+ columns) from the database to generate CSV reports and dumps. I know it's a lot of data, but it is required for a particular use case.

I'm using ResultSet to stream the rows through. It works great until I hit the FETCH_SIZE row count (2^15). Then the stream pauses for a minute, and the streaming continues until the next FETCH_SIZE batch.

I would expect to see a continuous stream of rows, but I assume the driver buffers FETCH_SIZE rows until it pushes it through the stream. Maybe you can explain this mechanism further? It might give me hints for what I can optimize.

In any case, what can I do to shorten this stream pause? Is there something I can tweak at the driver level?

I tried increasing FETCH_SIZE and MAX_FETCH_SIZE: Unfortunately I get error while parsing protocol: wrong fetch size after exceeding a certain size.

I also poked around the source and I saw averageRowLength. Is this something I should explore? I'm not sure what kind of unit this value is in (is it bytes? or number of columns?). The default seems low for the amount of columns I have.

In addition, is there something I could do in my Hana configuration?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions