Skip to content

Shouldn't API client pass stream = True to the requests when downloading datasets? #10

@lcorcodilos

Description

@lcorcodilos

Linking to the issue of the same name in kaggle-api:

Kaggle/kaggle-api#754

I'm raising it here since the problematic code is now maintained in kagglesdk (KaggleHttpClient.call is where I think this could be fixed).

TL;DR Not using stream=True in requests is causing entire datasets to be materialized in memory which makes it impossible to download anything of even a modest size.

I refer to the linked issue for more details but happy to expand the conversation here.

Tagging @leoauri and @i-aki-y for their visibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions