Skip to content

kaggle datasets list results in TypeError: '>=' not supported between instances of 'NoneType' and 'int' #239

@mobileben

Description

@mobileben

I installed kaggle via

pip3 install kaggle

I've verified the version is 1.5.6.

When running kaggle datasets list, I get the following traceback.

Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 10, in
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 51, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 940, in dataset_list_cli
max_size, min_size)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 905, in dataset_list
return [Dataset(d) for d in datasets_list_result]
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 905, in
return [Dataset(d) for d in datasets_list_result]
File "/usr/local/lib/python3.7/site-packages/kaggle/models/kaggle_models_extended.py", line 69, in init
self.size = File.get_size(self.totalBytes)
File "/usr/local/lib/python3.7/site-packages/kaggle/models/kaggle_models_extended.py", line 109, in get_size
while size >= 1024 and suffix_index < 4:
TypeError: '>=' not supported between instances of 'NoneType' and 'int'

The problem stems from a few refs which don't appear to have any data files. For example this one bigquery/crypto-litecoin.

I've traced the problem to

self.size = File.get_size(self.totalBytes)

This issue is that it seems in these cases, self.totalBytes will be None since it isn't in the parsed_dict.

I've created a PR for this. I've made get_size treat a size that is None to have the value 0.

This results in output looking like:

bigquery/crypto-litecoin Litecoin Crypto Blockchain 0B 2019-02-14 20:22:27 0 25 0.6764706
bigquery/crypto-bitcoin-cash Bitcoin Cash Blockchain 0B 2019-02-14 18:08:11 0 54 0.6764706
bigquery/crypto-dogecoin Dogecoin Crypto Blockchain 0B 2019-02-14 20:22:27 0 23 0.6764706
rajeevw/ufcdata UFC-Fight historical data from 1993 to 2019 3MB 2019-07-05 09:58:02 13989 530 0.9705882
bigquery/crypto-ethereum-classic Ethereum Classic Blockchain 0B 2019-03-20 23:21:25 0 112 0.7058824
dgomonov/new-york-city-airbnb-open-data New York City Airbnb Open Data 2MB 2019-08-12 16:24:45 49718 1163 1.0

Note some others have this issue: https://stackoverflow.com/questions/59445928/kaggle-api-in-colab-datasets-kaggle-datasets-list-error/59559533#59559533

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions