Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix remaining tfds datasets bugs on windows #1914

Merged
merged 1 commit into from Apr 30, 2020

Conversation

vijayphoenix
Copy link
Contributor

@vijayphoenix vijayphoenix commented Apr 19, 2020

Fixed tfds\image, tfds\obj_dec, tfds\structured, tfds\text, tfds\translate.
Fix #1901

See comments of #1911 for old results.

@googlebot googlebot added the cla: yes Author has signed CLA label Apr 19, 2020
@tfds-bot tfds-bot added the community:please_review Community - We need your help to review this PR. label Apr 19, 2020
@vijayphoenix
Copy link
Contributor Author

vijayphoenix commented Apr 19, 2020

New results tfds/image

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 120 items

tensorflow_datasets\image\abstract_reasoning_test.py ....s               [  4%]
tensorflow_datasets\image\aflw2k3d_test.py ....s                         [  8%]
tensorflow_datasets\image\arc_test.py ....s                              [ 12%]
tensorflow_datasets\image\binarized_mnist_test.py ....s                  [ 16%]
tensorflow_datasets\image\celeba_test.py ....s                           [ 20%]
tensorflow_datasets\image\celebahq_test.py ....s                         [ 25%]
tensorflow_datasets\image\cityscapes_test.py ....s....s....s....s        [ 41%]
tensorflow_datasets\image\clevr_test.py ....s                            [ 45%]
tensorflow_datasets\image\coil100_test.py ....s                          [ 50%]
tensorflow_datasets\image\div2k_test.py ....s....s                       [ 58%]
tensorflow_datasets\image\downsampled_imagenet_test.py ....s             [ 62%]
tensorflow_datasets\image\dsprites_test.py ....s                         [ 66%]
tensorflow_datasets\image\duke_ultrasound_test.py .s..s                  [ 70%]
tensorflow_datasets\image\flic_test.py ....s....s                        [ 79%]
tensorflow_datasets\image\lost_and_found_test.py ....s                   [ 83%]
tensorflow_datasets\image\lsun_test.py .F..s                             [ 87%]
tensorflow_datasets\image\scene_parse_150_test.py ....s                  [ 91%]
tensorflow_datasets\image\shapes3d_test.py ....s                         [ 95%]
tensorflow_datasets\image\the300w_lp_test.py ....s                       [100%]

================================== FAILURES ===================================
C:\ProgramData\Miniconda3\lib\site-packages\tensorflow_io\core\python\ops\__init__.py:67: NotImplementedError: unable to open file: libtensorflow_io.so, from paths: ['C:\\ProgramData\\Miniconda3\\lib\\site-packages\\tensorflow_io\\core\\python\\ops\\libtensorflow_io.so']
=========================== short test summary info ===========================
FAILED tensorflow_datasets/image/lsun_test.py::LsunTest::test_download_and_prepare_as_dataset
===== 1 failed, 95 passed, 24 skipped, 969 warnings in 244.49s (0:04:04) ======

@vijayphoenix
Copy link
Contributor Author

vijayphoenix commented Apr 19, 2020

New results tfds/text

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 260 items

tensorflow_datasets\text\blimp_test.py ....s                             [  1%]
tensorflow_datasets\text\c4_test.py ....s....s                           [  5%]
tensorflow_datasets\text\c4_utils_test.py ........s.                     [  9%]
tensorflow_datasets\text\cfq_test.py ....s                               [ 11%]
tensorflow_datasets\text\civil_comments_test.py ....s                    [ 13%]
tensorflow_datasets\text\cos_e_test.py ....s                             [ 15%]
tensorflow_datasets\text\definite_pronoun_resolution_test.py ....s       [ 17%]
tensorflow_datasets\text\eraser_multi_rc_test.py ....s                   [ 19%]
tensorflow_datasets\text\esnli_test.py ....s                             [ 21%]
tensorflow_datasets\text\gap_test.py ....s                               [ 23%]
tensorflow_datasets\text\glue_test.py ....s....s....s....s....s....s.... [ 36%]
s....s....s....s                                                         [ 42%]
tensorflow_datasets\text\imdb_test.py ....s                              [ 44%]
tensorflow_datasets\text\librispeech_lm_test.py ....s                    [ 46%]
tensorflow_datasets\text\lm1b_test.py ....s                              [ 48%]
tensorflow_datasets\text\math_dataset_test.py ....s                      [ 50%]
tensorflow_datasets\text\movie_rationales_test.py ....s                  [ 51%]
tensorflow_datasets\text\multi_nli_mismatch_test.py ....s                [ 53%]
tensorflow_datasets\text\multi_nli_test.py ....s                         [ 55%]
tensorflow_datasets\text\natural_questions_test.py ....s                 [ 57%]
tensorflow_datasets\text\qa4mre_test.py ....s                            [ 59%]
tensorflow_datasets\text\scan_test.py ....s                              [ 61%]
tensorflow_datasets\text\scicite_test.py ....s                           [ 63%]
tensorflow_datasets\text\snli_test.py ....s                              [ 65%]
tensorflow_datasets\text\squad_test.py ....s                             [ 67%]
tensorflow_datasets\text\super_glue_test.py ....s....s....s....s....s... [ 78%]
.s....s....s....s....s                                                   [ 86%]
tensorflow_datasets\text\tiny_shakespeare_test.py ....s                  [ 88%]
tensorflow_datasets\text\triviaqa_test.py ....s                          [ 90%]
tensorflow_datasets\text\web_questions_test.py ....s                     [ 92%]
tensorflow_datasets\text\wiki40b_test.py ....s                           [ 94%]
tensorflow_datasets\text\wikipedia_test.py ....s                         [ 96%]
tensorflow_datasets\text\xnli_test.py ....s                              [ 98%]
tensorflow_datasets\text\yelp_polarity_test.py ....s                     [100%]

========== 209 passed, 51 skipped, 204 warnings in 401.06s (0:06:41) ==========

@tfds-bot tfds-bot added community:is_reviewing This PR is being reviewed by community member. and removed community:please_review Community - We need your help to review this PR. labels Apr 19, 2020
@vijayphoenix
Copy link
Contributor Author

vijayphoenix commented Apr 19, 2020

New results tfds/translate

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 31 items

tensorflow_datasets\translate\para_crawl_test.py ....s                   [ 16%]
tensorflow_datasets\translate\ted_hrlr_test.py ....s                     [ 32%]
tensorflow_datasets\translate\ted_multi_test.py ....s                    [ 48%]
tensorflow_datasets\translate\wmt19_test.py ....s....s                   [ 80%]
tensorflow_datasets\translate\wmt_test.py .....s                         [100%]

================= 25 passed, 6 skipped, 7 warnings in 25.88s ==================

@vijayphoenix
Copy link
Contributor Author

vijayphoenix commented Apr 19, 2020

New results tfds/obj_dec

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 50 items

tensorflow_datasets\object_detection\coco_test.py ....s....s....s        [ 30%]
tensorflow_datasets\object_detection\kitti_test.py ....s                 [ 40%]
tensorflow_datasets\object_detection\open_images_challenge2019_test.py . [ 42%]
...s                                                                     [ 50%]
tensorflow_datasets\object_detection\open_images_test.py ....s           [ 60%]
tensorflow_datasets\object_detection\voc_test.py ....s....s              [ 80%]
tensorflow_datasets\object_detection\waymo_open_dataset_test.py ....s    [ 90%]
tensorflow_datasets\object_detection\wider_face_test.py ....s            [100%]

========== 40 passed, 10 skipped, 2308 warnings in 163.68s (0:02:43) ==========

@vijayphoenix
Copy link
Contributor Author

vijayphoenix commented Apr 19, 2020

New results tfds/structured

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 35 items

tensorflow_datasets\structured\amazon_us_reviews_test.py ....s           [ 14%]
tensorflow_datasets\structured\forest_fires_test.py ....s                [ 28%]
tensorflow_datasets\structured\german_credit_numeric_test.py ....s       [ 42%]
tensorflow_datasets\structured\higgs_test.py ....s                       [ 57%]
tensorflow_datasets\structured\iris_test.py ....s                        [ 71%]
tensorflow_datasets\structured\rock_you_test.py ....s                    [ 85%]
tensorflow_datasets\structured\titanic_test.py ....s                     [100%]

================= 28 passed, 7 skipped, 7 warnings in 13.77s ==================

Copy link
Member

@Conchylicultor Conchylicultor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !

@@ -106,7 +106,8 @@ def _info(self):
'probe': tfds.features.Tensor(shape=(), dtype=tf.string),
'scanner': tfds.features.Tensor(shape=(), dtype=tf.string),
'target': tfds.features.Tensor(shape=(), dtype=tf.string),
'timestamp_id': tfds.features.Tensor(shape=(), dtype=tf.uint32),
# Use tf.uint64 to prevent possible overflow on windows `sys.maxsize`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide more context on this one ? uint32 should be system independent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the following stack trace

See pytest results

ERROR: test_download_and_prepare_as_dataset (__main__.DukeUltrasoundTest)
test_download_and_prepare_as_dataset (__main__.DukeUltrasoundTest)
Run the decorated test method.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\test_utils.py", line 198, in decorated
    f(self, *args, **kwargs)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\dataset_builder_testing.py", line 298, in test_download_and_prepare_as_dataset
    self._download_and_prepare_as_dataset(self.builder)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\dataset_builder_testing.py", line 359, in _download_and_prepare_as_dataset
    builder.download_and_prepare(download_config=download_config)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\api_utils.py", line 69, in disallow_positional_args_dec
    return fn(*args, **kwargs)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 363, in download_and_prepare
    download_config=download_config)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 996, in _download_and_prepare
    max_examples_per_split=download_config.max_examples_per_split,
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 928, in _download_and_prepare
    self._prepare_split(split_generator, **prepare_split_kwargs)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 1012, in _prepare_split
    example = self.info.features.encode_example(record)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\features_dict.py", line 170, in encode_example
    in utils.zip_dict(self._feature_dict, example_dict)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\features_dict.py", line 169, in <dictcomp>
    for k, (feature, example_value)
  File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\feature.py", line 541, in encode_example
    example_data = np.array(example_data, dtype=np_dtype)
OverflowError: Python int too large to convert to C long

----------------------------------------------------------------------
Ran 5 tests in 1.021s

I am using Windows 10 64-bit Operating System. python3.7.7

The dataset prepared successfully when replaced tf.uint32 with tf.uint64

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seems to be the right place to fix this.

The error is raised in feature.py", line 541, in encode_example, so might be an issue with np_dtype, or similar. What is the int value ?

@tfds-bot tfds-bot added author:please_respond Author - please respond to the recent comments. and removed community:is_reviewing This PR is being reviewed by community member. labels Apr 20, 2020
@vijayphoenix vijayphoenix changed the title Fix tfds datasets bugs on windows Fix remaining tfds datasets bugs on windows Apr 20, 2020
@tfds-bot tfds-bot added tfds:is_reviewing TFDS team: PTAL and removed author:please_respond Author - please respond to the recent comments. labels Apr 20, 2020
Copy link
Member

@Conchylicultor Conchylicultor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this. I'm gonna do a partial merge of this to fix the os issues. The uint seems a different problem so I would prefer better understand the issue first before fixing it.

@@ -106,7 +106,8 @@ def _info(self):
'probe': tfds.features.Tensor(shape=(), dtype=tf.string),
'scanner': tfds.features.Tensor(shape=(), dtype=tf.string),
'target': tfds.features.Tensor(shape=(), dtype=tf.string),
'timestamp_id': tfds.features.Tensor(shape=(), dtype=tf.uint32),
# Use tf.uint64 to prevent possible overflow on windows `sys.maxsize`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seems to be the right place to fix this.

The error is raised in feature.py", line 541, in encode_example, so might be an issue with np_dtype, or similar. What is the int value ?

tensorflow_datasets/image/the300w_lp.py Show resolved Hide resolved
@tfds-bot tfds-bot added kokoro:run Run Kokoro tests and removed tfds:is_reviewing TFDS team: PTAL labels Apr 29, 2020
@kokoro-team kokoro-team removed the kokoro:run Run Kokoro tests label Apr 29, 2020
@tfds-bot tfds-bot added the tfds:ready_to_merge This PR is ready to be merged. label Apr 29, 2020
@tfds-copybara tfds-copybara merged commit fcfa6a3 into tensorflow:master Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Author has signed CLA tfds:ready_to_merge This PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

caltech_birds yields no examples on windows
7 participants