Fix remaining tfds datasets bugs on windows #1914

vijayphoenix · 2020-04-19T20:48:20Z

Fixed tfds\image, tfds\obj_dec, tfds\structured, tfds\text, tfds\translate.
Fix #1901

See comments of #1911 for old results.

vijayphoenix · 2020-04-19T21:49:29Z

New results tfds/image

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 120 items

tensorflow_datasets\image\abstract_reasoning_test.py ....s               [  4%]
tensorflow_datasets\image\aflw2k3d_test.py ....s                         [  8%]
tensorflow_datasets\image\arc_test.py ....s                              [ 12%]
tensorflow_datasets\image\binarized_mnist_test.py ....s                  [ 16%]
tensorflow_datasets\image\celeba_test.py ....s                           [ 20%]
tensorflow_datasets\image\celebahq_test.py ....s                         [ 25%]
tensorflow_datasets\image\cityscapes_test.py ....s....s....s....s        [ 41%]
tensorflow_datasets\image\clevr_test.py ....s                            [ 45%]
tensorflow_datasets\image\coil100_test.py ....s                          [ 50%]
tensorflow_datasets\image\div2k_test.py ....s....s                       [ 58%]
tensorflow_datasets\image\downsampled_imagenet_test.py ....s             [ 62%]
tensorflow_datasets\image\dsprites_test.py ....s                         [ 66%]
tensorflow_datasets\image\duke_ultrasound_test.py .s..s                  [ 70%]
tensorflow_datasets\image\flic_test.py ....s....s                        [ 79%]
tensorflow_datasets\image\lost_and_found_test.py ....s                   [ 83%]
tensorflow_datasets\image\lsun_test.py .F..s                             [ 87%]
tensorflow_datasets\image\scene_parse_150_test.py ....s                  [ 91%]
tensorflow_datasets\image\shapes3d_test.py ....s                         [ 95%]
tensorflow_datasets\image\the300w_lp_test.py ....s                       [100%]

================================== FAILURES ===================================
C:\ProgramData\Miniconda3\lib\site-packages\tensorflow_io\core\python\ops\__init__.py:67: NotImplementedError: unable to open file: libtensorflow_io.so, from paths: ['C:\\ProgramData\\Miniconda3\\lib\\site-packages\\tensorflow_io\\core\\python\\ops\\libtensorflow_io.so']
=========================== short test summary info ===========================
FAILED tensorflow_datasets/image/lsun_test.py::LsunTest::test_download_and_prepare_as_dataset
===== 1 failed, 95 passed, 24 skipped, 969 warnings in 244.49s (0:04:04) ======

vijayphoenix · 2020-04-19T21:50:01Z

New results tfds/text

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 260 items

tensorflow_datasets\text\blimp_test.py ....s                             [  1%]
tensorflow_datasets\text\c4_test.py ....s....s                           [  5%]
tensorflow_datasets\text\c4_utils_test.py ........s.                     [  9%]
tensorflow_datasets\text\cfq_test.py ....s                               [ 11%]
tensorflow_datasets\text\civil_comments_test.py ....s                    [ 13%]
tensorflow_datasets\text\cos_e_test.py ....s                             [ 15%]
tensorflow_datasets\text\definite_pronoun_resolution_test.py ....s       [ 17%]
tensorflow_datasets\text\eraser_multi_rc_test.py ....s                   [ 19%]
tensorflow_datasets\text\esnli_test.py ....s                             [ 21%]
tensorflow_datasets\text\gap_test.py ....s                               [ 23%]
tensorflow_datasets\text\glue_test.py ....s....s....s....s....s....s.... [ 36%]
s....s....s....s                                                         [ 42%]
tensorflow_datasets\text\imdb_test.py ....s                              [ 44%]
tensorflow_datasets\text\librispeech_lm_test.py ....s                    [ 46%]
tensorflow_datasets\text\lm1b_test.py ....s                              [ 48%]
tensorflow_datasets\text\math_dataset_test.py ....s                      [ 50%]
tensorflow_datasets\text\movie_rationales_test.py ....s                  [ 51%]
tensorflow_datasets\text\multi_nli_mismatch_test.py ....s                [ 53%]
tensorflow_datasets\text\multi_nli_test.py ....s                         [ 55%]
tensorflow_datasets\text\natural_questions_test.py ....s                 [ 57%]
tensorflow_datasets\text\qa4mre_test.py ....s                            [ 59%]
tensorflow_datasets\text\scan_test.py ....s                              [ 61%]
tensorflow_datasets\text\scicite_test.py ....s                           [ 63%]
tensorflow_datasets\text\snli_test.py ....s                              [ 65%]
tensorflow_datasets\text\squad_test.py ....s                             [ 67%]
tensorflow_datasets\text\super_glue_test.py ....s....s....s....s....s... [ 78%]
.s....s....s....s....s                                                   [ 86%]
tensorflow_datasets\text\tiny_shakespeare_test.py ....s                  [ 88%]
tensorflow_datasets\text\triviaqa_test.py ....s                          [ 90%]
tensorflow_datasets\text\web_questions_test.py ....s                     [ 92%]
tensorflow_datasets\text\wiki40b_test.py ....s                           [ 94%]
tensorflow_datasets\text\wikipedia_test.py ....s                         [ 96%]
tensorflow_datasets\text\xnli_test.py ....s                              [ 98%]
tensorflow_datasets\text\yelp_polarity_test.py ....s                     [100%]

========== 209 passed, 51 skipped, 204 warnings in 401.06s (0:06:41) ==========

tensorflow_datasets/image/the300w_lp.py

vijayphoenix · 2020-04-19T22:00:21Z

New results tfds/translate

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 31 items

tensorflow_datasets\translate\para_crawl_test.py ....s                   [ 16%]
tensorflow_datasets\translate\ted_hrlr_test.py ....s                     [ 32%]
tensorflow_datasets\translate\ted_multi_test.py ....s                    [ 48%]
tensorflow_datasets\translate\wmt19_test.py ....s....s                   [ 80%]
tensorflow_datasets\translate\wmt_test.py .....s                         [100%]

================= 25 passed, 6 skipped, 7 warnings in 25.88s ==================

vijayphoenix · 2020-04-19T22:00:56Z

New results tfds/obj_dec

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 50 items

tensorflow_datasets\object_detection\coco_test.py ....s....s....s        [ 30%]
tensorflow_datasets\object_detection\kitti_test.py ....s                 [ 40%]
tensorflow_datasets\object_detection\open_images_challenge2019_test.py . [ 42%]
...s                                                                     [ 50%]
tensorflow_datasets\object_detection\open_images_test.py ....s           [ 60%]
tensorflow_datasets\object_detection\voc_test.py ....s....s              [ 80%]
tensorflow_datasets\object_detection\waymo_open_dataset_test.py ....s    [ 90%]
tensorflow_datasets\object_detection\wider_face_test.py ....s            [100%]

========== 40 passed, 10 skipped, 2308 warnings in 163.68s (0:02:43) ==========

vijayphoenix · 2020-04-19T22:01:49Z

New results tfds/structured

See pytest results

============================= test session starts =============================
platform win32 -- Python 3.7.6, pytest-5.4.1, py-1.8.1, pluggy-0.13.1
rootdir: C:\Users\VIJAY\Desktop\GitHub_Repos\datasets
plugins: forked-1.1.3, xdist-1.31.0
collected 35 items

tensorflow_datasets\structured\amazon_us_reviews_test.py ....s           [ 14%]
tensorflow_datasets\structured\forest_fires_test.py ....s                [ 28%]
tensorflow_datasets\structured\german_credit_numeric_test.py ....s       [ 42%]
tensorflow_datasets\structured\higgs_test.py ....s                       [ 57%]
tensorflow_datasets\structured\iris_test.py ....s                        [ 71%]
tensorflow_datasets\structured\rock_you_test.py ....s                    [ 85%]
tensorflow_datasets\structured\titanic_test.py ....s                     [100%]

================= 28 passed, 7 skipped, 7 warnings in 13.77s ==================

Conchylicultor

Thank you !

Conchylicultor · 2020-04-20T01:50:26Z

tensorflow_datasets/image/duke_ultrasound.py

@@ -106,7 +106,8 @@ def _info(self):
            'probe': tfds.features.Tensor(shape=(), dtype=tf.string),
            'scanner': tfds.features.Tensor(shape=(), dtype=tf.string),
            'target': tfds.features.Tensor(shape=(), dtype=tf.string),
-            'timestamp_id': tfds.features.Tensor(shape=(), dtype=tf.uint32),
+            # Use tf.uint64 to prevent possible overflow on windows `sys.maxsize`


Could you provide more context on this one ? uint32 should be system independent

I got the following stack trace

See pytest results

ERROR: test_download_and_prepare_as_dataset (__main__.DukeUltrasoundTest) test_download_and_prepare_as_dataset (__main__.DukeUltrasoundTest) Run the decorated test method. ---------------------------------------------------------------------- Traceback (most recent call last): File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\test_utils.py", line 198, in decorated f(self, *args, **kwargs) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\dataset_builder_testing.py", line 298, in test_download_and_prepare_as_dataset self._download_and_prepare_as_dataset(self.builder) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\testing\dataset_builder_testing.py", line 359, in _download_and_prepare_as_dataset builder.download_and_prepare(download_config=download_config) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\api_utils.py", line 69, in disallow_positional_args_dec return fn(*args, **kwargs) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 363, in download_and_prepare download_config=download_config) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 996, in _download_and_prepare max_examples_per_split=download_config.max_examples_per_split, File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 928, in _download_and_prepare self._prepare_split(split_generator, **prepare_split_kwargs) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\dataset_builder.py", line 1012, in _prepare_split example = self.info.features.encode_example(record) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\features_dict.py", line 170, in encode_example in utils.zip_dict(self._feature_dict, example_dict) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\features_dict.py", line 169, in <dictcomp> for k, (feature, example_value) File "C:\Users\VIJAY\Desktop\GitHub_Repos\datasets\tensorflow_datasets\core\features\feature.py", line 541, in encode_example example_data = np.array(example_data, dtype=np_dtype) OverflowError: Python int too large to convert to C long ---------------------------------------------------------------------- Ran 5 tests in 1.021s

I am using Windows 10 64-bit Operating System. python3.7.7

The dataset prepared successfully when replaced tf.uint32 with tf.uint64

This doesn't seems to be the right place to fix this.

The error is raised in feature.py", line 541, in encode_example, so might be an issue with np_dtype, or similar. What is the int value ?

Conchylicultor

Thanks for fixing this. I'm gonna do a partial merge of this to fix the os issues. The uint seems a different problem so I would prefer better understand the issue first before fixing it.

Conchylicultor · 2020-04-29T19:12:40Z

tensorflow_datasets/image/duke_ultrasound.py

@@ -106,7 +106,8 @@ def _info(self):
            'probe': tfds.features.Tensor(shape=(), dtype=tf.string),
            'scanner': tfds.features.Tensor(shape=(), dtype=tf.string),
            'target': tfds.features.Tensor(shape=(), dtype=tf.string),
-            'timestamp_id': tfds.features.Tensor(shape=(), dtype=tf.uint32),
+            # Use tf.uint64 to prevent possible overflow on windows `sys.maxsize`


This doesn't seems to be the right place to fix this.

The error is raised in feature.py", line 541, in encode_example, so might be an issue with np_dtype, or similar. What is the int value ?

tensorflow_datasets/image/the300w_lp.py

googlebot added the cla: yes Author has signed CLA label Apr 19, 2020

tfds-bot added the community:please_review Community - We need your help to review this PR. label Apr 19, 2020

Fix tfds all other datasets bugs.

f51636a

vijayphoenix force-pushed the image_w branch from 2c6251f to f51636a Compare April 19, 2020 21:15

Eshan-Agarwal reviewed Apr 19, 2020

View reviewed changes

tensorflow_datasets/image/the300w_lp.py Show resolved Hide resolved

vijayphoenix commented Apr 19, 2020

View reviewed changes

tensorflow_datasets/image/the300w_lp.py Show resolved Hide resolved

tfds-bot added community:is_reviewing This PR is being reviewed by community member. and removed community:please_review Community - We need your help to review this PR. labels Apr 19, 2020

Conchylicultor requested changes Apr 20, 2020

View reviewed changes

tfds-bot added author:please_respond Author - please respond to the recent comments. and removed community:is_reviewing This PR is being reviewed by community member. labels Apr 20, 2020

vijayphoenix changed the title ~~Fix tfds datasets bugs on windows~~ Fix remaining tfds datasets bugs on windows Apr 20, 2020

tfds-bot added tfds:is_reviewing TFDS team: PTAL and removed author:please_respond Author - please respond to the recent comments. labels Apr 20, 2020

Conchylicultor approved these changes Apr 29, 2020

View reviewed changes

tfds-bot added kokoro:run Run Kokoro tests and removed tfds:is_reviewing TFDS team: PTAL labels Apr 29, 2020

kokoro-team removed the kokoro:run Run Kokoro tests label Apr 29, 2020

tfds-bot added the tfds:ready_to_merge This PR is ready to be merged. label Apr 29, 2020

tfds-copybara merged commit fcfa6a3 into tensorflow:master Apr 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix remaining tfds datasets bugs on windows #1914

Fix remaining tfds datasets bugs on windows #1914

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

Conchylicultor left a comment

Conchylicultor Apr 20, 2020

vijayphoenix Apr 20, 2020 •

edited

Conchylicultor Apr 29, 2020

Conchylicultor left a comment

Conchylicultor Apr 29, 2020

Fix remaining tfds datasets bugs on windows #1914

Fix remaining tfds datasets bugs on windows #1914

Conversation

vijayphoenix commented Apr 19, 2020 • edited

vijayphoenix commented Apr 19, 2020 • edited

vijayphoenix commented Apr 19, 2020 • edited

vijayphoenix commented Apr 19, 2020 • edited

vijayphoenix commented Apr 19, 2020 • edited

vijayphoenix commented Apr 19, 2020 • edited

Conchylicultor left a comment

Choose a reason for hiding this comment

Conchylicultor Apr 20, 2020

Choose a reason for hiding this comment

vijayphoenix Apr 20, 2020 • edited

Choose a reason for hiding this comment

Conchylicultor Apr 29, 2020

Choose a reason for hiding this comment

Conchylicultor left a comment

Choose a reason for hiding this comment

Conchylicultor Apr 29, 2020

Choose a reason for hiding this comment

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix commented Apr 19, 2020 •

edited

vijayphoenix Apr 20, 2020 •

edited