Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_unicodedata: test_normalization() fails randomly with IncompleteRead on PPC64LE Fedora buildbots #99892

Closed
vstinner opened this issue Nov 30, 2022 · 5 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

vstinner commented Nov 30, 2022

For a few weeks, test_unicodedata.test_normalization() fails randomly with IncompleteRead on PPC64LE Fedora buildbots.

IMO the test should be skipped on download error, rather than treating a download error as a test failure.

Example: https://buildbot.python.org/all/#/builders/33/builds/3031

FAIL: test_normalization (test.test_unicodedata.NormalizationTest.test_normalization)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/http/client.py", line 591, in _read_chunked
    value.append(self._safe_read(chunk_left))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/http/client.py", line 632, in _safe_read
    raise IncompleteRead(data, amt-len(data))
http.client.IncompleteRead: IncompleteRead(8113 bytes read, 2337 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/test/test_unicodedata.py", line 362, in test_normalization
    testdata = open_urlresource(TESTDATAURL, encoding="utf-8",
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/test/support/__init__.py", line 671, in open_urlresource
    s = f.read()
        ^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/gzip.py", line 295, in read
    return self._buffer.read(size)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/_compression.py", line 118, in readall
    while data := self.read(sys.maxsize):
                  ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/gzip.py", line 500, in read
    buf = self._fp.read(READ_BUFFER_SIZE)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/gzip.py", line 90, in read
    return self.file.read(size)
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/http/client.py", line 459, in read
    return self._read_chunked(amt)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/http/client.py", line 597, in _read_chunked
    raise IncompleteRead(b''.join(value)) from exc
http.client.IncompleteRead: IncompleteRead(106486 bytes read)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/buildbot/buildarea/3.x.cstratak-fedora-rawhide-ppc64le.clang/build/Lib/test/test_unicodedata.py", line 368, in test_normalization
    self.fail(f"Could not retrieve {TESTDATAURL}")
AssertionError: Could not retrieve http://www.pythontest.net/unicode/15.0.0/NormalizationTest.txt

Linked PRs

@vstinner vstinner added the type-bug An unexpected behavior, bug, or error label Nov 30, 2022
@vstinner
Copy link
Member Author

vstinner commented Dec 5, 2022

Local patch to trigger the bug:

diff --git a/Lib/http/client.py b/Lib/http/client.py
index 15c5cf634c..08a0c3b39a 100644
--- a/Lib/http/client.py
+++ b/Lib/http/client.py
@@ -561,6 +561,7 @@ def _get_chunk_left(self):
             if chunk_left is not None:
                 # We are at the end of chunk, discard chunk end
                 self._safe_read(2)  # toss the CRLF at the end of the chunk
+            raise IncompleteRead(b'')
             try:
                 chunk_left = self._read_next_chunk_size()
             except ValueError:
@@ -632,8 +633,7 @@ def _safe_readinto(self, b):
         """Same as _safe_read, but for reading into a buffer."""
         amt = len(b)
         n = self.fp.readinto(b)
-        if n < amt:
-            raise IncompleteRead(bytes(b[:n]), amt-n)
+        raise IncompleteRead(bytes(b[:n]), amt-n)
         return n
 
     def read1(self, n=-1):
@@ -687,8 +687,7 @@ def _read1_chunked(self, n):
             n = chunk_left # if n is negative or larger than chunk_left
         read = self.fp.read1(n)
         self.chunk_left -= len(read)
-        if not read:
-            raise IncompleteRead(b"")
+        raise IncompleteRead(b"")
         return read
 
     def _peek_chunked(self, n):

Command to reproduce the bug using the patch:

rm -f ./Lib/test/data/NormalizationTest.txt && ./python -m test -u all test_unicodedata -v

vstinner added a commit that referenced this issue Dec 5, 2022
Skip test_normalization() of test_unicodedata if it fails to download
NormalizationTest.txt file from pythontest.net.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 5, 2022
…onGH-100011)

Skip test_normalization() of test_unicodedata if it fails to download
NormalizationTest.txt file from pythontest.net.
(cherry picked from commit 2488c1e)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Dec 5, 2022
…onGH-100011)

Skip test_normalization() of test_unicodedata if it fails to download
NormalizationTest.txt file from pythontest.net.
(cherry picked from commit 2488c1e)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington added a commit that referenced this issue Dec 5, 2022
Skip test_normalization() of test_unicodedata if it fails to download
NormalizationTest.txt file from pythontest.net.
(cherry picked from commit 2488c1e)

Co-authored-by: Victor Stinner <vstinner@python.org>
miss-islington added a commit that referenced this issue Dec 5, 2022
Skip test_normalization() of test_unicodedata if it fails to download
NormalizationTest.txt file from pythontest.net.
(cherry picked from commit 2488c1e)

Co-authored-by: Victor Stinner <vstinner@python.org>
@zware
Copy link
Member

zware commented Dec 5, 2022

I've started wondering lately if we'd be better served to migrate those test files into the repository.

@vstinner
Copy link
Member Author

vstinner commented Dec 5, 2022

I've started wondering lately if we'd be better served to migrate those test files into the repository.

How many files are we talking about? NormalizationTest.txt takes 2.6 MB.

My notes on these files: https://pythondev.readthedocs.io/infra.html#services-used-by-unit-tests

Another example: Lib/test/test_codecmaps_cn.py downloads http://www.pythontest.net/unicode/EUC-CN.TXT which takes 104 KB.

@vstinner
Copy link
Member Author

vstinner commented Dec 5, 2022

Files containing pythontest.net/unicode/ pattern:

  • Lib/test/test_codecmaps_cn.py
  • Lib/test/test_codecmaps_hk.py
  • Lib/test/test_codecmaps_jp.py
  • Lib/test/test_codecmaps_kr.py
  • Lib/test/test_codecmaps_tw.py
  • Lib/test/test_ucn.py
  • Lib/test/test_unicodedata.py
Lib/test/test_codecmaps_cn.py:12:    mapfileurl = 'http://www.pythontest.net/unicode/EUC-CN.TXT'
Lib/test/test_codecmaps_cn.py:17:    mapfileurl = 'http://www.pythontest.net/unicode/CP936.TXT'
Lib/test/test_codecmaps_cn.py:22:    mapfileurl = 'http://www.pythontest.net/unicode/gb-18030-2000.xml'
Lib/test/test_codecmaps_hk.py:12:    mapfileurl = 'http://www.pythontest.net/unicode/BIG5HKSCS-2004.TXT'
Lib/test/test_codecmaps_jp.py:12:    mapfileurl = 'http://www.pythontest.net/unicode/CP932.TXT'
Lib/test/test_codecmaps_jp.py:28:    mapfileurl = 'http://www.pythontest.net/unicode/EUC-JP.TXT'
Lib/test/test_codecmaps_jp.py:35:    mapfileurl = 'http://www.pythontest.net/unicode/SHIFTJIS.TXT'
Lib/test/test_codecmaps_jp.py:49:    mapfileurl = 'http://www.pythontest.net/unicode/EUC-JISX0213.TXT'
Lib/test/test_codecmaps_jp.py:56:    mapfileurl = 'http://www.pythontest.net/unicode/SHIFT_JISX0213.TXT'
Lib/test/test_codecmaps_kr.py:12:    mapfileurl = 'http://www.pythontest.net/unicode/CP949.TXT'
Lib/test/test_codecmaps_kr.py:18:    mapfileurl = 'http://www.pythontest.net/unicode/EUC-KR.TXT'
Lib/test/test_codecmaps_kr.py:28:    mapfileurl = 'http://www.pythontest.net/unicode/JOHAB.TXT'
Lib/test/test_codecmaps_tw.py:12:    mapfileurl = 'http://www.pythontest.net/unicode/BIG5.TXT'
Lib/test/test_codecmaps_tw.py:17:    mapfileurl = 'http://www.pythontest.net/unicode/CP950.TXT'
Lib/test/test_ucn.py:179:        url = ("http://www.pythontest.net/unicode/%s/NamedSequences.txt" %
Lib/test/test_unicodedata.py:359:        TESTDATAURL = f"http://www.pythontest.net/unicode/{unicodedata.unidata_version}/{TESTDATAFILE}"

I don't know who manages http://www.pythontest.net/unicode/ nor how it is updated to a new Unicode version.

@benjaminp wrote:

Basically, I update the version in the script (Tools/unicode/makeunicodedata.py), run it, and fix what fails.

@vstinner
Copy link
Member Author

vstinner commented Dec 6, 2022

Issue fixed by: 2488c1e

Please open a new issue if you want to include these Unicode TXT files directly into Python Git repository.

@vstinner vstinner closed this as completed Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants
@vstinner @zware and others