Tests can flake during WPT imports #29398

mrobinson · 2023-02-22T07:28:31Z

When doing the daily import of WPT tests, there is a step that generates new results. During this step, for reasons unrelated to the test itself, a test can flake. In that case, the results generated for that particular day's import will mean that the CI will typically always fail.

mrobinson · 2023-02-22T07:29:30Z

Here's an example of this happening: #29370 (comment)

CYBAI · 2023-04-25T17:21:42Z

I've recently noticed that some intermittent issues (e.g. #26997, #23623, #29309 and more) are marked as "the intermittent status" as their expected status so that they are frequently mentioned in many PRs.

For example, #26997 is intermittent TIMEOUT for /FileAPI/url/url-charset.window.html. However, status of the test should be OK with PASS tests but sometimes got TIMEOUT.

If we look at the git history of the test expectation file for it (https://github.com/servo/servo/commits/master/tests/wpt/metadata/FileAPI/url/url-charset.window.js.ini), we can notice that it's added and removed frequently.
I wonder if this could be caused by lack of filter-intermittents in the import job.

servo/.github/workflows/linux.yml

Lines 137 to 157 in 2ae158d

    
                 - name: Run tests 
        
                   if: ${{ inputs.wpt != 'sync' }} 
        
                   run: | 
        
                     python3 ./mach test-wpt --with-${{ env.LAYOUT }} \ 
        
                       --release --processes $(nproc) --timeout-multiplier 2 \ 
        
                       --total-chunks ${{ env.max_chunk_id }} --this-chunk ${{ matrix.chunk_id }} \ 
        
                       --log-raw test-wpt.${{ matrix.chunk_id }}.log \ 
        
                       --log-raw-unexpected unexpected-test-wpt.${{ matrix.chunk_id }}.log \ 
        
                       --filter-intermittents filtered-test-wpt.${{ matrix.chunk_id }}.json 
        
                   env: 
        
                     GITHUB_CONTEXT: ${{ toJson(github) }} 
        
                     INTERMITTENT_TRACKER_DASHBOARD_SECRET: ${{ secrets.INTERMITTENT_TRACKER_DASHBOARD_SECRET }} 
        
                 - name: Run tests (sync) 
        
                   if: ${{ inputs.wpt == 'sync' }} 
        
                   run: | 
        
                     python3 ./mach test-wpt --with-${{ env.LAYOUT }} \ 
        
                       --release --processes $(nproc) --timeout-multiplier 2 \ 
        
                       --total-chunks ${{ env.max_chunk_id }} --this-chunk ${{ matrix.chunk_id }} \ 
        
                       --log-raw test-wpt.${{ matrix.chunk_id }}.log \ 
        
                       --log-servojson wpt-jsonsummary.${{ matrix.chunk_id }}.log \ 
        
                       --always-succeed

Does it make sense to filter intermittent during the import phase? If yes, I wonder we could avoid marking intermittent tests with the intermittent status 🤔

mrobinson · 2023-05-07T13:34:40Z

I have been pondering this problem as well, since it's quite common. The issue with filter-intermittents is that it doesn't know which of the two runs to choose. We could go for best out of three, but we also don't know which of the tests to run twice.

If the import script knew how to run try jobs, perhaps it could try to regenerate results for failing tests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests can flake during WPT imports #29398

Tests can flake during WPT imports #29398

mrobinson commented Feb 22, 2023

mrobinson commented Feb 22, 2023

CYBAI commented Apr 25, 2023

mrobinson commented May 7, 2023

Tests can flake during WPT imports #29398

Tests can flake during WPT imports #29398

Comments

mrobinson commented Feb 22, 2023

mrobinson commented Feb 22, 2023

CYBAI commented Apr 25, 2023

mrobinson commented May 7, 2023