Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove/Skip unreliable/non-deterministic tests #9235

Closed
anisse opened this issue Apr 18, 2016 · 2 comments
Closed

Remove/Skip unreliable/non-deterministic tests #9235

anisse opened this issue Apr 18, 2016 · 2 comments

Comments

@anisse
Copy link
Contributor

anisse commented Apr 18, 2016

As part of the work in #8496 , I try to make sure we don't run into regressions. The problem is that some tests are so unreliable, that it's very difficult to run them twice and get the same result.

I understand that due to the nature of the project, tests depend on a third party (the website), and that website may well be quite flaky. But I want to at least eliminate the most unreliable tests.

For example, I ran the full test suite (with regression detection) 75 times. Out of those, a few tests have more than 2 detected regressions (false positives):

      2 test.test_download:TestDownload.test_WeiqiTV ERROR
      2 test.test_download:TestDownload.test_YesJapan ERROR
      2 test.test_subtitles:TestYoutubeSubtitles.test_youtube_translated_subtitles ERROR
      3 test.test_download:TestDownload.test_DouyuTV_2 ERROR
      3 test.test_download:TestDownload.test_KuwoMv FAIL
      3 test.test_download:TestDownload.test_NRKTV ERROR
      3 test.test_download:TestDownload.test_Sohu_3 ERROR
      3 test.test_download:TestDownload.test_Vodlocker ERROR
      3 test.test_subtitles:TestNPOSubtitles.test_allsubtitles ERROR
      4 test.test_download:TestDownload.test_Chaturbate ERROR
      4 test.test_download:TestDownload.test_ElPais ERROR
      4 test.test_download:TestDownload.test_Jpopsuki ERROR
      4 test.test_download:TestDownload.test_Telegraaf ERROR
      4 test.test_download:TestDownload.test_TudouPlaylist FAIL
      4 test.test_download:TestDownload.test_Viddler FAIL
      5 test.test_download:TestDownload.test_NPO_4 ERROR
      5 test.test_download:TestDownload.test_NRKTV_1 ERROR
      6 test.test_download:TestDownload.test_C56_1 ERROR
      6 test.test_download:TestDownload.test_Smotri_5 ERROR
      7 test.test_download:TestDownload.test_AdobeTVVideo ERROR
      7 test.test_download:TestDownload.test_LePlaylist FAIL
      7 test.test_download:TestDownload.test_MiTele FAIL
      7 test.test_download:TestDownload.test_NPO_5 ERROR
      9 test.test_download:TestDownload.test_Yahoo_7 FAIL
     14 test.test_download:TestDownload.test_GodTube FAIL
     15 test.test_download:TestDownload.test_Mwave FAIL
     16 test.test_download:TestDownload.test_LivestreamOriginal_1 ERROR
     16 test.test_download:TestDownload.test_Sexu ERROR
     19 test.test_download:TestDownload.test_ACast ERROR
     19 test.test_download:TestDownload.test_StreetVoice ERROR

Some have as many as 19(!).
ERROR means that there was an error running the test. Maybe we should ignore those. FAIL, means that a test passed, then failed on the same code revision. You can see how this translates in regression detection here:
StreetVoice: https://travis-ci.org/anisse/youtube-dl/jobs/123862153 https://travis-ci.org/anisse/youtube-dl/jobs/123289890
MWave: https://travis-ci.org/anisse/youtube-dl/jobs/123289900
GodTube: https://travis-ci.org/anisse/youtube-dl/jobs/123289914

It even generates user issues, for example: Streetvoice #9219 .

@yan12125
Copy link
Collaborator

Some non-deterministic tests really indicates problems in youtube-dl. For a concrete example, see my comment at #9219. Another example is test_Bloomberg. Sometime m3u8 gives a better quality of streams while sometimes f4m does. Currently both cases trigger an error. In this case we should force a specific format in the test. Above all, all non-deterministic should be examined one-by-one. They shouldn't be removed/skipped without a reason.

Anyway, this list is quite useful. Much thanks for the work on such a detailed investigation of tests. Could you paste it to #8496? To make regression tests possible, I may try to attack these tests first.

@anisse
Copy link
Contributor Author

anisse commented Apr 18, 2016

Sure, I copied the message to #8496, let's move the discussion there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants