Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make linked test cases point to the Internet Archive #6854

Closed
91 tasks done
timvandermeij opened this issue Jan 11, 2016 · 2 comments · Fixed by #6895
Closed
91 tasks done

Make linked test cases point to the Internet Archive #6854

timvandermeij opened this issue Jan 11, 2016 · 2 comments · Fixed by #6895
Labels

Comments

@timvandermeij
Copy link
Contributor

We have quite a number of linked test cases in the https://github.com/mozilla/pdf.js/tree/master/test/pdfs folder. Some of them point to sources other than the Internet Archive or Bugzilla and might become unavailable, which is to be avoided.

It is better to fetch the files from the Internet Archive to be sure that they remain available. This should always be possible if the current source exists because once you input the link on the Internet Archive website it will archive the file automatically.

Below is a list of linked test cases that point to external sources and should be updated (obtained with grep -L "bugzilla.mozilla.org\|archive.org" *.pdf.link). Make sure that the hash of the Internet Archive file is the same as the hash in the test manifest.

  • 20130226130259.pdf.link
  • JST2007-5.pdf.link
  • P020121130574743273239.pdf.link
  • SFAA_Japanese.pdf.link
  • TaroUTR50SortedList112.pdf.link
  • aboutstacks.pdf.link
  • artofwar.pdf.link
  • bpl13210.pdf.link
  • bug766138.pdf.link
  • bug808084.pdf.link
  • bug887152.pdf.link
  • cable.pdf.link
  • ecma262.pdf.link
  • fips197.pdf.link
  • fit11-talk.pdf.link
  • geothermal.pdf.link
  • hmm.pdf.link
  • html5checker.pdf.link
  • hudsonsurvey.pdf.link
  • ichiji.pdf.link
  • issue1010.pdf.link
  • issue1015.pdf.link
  • issue1049.pdf.link
  • issue1096.pdf.link
  • issue1127.pdf.link
  • issue1133.pdf.link
  • issue1155.pdf.link
  • issue1169.pdf.link
  • issue1233.pdf.link
  • issue1257.pdf.link
  • issue1309.pdf.link
  • issue1317.pdf.link
  • issue1419.pdf.link
  • issue1466.pdf.link
  • issue1597.pdf.link
  • issue1629.pdf.link
  • issue1658.pdf.link
  • issue1685.pdf.link
  • issue1687.pdf.link
  • issue1709.pdf.link
  • issue1721.pdf.link
  • issue1729.pdf.link
  • issue1796.pdf.link
  • issue1810.pdf.link
  • issue1878.pdf.link
  • issue1912.pdf.link
  • issue1936.pdf.link
  • issue1998.pdf.link
  • issue2006.pdf.link
  • issue2129.pdf.link
  • issue2139.pdf.link
  • issue2386.pdf.link
  • issue2442.pdf.link
  • issue2531.pdf.link
  • issue2537.pdf.link
  • issue2627.pdf.link
  • issue2799.pdf.link
  • issue2829.pdf.link
  • issue2853.pdf.link
  • issue2881.pdf.link
  • issue3062.pdf.link
  • issue3384.pdf.link
  • issue3666.pdf.link
  • issue3848.pdf.link
  • issue3903.pdf.link
  • issue3925.pdf.link
  • issue3999.pdf.link
  • issue4926.pdf.link
  • issue5592.pdf.link
  • issue5726.pdf.link
  • issue818.pdf.link
  • issue919.pdf.link
  • jai.pdf.link
  • kdchart.pdf.link
  • liveprogramming.pdf.link
  • mao.pdf.link
  • ocs.pdf.link
  • ohkubo-SS04.pdf.link
  • pal-o47.pdf.link
  • pdf.pdf.link
  • piperine.pdf.link
  • preistabelle.pdf.link
  • protectip.pdf.link
  • tcpdf_033.pdf.link
  • tutorial.pdf.link
  • txt2pdf.pdf.link
  • unix01.pdf.link
  • usmanm-bad.pdf.link
  • vesta.pdf.link
  • yo01.pdf.link

Additionally:

  • Investigate hmm.pdf. The MD5 hash in the manifest matches the MD5 hash of the file from the link, but the bots complain about a mismatch.
@timvandermeij timvandermeij changed the title Make more linked test cases point to the Web Archive Make more linked test cases point to the Internet Archive Jan 13, 2016
@timvandermeij timvandermeij changed the title Make more linked test cases point to the Internet Archive Make linked test cases point to the Internet Archive Jan 13, 2016
@Snuffleupagus
Copy link
Collaborator

Regarding issue1155: I'm not having any luck in obtaining a working link from the Internet Archive. However, looking at the issue/PR (and quickly browsing through the structure of the PDF file), it seems that it shouldn't be too hard to hand-craft a reduced test-case, so I'll look into that later. Fixed

@Snuffleupagus
Copy link
Collaborator

It appears that we've now reached the point where every file, perhaps with one or two exceptions, that is possible to obtain through the Internet Archive have been fixed.

Below is a summary of the ones that are left to fix, so we probably need to start looking into the possibility of replacing them with reduced test-cases, if at all possible.

A number of these files are font related, which in theory should make it reasonable easy (albeit time-consuming) to create/verify reduced test-cases. (It might be simplest to see if Brendan manages to create the tool mentioned in #6650 (comment), before undertaking more work on that.)

test/pdfs/issue2799.pdf.link
test/pdfs/issue5726.pdf.link
test/pdfs/yo01.pdf.link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants