Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Providence of tests/resources/enron*.pdf #21

Closed
spwhitton opened this issue Jan 8, 2019 · 2 comments
Closed

Providence of tests/resources/enron*.pdf #21

spwhitton opened this issue Jan 8, 2019 · 2 comments

Comments

@spwhitton
Copy link
Contributor

Hello,

I can't find where the files matching tests/resources/enron*.pdf came from. The file debian/copyright (in this repository; not Debian's actual copyright file) points to enrondata.org but the download link on that site 404s. Further, that site is about a collection of e-mails, but neither of the enron*.pdf files are e-mail messages.

Unless you've more information, it is probably best for me to filter out these files for Debian and disable the tests. Let me know.

Thanks.

@jbarlow83
Copy link
Member

You're thorough ;)

I pushed an update that adds the original location for one (it was attachment to an Enron email) and replaced the Latin one (wrong file anyway, not from Enron) with a synthetic version that exhibits the same problem.

@spwhitton
Copy link
Contributor Author

Thanks. After thinking about it more, I remain unconvinced that the Enron file is freely licensed. Although enrondata.org claims a creative commons license, enrondata.org is not the copyright holder of the original e-mails, so I don't see how they can release the files under that license. I dug out the original release of the files from web.archive.org and there is no indication that they were released under a free license.

So I'm going to filter out the remaining Enron file and disable the correponding test. Thanks for minimising the amount of filtering I have to do by deleting one of the files :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants