Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copyright status of test data #138

Closed
stefanor opened this issue Nov 14, 2021 · 4 comments · Fixed by #139
Closed

Copyright status of test data #138

stefanor opened this issue Nov 14, 2021 · 4 comments · Fixed by #139
Labels
help wanted Extra attention is needed

Comments

@stefanor
Copy link

Can you clarify the copyright status of the test data files in data? There are some things there that look like commercial TV subtitles. I wouldn't assume that you can legally redistribute those.

This came up when reviewing the package contents in Debian.

Ideally, anything that is public domain or freely licensed should be documented with the copyright holder and license. And everything else should be removed from the package. That's a fairly tireless bureaucratic job, but it needs to be done for us to be able to distribute the test content and run the test suite.

@stefanor stefanor added bug Something isn't working help wanted Extra attention is needed labels Nov 14, 2021
@stefanor
Copy link
Author

FYI: @Natureshadow

@Ousret
Copy link
Owner

Ousret commented Nov 17, 2021

I can clarify. First of all, I am not a legal expert.
Those freely available subtitles are indeed for commercial TV shows. The SRTs are not attached to any licenses that I am aware of. Neither they are extracted from copyrighted material, my understanding is they are made from scratch by the respective authors mentioned in them.

I have done some research and never found something very clear about the current use case.
More on that, you may find interesting the ongoing issue chardet/chardet#231

Regarding CC-BY-SA: in a first reaction, our licensing expert agreed that he also wouldn't see chardet code as derived work of a CC-BY-SA testsuite file. He however pointed out a future risk when specific content in the testsuite (think of a very rare combination of encodings in a test sample?) triggers a special feature or bugfix in chardet code. No idea whether such a scenario is realistic...

Ideally, I could get rid of them, but not in a short term.
Legally speaking I don't see anything hard-blocking. Can you seek legal advice from an expert? Having a good explanation would help.

@Ousret Ousret added question Further information is requested and removed bug Something isn't working labels Nov 17, 2021
@stefanor
Copy link
Author

Those freely available subtitles are indeed for commercial TV shows. The SRTs are not attached to any licenses that I am aware of. Neither they are extracted from copyrighted material, my understanding is they are made from scratch by the respective authors mentioned in them.

Copyright law of course varies from jurisdiction to jurisdiction, but my understanding would be that subtitles are a derivative of the TV shows. Both the translator and the producers of the series would have a copyright interest. They may be freely available on the Internet, but that doesn't mean that they are being legally distributed on the Internet. It's more of a grey area that isn't being policed.

Here's an article confirming this interpretation in court in the Netherlands: https://fossbytes.com/are-subtitles-illegal/

Ideally, I could get rid of them, but not in a short term.

It seems that in the short term, we're just shipping them, too. But it's definitely not ideal.

Can you seek legal advice from an expert?

I'm afraid I don't have any council I can reach to on this, but I could ask on the debian-legal mailing list, where I expect to hear many people saying the same thing.

@Ousret
Copy link
Owner

Ousret commented Nov 18, 2021

I had the chance to reach someone with a legal background, the conclusion is: some countries clearly stated those as unauthorized either in their laws or courts. So in the light of those, I will prioritize the removal/replacement of the targeted assets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants