-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*.utf-8 fields in created torrent files are violate BEP standart #1274
Comments
You have a good point about torrent creation, the main file name/path fields should be written in UTF-8. I'll check that out to make sure we are doing that. The torrent creation code is terribly old and many cobwebs are probably covering it. You can ignore the rest of my comment if you don't want to read me ranting :P :) utf-8 may be not be in the spec, but even uTorrent uses (and even prioritizes them, last I checked) over the non ".utf-8" ones. In addition, the spec doesn't disallow additional keys, such as ".utf-8" or the common "md5sum" one. We also can't rely on BEP 3 fully, since BT-Inc has been known to change the spec and break backwards compatibility, specifically regarding UTF-8. The original spec never required filenames to be UTF-8, and a lot of early clients created torrent using the user's current locale. When BT-Inc changed the spec, it broke all those torrents, so most clients still have to deal with non-UTF8. Honestly I can't remember who came up with the "utf-8" field in the first place, but at the time it was a great way to ensure one could store the names in UTF-8 while being backwards compatible. |
@parg what triggered this being completed? Is there a relevant commit? |
Nope, cleaning house |
Adding random shit (such as .utf entries) is not in violation of the spec, any client that can't handle additional keys is broken. |
Hi,
There is an old hack in Bigly code that's originally is from Vuze and in its turn - from Azureus.
They all aren't using UTF-8 for filenames in generated torrent files (as well as comments and filepaths) for regular fields but adding additional non-standart field "*.utf-8" for this:
BiglyBT/core/src/com/biglybt/core/torrent/impl/TOTorrentImpl.java
Line 55 in bc8bbd2
BiglyBT/core/src/com/biglybt/core/torrent/impl/TOTorrentFileImpl.java
Line 381 in a278d35
That's a violation of BEP 3 standard. There are no any *.utf-8 fields allowed and regular fields are expected to be in UTF-8 instead.
On practice this means that if you create a torrent file with such Bittorrent client and filename of the file that you seed was in Russian (for ex.) it will put 2 name/path tags in torrent file: name and name.utf-8 + path and path.utf-8.
Then other user with other bittorent client tries to open such torrent file - his client ignores *.utf-8 (coz it's not in BEP 3) fields and tries to decode regular path/name fields as UTF-8.
This results in broken characters in filenames of files that you try to download.
There are few bittorent clients that have added an exception case for Azureus's *.utf-8 fields long time ago (qBittorent for ex.) but most of others (KTorrent for ex.) aren't and still have issue with torrent files generated with Azureus/Vuze.
It seems (not tested but looking at your code it's so) this client reproduce this problem too.
Could you please fix this behavior in your client?
The text was updated successfully, but these errors were encountered: