Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Update) use array_is_list when bencode #3297

Merged
merged 3 commits into from
Dec 17, 2023
Merged

Conversation

Rhilip
Copy link
Contributor

@Rhilip Rhilip commented Dec 16, 2023

PHP 8.1 add a new function array_is_list and since UNIT3d 7.x.x min Required PHP Version is 8.2.
We can safely simple the list or dict Checks by this new array_is_list function.

And I completely delete the isset($d['isDct']) check, since no 'isDct' key is created in the bdecode step and whole project.

@Rhilip Rhilip changed the title (Fix) use array_is_list when bencode (Update) use array_is_list when bencode Dec 17, 2023
Copy link
Collaborator

@Roardom Roardom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked all locations bencode is called and there were no cases where an explicit list was passed in. Implicitly, the only variables that were passed in were those decoded using the bdecode function, which after checking, there are no cases where lists might have integer keys that are out of order. So I think we're safe to assume a list is only a list when the keys are in order compared to the previous implementation where only the type was checked.

Thanks!

@HDVinnie HDVinnie changed the base branch from master to 7.x.x December 17, 2023 15:12
@HDVinnie HDVinnie merged commit 696720e into HDInnovations:7.x.x Dec 17, 2023
4 checks passed
@Rhilip Rhilip deleted the patch-1 branch December 18, 2023 04:26
@Rhilip
Copy link
Contributor Author

Rhilip commented Dec 18, 2023

I checked all locations bencode is called and there were no cases where an explicit list was passed in. Implicitly, the only variables that were passed in were those decoded using the bdecode function, which after checking, there are no cases where lists might have integer keys that are out of order. So I think we're safe to assume a list is only a list when the keys are in order compared to the previous implementation where only the type was checked.

Thanks!

Yes, I agree with you. the isDict key seems very old method to check a PHP Array is list or dict. It create when bdecode and will removed in bencode, just in PHP runtime. So I think it's safe and removed in this commit.

One more thing, Is UNIT3D team willing to use my library https://github.com/Rhilip/Bencode for Bencode and Torrent ? If this lirary meets your requirements, I'd be more than willing to make a pull request.

@Roardom
Copy link
Collaborator

Roardom commented Dec 18, 2023

One more thing, Is UNIT3D team willing to use my library https://github.com/Rhilip/Bencode for Bencode and Torrent ? If this lirary meets your requirements, I'd be more than willing to make a pull request.

After this PR, I actually started doing a big refactor of the bencode helper and adding a bunch of tests. I was going based off the tests in https://github.com/arvidn/libtorrent/blob/RC_2_0/test/test_bdecode.cpp as a baseline. Does your library have an equivalent for each test in there? I think it might be missing a few such as defining a recursion limit which could prove useful.

@Rhilip
Copy link
Contributor Author

Rhilip commented Dec 18, 2023

One more thing, Is UNIT3D team willing to use my library Rhilip/Bencode for Bencode and Torrent ? If this lirary meets your requirements, I'd be more than willing to make a pull request.

After this PR, I actually started doing a big refactor of the bencode helper and adding a bunch of tests. I was going based off the tests in arvidn/libtorrent@RC_2_0/test/test_bdecode.cpp as a baseline. Does your library have an equivalent for each test in there? I think it might be missing a few such as defining a recursion limit which could prove useful.

the Rhilip/Bencode library has used for NexusPHP and other tracker, I already add some tests in my library.
And many thanks to you point out the test file in libtorrent. If you say "have an equivalent for each test in there", some may not coverage at yet, but I will try to add ASAP.

@Roardom
Copy link
Collaborator

Roardom commented Dec 18, 2023

If you're willing to add those tests, definitely take a stab at it! This code in UNIT3D hasn't been updated for years and definitely need some major improvement with edge case handling. Although one thing to worry about is since we save uploaded .torrents, we've saved previous torrents with less strict handling, and when we go to parse these torrents again we may run into issues that I haven't fully thought through yet. In UNIT3D, we would probably need to wrap the library so that we can handle all the exception cases in a single spot and decide which errors should be allowed in decoding previously saved torrents, but prevent any new uploads of out-of-spec torrent files.

@JoshyPHP
Copy link

If you're looking into replacing your decoder, I'll suggest using my library at https://github.com/s9e/Bencode if that's something that you can work into your project.

It's only a decoder/encoder, it's not meant to manipulate files. It uses ArrayObject instances for dictionaries to unequivocally differentiate them from lists, including empty dictionaries. ArrayObject allows dictionaries be accessed via the array notation or the object notation, but since it's not an array it can't be passed to a function that expects one unless you cast it to (array). The decoder throws exceptions on malformed/invalid input, but it has a secondary decoder that can be used to salvage non-compliant input such as unordered dictionaries and things like that. It's designed to handle malformed/malicious input gracefully and there's a fuzzer in the repository.

@Rhilip
Copy link
Contributor Author

Rhilip commented Dec 19, 2023

I don't think the ArrayObject instances or other BencodeSerializableInterface is needed in tracker. Since in most cases tracker only need to receive torrent file, check the torrent is vaild or not , change torrent metadata. For a Torrent, We can easily assert the $root['info'] is a dict and $root['info']['files'] is a list. The PHP Array is strong and powerful enough.


For test, I total review https://github.com/arvidn/libtorrent/blob/RC_2_0/test/test_bdecode.cpp?rgh-link-date=2023-12-18T07%3A09%3A09Z , 4 tests and 10 assertions add in commit Rhilip/Bencode@6ff306f . And I think it's enough. Some tests in arvidn/libtorrent may overdesign in PHP language since it build for btclient and torrent maker, for example, https://github.com/arvidn/libtorrent/blob/9c1897645265c6a450930e766ab46c02a240891f/test/test_bdecode.cpp#L482-L512 .Or just converage by other exception, mostly by Rhilip\Bencode\ParseException("Invalid integer format or integer overflow")

@JoshyPHP
Copy link

I mentioned ArrayObject for context and in case it's relevant to this project. The reason to use s9e/bencode is because it's solid, it's efficient, and it's designed to handle malformed and malicious input. If a website ever accepts bencoded data from users, that's the most important criterion for a decoder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants