-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grow buffer based on current size #771
Conversation
This passes locally for me. I don't see how And this is correct, isn't it?
Looks like it's with the |
The buffer has somehow grown to 0x60000000 bytes. That's not a power of two! But it's suspicously I'm guessing that we've somehow gotten a buffer of 6 bytes, and gone doubling from there. And that's exactly what happens in this special path in So, my patch is incomplete -- we can still get arbitrary (non-power-of-two) buffer sizes through this special path. I'll see what I can do about it. |
bfa615e
to
72dfb88
Compare
My nice little one-liner just grew a bit in complexity -- but now it passes with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Snild-Sony, thanks for the new version :+1 Please see the new comments below:
72dfb88
to
2214374
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Snild-Sony thanks for the replies adjustments! For some reason on MinGW the test on empty string ("") is failing: https://github.com/libexpat/libexpat/actions/runs/6578518007/job/17879007163?pr=771#step:6:490
I am unable to reproduce that failure locally. I took your https://github.com/libexpat/libexpat/tree/buffer-limit-mingw-experiments and added some extra stuff at https://github.com/SonyMobile/libexpat-contrib/tree/buffer-limit-mingw-experiments -- let's see if the logs show anything interesting. |
Looks like malloc just won't give us the required 1 GiB allocation. This surprises me, since the runners are supposed to have 7GiB of RAM. Since it passes on the "normal" builds, it must be somehow wine- or mingw-specific, but I have the same versions of those packages on one of my machines, and it passes there. I could make the test check that malloc+free of 1GiB is possible before the actual test, or make it use a custom malloc function that won't fail... but I'd prefer understanding what the actual problem is, if possible. Any guesses, @hartwork? |
@Snild-Sony great to have some first insights and some new ideas 🎉 👍 536870912 is precisely 2^29, interesting! (Oh wait, your loop is halving on integer level every time...) While it should not (have to) affect MinGW on Linux I remember that true Windows malloc is committing to that memory and committing to that much memory in the 32bit world is maybe considered too much and rejected artificially? I'll need to think over this more, maybe even ifdef-ing out the test for MinGW is an option, not anywhere near sure yet. |
I was thinking along the same lines, but then it should fail with 32-bit wine locally as well -- and it doesn't. There must be some other difference, but I don't know what. |
2214374
to
804d03d
Compare
This new upload adds a workaround for the MinGW failure in CI. On 32-bit, the test now tries to allocate 1GiB, and if it fails, goes down to 512MiB (but no further). This (still) passes locally, and now hopefully here on GitHub Actions too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Snild-Sony thanks for the adjustments, very nice! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Snild-Sony it turns out that clang-format has an issue still:
--- a/expat/tests/basic_tests.c
+++ b/expat/tests/basic_tests.c
@@ -2872,7 +2872,7 @@ START_TEST(test_buffer_can_grow_to_max) {
if (sizeof(void *) < 8) {
// Looks like we have a 32-bit system. Can we make a big allocation?
void *big = malloc(maxbuf);
- if (!big) {
+ if (! big) {
// The big allocation failed. Let's be a little lenient.
maxbuf = maxbuf / 2;
}
Until now, the buffer size to grow to has been calculated based on the distance from the current parse position to the end of the buffer. This means that the size of any already-parsed data was not considered, leading to inconsistent buffer growth. There was also a special case in XML_Parse() when XML_CONTEXT_BYTES was zero, where the buffer size would be set to twice the incoming string length. This patch replaces this with an XML_GetBuffer() call. Growing the buffer based on its total size makes its growth consistent. The commit includes a test that checks that we can reach the max buffer size (usually INT_MAX/2 + 1) regardless of previously parsed content. GitHub CI couldn't allocate the full 1GiB with MinGW/wine32, though it works locally with the same compiler and wine version. As a workaround, the test tries to malloc 1GiB, and reduces `maxbuf` to 512MiB in case of failure.
804d03d
to
119ae27
Compare
Järnspikars! My Ubuntu 22.04 distribution only provides clang up to 15, so #773 has made it hard for me to check locally (clang15 would've caught this one, but has different opinions than clang17 about some other lines). I've now installed clang 17 directly from the llvm project, so it should be fine going forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Snild-Sony thanks for the adjustment!
Järnspikars!
Google translate first thinks it's Estonian but when switching to Swedish explicitly it says "Iron Nails!" Probably not too different from "Verdammte Axt!" in German 😃
My Ubuntu 22.04 distribution only provides clang up to 15, so #773 has made it hard for me to check locally (clang15 would've caught this one, but has different opinions than clang17 about some other lines).
I've now installed clang 17 directly from the llvm project, so it should be fine going forward.
Excellent!
Until now, the buffer size to grow to has been calculated based on the distance from the current parse position to the end of the buffer. This means that the size of any already-parsed data was not considered, leading to inconsistent buffer growth.
Growing the buffer based on its total size makes its growth consistent.
The commit includes a test that checks that we can reach the max buffer size (usually INT_MAX/2 + 1) regardless of previously parsed content.