Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix GH-13203 file_put_contents fail on strings over 4GB on Windows #13205

Closed
wants to merge 18 commits into from

Conversation

divinity76
Copy link
Contributor

@divinity76 divinity76 commented Jan 20, 2024

fix GH-13203

  • used 10GB in SKIPIF to account for people running run-tests.php in parallel; i want there to be a comfortable amount of system resources, don't want to run if it's just barely enough.

actually exist a macro to do the cast

(not implemented exactly how I would have preferred, but it's fine. is UINT_MAX actually guaranteed to be unsigned int? does specifications actually dictate that, or did the compiler developers just chose to implement it like that?)
@SakiTakamachi
Copy link
Member

I'm not very familiar with that, but did this solve the problem?
I was concerned because the behavior does not seem to change depending on PLAIN_WRAP_BUF_SIZE on the 64-bit version of Windows.

@divinity76
Copy link
Contributor Author

@SakiTakamachi don't actually know. Was hoping WINDOWS_X64_ZTS CI tester would tell me, but I can't actually see if the test passed or was skipped due to system resource constraints,
i can only see that it did not fail.

I tried getting MSVC to work, but that's seemingly non-trivial: MSVC2022 is not yet supported, and I would need to download MSVC2017 or something to actually compile PHP on MSVC? I gave up. Hopefully a developer actually equipped with MSVC can test it

@SakiTakamachi
Copy link
Member

I see, if I can look it this week, I'll check this out.

@nielsdos
Copy link
Member

I'm familiar with doing Windows debugs and I have a dedicated VM to do so, so I can also check tonight

@bukka
Copy link
Member

bukka commented Jan 23, 2024

It might be potentially better to enable chunking for big writes so we don't need to write it all in one go.

@nielsdos
Copy link
Member

I was concerned because the behavior does not seem to change depending on PLAIN_WRAP_BUF_SIZE on the 64-bit version of Windows.

PHP_WIN32 is also set on 64-bit Windows. This is because Win32 is actually the name of the subsystem, not the bitness.
So the macro actually does change something.
_WIN64 is the macro only set on 64-bit Windows.

@nielsdos
Copy link
Member

Just checked, CI skipped your test.

@nielsdos
Copy link
Member

I tried testing this, but my VM just dies when trying to do large I/O, so someone else will have to ¯_(ツ)_/¯

@bukka
Copy link
Member

bukka commented Jan 23, 2024

I have a real machine so I can take a look later this week.

@SakiTakamachi
Copy link
Member

@nielsdos
Thanks, but if you look a little above the changed line, it appears to be doing almost the same thing as the added macro, so I feel like the behavior is still the same.

@divinity76
Copy link
Contributor Author

divinity76 commented Jan 23, 2024

good point.. @nielsdos could you check if your VM hits the

		if (ZEND_SIZE_T_UINT_OVFL(count)) {
			count = UINT_MAX;
		}

condition on line 359? maybe the real issue is in ZEND_SIZE_T_UINT_OVFL() failing? or maybe i'm looking in the wrong place completely

@nielsdos
Copy link
Member

nielsdos commented Jan 23, 2024

Yes that condition is hit and the body is executed.
EDIT: and indeed this patch does not work.

@divinity76
Copy link
Contributor Author

divinity76 commented Jan 23, 2024

@nielsdos try again now.

  • can actually partially reproduce the issue with just
#include <iostream>
#include <stdio.h>
#include <stdint.h>
#include <BaseTsd.h>
#include <io.h>
#include<thread>
#include<chrono>
int main()
{
    FILE* h;
    tmpfile_s(&h);
    int h_int = _fileno(h);
    std::string gb5(size_t(5) * 1024 * 1024 * 1024, 'a');
    while (gb5.length() > 0) {
        int written_now = _write(h_int, gb5.data(), gb5.length() > UINT_MAX ? UINT_MAX-5 : gb5.length());   
        std::cout << "written_now" << written_now << std::endl;
        gb5.erase(0, written_now);
        std::this_thread::sleep_for(std::chrono::seconds(3));
    }
}
  • basically microsoft's _write function becomes unstable somewhere above INT_MAX

@nielsdos
Copy link
Member

The patch currently works.
It's not that it becomes unstable, it's peak idiotic API design actually: the _write function returns an int. So when the size becomes larger than INT_MAX, the return value becomes negative which is confused for an error that happened because error codes are also negative...

@divinity76
Copy link
Contributor Author

The patch currently works. It's not that it becomes unstable, it's peak idiotic API design actually: the _write function returns an int. So when the size becomes larger than INT_MAX, the return value becomes negative which is confused for an error that happened because error codes are also negative...

so basically, if you try to use _write() above INT_MAX, it becomes impossible to differentiate between error and success 😵‍

@nielsdos
Copy link
Member

nielsdos commented Jan 23, 2024

The following works too (tested), and do less system calls probably:

$ git diff main\streams\plain_wrapper.c
diff --git a/main/streams/plain_wrapper.c b/main/streams/plain_wrapper.c
index e9a30f3334..c5aa807c4f 100644
--- a/main/streams/plain_wrapper.c
+++ b/main/streams/plain_wrapper.c
@@ -358,10 +358,11 @@ static ssize_t php_stdiop_write(php_stream *stream, const char *buf, size_t coun
			count = UINT_MAX;
		}
		bytes_written = _write(data->fd, buf, (unsigned int)count);
+		if (errno != 0) {
 #else
		ssize_t bytes_written = write(data->fd, buf, count);
-#endif
		if (bytes_written < 0) {
+#endif
			if (PHP_IS_TRANSIENT_ERROR(errno)) {
				return 0;
			}
 

@nielsdos
Copy link
Member

so basically, if you try to use _write() above INT_MAX, it becomes impossible to differentiate between error and success 😵‍

Indeed, a real footgun. It's possible to check errno instead but yeah...

@divinity76
Copy link
Contributor Author

divinity76 commented Jan 23, 2024

bytes_written = _write(data->fd, buf, (unsigned int)count);
+		if (errno != 0) {

makes me uneasy, bytes_written may be negative and then return bytes_written; may return negative rather than positive,
and then the whole php_stdiop_write function may return negative values for a successful write, that will probably break something further up 🤔

@nielsdos
Copy link
Member

Yeah that's true. And would also cause problems in 32-bit where sizeof(ssize_t) == sizeof(int)...
I think your patch as-is is fine. Not sure about the heavy test case in CI, but deferring the final say to Jakub.

@SakiTakamachi
Copy link
Member

It seemed like there was a lot of discussion going on while I was getting ready in the morning.
I'm not very familiar with it so my opinion doesn't really matter, but it looks good to me.

Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable, just test needs to not use tmpfile and one NIT possibly.

main/streams/plain_wrapper.c Outdated Show resolved Hide resolved
ext/standard/tests/file/file_put_contents_5gb.phpt Outdated Show resolved Hide resolved
@divinity76 divinity76 requested a review from bukka January 29, 2024 19:58
@divinity76
Copy link
Contributor Author

time to merge this?

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this change and it works for me.
Code looks good. Waiting if bukka has anything left to say on this as he was the one requesting changes 🙂

main/streams/plain_wrapper.c Outdated Show resolved Hide resolved
ext/standard/tests/file/file_put_contents_5gb.phpt Outdated Show resolved Hide resolved
ext/standard/tests/file/file_put_contents_5gb.phpt Outdated Show resolved Hide resolved
Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, just one minor thing. I will also try to find some time to try the test on my local Windows to see if that actually works.

ext/standard/tests/file/file_put_contents_5gb.phpt Outdated Show resolved Hide resolved
@divinity76 divinity76 requested a review from bukka February 9, 2024 15:58
Copy link
Member

@bukka bukka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it on Windows and after modifying that skip check as suggested, it seems to work fine. Fails without the change and pass with the change...

Co-authored-by: Jakub Zelenka <bukka@php.net>
Copy link
Member

@SakiTakamachi SakiTakamachi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me except for one nits about code style.

(edit)
But I'm not as familiar with this as other members, so you should wait for other members' opinions.

ext/standard/tests/file/file_put_contents_5gb.phpt Outdated Show resolved Hide resolved
Co-authored-by: Saki Takamachi <34942839+SakiTakamachi@users.noreply.github.com>
@bukka bukka closed this in 2343791 Mar 9, 2024
devnexen added a commit to devnexen/php-src that referenced this pull request Apr 20, 2024
MSDN pages mention the buffer size upper limit is INT_MAX not UINT_MAX.
inspired by phpGH-13205.
devnexen added a commit that referenced this pull request Apr 20, 2024
MSDN pages mention the buffer size upper limit is INT_MAX not UINT_MAX.
inspired by GH-13205.

Close GH-14017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

file_put_contents fail on strings over 4GB on Windows 10
4 participants