Skip to content

Conversation

divinity76
Copy link
Contributor

Ultimately i wish to make LOCK_SH usable in file_get_contents. With baby steps, i think the first step is to make sure FILE_USE_INCLUDE_PATH does not collide with LOCK_SH (which, prior to this patch, it does)

it's annoying to have to write

$fp = fopen("file","rb");
flock($fp, LOCK_SH);
$content = stream_get_contents($fp);
flock($fp, LOCK_UN);
fclose($fp);

when all i want to write is
$content = file_get_contents("file", LOCK_SH);

also it's really weird that file_put_contents support LOCK_EX but file_get_contents does not support LOCK_SH.

Ultimately i wish to make LOCK_SH usable in file_get_contents. With baby steps, i think the first step is to make sure FILE_USE_INCLUDE_PATH does not collide with LOCK_SH (which, prior to this patch, it does)

it's annoying to have to write
```
$fp = fopen("file","rb");
flock($fp, LOCK_SH);
$content = stream_get_contents($fp);
flock($fp, LOCK_UN);
fclose($fp);
```
when all i want to write is
$content = file_get_contents("file", LOCK_SH);

also it's really weird that file_put_contents support LOCK_EX but file_get_contents does not support LOCK_SH.
@divinity76 divinity76 marked this pull request as draft May 17, 2023 21:50
@divinity76 divinity76 marked this pull request as ready for review May 17, 2023 23:11
The userland lock constants are different than the C-land constants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is wild. Let's totally just iterate over a range of ints as flag values and test the behavior 🤦🏻

But, that's not your fault. "Looks good to me."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe yeah, and thanks for checking! btw the old check was actually flawed and would miss some incorrect flags, for example flags = 10 aka flags= (1 << 1) | (1 << 3) is, and always was, incorrect for file() because file() doesn't support PHP_FILE_APPEND, but the old $flags error-check code would miss it. the new check doesn't. As an added bonus, seems the new flags check is about 7% faster too: https://quick-bench.com/q/j3k_unXcE91gVdTRH9Dm0nxpDUA
(probably because it doesn't do the flags < 0 check)

@bukka
Copy link
Member

bukka commented Jun 15, 2023

So file_get_contents("file", LOCK_SH) is surely not going to work unless we change the second arg from bool to int and introduce unnecessary BC break: So I don't think this is a good idea. We could maybe introduce another arguments for locking and then we don't need this redefinition of flags potentially.

I'm not to sure if file constants value redefinition is a good idea in general as you could see in one of the tests, some people might actually use the actual int value. This can be even saved in database or some serialized data so it becomes very hard to catch such break. I'm not saying that this is something we can't break but I think there should be a really good reason to do so which I currently don't see here.

It would be great if you could propose the API changes to file_get_contents on internals mailing list as this is not likely something we can resolve here and might potentially require RFC as it will likely result in some bikeshedding...

@divinity76
Copy link
Contributor Author

@bukka for people using documented arguments, i'm planning 0 BC break: 3 arguments are documented to be valid for file_get_contents()'s 2nd argument: bool(false), bool(true), and FILE_USE_INCLUDE_PATH, and I'm planning to propose changing the argument to bool|int $flags, where bool(false) is equivalent to $flags=0 and bool(true) is equivalent to $flags=FILE_USE_INCLUDE_PATH, and $flags accepting both FILE_USE_INCLUDE_PATH and LOCK_SH (and the bools)

ofc, that leaves the edge-cases of people using non-documented arguments, like 1 and people saving the arguments in a database (don't think anyone actually does the latter; i've surely never encountered it),
... but IMO making FILE_ compatible with LOCK_ is still justified: they're related concepts and it makes sense to use them together, and it's a shame that compatibility was not considered way back when the FILE_ constants were created.

forget file_get_contents for a second, imagine having to write:

$fp = fopen("filename","rb");
flock($fp, LOCK_SH);
$lines = file("filename", FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
flock($fp,LOCK_UN);
fclose($fp);

when all you want to write is:

$lines = file("filename", LOCK_SH | FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);

or imagine having to write:

$fp = fopen("filename","rb");
flock($fp, LOCK_SH);
readfile("filename");
flock($fp,LOCK_UN);
fclose($fp);

when all you want to write is:

readfile("filename", LOCK_SH);

there's several places that would benefit from LOCK_ and FILE_ being compatible.

It would be great if you could propose the API changes to file_get_contents on internals mailing list as this is not likely something we can resolve here and might potentially require RFC as it will likely result in some bikeshedding...

They're not interested: I tried brining this up 13 months ago (2022-04-23): https://marc.info/?l=php-internals&r=1&w=2&b=202204 - nobody wanted to discuss it.

I also tried brining it up 10 days ago (2023-06-06), still nobody wants to discuss it: https://marc.info/?l=php-internals&m=168604887630519&w=2

@bukka
Copy link
Member

bukka commented Jun 16, 2023

for people using documented arguments, i'm planning 0 BC break: 3 arguments are documented to be valid for file_get_contents()'s 2nd argument: bool(false), bool(true), and FILE_USE_INCLUDE_PATH, and I'm planning to propose changing the argument to bool|int $flags, where bool(false) is equivalent to $flags=0 and bool(true) is equivalent to $flags=FILE_USE_INCLUDE_PATH, and $flags accepting both FILE_USE_INCLUDE_PATH and LOCK_SH (and the bools)

Honestly I really don't like this conversion magic. Personally I don't like anything bool|int because it seems like a hack to me. What would be actually worse is keeping the name use_include_path also for flags as it is really confusing especially in combination with named params. Changing the name is of course a BC break as the name is part of the API since 8.0 (introduction of named params). You could argue that people should use positional argument as it is a second argument but we allow named param here and some people might be already using that as it is more explanatory. So we need to keep that name IMO.

I don't really understand why all of this cannot be implemented using extra parameter and why we would need flags here? It should not be an issue if it's the last parameter as it can easily be called by name (using named params). I would actually argue that file_put_contents should use separate parameter for locking which might be useful to add. Would be probably more just for consistency if this is introduced as we cannot remove LOCK_EX from flags for BC reasons.

Also note that the flags like FILE_SKIP_EMPTY_LINES and FILE_IGNORE_NEW_LINES are really meant just for file function and are not supported by stream layer so there is not much point to support them in file_get_contents. I think it makes sense really in the array conversion but using that for reading the whole file is questionable at least.

In terms of mailing list, it is quite usual to not get any reactions for technical topics and one really need to look to the details of the implementation to see the impact. This particular change has got my objection so it is not going to be merged without RFC in the current form. So to move forward you can choose between those two options:

  • Make RFC for the current implementation detailing all changes for all functions impacted.
  • Change it to use separate parameter for locking and if there are no objections here, write message to the internals detailing all changes and if there is no response, we could merge it without RFC. If there are objections, it would of course need RFC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants