Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add missing requirements for internal file pointers #1134

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

mindplay-dk
Copy link
Contributor

@mindplay-dk mindplay-dk commented Jan 11, 2019

I am proposing the following amendment to PSR-17.

I am aware of the amendments clause of the bylaws:

Following the rules of the workflow bylaw, once a PSR has been “Accepted” the PSR meaning cannot change, backwards compatibility must remain at 100%, and any confusion that arises from original wording can be clarified through errata.

The rules for errata are covered in the workflow bylaw, and only allow non-backwards compatible clarification to be added to the meta document. Sometimes, modifications will be necessary in PSR document itself, and this document outlines those cases.

The question here is what exactly is implied by "backwards compatibility".

In the usual sense of "backwards compatibility" with regards to a class, for example, the addition of a new feature (such as a new method or optional argument) maintains "backwards compatibility", in the sense that it doesn't change something that already exists.

In the same sense, the omission of this important detail in the specification doesn't change something that already exists, but rather adds something that was missing and thereby addresses a problem with the existing specification.

While this will render some existing implementations "buggy", these were already "buggy", in the sense that they can't actually be used with existing emitters or with predictable results, as per the circumstances described in proposed section 5.7 of PSR-17-http-factory-meta.md.


The issue was debated on the forum here:

https://groups.google.com/forum/#!topic/php-fig/S5YIw-Pu1yM

https://groups.google.com/forum/#!topic/php-fig/mv9-MWDVVBM

@mindplay-dk
Copy link
Contributor Author

@Crell I've added an Errata section per your feedback on the forum.

I also added erratum references in the spec itself.

accepted/PSR-17-http-factory-meta.md Outdated Show resolved Hide resolved
@@ -290,6 +290,12 @@ responsible for the creation of the PHP resource to use as the basis for the
stream, and therefore also assumes responsibility for the read/write pointer
state of the resource and resulting stream.

And finally, the instances created by the `RequestFactoryInterface()`,
`ResponseFactoryInterface()` and `ServerRequestFactoryInterface` factories
must generate instances that return an empty body-stream in read and write
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the wording could be improved here:

must generate instances that contain an empty StreamInterface, in read/write mode, as the body

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same goes for other places where this wording is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementations of PSR-7/17 should defer the StreamInterface instance creation until getBody() is called for the first time, since you may otherwise run out of OS file handles. (Linux has a limit of 1000 by default.)

So it doesn't have to "contain" a stream instance, and probably shouldn't - it just has to "return" one.

I had initially written "read/write" as you suggest, but changed it to "read and write" for clarity, since the "/" could be interpreted as "or", which would be incorrect.

But I will change the wording from "body-stream" to StreamInterface everywhere.

@@ -51,6 +51,8 @@ interface RequestFactoryInterface
}
```

The created instance MUST return an empty body-stream in read and write mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed elsewhere, the spec itself cannot be changed. We can only update the meta document with errata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least as far as adding MUST statements go.

Perhaps it is acceptable to add SHOULD statements... @php-fig/secretaries ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the bylaws say so?

I think you should vote on an exemption here - the spec is considerably less useful without these amendments being visible where they should be. If an implementer has to dig portions of the spec out of the meta section (which isn't even in the same file) that's quite a serious problem.

If what you're saying is, this isn't an "amendment" and can't be considered "errata", then it sounds like we're back to the idea of deprecating PSR-17.

Given the fact that the current spec does not guarantee interoperable implementations, in my opinion, this is errata. The definition of errata in the bylaws is somewhat ambiguous.

As I've argued before, this isn't a "breaking change" in the sense that it's going back on something that was already specified - we aren't making changes versus the previous specification, we're adding something that was unspecified and is currently inconsistent among existing implementations.

I sincerely hope we can amend rather than putting the entire community through the painful process of replacing the entire standard with a new one?

@mindplay-dk
Copy link
Contributor Author

Let me know if there's anything else I should do here?

Copy link
Contributor

@shadowhand shadowhand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing this with the other FIG working group members, I think that the beginning the wrong place to put the pointer. Instead, the pointer should be placed at the end of the stream. This is consistent with how resource handling works natively in PHP.

From the standpoint of PSR-17 it makes more sense to be able to add onto a given stream than it does to unintentionally perform a partial overwrite by forgetting to manually seek to the end of the stream. Despite the assertion that "rewinding the stream could have unacceptable performance implications" there is no proof given for that reasoning. Secondly the assertion that "an emitter ... can't simply rewind() the stream" may be technically correct but emitters can use the the isSeekable() method to determine if rewind() can be called.

This is also consistent with PSR-7, where the StreamInterface::__toString method declares that:

This method MUST attempt to seek to the beginning of the stream before reading data and read the stream until the end is reached.

Forcing the stream pointer to be at the beginning is awkward for existing resources because the only way to append to the stream would be to manually seek to the end of the stream, due to the fact that there is no counterpart to StreamInterface::rewind() that fast-forwards to the end of the stream.

For the reasons outlined above, please update this PR to document that streams created with StreamFactoryInterface SHOULD always place the pointer at the beginning of the stream. (We cannot use MUST because that would constitute a breaking change in the specification.)

@mindplay-dk
Copy link
Contributor Author

the pointer should be placed at the end of the stream. This is consistent with how resource handling works natively in PHP.

No, it isn't.

When you open a file, the file pointer is at the beginning of the stream.

This is true of PHP and any other language I can think of, and probably for the same reasons I cited in the additions to the meta: unpredictable performance, most likely use-case, and the fact that some types of streams can't be rewinded and/or aren't repeatable.

From the standpoint of PSR-17 it makes more sense to be able to add onto a given stream...

Onto a given stream, yes - not onto existing content being put into a new stream.

...than it does to unintentionally perform a partial overwrite by forgetting to manually seek to the end of the stream

Yes, that is an unlikely use-case.

It's also an extremely unlikely use-case to create a stream with existing content with the intent to write more content to the same stream.

There are basically two likely use-cases for createStream():

  1. Create a stream with existing content with the intent to read/emit that content.
  2. Create an empty stream with the intent to write to the stream.

A third use-case where you put some content into a stream and then continue writing to it is first of all unlikely - if given a stream with some existing content, you will almost certainly corrupt the stream by appending to it; there are almost no data formats that would permit it. If creating your own stream with existing content, it's more likely you will start with an empty stream and then write everything in increments - which you can just do, so there isn't any need for a third use either.

emitters can use the the isSeekable() method to determine if rewind() can be called.

And then? Throw an exception? ... And then what's the alternative?

Despite the assertion that "rewinding the stream could have unacceptable performance implications" there is no proof given for that reasoning

Suppose I'm reading from an S3 stream over the network - reading the entire stream twice definitely has unacceptable performance implications. (As it happens, this is a real use-case for us.)

This is also consistent with PSR-7

It is not.

PSR-7 does not specify anything about creation of streams.

I'm not sure what your point is with regards to __toString() - is is worded as "MUST attempt", implying what we know, that it may not be possible. PSR-7 also has getContents() which does not attempt to rewind - I'm not sure why you think consistency with __toString() is more important, they're just two different methods for two different use-cases.

(Note that the usefulness of __toString() is quite limited in the first place - it works only for smaller streams; an emitter can't safely rely on it, since it will have unpredictable memory usage and/or will simply crash for longer streams as you hit PHP's memory limit. An emitter will generally read and pass on the stream content in chunks, so __toString() is probably mostly useful for e.g. debugging use-cases.)

Forcing the stream pointer to be at the beginning is awkward for existing resources because the only way to append to the stream would be to manually seek to the end of the stream

But that is not what the proposed amendment says - for existing resources, it says literally the opposite.

Yes, for createStream() and createStreamFromFile(), it says:

The stream MUST be created with the current position of the file read/write pointer at the beginning of the stream.

This is consistent with opening a file with fopen() - this is useful for existing content, as well as for creating an empty stream to be populated with new content.

For createStreamFromResource(), it says:

The current read/write position of the given PHP resource MUST NOT be modified.

This is useful for existing resources.

It is only wrapping an existing PHP stream handle, and therefore doesn't make any assumptions about what state you may or may not want that stream to be in.

please update this PR to document that streams created with StreamFactoryInterface SHOULD always place the pointer at the beginning of the stream

The current meta wouldn't make any sense, and I wouldn't know what to put there instead.

We cannot use MUST because that would constitute a breaking change in the specification.

It constitutes a breaking change only in implementations that get this wrong in the first place.

We currently have a 50/50 split among implementations on this point, which is problematic only because we can't count on streams created with existing content being positioned so they're ready to read, and because streams aren't consistently rewindable.

You can seek to the end of any stream, if that's what you want - but you can't necessarily seek back to the beginning, and if you have to try, this can have performance implications, or simply might not work at all.

The language opens streams with the cursor at zero for those reasons.

That said, the note regarding openStreamFromFile() probably needs to change, as this is really a wrapper around fopen(), which lets you specify with the $mode argument how the file pointer will be positioned - e.g. a will position the file pointer at the end of the file, while r or w will position the file pointer at the start.

I'm not sure how to word that exactly, but it probably shouldn't prescribe that this factory function do anything other than pass the arguments to fopen() and leave the stream as-is?

And createStreamFromResource() is basically a given, so I think maybe at this point we're only really debating createStream(), which doesn't let you specify a mode? I'd assume rw, you seem to assume a, which as explained would be problematic.

Wanting to start with createStream("some content") and then write("more content") simply isn't necessary - you can just createStream("") and then write() everything. This can't come at the expense of predictable performance, and emitters have to be able to assume streams are ready to read and emit, since they can't necessarily rewind or repeat all streams.

@mindplay-dk
Copy link
Contributor Author

I've updated the spec and meta with regards to createStreamFromFile() behaving like fopen().

Regarding MUST vs SHOULD, please give this further consideration.

Adding an option to harmonize this behavior across implementations doesn't really solve the problem - it's almost as bad as being unspecified: since the behavior remains unpredictable, an emitter would still need to try to rewind the stream, which is the problem we're trying to address.

Again, whether this is a "breaking change" is a matter of interpretation. It was previously unspecified, so it's not a breaking change in the sense that it's reverting a previous decision - no decision was made, and this is an amendment to address that problem and achieve consistency.

Alternatively, I think we should consider creating a new PSR that does provide consistency.

The PSR isn't useful as it is, and adding optional consistency doesn't solve the problem.

@vdelau
Copy link
Member

vdelau commented Jan 25, 2022

@shadowhand as Editor/Maintainer for this PSR, can you indicate if this PR can be merged, closed, or needs more work?

@shadowhand
Copy link
Contributor

Returning to this (after nearly 3 years), I agree with the proposed changes and with the argument made by @mindplay-dk as to why.

@vdelau
Copy link
Member

vdelau commented Jan 26, 2022

In its current form, I believe this PR is not acceptable as an amendment. It introduces backward incompatible requirements to the specification, by using the key word 'MUST'. In doing so, all current implementations that do not meet these requirements suddenly would be non-compliant.

From the PSR Workflow bylaw:

Errata clarifications may only be added to the meta document, not to the spec itself, as items in their own section. The spec itself may only be minimally edited to point readers toward a relevant errata if appropriate.

The PSR Amendments bylaw offers the option to add annotations to the original spec:

If Errata is added which is deemed important enough by whoever is initiating the errata vote, annotations may be placed in or near the offending line so that readers know to view the errata for more information, with a link containing an anchor to that specific piece of errata.

As such, I believe one of two things should happen:

  • Amend the PR to be in line with the bylaws: Create an errata in the meta document and add annotations to the spec.
  • Start a new PSR to supersede PSR-17.

I would propose to bring this as a discussion to the list.

@vdelau vdelau requested review from vdelau and removed request for michaelcullum, iansltx and mstaples January 26, 2022 10:04
Comment on lines +126 to +128
*
* The stream MUST be created with the current position of the file read/write
* pointer at the beginning of the stream. (Errata 9.1)
Copy link
Contributor

@shadowhand shadowhand Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mindplay-dk on closer inspection, this one is not correct. Given that this is an "open + write" operation, it would be natural for the pointer to be at the end of the stream and there is no guarantee that the stream can be rewound, making it impossible to place the pointer at the beginning of the stream.

Instead, I believe this can be combined with the previous line and read:

The stream SHOULD be created with a temporary resource, and the file read/write pointer SHOULD be placed consistently with an fwrite() operation. (Errata 9.1)

@@ -130,6 +139,10 @@ interface StreamFactoryInterface
*
* The `$filename` MAY be any string supported by `fopen()`.
*
* The stream MUST be created with the current position of the file read/write
* pointer positioned consistently with that of `fopen()` with regards to
* the `$mode` flag. (Errata 9.2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/MUST/SHOULD/

@@ -143,6 +156,8 @@ interface StreamFactoryInterface
*
* The stream MUST be readable and may be writable.
*
* The current read/write position of the given PHP resource MUST NOT be modified. (Errata 9.3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/MUST/SHOULD/

Comment on lines +351 to +352
The state of the created temporary resource was unspecified - per section 5.7, the
requirement was added to position the temporary resource at the beginning of the stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my other comment, this should read:

Suggested change
The state of the created temporary resource was unspecified - per section 5.7, the
requirement was added to position the temporary resource at the beginning of the stream.
The state of the created temporary resource was unspecified - per section 5.7,
the recommendation was added to position the pointer consistent with an
`fwrite()` operation.

Comment on lines +356 to +357
The state of the created temporary resource was unspecified - per section 5.7, the
requirement was added to position the resource consistently with `fopen()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The state of the created temporary resource was unspecified - per section 5.7, the
requirement was added to position the resource consistently with `fopen()`.
The state of the created temporary resource was unspecified - per section 5.7,
the recommendation was added to position the pointer of the resource consistently
with `fopen()`.

Comment on lines +361 to +363
The requirements did not specify whether this method might affect the state of the given
resource - per section 5.7, the requirement was added to clarify that this method MUST NOT
modify the position of the given resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The requirements did not specify whether this method might affect the state of the given
resource - per section 5.7, the requirement was added to clarify that this method MUST NOT
modify the position of the given resource.
The requirements did not specify whether this method might affect the state of the given
resource - per section 5.7, the recommendation was added to not modify the pointer position
of the given resource.

@mindplay-dk
Copy link
Contributor Author

Sorry, this was 3 years ago - I have no recollection of anything and I'm not currently using PHP.

Do what you want. 🙂

@vdelau
Copy link
Member

vdelau commented Jan 28, 2022

Sorry, this was 3 years ago - I have no recollection of anything and I'm not currently using PHP.

Thanks for letting us know!

Do what you want. 🙂

@shadowhand How do you want to proceed?

@shadowhand
Copy link
Contributor

@vdelau given that this sat for years and no one pushed for this to be resolved, I don't think this needs to happen.

That said, I do agree that not defining this in the spec was an oversight. If there is a way forward that involves no vote, or a simple CC acceptance vote, I think we should complete this.

@vdelau
Copy link
Member

vdelau commented Jan 28, 2022

I think the least intrusive way is to add an clarifying errata in the meta document and reference it from the main spec. I expect that the errata would be considered substantial and thus require a CC approval vote.

In any case, any change made has to be 100% backward compatible. That means that we can not guarantee that conforming implementation will implement these changes and users can not rely on the interface doing The Right Thing. Any errata would amount to a recommendation or a 'SHOULD' level requirement.

Comment on lines 125 to +128
* The stream SHOULD be created with a temporary resource.
*
* The stream MUST be created with the current position of the file read/write
* pointer at the beginning of the stream. (Errata 9.1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The stream SHOULD be created with a temporary resource.
*
* The stream MUST be created with the current position of the file read/write
* pointer at the beginning of the stream. (Errata 9.1)
* The stream SHOULD be created with a temporary resource, and the file pointer SHOULD
* be placed consistently with an fwrite() operation. (Errata 9.1)

@shadowhand
Copy link
Contributor

@vdelau do you think this PR, with the suggestions I have made, are a reasonable change for the errata?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants