Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language evolution overview proposal #2

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nikic
Copy link
Member

@nikic nikic commented Feb 18, 2020

This is an overview proposal for different approaches to handle opt-in backwards incompatible changes. This is not a specific proposal: It is intended as a discussion starting point, so we can decide on the general direction we want to pursue.

Rendered Proposal


We may or may not want to support old editions indefinitely. Even if an edition only has a finite life-time, it is still a useful tool for managing version migrations, because it eliminates dependencies between projects. Each project can update to a new edition independently, without requiring updates to its dependencies first, or forcing an upgrade on its reverse-dependencies.

A related question is how often new editions are released. With the fine-grained declare approach it would probably be fine to add them in any minor version. With editions, it's unclear if we would want to also create a new edition for each minor version, even if there are few changes. Should these only be created for each major version instead? Or should they be created on demand, as we accumulate relevant changes?
Copy link
Member

@bwoebi bwoebi Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think editions should not be permanent. I do view editions as a migration path to opt-in when you want to. But ultimately after enough versions (I suggest each edition must exist for at least one major version, so probably 4-7 years), the editions of the previous major version should be merged into the main line and become default.
So the edition released with 8.2 should remain until 10.0. An edition released with 9.0 should also remain until 10.0. An edition released with 9.1 should remain until 11.0.

I think we should create an edition for each minor version (if there is something at all to add). So that people who want to gradually migrate can do it easily.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This. The common issue I have with P++ and Editions is that they both seem to assume an indefinite period to language variances. I'd much rather see them viewed as migration periods. ((And in that view, P++ doesn't really work, while editions do))

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would agree. The longest I could see an edition being maintained is "next major". So if there's an 8.0.0 edition, then 8.x supports both "Edition 7" and "Edition 8". PHP 9 would drop support for Edition 7 and only support Edition 8 and Edition 9. Etc. A mid-major edition would probably also get dropped at the next major.

Anything longer leads to the same kind of proliferation that individual declares does, and complicates matters both for engine devs and userspace devs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not agree more with what @bwoebi wrote.

@bwoebi
Copy link
Member

bwoebi commented Feb 18, 2020

If we move forward with editions bound to minor versions, they should probably be named edition=8.0 etc. instead of by year.


Editions are specified at the package (in Rust: crate) level, and opt-in to a bundled set of backwards-incompatible changes. Different packages using different editions remain compatible. Editions are intended to be supported forever, and there are some limitations on what kind of changes are permitted in editions (one of the significant limitations is that no standard library changes are allowed).

Editions were also intended to serve as a rallying point from a marketing perspective, though I believe that this coupling between a purely technical mechanism and the marketing angle was found to be confusing and detrimental in hindsight.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent document @nikic, thank you for taking the effort to write it!

Re this point, do we (the PHP community) have links to people in the Rust community who can confirm this point, and how they found it? It's very helpful if we can learn from other language communities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See rust-lang/rfcs#2857 (comment) for one thread on this topic.


"Editions" are a concept popularized by the Rust programming language. See the [edition guide](https://doc.rust-lang.org/edition-guide/editions/index.html) and the [epoch RFC](https://github.com/rust-lang/rfcs/blob/master/text/2052-epochs.md) for more information.

Editions are specified at the package (in Rust: crate) level, and opt-in to a bundled set of backwards-incompatible changes. Different packages using different editions remain compatible. Editions are intended to be supported forever, and there are some limitations on what kind of changes are permitted in editions (one of the significant limitations is that no standard library changes are allowed).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be useful to split what is described in this document into two different things: syntax changes and standard library changes.

While syntax changes seem to get easily fixed by the editions mechanism, standard library changes don't.

I wonder how feasible it would be to detach ext/standard from php source and version it separately. Tools like composer allow extension version requirements and could support avoiding breaking changes among packages and maybe a smoother transition between ext/standard versions.

Smells weird even to me, but might be worth it taking this into consideration.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it (I haven't read this document fully yet), the key advantage of editions is that different libraries can be installed within the same project and use different editions.

To achieve the same with module versions, we'd need to fundamentally change name resolution to work more like node.js, so that e.g. you can have multiple implementations of strlen available, and a package defines which it wants to import.

With editions, you would instead have one implementation of strlen, but a way for packages to set flags for how it should behave, like a kind of environment variable. Note that this is already true of strict_types: strlen(123) will return 3 under strict_types=0 but throw an error under strict_types=1.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put something more concrete no the table here. Parameter ordering, that old chestnut. Would we teach in_array() to take its arguments in different order depending on the caller's edition? I would say "No", but I expect that to be an ask once such functionality is available, and from a technical standpoint it is possible, albiet ugly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think with parameter ordering would not be a good example at all. And I think among all changes php could benefit from, parameter orderings are the least important ones IMO.

Versioning the standard library could open doors for easier deprecations/polyfills. Being clearer:
Say ext/standard:1.0.0 has strlen() compatible with PHP <=7.4. Later on, ext/standard:2.0.0 might present a String class that already handles multibyte strings and contains a length() method and therefore deprecating strlen() completely.

If this can be managed via composer (ext-string >= ^2.0.0), one can easily polyfill stuff they need and the Standard library can still move on without having to care about older functions.

Does it make sense?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nawarian If all you're doing is introducing new functions and deprecating old ones, you don't need any changes to versioning, just official polyfills to keep using them after they've been removed. At which point, you've lost the main benefit of removing them at all, because you've still got to maintain them in some form.

If you're changing the definition of strlen itself, then versioning doesn't help, because the whole program has to decide which version to use, so might as well just set a min/max PHP version. That's why I mentioned node.js: there, one module can include "strings-1.0" and another simultaneously include "strings-2.0". That flat-out doesn't work in PHP, because strings-1.0 will declare its version of strlen, then strings-2.0 will try to redeclare it, and throw an error.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IMSoP What if editions would have separate symbol tables for functions, constants and classes?
If in next edition for eg. strlen would be changed it'll be looked up in editions symbol table and the issue with redeclare no longer exists. Therefore if edition doesn't change strlen it'll be just linked in multiple editions symbol tables.

Does that make sense?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me we already have a perfectly workable versioning system in place for library functions: Namespaces.

We leave the existing functions alone — and possibly deprecate them eventually — and we create new functions in a small group of well-known namespaces.

The @sgoleman's point about parameter ordering, that could be addressed by creating a new namespace and "fixing" the parameter order.

All we have to do is embrace this concept of using namespaces for newer functions in core.

@rask
Copy link

rask commented Feb 18, 2020

Editions and PHP version lifecycles

You have not touched on the subject of interaction between release lifecycles and editions from the looks of it.

Let's say PHP 7.4 has editions 1, 2, and 3 available. PHP 8.0 has editions 1, 2, 3, and 4 available. All features in edition 1 are deprecated in PHP 8. Next we get PHP 9.0 , which removes the deprecated features, and supports editions 2, 3, and 4, rendering edition 1 unusable. How will the edition be phased out?

Opting in to editions or similar

Introducing some package config/syntax seems a bit much. Adding a new opening tag would itself be a huge compatibility break in terms of tooling that operate on PHP files. A new declare for editions is the best fit if editions are to be introduced I think.

Catering to Composer while sounding neat, to me sounds a little outside the scope of evolving PHP core.

@dkarlovi
Copy link

How will the edition be phased out?

As I understood it: it doesn't get phased out?

A Rust compiler will support all editions that existed prior to the compiler's release...


Each new edition/declare adds additional maintenance burden, because it requires supporting two different behaviors in the same implementation. In Rust the concept of editions includes a commitment to support them indefinitely (though in practice, backwards-compatibility breaks do sometimes get backported to earlier editions, just later).

We may or may not want to support old editions indefinitely. Even if an edition only has a finite life-time, it is still a useful tool for managing version migrations, because it eliminates dependencies between projects. Each project can update to a new edition independently, without requiring updates to its dependencies first, or forcing an upgrade on its reverse-dependencies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean for the absence of a declare(edition)? If that's equivalent to declare(edition='7.4'), then this should note whether an edition would become mandatory if support for old editions is dropped.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I don't generally want to support all editions indefinitely, I think there's an argument in favor of supporting "php1995" indefinitely as the default fallback for not including an edition declare. ((or call it php2020 if you'd like, whatever equates to the current status quo)).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could codename "legacy" pre-edition PHP "hindsight". Because then hindsight will always be php2020.

...I'll show myself out now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh...


While having fine-grained declares has its advantages, and is the option I advocated for initially, I think that concerns about the proliferation of declares, and exponential explosion in language variants are very much real. Editions, even if new ones are introduced regularly, significantly reduce the number of language dialects, thus reducing mental burden for developers and maintenance burden on our side.

Once we go with editions, there is not a lot of benefit to investing into a new mechanism for package-scoped declares, all of which have their own issues. The per-file declare does have its own advantages, in particular that it keeps things self-contained, and that it allows partial upgrades of a codebase. The ergonomics of per-file declares are of course worse, but as people would essentially be replacing `declare(strict_types=1)` with `declare(edition=2020)`, it's not worse than the current situation. (Furthermore some kind of package-scoped declare mechanism can still be introduced at a later time.)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd definitely agree that editions would make the code easier to read, but bundling changes that are hard for a given codebase to be migrated to and validated is a disadvantage (e.g. strict_types, in projects using strings from HTTP requests, db results, etc. as int in some places).

  • The arguments against (e.g. your initial reasons) could be included in the RFC

It's inconvenient but not insurmountable, though; tooling could automatically convert return $unknownType; to return weak_coerce_to_int($unknownType); if support for the edition with strict_types=0 was dropped. (in cases where the real type was uninferable).

(or $unknownType->methodName($a, $b) to weak_call_user_func([$unknownType, 'methodName'], $a, $b))

It may be useful to add migratory editions, which would behave like an older edition and emit notices that provably didn't affect the behavior of the program (e.g. syslog() to the syslog handler or stderr, but don't call the user-defined error handler)

@sgolemon
Copy link

Overall +1 on the work you're doing here. Clearly we need to have several conversations to narrow in on a roadmap, but this is a good starting point to summarize the many discussions we've had in recent years.

@MichelJonkman
Copy link

Personally I like the opening tag solution. It's very easy to edit the PHP file templates using a decent editor and keeps our files as clean as they've always been.

@mallardduck
Copy link

mallardduck commented Feb 19, 2020

Editions and PHP version lifecycles

You have not touched on the subject of interaction between release lifecycles and editions from the looks of it.

Let's say PHP 7.4 has editions 1, 2, and 3 available. PHP 8.0 has editions 1, 2, 3, and 4 available. All features in edition 1 are deprecated in PHP 8. Next we get PHP 9.0 , which removes the deprecated features, and supports editions 2, 3, and 4, rendering edition 1 unusable. How will the edition be phased out?

This is really important IMO because while mimicking Rust's concept of editions isn't a bad idea, we can't implement it too directly. Rust is a compiled language which provides different benefits/challenges to implementing eternally supported editions.

In a similar train of thought, how would PHP Versions and PHP Editions affect distributing PHP for production environments. Currently PHP seems to follow a SemVer style versioning scheme and linux distros/hosting panels release these based on Major.Minor versions. So while I like the concept overall, it raises a lot of questions from a logistics perspective.

Let's assume that PHP 8.0 is the first release with editions built in and supports edition 1 (and 0 as default) on release. When distro's start shipping PHP 8.0 would the default mode be to operate identically to PHP 7.4, with any new features accessible only with edition=1?

In the past PHP (again following SemVer) has added great features in minor releases. Things like namespaces, traits, ... array unpacking were added in 5.x minors; and object type, trailing commas, typed properties, FFI, etc were all added in 7.x minors. How would editions effect the OOTB behavior of as minor versions are released with new features like this?

@deleugpn
Copy link

Let's assume that PHP 8.0 is the first release with editions built in and supports edition 1 (and 0 as default) on release. When distro's start shipping PHP 8.0 would the default mode be to operate identically to PHP 7.4, with any new features accessible only with edition=1?

I think this is covered in the document in the following statement.

It should be emaphasized that even if we introduce a mechanism like "editions", that does not imply that all backwards-incompatible changes will be handled through editions, and it also doesn't imply that any kind of backwards-compatibilty break is fine as long as it's based on editions.

PHP stays PHP as it is and features are added with the best effort for BC. A BC break, if small enough, may happen in the main line as it has been for ages. Things that would be supposedly extremely beneficial to the language (such as strict types) that have no other way of landing in php except via opt-in would require the edition declare at the top of the file. This would have little or no effect to distribution and the release process as the main line continues to be the language we all know and love. The current state of the php ecosystem demonstrate that even though it might be a little annoying, people are creating php files with a declare at the top. Replacing one declare with an edition declare seems feasible. People might still dislike the fact that they have to do it, but they got used to it and the proposal relies on that by simply not making it worse (several declare).

@rask
Copy link

rask commented Feb 19, 2020

How will the edition be phased out?

As I understood it: it doesn't get phased out?

A Rust compiler will support all editions that existed prior to the compiler's release...

Yes. Rust does Rust editions, but will PHP we able to adopt that promise of editions never being removed? Or would we have to have a security maintenance period of forever for all major PHP versions from 8 onwards?

@nikic
Copy link
Member Author

nikic commented Feb 19, 2020

Going by the current discussion, it seems like "editions" is the favored approach, with the big open questions revolving about what the support timeline for editions looks like. There are two approaches, which each have their advantages:

Old editions supported forever: This is the Rust model.

  • Pro (user-perspective): Can upgrade PHP versions indefinitely without touching old code (much).
  • Con (user-perspective): The language fragments over time. The number of language dialects always increases, and users need to be aware of them.
  • Con (dev-perspective): We have to expend resources to maintain old editions indefinitely.
  • Pro (dev-perspecitve): The bar for including changes in editions is lower, because there is no requirement that all code eventually updates to them. This allows including changes that legacy programmers are categorically opposed to.

Editions have limited life-time: It should be noted that this still solves the primary motivation of decoupling the upgrade process (libraries/apps can update completely independently, there are no dependencies).

  • Con (user-perspective): You still need to update your code for new editions at some point. You have more time to so, you can avoid dependency issues, but you still need to do the work eventually.
  • Pro (user-perspective): The number of language dialects is limited and we don't fragment the language over time. There's a clear direction in sight.
  • Pro (dev-perspective): We don't have to maintain old editions forever. We can get the internal benefits of changes we introduce in new editions at some point.
  • Con (dev-perspective): Changes introduced in an edition will still apply to everyone at some point. This may cause more resistance to introducing stricter typing and similar.

@bwoebi
Copy link
Member

bwoebi commented Feb 19, 2020

I disagree with the "Pro (dev-perspecitve):" on old editions being supported forever - legacy users want their legacy features, but also the shiny new things except those they don't like.

@nikic
Copy link
Member Author

nikic commented Feb 19, 2020

To expand on @bwoebi's comment, after clarifying otr: Legacy code may still want to update to new editions, and they don't want to stay on an old editions because of a single undesirable change that was made (while everything else is fine). This is a fundamental tension when using "editions" rather than "fine-grained declares", because you can't match and mix as you like, you get things as a bundle.

In this context, one possibility we discussed is to have both the edition and the strict_types declares. On newer editions strict_types will grow to include strict typing in additional areas (e.g. for operators and not just arguments). People commonly suggest adding some extra feature to strict_types, but we always reject this due to backwards-compatibility. With editions it would be possible.

So we would end up with a support matrix that looks something like this:

  • declare(strict_types=0):
    • Baseline.
  • declare(strict_types=1):
    • Strict types in function arguments and returns.
  • declare(edition=2020, strict_types=0):
    • Explicit pass-by-reference required.
    • Arbitrary string interpolation supported.
    • Maybe: Dynamic object properties forbidden.
  • declare(edition=2020, strict_types=1):
    • All the above, and...
    • Strict types in operators.
    • Maybe: Dynamic object properties forbidden. (Unclear whether this should be strict_types only or not.)

This would separate out changes related to stricter typing from other changes.

Not sure how much this would help though, overall. For example one of the recent points of contention was making undefined variables throw. Would that also go under "strict_types" or not? Seems fuzzy.

@deleugpn
Copy link

deleugpn commented Feb 19, 2020

@nikic your last statement seems to present a mix between granular declare and editions. I was under the impression that strict_types would be categorized as a successful test run that led to editions and would get phased out into editions eventually. If instead what you're saying becomes the outcome, there's nothing fundamentally preventing new granular declares to be proposed, increasing the matrix and leading to more mix of editions and granular declares. I know in practice this takes a great hit in the voting stage and it is very unlikely to actually happen.

This part of the discussion is interesting to highlight and align the expectations. With editions taking the lead, will strict types always be supported and be part of the language? Is the door partially opened to more specific declares? Wouldn't it be better to have edition=strict be equivalent to strict_types=1, consider strict_types a phased out project and focus primarily on editions-only without causing a matrix?

@IMSoP
Copy link

IMSoP commented Feb 19, 2020

@deleugpn Nikita's comment explicitly says that having both declares available is one possibility, and discusses the pros and cons of that. We're not at the point where there's a singular proposal for how this should work, we're investigating lots of different variations.

I think there's a fundamental tension whether the different behaviours represent "old vs new", or just "red bikeshed vs blue bikeshed". As I understand it, strict_types was not added as a migration mechanism, but as a compromise between two competing designs of the same new feature. If we tie it into editions, we are effectively saying that strict_types=0 is deprecated, because the only way to use it is to also turn off all other new behaviours.

Rather than expanding the meaning of strict_types, perhaps there could be a concept of "flavour" alongside "edition", with every block of code being in exactly one flavour of one edition. So the current behaviour under declare(strict_types=1) would become declare(edition=2020,flavour=strict). It still leaves the question of how to group behaviours into those flavours, though.

We might still need granular declares anyway. In the "locked classes" discussion, a couple of scenarios were put forward where mostly-strict code would want to dynamically define or unset an object property. Having a granular declare that can be block scoped allows an "escape hatch" for that code, but an edition or flavour switch would be awkward to use.

@Pierstoval
Copy link

Pierstoval commented Feb 19, 2020

I also like the idea of a strict flavour, this is similar to the good'ol "use strict"; in javascript, and allows opt-in per-file strictness in many shapes, that could start with strict typing, but also prevent dynamic object properties, strict arguments for native functions, etc., and more shapes that would appear in the future editions. Even though I think that declare(edition=2020,strict=1); would be clearer and more straightforward to me, but that's a small detail.

@Jean85
Copy link

Jean85 commented Feb 19, 2020

IMHO adding flavours to editions just mixes the edition and fine-grained declares approaches, reducing the overall benefit. Making the editions linear would should reduce the php-src code complexity, because the language would be in a single known state at a time, not on 2^n.

@deleugpn
Copy link

deleugpn commented Feb 20, 2020

If we assume what Sara said about PHP1995 is a valid statement, then wouldn't it be a fair plan to establish strict_types as the first edition and move on from there with every new edition being an increment on top of the previous one?

I'm under the impression that most people using strict_types wants faster evolution at the expense of some BC breaks. It leads to two options in the language: Stick with PHP 1995 edition or adopt a faster BC break pace. People that opt into editions knows that there will be a time where they need to bump to the new edition and going back to editionless might bring more breaks than the upgrade. The end result is a commitment to always take some BC breaks and keep using editions or no editions at all. As Crell said, we're not giving a "make your own language" option. There isn't much room to make everyone super happy without a magic wand.

For me the question to the community would be: If you did opt into strict_types, is it fair for you to automatically be part of the community that will be forced into a bit more BC breaks even though you were not aware of that at the time of declaring strict_types? That way, strict_types become alias of edition=1 and the internals can start working on edition=2 which will be edition=1 + more strictness-like feature and/or some breaks to make errors throw instead of return false, for instance.

The PHP1995 community would be aware that they can miss on some historical fixes if they choose to not use editions. The Editions community would be aware that if they don't keep up, they will be left in a broken state (BC to go back to editionless, BC to move foward). It's not a perfect solution for anyone, but doesn't seem a bad for anyone either.
From my employer perspective, I get to upgrade legacy projects written 15 years ago to the latest version with minimum effort for security reasons while anything new I write can be part of editions.

@bwoebi
Copy link
Member

bwoebi commented Feb 20, 2020

Note that the premise of strict_types was not being a migration path, but being an actual separate, forever supported behavior.
That was what I voted for. (And I wish it to remain as is. But that's not relevant here.)

I would in any case suggest that we do not mix both concerns:

  • indefinitely supported alternate behaviors
  • opt-in behaviors to provide a migration path

The former should have a really high bar. Maybe we sometime will add another, but that has to be very carefully evaluated on a case by case basis, nothing this RFC should address.
Strict types are not a failure, but weak typing is not a failure either.

Whatever shall happen to strict types transcends the scope of this RFC and should be handled separately. (At least if we go with editions, which seems to sort of be the consensus here.)

```php
<?php

package "nikic/php-parser";
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An interesting aspect of declaring a package inline is that it gives tools an unambiguous signal that the file is not self-contained, and that the configuration for that package should be found. It could even be an error to have a package name with no configuration, perhaps using an autoload callback to load the configuration on demand.

Additionally, if the package configuration can only be set once, and not amended, a tool that has found the configuration for that package can assume it is processing the file correctly. A similar advantage holds for OpCache: unlike a namespace or directory-based approach, there is exactly one point of invalidation to track, not a hierarchy.

In general, I think package configuration works better than editions for those options which are truly optional, not merely a migration path for old code. Rather than a sliding scale of "newness" or even "strictness", they let a package author get the advantage of checks they want without the overhead of checks they disagree with, or consider disproportionately burdensome.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts on package definition, were we to ever go that way, would be to assume that user-space standards like FIG can help. For instance, if a "package" were a namespaced definition, just like classes and traits, then you could say "this is part of package My\Stuff", and then the autoloader would kick in to look for

namespace My;

package Stuff { ... }

And FIG can via PSR say that packages follow the same autoloading as classes. And then what goes into a package is... well, whatever we decide it is. Technically it wouldn't even have to correspond to a Composer package, although PSRs could of course make that a recommendation.

That said, if not using packages for behavior flags then there's a limited set of things I would see a package doing at all; Visibility perhaps, but that's it. So I'm not sure if it would do enough to be worth it.

@IMSoP
Copy link

IMSoP commented Feb 20, 2020

@deleugpn

For me the question to the community would be: If you did opt into strict_types, is it fair for you to automatically be part of the community that will be forced into a bit more BC breaks even though you were not aware of that at the time of declaring strict_types?

We cannot ask this question without also asking the opposite: if you opted out of strict_types, is it fair for other new features to be unavailable to you?

@Jean85
Copy link

Jean85 commented Feb 21, 2020

We cannot ask this question without also asking the opposite: if you opted out of strict_types, is it fair for other new features to be unavailable to you?

I don't think that's the right question to ask. IMHO the point should be how much complicated is to NOT bind strict_types to an edition, as a said before. The deciding factor should be maintainability for the core code; we cannot ask the core maintainers too hard a burden. We can just hope that, since strict_types is already implemented, it isn't so hard; otherwise I agree with @Crell:

There will always be language changes you don't like, and don't want to deal with, but have to anyway. That will happen forever, no matter what approach we end up taking. People still complain about using backslashes for namespaces, but sorry, that's the way it is, deal.

@IMSoP
Copy link

IMSoP commented Feb 21, 2020

@Jean85

I don't think that's the right question to ask. IMHO the point should be how much complicated is to NOT bind strict_types to an edition, as a said before.

It doesn't make any sense to say that just because option A has a cost we don't need to even measure the cost of option B. The community made the decision to add strict_types=0 as an option that any user could choose; labelling it as part of a legacy mode should be considered an explicit change of policy, which would need to be justified in terms of costs and benefits like any other change.

@shulard
Copy link

shulard commented Feb 21, 2020

Regarding strict types, it was not introduced as an upgrade path and this flag must not be removed in the future. However, it was decided that the default value was 0.

Today if you don't specify the declare, your code is processed like you wrote declare(strict_types=0);.

We can understand this as the edition 0 if we want but that's the current behavior. Then in a new major version we can decide that the flag will have default value 1 so you'll must explicitly declare that you want 0. This will not go against the strict_types definition but still move the language forward.

Since the strict_types flag is currently the only one which have this behavior, we can decide to mix it with edition when required. It'll only have the power of the first to have been introduced.

Then we'll have :

declare(strict_types=0); //I'm using the edition 0
declare(strict_types=1); //I'm just using strict types on my code base
declare(edition=2020); //I'm using the edition 2020 so strict_types is now 1
declare(edition=2020,strict_types=0); //I'm using the edition 2020 with disabled strict_types

This of course mean that we, as a community, decide that strict_types=1 must be added in the first real edition.

@Crell
Copy link

Crell commented Feb 21, 2020

strict_types is an interesting problem, I agree. I don't know that I have much to add to what others have said in terms of analysis. I'm also in the camp that would be happy to see weak mode go away eventually, but I realize that's unlikely to pass a vote.

That said, if I understand @nikic's proposal correctly it's also slightly a red herring; as he noted, strict types code is 100% compatible with weak type mode. So if you write to strict, you're always safe.

The harder problem are changes where there is no "safe" compatible option; a given piece of code will work property in one mode/edition/thing but not the other, and vice versa, at least not without some funky extra logic.

So... actually I think there's 4 categories of changes to consider:

  1. New features that have no BC impact beyond maybe a new reserved word function name. This is most things. Generics, property accessors, comprehensions, read-only properties, etc.
  2. New features where we explicitly want to have it be user-enableable or not. Strict types is the example here. I... frankly think this category should be as small as possible, perhaps never used again beyond strict types, to avoid the k^n problem.
  3. Behavior changes where we explicitly are providing a BC grace period, where code written new-style will always run in old-style but we want to let people start using stuff from category 1 without needing to rewrite everything. This is where we've usually used E_DEPRECATED. An edition flag would be a per-file/bundle alternative. Example here would be throwing an E_NOTICE on an undefined variable, or writing to an undefined object property.
  4. Behavior changes where code cannot realistically run in both new and old modes, so you have to pick one mode or the other because code won't work in both.

Type 1 is already covered.
Type 2, again, I think we should avoid as it creates an ever-increasing maintenance burden.
Type 4 is super dangerous, because it risks creating an ecosystem break. Per-file/bundle editions/flags are probably the only way to support it, and even then it should be rare.
Type 3... I think this is the ideal use case for editions/flags/what's being discussed. It shouldn't be used much, but that would be the way to do it.

@deleugpn
Copy link

deleugpn commented Feb 21, 2020

Then we'll have :

declare(strict_types=0); //I'm using the edition 0
declare(strict_types=1); //I'm just using strict types on my code base
declare(edition=2020); //I'm using the edition 2020 so strict_types is now 1
declare(edition=2020,strict_types=0); //I'm using the edition 2020 with disabled strict_types

My suggestion was exactly intended to avoid the 4 possibilities combination that you described. I don't think we need to change the default behavior to 1 at all. I just think that strict_types=1 can be aliased to edition=1 and the core team can work on edition=2 which would provide more behavior changes that could not be unbundled from strict type. The end result is:

edition=0 is the same as strict_types=0 and forever supported and forever default
edition=1 is the same as strict_types=1 and will someday reach end of life
edition=2 is the same as edition=1 with more changes.
edition=3 is the same as edition=2 with more changes.

The only version forever supported is edition=0. The release cycle can be similar to the current major version scheme:

Support for 0, 1 and 2.
Support for 0, 2 and 3.
Support for 0, 3 and 4 and so forth.

@nikic
Copy link
Member Author

nikic commented Feb 22, 2020

@deleugpn I don't think that support model makes a lot of sense. We can either phase out editions over time, or support them forever, but supporting only the first edition forever doesn't really buy us anything from a maintenance perspective. As editions are cumulative, supporting the first edition is basically the same as supporting all editions, as we have to keep around code for all the old behavior in both cases.

@andreasschroth
Copy link

andreasschroth commented Feb 22, 2020

@nikic How realistic is it really to support the first edition "forever" (let's say 20 years+)? I assume that will be a huge maintainability burden. My fear / assumption is that it will block the future development of the language a lot. A reasonable support of the last few editions (e.g. the last 5) would still be quite a long support (longer than at the moment) and would already be an improvement compared to now (assuming new editions are released once a year). But still there would be an end of support for older editions in sight.

Also it's probably better to place a clear process how to phase out old editions when introducing the editions and not e.g. say in 8 years "we'll remove support for all editions <= 5". With a clear process and strategy in mind it will also help businesses keep planning and maintain their code bases.

@shulard
Copy link

shulard commented Feb 22, 2020

Talking about an edition which will be maintained forever implies that this edition is the strict_types=1 one which doesn't seems to be a correct assumption.

As we said, editions seems a very interesting feature to allow having a better upgrade path for breaking changes that requires code update to work. It'll be more like a period to adapt the code and test the new behaviour while the old behaviour (which will be removed in future release) is still available as a first class citizen. Then when the release which mark the new behaviour as the default one arrive, the corresponding edition must be removed from the maintenance roadmap with the last version it's useful on.

We need to make a clear separation between the strict_types and the editions described here. Maybe, some day we'll have an edition which will enfore the strict_types but it's not the question here.

IMHO, using a similar logic to choose edition than we have for strict_types seems the way to go.

@IMSoP
Copy link

IMSoP commented Feb 22, 2020

@deleugpn

I just think that strict_types=1 can be aliased to edition=1

You're ignoring the fundamental issue which has been pointed out multiple times now, that strict_types was introduced as equal-but-different, not as old-but-supported vs newer-and-better. It simply doesn't belong in the same category as the kind of things editions are for.

@deleugpn
Copy link

deleugpn commented Feb 22, 2020

@nikic reading your reply I find two things need clarification. One is my poor use of cumulative changes. Edition 7 may break something from edition 2, you dont have to keep every decision of every edition alive forever, but at least give us time to adjust. Continuing from my previous reply:

Edition=2 adds bad feature X (2020)
Edition=3 can deprecate feature X that was introduced on 2 (2021)
Edition=4 can completely remove feature X. (2022)

If editions last for 2 years, that would have given people feature X only through 2021 and 2022. You're not supporting all editions forever, but you're using editions in parallel with versioning for a fast paced and per file changes.

Now the bigger elephant in the room, if your intentions are to deprecate and remove edition=0 (which is current strict types = 0), that wasn't made clear in the document and I feel like it brings back discussion that has been tirelessly debated in internals, which is forcing onto one side of the community (dynamic type) the wishes of the other side (strict types).

Personally, I feel like it's best to have strict types be an edition than to create a matrix of editions with strict types 1 and editions with strict types 0. The matrix wouldn't allow you to deprecate and remove strict types 0 either, so that's why I feel like it's easier to think linearly.

The end result is: versioning allows php to move forward with general purpose features and security releases.
Editions allows people to have a less forgiving PHP branch with more breaking changes and more features that are only made possible by breaking the current PHP.

@IMSoP

You're ignoring the fundamental issue which has been pointed out multiple times now, that strict_types was introduced as equal-but-different, not as old-but-supported vs newer-and-better. It simply doesn't belong in the same category as the kind of things editions are for.

If that is the case, I imagine it will be harder for the core team to maintain a codebase where every decision has to cater for more than 1 supported edition and the permutation of strict types at all times.

@mikeschinkel
Copy link

mikeschinkel commented Feb 23, 2020

In this context, one possibility we discussed is to have both the edition and the strict_types declares. On newer editions strict_types will grow to include strict typing in additional areas (e.g. for operators and not just arguments). People commonly suggest adding some extra feature to strict_types, but we always reject this due to backwards-compatibility. With editions it would be possible.

@nikic — I came here to first give huge kudos for the work you have done on this, to say that after painful analysis state that I agree with your preferred solution, but most importantly to propose exactly what you wrote above, that editions should be a grouping of individual declare items and that a developer should be able to select an edition and then opt-out of one or more specifics of that edition.

A key arguments for this approach is that it will likely reduce contention among internals developers for what gets included or excluded from an edition. Clearly there are at least two minds on every proposed feature, so I expect there were with a battle royal for every feature going into an edition with those who want it on one side and those who hate it on the other side. If individual features could be opt-out of them the stake for inclusion/exclusion would but much lower and hence the ferocity of the debates are likely to be less pernicious.

Another potential benefit would be that the edition's features would be clearly obvious to everyone as it will be a collection of declarations vs. a potentially amorphous blob of changes.

Finally, I would like to propose that there be two tracks for each edition. Clearly there has been non-stop between the stricters and the non-stricters so I propose we consider two editions on an ongoing basis (UPDATE: I see others proposed this as flavors which works too):

declare( "2020");  
// . or
declare( "2020-strict");  

Without two tracks, we will continue to have knock-down, drag-out fights between the stricters and the non-stricters and I would argue that the energy drained by those fights could be far better invested in moving PHP forward with two tracks instead of just one since — with the above — an edition is just a name applied to a collection of declares.

Having two edition names where each specifies a collection of declares and their related settings is really no additional complexity in the source code since all the settings would have to be implemented anyway.

@mikeschinkel
Copy link

A quick follow up to my last reply as I read some of the comments where many are assuming that a cumulative of editions would create a huge maintenance burden.

I will defer to @nikic to say I am incorrect here — if I am — but it would seem that if we implement editions as simply a collection of binary features such as as strict_types feature then the maintainance required would only be as large as each of those binary choices and not some cumulative exponential cost that I think others are assuming would be the case.

Sure some binary choices would affect other binary choices and make 4 states instead of 2, but I doubt most choices would affect all the others. If in the future we come across a feature that does result in an exponential explosion of maintenance required we deal with that one explicitly rather than pre-judge all binary choices and preemptively limit support for legacy code.

@andreasschroth
Copy link

@mikeschinkel Total disagreement from my side. Of course it is a huge maintainability burden. It will affect the development of the language in a negative way if we keep supporting old editions "forever" (what even means forever? I guess we all agree that someday there needs to be a cut and some old editions need to be phased out). We need a clear process in place beforehand. Otherwise discussions and arguments will come all the time about the same topics.

@mikeschinkel
Copy link

mikeschinkel commented Feb 23, 2020

@andreasschroth

"Of course it is a huge maintainability burden. It will affect the development of the language in a negative way if we keep supporting old editions "forever" "

You are making assumptions you cannot objectively validate.

I am also making some assumptions, but my assumptions are prefaced by an objective assertion, that we consider the issues when the become a real problem instead of a problem that we are just worried about.

"We need a clear process in place beforehand."

Yes. We evaluate the problems when they are known to be real problems, not before.

"Otherwise discussions and arguments will come all the time about the same topics."

Are you saying discussions and arguments are not happening right now? I present your response to my post as evidence.

@drealecs
Copy link

I believe that some persons here mix major/minor "versions" with proposed "editions".
To me, it looks like editions will be like strict_type directive and, so far, it seems that can be implemented as a directive called edition.

It's not sure if it help but maybe I can share what I understood until now.

In the past regarding strict_type directive:

  • when adding a feature that has no relation to type strictness and no backward incompatible changes, there is no mention of the directive, feature being enabled by default to both code parsing version.
  • when adding a feature that is related to strict types but still there are no backward incompatible changes, there is mention of the directive and some extra checks (and optimizations) are enabled when directive is on.
  • when there were backward incompatible changes, unrelated with strict typing, there was no way to accommodate them and the topic ended up in the never ending conflict between the two worlds: of classic fast and messy ninja PHP developers and of new age safe and clean janitor PHP developers.

So let's say we start with editions and the directive can have a value: aligator, beaver, cheetah or whatever. No edition could the default to first edition (aligator in this case).

When deciding on a backward incompatible change, we should mention how it affects each combination:

  1. edition='aligator', strict_types=0
  2. edition='aligator', strict_types=1
  3. edition='beaver', strict_types=0
  4. edition='beaver', strict_types=1

What we don't know now is how many persons will care about case number 3.
And I'm afraid that we might not know that until we will have some rounds of RFCs.
What we can do, as it was suggested here, for edition beaver, have a default of strict_types=1
After we might find out that combination number 3 is not that much used, maybe for edition cheetah there would be no way to disable strict types.

Mainly, as strict types added a directive to control parsing of the files related to type strictness that enable backward incompatible changes, we need a new directive to control other backward incompatible changes that are not related to type strictness. We can call that edition. If most of these changes are in the direction of having a more strict language, it will probably not mix well with strict_types=0.
New editions can be added in whatever version but probably we will want to have new edition in a major or minor version so that it will be easy to tell what editions are supported.

@mikeschinkel
Copy link

@drealecs,

After we might find out that combination number 3 is not that much used, maybe for edition cheetah there would be no way to disable strict types.

How would we determine that? Most userland PHP developers don't spend time on the list.

And per @IMSoP strict_types was originally intended to be a choice, not a plan for deprecating. Having no way to disable strict types does not sounds very much like a choice to me.

@markrandall
Copy link

If we're going to evolve the language, I think strict_types being enabled by default has to be a part of that. For the benefits of ease, PHP juggling things behind the scenes is a curse that, in the long term, we will all benefit by exorcising.

@mikeschinkel
Copy link

@marandall:

That is the beauty of strict_types. It gives you the choice to exercise your curses but also allows newer programmers to choose a flavor of PHP that is easier to achieve initial success experiences with.

With the option of setting strict_types, everyone gets to win.

@markrandall
Copy link

That is the beauty of strict_types. It gives you the choice to exercise your curses but also allows newer programmers to choose a flavor of PHP that is easier to achieve initial success experiences with.

I have to disagree with the premise.

strict_types=0 is easier, in the same way that it's easier to not bother looking behind you before reversing your car. It's a false economy.

When I am teaching someone, the fewer obscure rules to learn, the better. PHP's internal type juggling introduces a whole shipping container full of obscure rules that are hard to explain, hard to understand, and even harder to debug, especially for someone who is just starting off.

strict_types=1 goes a long way in making sure that those obscure rules are curtailed, and replaced with clearly identified error messages making it significantly more understandable (and identifiable) to new and experienced programmers alike.

For that reason, I am 100% in favour of strict types becoming the way PHP works. I understand the costs, but from a language perspective, it's clearly the superior way to go.

@IMSoP
Copy link

IMSoP commented Feb 25, 2020

@marandall

I think strict_types is poorly understood, and doesn't achieve nearly as much improvement to programs as people think. Which of the following code blocks is more forgiving?

declare(strict_types=0);
$foo = 'hello world';
sleep($foo);

or:

declare(strict_types=1);
$foo = 'hello world';
sleep((int)$foo);

It turns out, the first, where the programmer "lazily" allowed PHP to perform the cast as part of the function call, gives a warning; whereas the second, where the programmer "clearly" asked for an explicit cast, does not.

I am all for finding ways to tighten up the edge cases of implicit (and explicit) casts, and raising more warnings to errors; I would also like to see easier-to-use functions for use cases like checking whether a string can safely be cast to int. But forcing code into strict_types=1 mode right now would lead to examples like the above, which are clearly worse, not better, at telling the user they've made a mistake.

@drealecs
Copy link

And the fight continues :) You have to agree sometimes that it looks funny.
What we can do is to understand both sides and find a solution that fits all.

How would we determine that? Most userland PHP developers don't spend time on the list.

I agree that's a hard thing to figure out.
But if language wants to evolve, it will probably do this in steps. And if 1% of the userland developers will happen to really like a specific step and decide to remain on that... tough luck.
That because each combination of directives must be maintained and the core developers don't have infinite time. And they will probably choose to stop that maintenance effort and develop new features instead.

And per @IMSoP strict_types was originally intended to be a choice, not a plan for deprecating. Having no way to disable strict types does not sounds very much like a choice to me.

So far, I agree with this too.
I never said that combination 1. edition='aligator', strict_types=0 will be dropped. That will even still be the default if one doesn't mention any directive.

I just argue that probably the same persons that wants/uses strict_types=0 are almost the same persons that don't want "undefined variable" error, a rfc change that might go in a later edition.

Just to be clear, in my head, a feature that does not involve any backward incompatible changes will probably go in all editions/combinations.

@IMSoP
Copy link

IMSoP commented Feb 25, 2020

I just argue that probably the same persons that wants/uses strict_types=0 are almost the same persons that don't want "undefined variable" error, a rfc change that might go in a later edition.

I think this is the big misconception: that people who prefer strict_types=0 would not like a stricter language in other ways. The name itself probably contributes to this; if it was called scalar_types=automatic, the general association with "stricter code" would be less tempting.

There is one thing in common, though, which is that stricter checks should ideally come hand in hand with better ways to handle the relevant use cases. For scalar types, that means better primitives for the kind of "safe casting" that "automatic" mode (strict_types=0) currently does. For undefined variables, that would mean things like marking of "out" parameters and pre-defining multi-dimensional arrays.

The new features should be available in all editions, unless they unavoidably reuse old syntax, and opting into an edition could mean "I've made sure all my code is using the new feature, error if I forget it somewhere". Right now, automatic scalar type casting is a feature with real use cases, which can only be used by switching to strict_types=0 mode; there's no drop-in replacement that a new edition could make mandatory.

@itskevinsam
Copy link

itskevinsam commented Mar 2, 2020

automatic scalar type casting is a feature with real use cases, which can only be used by switching to
@IMSoP

register globals too was a feature with real use case when implemented back in the days.


Unlike the two package-based approaches discussed in the following, namespace-scoped declares are based on the existing, well-established and well-understood "namespace" feature. This is both an advantage (it does not introduce any new concepts or need for additional code) and a disadvantage:

While namespaces commonly map directly to a package, they don't always do so. For example, the main amphp package uses the `Amp\` namespace, while other amphp packages use `Amp\FooBar\`. Here, the `Amp\` namespace cannot really be treated as a single package. There are additional issues (e.g. due to the ability to have multiple namespaces in a single file), but these can be resolved, see the linked RFC for more detailed discussion.
Copy link

@dinamic dinamic May 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing to override on a granular level would solve this. Like, setting it for Amp\ could set the default for it, but one should be able to override it for Amp\FooBar\.

@shushenghong
Copy link

if php not embrace micro services, not embrace high performance like coroutine, it will die years later. please support swoole officially, like php-apache php-fpm!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet