Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Revisit load vs. store #90

Open
gvanrossum opened this issue Jun 24, 2020 · 84 comments
Open

Revisit load vs. store #90

gvanrossum opened this issue Jun 24, 2020 · 84 comments
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP

Comments

@gvanrossum
Copy link
Owner

A bunch of folks on python-dev brought up that it's confusing to have case foo: mean "assign to foo" and have case .foo: to alter the meaning to "compare to the value of foo".

I think we're going to need another round of this discussion.

@gvanrossum gvanrossum added the open Still under discussion label Jun 24, 2020
@brandtbucher
Copy link
Collaborator

Yeah, this will likely be the biggest sticking-point. I'm not personally even 80% satisfied with any proposed option, which is... frustrating.

@Tobias-Kohn
Copy link
Collaborator

This is probably one of the most obvious cases where the two different heritages of pattern matching clash. Hence, whatever solution we come up with, half of the people won't like it because it contradicts their approach to this.

One of the major problems I see here is that we often discuss this without context in an abstract space of possibilities. A single case foo: by itself does indeed look like it should have load semantics. It seems to be even stronger with something like case number_of_dots:, say. But what about case x: or case n:? As humans, we look at everything within a given context and picture such a context ourselves, when it is not given. The parser, however, has no such context (at least in Python), but needs to figure things out locally. Once again: whatever we do, there are always cases where we feel the compiler should be smart enough to understand our intent...

We should for the sake of this discussion probably rather stick to examples like the case Point(x, y):, as it is much more realistic to use pattern matching for something like this, rather than the trivial case case point:. Just as we would not want to discuss the details of tuple unpacking on an example like a, = foo().

I think we quite agree that all dotted names must have load semantics. The rationale being that we only want to assign to local variables. Ideally, pattern matching has no side effects beyond these local variables (we cannot strictly enforce this in Python, of course) and assigning to dotted names would basically mean that we write to attributes and global variables.

Flipping the semantics of .foo and foo so that .foo has store semantics seems like a bad ideas given the previous thoughts: it would introduce an exception or counter-rule and thus make our lives harder. So, if we had to make both .foo and foo have load semantics, this would leave us with two options as far as I can see—neither of which looks very appealing to me:

  • Distinguish the load/store semantics based on something like upper/lower case, or require that all "store" names must end in an underscore (I think, e.g., Basic used to have type markers in variable names).

  • Introduce another beloved symbol like $ (or whatever) for store semantics.

Given that we are moving inside a network of interdependent rules, customs, and use cases, there are realistically much fewer feasible options than what you would naively expect.

@gvanrossum
Copy link
Owner Author

@JelleZijlstra brought up a good argument for marking extraction variables:

So instead of:

y = 3
case Point(x, .y): ...  # x is assigned to, y is looked up

we'd have

y = 3
case Point($x, y): ...  # x is assigned to, y is looked up

The trouble with the current syntax is that if you forget the ".", you always get a hard-to-detect bug: your pattern unexpectedly matches and "y" suddenly has a different value. Even if you find the bug, it's hard to find out where exactly the mistake happened.

But with the proposed "$" syntax, if you forget the "$", you probably will just immediately get a NameError that will tell you exactly where your bug is. (Except of course if you happen to already have the name "x" in scope, but that's hopefully not very common, and it's already what happens if you typo a local variable name.)

(Though I think this would be an even stronger argument with "?" instead of "$". :-)

@ambientnuance
Copy link

Hello Guido, Brandt and Tobias :)
This PEP is the first dev topic I've commented on, so apologies if I'm stepping on any toes in my approach.

I would like to bring to attention an alternative for extraction variables that I posted in the Python-Dev mailing list, which is in the same vein as a proposal from @dmoisset (feel free to correct/elaborate on anything Daniel). Posting it here is motivated by a sense that it may have flown under the radar while the discussion comes to a head between established options (primarily '. / ? / $' as a prefix) - see quotes below from #92:

@gvanrossum

Still, the only two alternatives available seem to be the current proposal or something using ?. :-(

@brandtbucher

If Python-Dev responds positively to a simple yes/no on ?, I'd say we pull the trigger.

In short, the idea is to use the scope of a match block to define upfront variables which can be used for extraction purposes. Here are some examples that span across @dmoisset's post and my own, where 'x' and 'y' are extraction variables and the body of case statements is omitted:

match point bind c:       # 'c' is a 'capture' object used to express any extraction variable as an attribute
    case (c.x, c.y): ...

match point bind x, y:    # 'x' and 'y' are individual extraction variables
    case: ...

match point:
    bind x, y
    case: ...

The nomenclature/syntax is certainly open to debate. Some alternatives to kick off with:

    match _ [bind] _:  ->  [into, in, as]

    match _:
        [bind] _       ->  [proxy/placeholder, sub/substitute]

    match(c) point:
    match(x, y) point:

My post: https://mail.python.org/archives/list/python-dev@python.org/message/FRC5ZNXQPWWA4D2SJM4TYWMN5VALD3O6/

@dmoisset's post (proposal 1.B): https://mail.python.org/archives/list/python-dev@python.org/message/43YOZUKP3GJ66Z2V2NKSJARL4CGKISEH/
My reply: https://mail.python.org/archives/list/python-dev@python.org/message/C6EP2L66LBJKT5RHDE6OIKG7KWM2NMLV/

Some (somewhat biased) pros and cons compared to special prefix characters...

PROS;

  • Variables are not typed out any differently, similar to lambda and comprehension expressions which do a similar type of binding (avoids '$number' being unrelated to 'number'). Also consistent with all other variable usage and naming (unique names, case not enforced).
  • Being explicit about the use of extraction variables upfront can 'prime' a reader for their use in the body of case statements.
    • Particularly useful if people 'default' to using match/case as a switch/case construct.
    • Very helpful for a first-pass understanding of a code block, with or without experience with the match/case construct.
  • All special-case syntax of the match/case construct remains prose-like, which is consistent with other control flow blocks.

CONS:

  • Variables are not marked for extraction at their point of use.
  • Additional soft keyword (although this seems similar to the 'case' keyword).
  • Verbosity (subjective, since I personally consider the repeated use of a special character verbose)

P.S. English naturally lends itself to duplicitous terminology, as seen by the many different terms already used to refer to what the PEP calls a 'Name Pattern': [extraction/capture] variable, assignment [target/cue], placeholder. I'd like to get thoughts on whether some clarity is helpful in these scenarios, or if it's better to let language free range? (I am guilty of this with the 'placeholder' term.)

@viridia
Copy link
Collaborator

viridia commented Jun 25, 2020

I want to bring up one subtle point that I think has been overlooked: If we do define a prefix operator for named pattern variables, this does not affect the walrus operator. In other words, you can still use := to assign a value to a variable, without having to prefix that variable with ? or whatever symbol we choose.

This means that ?name is, essentially, a shortcut for name := ?. I am actually fine with this - having a succinct syntax for the (vastly) more common case beats OOWTDI in this case.

@viridia
Copy link
Collaborator

viridia commented Jun 25, 2020

I created an experimental branch https://github.com/gvanrossum/patma/blob/expr-qmark/examples/expr.py showing what my expr.py example would look like with named variables explicitly declared with a prefix character. My reaction is:

PRO: It is very obvious which names are pattern bindings (stores) and which are loads.
PRO: It's very easy to explain the rules to a beginner.
PRO: This solves the problem of accidentally forgetting the . and overwriting an important symbol - in other words, it fails fast rather than being a potential time bomb.
CON: The extra character adds a substantial amount of visual clutter. Named pattern variables are very common.

@stereobutter
Copy link

stereobutter commented Jun 25, 2020

Another variant to differentiate variable binding (store) from comparing to the values of a (global) variable that I have not seen discussed elsewhere is:

example 1

x = 'hello world'
# ...
match example:
    case global x: print('how boring...')  # load a global variable
    case x: print(f'How ingenious: {x}')  # store x

example 2

def check_example(example):
    default = 'hello world'
    match example:
        case nonlocal default: print('how boring...')  # load a variable from the outer scope
        case x: print(f'How ingenious: {x}')  # store x

I personally find this much clearer in intend than .x or ?x variants I've seen floating around for differentiating store from load; but that is just me. Objective advantages are

  • no new magic syntax/symbol
  • less visual clutter compared to ?x, x?, .x
  • less likely to be overlooked/omitted by accident (compared to the currently proposed .x from the pep)
  • no collision with PEP 505's use of ? for non-aware operators
  • people from all kinds of programming languages already are familiar with modifier keywords and with async def there is somewhat of a precedent in python for using another keyword to modify a statement
  • keywords like global and nonlocal are easier to google for than symbols like ?

@thautwarm
Copy link

@SaschaSchlemmer This is clean but a bit verbose, especially when the patterns are nested.

Also, in your proposal way, I wonder if we're supposed to mark it global in each occurrence of x within a case clause?

  • If not, another problem would raise: you may mark a few names global in different positions, this is a little bit messy.

  • If so, wouldn't this be verbose? case (global C1)(global C2(...), ...).

@stereobutter
Copy link

stereobutter commented Jun 25, 2020

@thautwarm

  • I intended that every occurrence that should load something must be modified with either global or nonlocal.
match example:
    case Point(global x, global x): ...
    ....
  • I've read somewhere (I think that was in one of @gvanrossum posts on the mailing list) that one main use case for loading was comparing variables to globals similar to a switch statement in c. For these use cases I don't expect a lot of nesting; is that a valid line of thinking?
match log_level:
    case global DEBUG: ....
    case global WARN: ...
    case global INFO: ...

I must admit I am fully in the FP camp where I will use match mostly for unpacking datastructures and stuff like that and don't see the appeal of using match as a glorified c-style switch statement. This is probably the greatest issue with the pep as a whole: What is main use case (that has the nice, clutter free syntax) loading (aka c-style switch statement) or binding to names (FP-style pattern matching)?

@gvanrossum
Copy link
Owner Author

What is main use case (that has the nice, clutter free syntax) loading (aka c-style switch statement) or binding to names (FP-style pattern matching)?

You hit the nail on the head here. The main use case is definitely FP style structure unpacking. An early proposal didn’t even have constant value patterns. But named constants and enums are very much part of Python’s culture and we felt we had to support them. And now this has caused a dilemma.

@thautwarm
Copy link

@gvanrossum
I wonder if using prefix ! operator to indicate Store is rejected? I don't see a discussion about this at here or in the python-dev mailing list. Instead of ? or *.

Semantically I think ! is better, though it's still not very good to follow Python's traditional style?

@gvanrossum
Copy link
Owner Author

! and ? Are about equally ugly.

@thautwarm
Copy link

They're both ugly but ! has a closer meaning for storing variables: #92 (comment)

@gvanrossum
Copy link
Owner Author

Given typical usage I still want to use unadorned names for capture variables. Maybe we can introduce some notation that allows any expression to be used as a value? What about ‘+x‘ or ‘+(x+y)‘ ?

@Tobias-Kohn
Copy link
Collaborator

Yes, ! are very similar, apart from one important detail. ! is often used with the meaning of not. Even in Python you find it used like this as in, e.g., !=. Hence, ! would probably trip up a lot of people.

More generally, however, I would certainly welcome an operator to evaluate expressions, rather than a store marker. As I indicated before, there are other languages that actually use + in a very similar vein, although $ is used as a general evaluation operator (mind, I am not saying $ is a nice choice, but rather advocating the idea behind it).

Using nonlocal and global as load markers seems a very bad idea, as this would actually reverse their current meaning in the new context! We place a global pseudo-statement in our function to say that the global variable is modified. And then, of course, there is the problem with being way too verbose.

@ambientnuance We actually considered and briefly discussed the idea of explicitly listing the names of either the loads or stores, as evidenced here. Actually, this variant reminds me a lot of Pascal, where you had to declare all local variables upfront and it therefore feels very backward to me. But my main concern is that it does not scale well. When patterns get larger and more complex, it gets difficult to keep track of which names have load and which have store semantics, respectively. While the compiler could certainly handle it, it becomes hard for us humans to read code when the actual meaning of a symbol can no longer be detemined by local cues. I would therefore very much prefer a solution that determines a name's semantics locally.

@ambientnuance
Copy link

Rightio, thanks for your follow up @Tobias-Kohn. A good self-reminder to search through all GitHub issues, not just recent ones.

Could this be added to the ‘Rejected Ideas’ in the PEP (presumably under ‘Alternatives for constant value pattern’)? I can understand if this may be deferred, given the active discussion of some options currently in that section.

Regarding poor scaling in large match blocks (or indeed as you note, complex patterns), this is definitely the weak spot for a ‘declarative’ approach. There is certainly a strong appeal in being locally unambiguous. Nonetheless, this hadn’t weighed heavily in my mind, as I had mentally assigned the task of distinguishing store and local variables to a syntax highlighter - with unique names enforced. My personal choice would be to highlight store variables in some manner, since they are distinct from all other variables. But, I was reminded today that many people prefer to have a muted theme, and also learnt that some choose to forgo syntax modification altogether.

@ambientnuance
Copy link

In any case, @aeros made a worthwhile comment in the dev mailing list that I think is relevant here. They emphasised the importance of easy readability for a special character modifier, particularly for those with any visual impairments (size being the dominant factor). They also made a good counter-argument to my syntax highlighter crutch, albeit directed at the use of smaller special characters such as '.' :

However, I don't think it should be at all necessary for people to rely on syntax highlighting to be able to clearly see something that's part of a core Python language feature.

Their full comment:
https://mail.python.org/archives/list/python-dev@python.org/thread/RFW56R7LTSC3QSNIZPNZ26FZ3ZEUCZ3C/

Thanks for your time amidst what seems to be a hot topic.

@gvanrossum gvanrossum mentioned this issue Jun 26, 2020
@viridia
Copy link
Collaborator

viridia commented Jun 26, 2020

The way I would characterize the current dilemma is that explicitly marking stores has a number of compelling advantages - such as avoiding the "foot gun" problem; the major sticking point is aesthetics.

@stereobutter
Copy link

stereobutter commented Jun 26, 2020

@gvanrossum

The main use case is definitely FP style structure unpacking. An early proposal didn’t even have constant value patterns. But named constants and enums are very much part of Python’s culture and we felt we had to support them. And now this has caused a dilemma.
[emphasis added]

Maybe taking a step back and seeing whether there is an actual need to support constant value patterns (like .x as proposed in the PEP) is an avenue away from bikeshedding all the (less than ideal) syntactical variants of the store vs load dilemma to death:

if log_level == DEBUG:
    # do this 
elif log_level == WARN:
    # do that
else:
    # do whatever
  • if one wants to use match with constant values there is Enum (which one might consider a better choice than global variablesDEBUG, WARN etc. anyway) or using something like
match log_level:
    case logging.DEBUG: ...
    case logging.WARN: ...
    case _: ...
  • (Not only but especially for beginners) the subtlety of store vs load in match is avoided completely (regardless of what the syntax for store vs. load were). Again the message is clear:
    • for control flow use if ... elif ... else

    • for FP-style destructuring of objects use match (advanced topic, not for beginners).

      These are your father's match statements, elegant control flow statements for a more ... civilized age.
      — python grandmaster in a galaxy far far away

  • Without load semantics on variables there is no need for special syntax like .x, x? or `x`
  • No bugs due to misunderstanding load vs. store or mistyping (I am looking at you .x) for variables
  • might simplify the implementation on the parser side?

@ambientnuance
Copy link

ambientnuance commented Jun 26, 2020

@SaschaSchlemmer
The introduction of a core construct with one programming style front of mind feels like going against the grain of Python's flexibility. I can see myself using this tool with generic variables (not just constants) whilst doing structure checks. An outlier use-case most likely, but blending paradigms helps me free up how I iterate on code.

If the construct is intended for FP-like structure checks as a first-class use-case, then one way of making that clear might be going back to the ‘as’ or ‘case’ discussion. The former makes it much harder to confuse with switch/case usage while still coherent: “match an object as having this pattern”.

Happily marrying the two worlds seems like something worthwhile for the significant expansion in functionality. I’ve mentioned this in another thread already, but I think @brandtbucher’s update to expr.py in #105 provides a low-friction avenue to do so.

EDIT: To be clear, you do make a strong argument. The distinction between pattern matching and control flow is advantageous.

@stereobutter
Copy link

stereobutter commented Jun 26, 2020

Given typical usage I still want to use unadorned names for capture variables. Maybe we can introduce some notation that allows any expression to be used as a value? What about ‘+x‘ or ‘+(x+y)‘ ?

match log_level:
    case `DEBUG`: ...  # load
    case `WARN`: ...  # load
    case level: print(f'{level} is not a valid log level')  # store

doesn't read to bad and leaves the unadorned form for capturing variables. I think though that arbitrary expressions (like `(x+y)`) should be explicitly forbidden.

one caveat: `WARN` does look deceptively similar to 'WARN' without any syntax highlighting (but is fine with some highlighting)

Bildschirmfoto 2020-06-26 um 10 58 56

@JelleZijlstra
Copy link

@SaschaSchlemmer that has the same problem I identified above: it's very easy to just write case DEBUG: and have a hard-to-detect bug.

@stereobutter
Copy link

stereobutter commented Jun 26, 2020

@SaschaSchlemmer that has the same problem I identified above: it's very easy to just write case DEBUG: and have a hard-to-detect bug.

That case can be made for any solution, regardless of the actual syntax. Mixing load and store semantics is maybe just not a good idea for these cases where the meaning is ambiguous. This (perceived) gain in functionality might just not be worth the potential for bugs and confusing users.
I propose that load semantics should be constrained to dotted names (like foo.bar) in the first release. This decision is also somewhat of a two-way door in that a later release could allow simple names with a new adornment syntax (like `x` or x!) if deemed a useful feature by the community.

@natelust
Copy link
Contributor

I do think that ? is clearer in the case where you are have a store on an attribute lookup in a match case. I.E.

@dataclass
class Record:
    a: int
    b: int
    c: int
r = Record(1, 2, 3)
match r:
    case Record(b=var?):
        print(var)

Without it someone might not expect a store, it would look like a keyword arg using a variable, when really it is just matching to next sub pattern.

@natelust
Copy link
Contributor

@SaschaSchlemmer Im not sure to which part you are giving the thumbs down to, you prefer the spelling in the current proposal and implementation? I personally find it more surprising on first read, we don't normally expect an assignment to happen on the right side of an equals.

@dataclass
class Record:
    a: int
    b: int
    c: int
r = Record(1, 2, 3)
match r:
    case Record(b=var):
        print(var)

Or are you saying this sort of thing should be taken out of the current proposal?

@gvanrossum
Copy link
Owner Author

I find it unacceptable that load vs. store applies to the whole clause. Also when I first read Sascha’s first example I didn’t understand it because I didn’t notice the ‘as’.

Maybe we could debate the UPPERCASE rule, and decide first two preferences. In Scala the UPPER rule seems to work well. Does Rust have it?

@stereobutter
Copy link

@Tobias-Kohn to be honest I have not seen a convincing example where I'd prefer load semantics (except for literal values) over using store semantics and an appropriate guard.

@brandtbucher
Copy link
Collaborator

brandtbucher commented Jul 2, 2020

I still dislike the uppercase rule, because the language is now enforcing a convention (and one that may not always be "correct" in this context). It also only enforces it in this very narrow use case. Both of these points make it feel more "bolted-on" than the other options.

There are also cases that aren't obvious to me:

  • What about names that start with underscores? I often have private module-level names like _PATH or _START_DATE. Do we just skip over all underscores first? I believe Scala treats them as lowercase.
  • More generally, what about names using characters with no concept of upper- or lower-case?

It's also worth considering how easy it is to correct unintentional stores when they're found. I've recently added a syntax warning for some trivial cases (prompted by a recent mailing list discussion, and not pushed yet):

>>> match 42:
...     case foo: pass
...     case bar: pass
...     case _: pass
... 
<stdin>:2: SyntaxWarning: unguarded name capture pattern makes remaining cases unreachable; did you forget a leading dot?

This simple action of adding a . in one place becomes more complicated for some of the alternatives:

  • pragmatic: "... consider renaming to UPPERCASE?"
  • purist: "... consider refactoring to use a qualified (dotted) name?"

I'll need to think about this more. Right now I pretty strongly prefer "PEP" and "purist", and pretty strongly dislike "pragmatic" and "compromise" ("pragmatic" and "purist" are pretty loaded names when discussing Python language design, by the way... 😉).

Strong strong strong dislike of `load`, though, because they look like strings and are a total pain to discuss/document in markup environments (this sentence alone has 11 ` characters in it).

@brandtbucher
Copy link
Collaborator

Either way, I think it's important to constantly emphasize (especially when discussing name patterns) that we are creating this feature specifically for destructuring, not switching. That should help reduce pushback from people who want to adorn stores rather than loads (or feel that rules like "purist" aren't powerful enough).

@natelust
Copy link
Contributor

natelust commented Jul 2, 2020

@gvanrossum I don't believe it does. To my knowledge in rust you either must use a match guard, or their binding operator in cases like these (I am not a rust expert though). The binding operator does a loads, compares, then stores, an example can be found here (it shows matching a pattern, but it may be a variable as well in a limited sense). In python (using their same at symbol) that might be spelled

number_of_doors=4
match random_new_car()
    case Car(color=color, doors=doors@number_of_doors):
        print(f"This is a {color} car guaranteed to have {doors} doors")

where I the variable is stored in doors, and I guess could be left as _ in the case you only wanted a constraint.

Edit:
Fixed a typo. An I want to highlight that I think syntax like this has been discussed and was not favored, I only wanted to compare to what is in rust.

@Tobias-Kohn
Copy link
Collaborator

Big +1 from me for @brandtbucher pointing out the difficulties with the uppercase rule! I hadn't though of that, but I think these two issues (leading underscores and non-latin names) are quite valid. Of course, a firm rule will answer these questions, but it shows nonetheless that it might not be quite as straight-forward as at least I had thought.

I also like the SyntaxWarning! Very nice indeed!

While I am certainly not too eager to go for the backticks rule, I am not entirely sure whether the use of the language in markdown can be a strong concern. After all, it would be intended as a rather 'obscure' feature to be used sparingly.

@brandtbucher
Copy link
Collaborator

I also like the SyntaxWarning! Very nice indeed!

I knew you would like that.

I am not entirely sure whether the use of the language in markdown can be a strong concern.

Alright, you're in charge of writing the RST docs if we go this route. 😉

@dmoisset
Copy link
Collaborator

dmoisset commented Jul 2, 2020

Just to double check, is there anyone here that is still against default bind(store) semantics and prefers evaluate(load) ? I know I mentioned some misgivings at some point but I'm generally onboard with binding by default (I'm asking because of brandt's comment about «help reduce pushback from people who want to adorn stores rather than loads »)

@Tobias-Kohn
Copy link
Collaborator

The uppercase rule build on the convention and idea that constants are written in uppercase letters. In Scala (where the uppercase rule is applied), there is also the Java convention of writing all classes with an uppercase letter. This means that, e.g., the load semantics of Point in Point(x, y) is already established by the name Point itself.

In Python, we face several difficulties with this rule:

  • Many classes and types are written entirely in lowercase. The load semantics of class names must therefore be established by the following parenthesis rather than the name itself.
  • There is no such thing as a constant in the strict sense, nor is there anything that would prevent local variables and parameters to use uppercase letters. Since I would rather not have too much 'load and compare' semantics in the patterns, I see the doors wide open for misuse (but there might be differing opinions on that).
  • As Brandt has just pointed out: there are many possible names that are neither clearly uppercase, nor clearly not.
  • There is quite some resistance from people who are uncomfortable with the introduction of such a rule that has no precedent in Python so far.

In favour of the uppercase rule, we find that it is quite simple, and solves the load/store problem without additional syntactic clutter. It thus has the potential to be a viable compromise between the two groups. On the other hand, having load and compare semantics for dotted names seems to cover enough cases as far as I am concerned.

@gvanrossum
Copy link
Owner Author

We seem to have agreement that dot-in-middle (a.b) is in and leading-dot (.b) is out. Also that stores don't need sigils, and that we'd rather not use sigils or other markers for loads.

Which leaves the choice: Do we use some form of the UPPERCASE rule or not? Let's have a vote among the authors.

If it's accepted, I'd solve one minor issue by ruling that _Foo and __Foo are UPPERCASE. I don't know what to do for alphabets without lower/upper distinction but I don't see that as a show-stopper. (@thautwarm, can you help here?)

@brandtbucher
Copy link
Collaborator

I vote no uppercase.

I still like the leading dot, actually... though I recognize that adding back it later is painless.

@Tobias-Kohn
Copy link
Collaborator

I also vote no uppercase.

However: as I understand the unicode standard seems to have an "UPPERCASE" flag for each character that specifies whether it is uppercase or not. Ignoring leading underscores also seems reasonable enough. But since my reservations are primarily on other aspects than whether we can determine if something is uppercase or not, I am still in favour of not implementing this rule.

@viridia
Copy link
Collaborator

viridia commented Jul 2, 2020

Note that even with the purist approach, there are ways to match an unqualified names, using either guards or custom matchers. I recognize that using it this way is quite ugly and verbose - it might make sense if you had a match statement with a bunch of qualified names, and needed one special case for an unqualified name - but if you had a bunch of unqualified names the burden would be great enough as to force most people not to use a match statement at all.

@gvanrossum
Copy link
Owner Author

@viridia Could you vote? If you're against UPPER it's decided. If you're for we'll have to ask Ivan.

Regarding uppercase for non-Latin alphabets, Unicode has several letter categories, and "Lo" (Letter, other) is neither lowercase nor uppercase -- and there are 127,004 of those. I'm not sure but it looks like that includes (almost) all CJK "letters".

If we're still interested after the vote I can ask around.

@viridia
Copy link
Collaborator

viridia commented Jul 3, 2020

I am -0.5 on using uppercase.

One other variant I would propose is, rather than "Starts with a capital letter", instead "Contains no lowercase letters". I am not sure if that is better or whether it addresses the unicode issues.

Alternatively, the rule could be "Conforms to the Python code formatting standard for an enumeration constant", in which cause I would be +0.

@dmoisset
Copy link
Collaborator

dmoisset commented Jul 3, 2020

One other variant I would propose is, rather than "Starts with a capital letter", instead "Contains no lowercase letters". I am not sure if that is better or whether it addresses the unicode issues.

That won't work; that would mean that variable names with no latin letters (which may be all variables for code written in some non-English languages) would default to load semantics rather than store.

@jimbaker
Copy link

jimbaker commented Jul 3, 2020 via email

@gvanrossum
Copy link
Owner Author

We could keep the door ajar for some variant of the uppercase rule by stipulating that capture variable names shouldn't start with a capital letter (after stripping leading underscores).

This shouldn't affect users whose alphabets has no case distinction (though a future decision to use a leading uppercase letter to mark dot-free loads would). It also shouldn't affect any serious use of match/case -- PEP 8 is quite clear that locals should use all-lowercase, and while I've seen plenty of code that violates the recommendation of using UPPERCASE for named constants or CapWords for class names, I don't think I've seen much code using anything but lowercase for local variables except on a whim. I doubt that anyone would even notice if we snuck this into the implementation without telling anybody. :-)

@thautwarm
Copy link

Glad to see the authors voting for no uppercase.

Many including me believe using uppercases this way is bad for Python.

If it's accepted, I'd solve one minor issue by ruling that _Foo and __Foo are UPPERCASE. I don't know what to do for alphabets without lower/upper distinction but I don't see that as a show-stopper. (@thautwarm, can you help here?)

Of course.

It seems that "no uppercase" will be adopted, and maybe no need to consider alphabets without lowercase/uppercase concepts.

In case you still some information about CJK languages, usually a CJK language user will expect all characters to be either uppercase or lowercase(because this is a routine way). However, instead of using cases, CJK language users might prefer using 「名字」 as the load of variable 名字.

@gvanrossum
Copy link
Owner Author

Looks like we're going with just the dot-in-the-middle rule ("purist"). People can create a dummy class and put values in there:

def foo(pt, context):
    class c:
        ctx = context
    match pt:
        case Point(0, 0, context=c.ctx): ...

@dmoisset
Copy link
Collaborator

dmoisset commented Jul 5, 2020

I imagine most people will not use the dummy class and instead go for:

def foo(pt, context):
    match pt:
        case Point(0, 0, context=c) if c==context: ...

And it's what I've seen in Haskell and Rust (see for example Listing 18-27 here)

@brandtbucher brandtbucher added accepted Discussion leading to a final decision to include in the PEP needs more pep An issue which needs to be documented in the PEP and removed open Still under discussion labels Jul 5, 2020
@brandtbucher
Copy link
Collaborator

This has been implemented. Still needs PEP though.

@gvanrossum
Copy link
Owner Author

I imagine most people will not use the dummy class and instead go for

def foo(pt, context):
    match pt:
        case Point(0, 0, context=c) if c==context: ...

I'm not so sure. That idiom looks backwards: We want to compare a value, but instead we extract it and then add a guard -- but for the human reader, a guard is much more expensive to understand, because there are many other things you could test for in a guard. Plus now the reader is wondering, is c only used in the guard, or also in the block (the ... in your example), or in the code past the end of the whole match statement.

Also it would be repetitive if several case clauses need it.

@brandtbucher brandtbucher added fully pepped Issues that have been fully documented in the PEP and removed needs more pep An issue which needs to be documented in the PEP labels Jul 7, 2020
@adamwehmann
Copy link

adamwehmann commented Sep 23, 2020

With the adoption of the dot-in-the-middle rule, it seems that the use of "." as a sigil, if one is ever needed in the future, could have a little extra weight to it by analogy. Surprisingly I haven't seen* it explicitly stated or discussed anywhere, although it seems likely it was the intention, that the two combined rules would have compared nicely with the traversal rules of relative imports (and the current working directory concept on the Linux file system). Essentially, in this context, if I understand correctly, as foo.bar is matching the constant bar loaded from the foo namespace, .bar could be said to be matching the constant loaded from the current one. Realizing this while reflecting on the PEP updates fulfilled a missing motivator that took the original PEP on constant patterns from arbitrary to making more sense to me personally, so I don't know if others readers might have missed this connection as well, not that other reasons for not maintaining it in the PEP don't exist, of course.

Apologies if this is obvious and/or unwanted noise.

*searching this repo, python-dev, python-ideas, the PEP versions

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP
Projects
None yet
Development

No branches or pull requests