Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 622: Proposed rewrite of Abstract and Overview (mostly) by @willingc #1573

Merged
merged 8 commits into from
Aug 28, 2020
204 changes: 129 additions & 75 deletions pep-0622.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,88 +22,141 @@ Resolution:
Abstract
========

This PEP proposes to add a pattern matching statement to Python,
inspired by similar syntax found in Scala and many other languages.

The pattern syntax builds on Python’s existing syntax for sequence
unpacking (e.g., ``a, b = value``), but is wrapped in a ``match``
statement which compares its subject to several different “shapes”
until one is found that fits. In addition to specifying the shape of a
sequence to be unpacked, patterns can also specify the shape to be a
mapping with specific keys, an instance of a given class with (optionally) specific
attributes, a specific value, or a wildcard. Patterns can be composed
in several ways.

Syntactically, a ``match`` statement contains a *subject* expression
and one or more ``case`` clauses, where each case clause specifies a
pattern (the overall shape to be matched), an optional “guard” (a
condition to be checked if the pattern matches), and a code block to
be executed if the case clause is selected.

The rest of the PEP motivates why we believe pattern matching makes a
good addition to Python, explains our design choices, and contains a
precise syntactic and runtime specification. We also give guidance for
static type checkers (and one small addition to the ``typing`` module)
and discuss the main objections and alternatives that have been
brought up during extensive discussion of the proposal, both within
the group of authors and in the python-dev community. Finally, we
discuss some possible extensions that might be considered in the
future, once the community has ample experience with the currently
proposed syntax and semantics.
This PEP proposes to add a **pattern matching statement** to Python,
inspired by similar syntax found in Scala, Erlang, and other languages.

Patterns and shapes
-------------------

The **pattern syntax** builds on Python’s existing syntax for sequence
unpacking (e.g., ``a, b = value``).
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved

A ``match`` statement compares a value (the **subject**)
to several different **shapes** (patterns) until a shape fits.
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved
Each pattern describes the type and structure of the accepted values
as well as the variables where to capture its contents.

Patterns can specify the shape to be:

- a sequence to be unpacked, as already mentioned
- a mapping with specific keys
- an instance of a given class with (optionally) specific attributes
- a specific value
- a wildcard

Patterns can be composed in several ways.
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved

Syntax
------

Syntactically, a ``match`` statement contains:

- a *subject* expression
- one or more ``case`` clauses

Each ``case`` clause specifies:

- a pattern (the overall shape to be matched)
- an optional “guard” (a condition to be checked if the pattern matches)
- a code block to be executed if the case clause is selected

Motivation
----------
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved

The rest of the PEP:

- motivates why we believe pattern matching makes a good addition to Python
- explains our design choices
- contains a precise syntactic and runtime specification
- gives guidance for static type checkers (and one small addition to the ``typing`` module)
- discusses the main objections and alternatives that have been
brought up during extensive discussion of the proposal, both within
the group of authors and in the python-dev community

Finally, we discuss some possible extensions that might be considered
in the future, once the community has ample experience with the
currently proposed syntax and semantics.

.. _overview:

Overview
========

Since patterns are a new syntactic category with their own rules
and special cases, and since they mix input (given values) and output
(captured variables) in novel ways, they require a bit of getting used
to. It is the experience of the authors that this happens quickly when
a brief introduction to the basic concepts such as the following is
presented. Note that this section is not intended to be complete or
perfectly accurate.
Patterns are a new syntactical category with their own rules
and special cases. Patterns mix input (given values) and output
(captured variables) in novel ways. They may take a little time to
use effectively. The authors have provided
a brief introduction to the basic concepts here. Note that this section
is not intended to be complete or entirely accurate.

A new syntactic construct called *pattern* is
introduced. Syntactically, patterns look like a subset of expressions;
the following are patterns:
Pattern, a new syntactic construct, and destructuring
-----------------------------------------------------

A new syntactic construct called **pattern** is introduced in this
PEP. Syntactically, patterns look like a subset of expressions.
The following are examples of patterns:

- ``[first, second, *rest]``
- ``Point2d(x, 0)``
- ``{"name": "Bruce", "age": age}``
- ``42``

The above look like examples of object construction. A constructor
takes some values as parameters and builds an object from those
components. But as a pattern the above mean the inverse operation of
construction, which we call *destructuring*: it takes a subject value
and extracts its components. The syntactic similarity between
construction and destructuring is intentional and follows the existing
Pythonic style which makes assignment targets (write contexts) look
like expressions (read contexts). Pattern matching never creates
objects, in the same way that ``[a, b] = my_list`` doesn't create a
The above expressions may look like examples of object construction
with a constructor which takes some values as parameters and
builds an object from those components.

When viewed as a pattern, the above patterns mean the inverse operation of
construction, which we call **destructuring**. **Destructuring** takes a subject value
and extracts its components.

The syntactic similarity between object construction and destructuring is
intentional. It also follows the existing
Pythonic style of contexts which makes assignment targets (write contexts) look
like expressions (read contexts).

Pattern matching never creates objects. This is in the same way that
``[a, b] = my_list`` doesn't create a
new ``[a, b]`` list, nor reads the values of ``a`` and ``b``.

The intuition we are trying to build in users as they learn this is
that matching a pattern to a subject binds the free variables (if any)
to subject components in a way that reflects the original
subject when read as an expression. During this process,
the structure of the pattern may not fit the subject, in which case
the matching *fails*. For example, matching the pattern ``Point2d(x,
0)`` to the subject ``Point2d(3, 0)`` successfully matches and binds
``x`` to ``3``. However, if the subject is ``[3, 0]`` the match fails
because a ``list`` is not a ``Point2d``. And if the subject is
``Point2D(3, 3)`` the match fails because its second coordinate is not
``0``.

The ``match`` statement tries to match each of the
patterns in its ``case`` clauses with a single subject. At the first
successful match, the variables in the pattern are assigned and a
corresponding block is executed. Each of the multiple branches of this
conditional statement can also have a boolean condition as a *guard*.

Here's an example of a match statement, used to define a function
building 3D points that can accept as input either tuples of size 2 or
3, or existing (2D or 3D) points::

Matching process
----------------

.. **Reword**
The intuition we are trying to build in users as they learn this is
that matching a pattern to a subject binds the free variables (if any)
to subject components in a way that reflects the original
subject when read as an expression.
gvanrossum marked this conversation as resolved.
Show resolved Hide resolved

During this matching process,
the structure of the pattern may not fit the subject, and matching *fails*.

For example, matching the pattern ``Point2d(x, 0)`` to the subject
``Point2d(3, 0)`` successfully matches. The match also **binds**
the pattern's free variable ``x`` to the subject's value ``3``.

As another example, if the subject is ``[3, 0]``, the match fails
because the subject's type ``list`` is not the pattern's ``Point2d``.

As a third example, if the subject is
``Point2d(3, 7)``, the match fails because the
subject's second coordinate ``7`` is not the same as the pattern's ``0``.

The ``match`` statement tries to match a single subject to each of the
patterns in its ``case`` clauses. At the first
successful match to a pattern in a ``case`` clause:

- the variables in the pattern are assigned, and
- a corresponding block is executed.

Each ``case`` clause can also specify an optional boolean condition,
known as a **guard**.

Let's look at a more detailed example of a ``match`` statement. The
``match`` statement is used within a function to define the building
of 3D points. In this example, the function can accept as input any of
the following: tuple with 2 elements, tuple with 3 elements, an
existing Point2d object or an existing Point3d object::

def make_point_3d(pt):
match pt:
Expand All @@ -118,11 +171,12 @@ building 3D points that can accept as input either tuples of size 2 or
case _:
raise TypeError("not a point we support")

Writing this function in the traditional fashion would require several
Without pattern matching, this function's implementation would require several
``isinstance()`` checks, one or two ``len()`` calls, and a more
convoluted control flow. While the ``match`` version translates into
similar code under the hood, to a reader familiar with patterns it is
much clearer.
convoluted control flow. The ``match`` example version and the traditional
Python version without ``match`` translate into similar code under the hood.
With familiarity of pattern matching, a user reading this function using ``match``
will likely find this version clearer than the traditional approach.


Rationale and Goals
Expand Down Expand Up @@ -859,8 +913,8 @@ The procedure is as following:
* If there are match-by-position items and the class has a
``__match_args__`` attribute, the item at position ``i``
is matched against the value looked up by attribute
``__match_args__[i]``. For example, a pattern ``Point2D(5, 8)``,
where ``Point2D.__match_args__ == ["x", "y"]``, is translated
``__match_args__[i]``. For example, a pattern ``Point2d(5, 8)``,
where ``Point2d.__match_args__ == ["x", "y"]``, is translated
(approximately) into ``obj.x == 5 and obj.y == 8``.

* If there are more positional items than the length of
Expand Down Expand Up @@ -901,7 +955,7 @@ runtime and will raise exceptions. In addition to basic checks
described in the previous subsection:

* The interpreter will check that two match items are not targeting the same
attribute, for example ``Point2D(1, 2, y=3)`` is an error.
attribute, for example ``Point2d(1, 2, y=3)`` is an error.

* It will also check that a mapping pattern does not attempt to match
the same key more than once.
Expand Down