In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

A Python program is read by a *parser*. Input to the parser is a stream of *tokens*, generated by the *lexical analyzer*. This chapter describes how the lexical analyzer breaks a file into tokens.

Python reads program text as Unicode code points; the encoding of a source file can be given by an encoding declaration and defaults to UTF-8, see [**PEP 3120**](https://www.python.org/dev/peps/pep-3120) for details. If the source file cannot be decoded, a [`SyntaxError`](https://docs.python.org/3/library/exceptions.html#SyntaxError) is raised.

> 一个Python程序是由一个*解析器*读取的。输入到解析器的是*符号*流，由*词法分析器*生成。本章描述了词法分析器如何将文件分解为标记。
>
> Python 将程序文本作为 Unicode 码点来读取；源文件的编码可以通过编码声明来给出，默认为 UTF-8，详情见 [**PEP 3120**](https://www.python.org/dev/peps/pep-3120) 。如果源文件不能被解码，会产生一个[`SyntaxError`](https://docs.python.org/3/library/exceptions.html#SyntaxError)。

## 2.1. Line structure

A Python program is divided into a number of *logical lines*.

> 一个Python程序被划分为若干*逻辑行*。

### 2.1.1. Logical lines

The end of a logical line is represented by the token NEWLINE. Statements cannot cross logical line boundaries except where NEWLINE is allowed by the syntax (e.g., between statements in compound statements). A logical line is constructed from one or more *physical lines* by following the explicit or implicit *line joining* rules.

> 一个逻辑行的结束由标记NEWLINE表示。语句不能跨越逻辑行的边界，除非语法允许NEWLINE（例如，在复合语句的语句之间）。一个逻辑行是由一个或多个*物理行*构成的，遵循显性或隐性的*行连接*规则。

A comment starts with a hash character (`#`) that is not part of a string literal, and ends at the end of the physical line. A comment signifies the end of the logical line unless the implicit line joining rules are invoked. Comments are ignored by the syntax.

> 注释以一个不属于字符串字面的哈希字符（`#`）开始，并在物理行的末端结束。除非调用隐含的行连接规则，否则注释标志着逻辑行的结束。语法会忽略注释。

### 2.1.4. Encoding declarations

If a comment in the first or second line of the Python script matches the regular expression `coding[=:]\s*([-\w.]+)`, this comment is processed as an encoding declaration; the first group of this expression names the encoding of the source code file. The encoding declaration must appear on a line of its own. If it is the second line, the first line must also be a comment-only line. The recommended forms of an encoding expression are

> 如果Python脚本第一行或第二行的注释与正则表达式 `coding[=:]/s*([-\w.]+)` 相匹配，这个注释将被处理为一个编码声明；这个表达式的第一组命名了源代码文件的编码。编码声明必须出现在它自己的一行中。如果它是第二行，第一行也必须是一个注释行。编码表达式的推荐形式是

In [2]:
# -*- coding: <encoding-name> -*-

which is recognized also by GNU Emacs, and

> 这也被GNU Emacs所认可，而

In [3]:
# vim:fileencoding=<encoding-name>

which is recognized by Bram Moolenaar’s VIM.

If no encoding declaration is found, the default encoding is UTF-8. In addition, if the first bytes of the file are the UTF-8 byte-order mark (`b'\xef\xbb\xbf'`), the declared file encoding is UTF-8 (this is supported, among others, by Microsoft’s **notepad**).

If an encoding is declared, the encoding name must be recognized by Python (see [Standard Encodings](https://docs.python.org/3/library/codecs.html#standard-encodings)). The encoding is used for all lexical analysis, including string literals, comments and identifiers.

> 这是被Bram Moolenaar的VIM识别的。
>
> 如果没有找到编码声明，默认的编码是UTF-8。此外，如果文件的第一个字节是UTF-8的字节顺序标记 (`b'\xef\xbb\xbf'`)，声明的文件编码是UTF-8 (除其他外，微软的**notepad**也支持这种编码)。
>
> 如果声明了一个编码，那么这个编码名称必须被Python所识别(见[标准编码](https://docs.python.org/3/library/codecs.html#standard-encodings))。该编码用于所有的词法分析，包括字符串字面、注释和标识符。

### 2.1.5. Explicit line joining

Two or more physical lines may be joined into logical lines using backslash characters (`\`), as follows: when a physical line ends in a backslash that is not part of a string literal or comment, it is joined with the following forming a single logical line, deleting the backslash and the following end-of-line character. For example:

> 两个或多个物理行可以使用反斜杠字符（`\`）连接成逻辑行，具体方法如下：当一个物理行以反斜杠结尾，且不属于字符串字面或注释的一部分时，它将与下面的行连接成一个逻辑行，删除反斜杠和下面的行尾字符。比如说：

In [2]:
if 1900 < year < 2100 and 1 <= month <= 12 \
  and 1 <= day <= 31 and 0 <= hour < 24 \
  and 0 <= minute < 60 and 0 <= second < 60:    # Looks like a valid data
    return 1

SyntaxError: 'return' outside function (3456379048.py, line 4)

### 2.1.6. Implicit line joining

Expressions in parentheses, square brackets or curly braces can be split over more than one physical line without using backslashes. For example:

> 小括号、方括号或大括号中的表达式可以在一个以上的物理行上分割，而不使用反斜线。例如：

In [3]:
month = ['Januari', 'Februari', 'Maart',        # These are
         'April',   'Mei',      'Juni',         # Dutch names
         'Juli',    'Augustus', 'Septembeer',   # for the months
         'Oktober', 'November', 'Decembeer']    # of the year

Implicitly continued lines can carry comments. The indentation of the continuation lines is not important. Blank continuation lines are allowed. There is no NEWLINE token between implicit continuation lines. Implicitly continued lines can also occur within triple-quoted strings (see below); in that case they cannot carry comments.

> 隐含的续行可以携带注释。延续行的缩进并不重要。空白的续行是允许的。在隐性续行之间没有NEWLINE标记。隐式续行也可以出现在三引号字符串中（见下文）；在这种情况下，它们不能带有注释。

### 2.1.7. Blank lines

A logical line that contains only spaces, tabs, formfeeds and possibly a comment, is ignored (i.e., no NEWLINE token is generated). During interactive input of statements, handling of a blank line may differ depending on the implementation of the read-eval-print loop. In the standard interactive interpreter, an entirely blank logical line (i.e. one containing not even whitespace or a comment) terminates a multi-line statement.

> 一个只包含空格、制表符、换行符和可能的注释的逻辑行被忽略（即不产生NEWLINE标记）。在语句的交互式输入过程中，对空行的处理可能有所不同，这取决于read-eval-print循环的实现。在标准的交互式解释器中，一个完全空白的逻辑行（即一个甚至不包含空白或注释的逻辑行）会终止一个多行语句。

### 2.1.8. Indentation

Leading whitespace (spaces and tabs) at the beginning of a logical line is used to compute the indentation level of the line, which in turn is used to determine the grouping of statements.

Tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.

Indentation is rejected as inconsistent if a source file mixes tabs and spaces in a way that makes the meaning dependent on the worth of a tab in spaces; a [`TabError`](https://docs.python.org/3/library/exceptions.html#TabError) is raised in that case.

**Cross-platform compatibility note:** because of the nature of text editors on non-UNIX platforms, it is unwise to use a mixture of spaces and tabs for the indentation in a single source file. It should also be noted that different platforms may explicitly limit the maximum indentation level.

A formfeed character may be present at the start of the line; it will be ignored for the indentation calculations above. Formfeed characters occurring elsewhere in the leading whitespace have an undefined effect (for instance, they may reset the space count to zero).

The indentation levels of consecutive lines are used to generate INDENT and DEDENT tokens, using a stack, as follows.

Before the first line of the file is read, a single zero is pushed on the stack; this will never be popped off again. The numbers pushed on the stack will always be strictly increasing from bottom to top. At the beginning of each logical line, the line’s indentation level is compared to the top of the stack. If it is equal, nothing happens. If it is larger, it is pushed on the stack, and one INDENT token is generated. If it is smaller, it *must* be one of the numbers occurring on the stack; all numbers on the stack that are larger are popped off, and for each number popped off a DEDENT token is generated. At the end of the file, a DEDENT token is generated for each number remaining on the stack that is larger than zero.

Here is an example of a correctly (though confusingly) indented piece of Python code:

> 逻辑行开始时的领先空白（空格和制表符）用于计算该行的缩进程度，这反过来又用于确定语句的分组。
>
> 制表符被1到8个空格所取代（从左到右），这样一来，直到并包括取代的字符总数是8的倍数（这旨在与Unix使用的规则相同）。然后，第一个非空白字符之前的空格总数决定了该行的缩进。缩进不能用反斜线分割成多个物理行；到第一个反斜线为止的空格决定了缩进。
>
> 如果源文件将制表符和空格混合在一起，使其含义取决于空格中的制表符的价值，则缩进将被拒绝，在这种情况下，将产生一个[`TabError`](https://docs.python.org/3/library/exceptions.html#TabError)。
>
> **跨平台兼容性说明：**由于非UNIX平台上的文本编辑器的性质，在单个源文件中使用空格和制表符的混合缩进方式是不明智的。还应注意的是，不同的平台可能明确限制最大缩进程度。
>
> 在行的开头可能有一个formfeed字符；在上面的缩进计算中，它将被忽略。在前导空白的其他地方出现的formfeed字符具有未定义的效果（例如，它们可能将空格数重置为零）。
>
> 连续几行的缩进水平被用来生成INDENT和DEDENT标记，使用堆栈，如下所示。
>
> 在读取文件的第一行之前，在堆栈上推入一个零；这将不再被弹出。推入堆栈的数字将总是严格地从下往上增加。在每个逻辑行的开头，该行的缩进水平与堆栈的顶部进行比较。如果它相等，则不发生任何事情。如果它大，它被推到堆栈上，并产生一个INDENT符号。如果它较小，它必须是堆栈上出现的数字之一；堆栈上所有较大的数字都被弹出，每弹出一个数字，就产生一个DEDENT标记。在文件结束时，为堆栈上剩余的每个大于0的数字生成一个DEDENT标记。
>
> 下面是一个正确缩进的Python代码的例子（虽然很混乱）：

In [2]:
def perm(l):
       # Computer the list of all permutations of 1
    if len(l) <= 1:
                  return [l]
    r = []
    for i in range(len(l)):
             s = l[:i] + l[i+1:]
             p = perm(s)
             for x in P:
              r.append(l[i:i+1] + x)
    return r

The following example shows various indentation errors:

> 下面的例子显示了各种缩进错误：

In [5]:
 def perm(l):                       # error: first line indented
for i in range(len(l)):             # error: not indented
    s = l[:i] + l[i+1:]
        p = perm(l[:i] + l[i+1:])   # error: unexpected indent
        for x in p:
                r.append(l[i:i+1] + x)
            return r                # error: inconsistent dedent

IndentationError: unindent does not match any outer indentation level (<tokenize>, line 7)

(Actually, the first three errors are detected by the parser; only the last error is found by the lexical analyzer — the indentation of `return r` does not match a level popped off the stack.)

> (实际上，前三个错误是由分析器检测出来的；只有最后一个错误是由词法分析器发现的--`return r`的缩进与从堆栈中弹出的级别不一致）。

### 2.1.9. Whitespace between tokens

Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens. Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (e.g., ab is one token, but a b is two tokens).

> 除了在逻辑行的开头或在字符串字面中，空白字符空格、制表符和formfeed可以互换使用，以分隔标记。只有在两个标记之间需要留空白，否则它们的连接可以被解释为不同的标记（例如，ab是一个标记，但a b是两个标记）。

## 2.2. Other tokens

Besides NEWLINE, INDENT and DEDENT, the following categories of tokens exist: *identifiers*, *keywords*, *literals*, *operators*, and *delimiters*. Whitespace characters (other than line terminators, discussed earlier) are not tokens, but serve to delimit tokens. Where ambiguity exists, a token comprises the longest possible string that forms a legal token, when read from left to right.

> 除了NEWLINE、INDENT和DEDENT，还有以下几类标记：*标识符（identifiers）*，*关键词（keywords）*，*字面值（literals）*，*操作符（operators）*，以及*分界符（delimiters）*。空白字符（除前面讨论的行结束符外）不是标记，而是用来给标记划界。在存在歧义的情况下，当从左到右阅读时，标记包括形成合法标记的最长的字符串。

## 2.3. Identifiers and keywords

Identifiers (also referred to as *names*) are described by the following lexical definitions.

The syntax of identifiers in Python is based on the Unicode standard annex UAX-31, with elaboration and changes as defined below; see also [**PEP 3131**](https://www.python.org/dev/peps/pep-3131) for further details.

Within the ASCII range (U+0001..U+007F), the valid characters for identifiers are the same as in Python 2.x: the uppercase and lowercase letters `A` through `Z`, the underscore `_` and, except for the first character, the digits `0` through `9`.

Python 3.0 introduces additional characters from outside the ASCII range (see [**PEP 3131**](https://www.python.org/dev/peps/pep-3131)). For these characters, the classification uses the version of the Unicode Character Database as included in the [`unicodedata`](https://docs.python.org/3/library/unicodedata.html#module-unicodedata) module.

Identifiers are unlimited in length. Case is significant.

> 标识符(也被称为*names*)由以下词法定义描述。
>
> Python中标识符的语法以Unicode标准附件UAX-31为基础，并按照下面的定义进行了阐述和修改；更多细节见[**PEP 3131**](https://www.python.org/dev/peps/pep-3131)。
>
> 在ASCII范围内(U+0001...U+007F)，标识符的有效字符与Python 2.x中的相同：大写和小写字母`A`到`Z`，下划线`_`，以及除了第一个字符外，数字`0`到`9`。
>
> Python 3.0引入了ASCII范围之外的其他字符(见[**PEP 3131**](https://www.python.org/dev/peps/pep-3131))。对于这些字符，分类使用[`unicodedata`](https://docs.python.org/3/library/unicodedata.html#module-unicodedata)模块中的Unicode字符数据库版本。
>
> 标识符的长度不限。大小写是重要的。

In [3]:
identifier   ::=  xid_start xid_continue*
id_start     ::=  <all characters in general categories Lu, Ll, Lt, Lm, Lo, Nl, the underscore, and characters with the Other_ID_Start property>
id_continue  ::=  <all characters in id_start, plus characters in the categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
xid_start    ::=  <all characters in id_start whose NFKC normalization is in "id_start xid_continue*">
xid_continue ::=  b<all characters in id_continue whose NFKC normalization is in "id_continue*">

SyntaxError: invalid syntax (2893680421.py, line 1)

The Unicode category codes mentioned above stand for:

- *Lu* - uppercase letters
- *Ll* - lowercase letters
- *Lt* - titlecase letters
- *Lm* - modifier letters
- *Lo* - other letters
- *Nl* - letter numbers
- *Mn* - nonspacing marks
- *Mc* - spacing combining marks
- *Nd* - decimal numbers
- *Pc* - connector punctuations
- *Other_ID_Start* - explicit list of characters in [PropList.txt](https://www.unicode.org/Public/13.0.0/ucd/PropList.txt) to support backwards compatibility
- *Other_ID_Continue* - likewise

All identifiers are converted into the normal form NFKC while parsing; comparison of identifiers is based on NFKC.

A non-normative HTML file listing all valid identifier characters for Unicode 4.1 can be found at https://www.unicode.org/Public/13.0.0/ucd/DerivedCoreProperties.txt

> 上面提到的Unicode类别代码代表了：
>
> - *Lu* - 大写字母
> - *Ll* - 小写字母
> - *Lt* - 标题大写字母
> - *Lm* - 修饰语字母
> - *Lo* - 其他字母
> - *Nl* - 字母编号
> - *Mn* - 非间距标记
> - *Mc* - 间隔组合标记
> - *Nd* - 十进制数字
> - *Pc* - 连接器标点符号
> - *Other_ID_Start* - [PropList.txt](https://www.unicode.org/Public/13.0.0/ucd/PropList.txt)中明确列出的字符，支持向后兼容
> - *Other_ID_Continue* - 同样如此
>
> 在解析时，所有标识符都被转换为正常形式的NFKC；标识符的比较是基于NFKC的。
>
> 一个非规范的HTML文件列出了Unicode 4.1的所有有效标识符，可以在https://www.unicode.org/Public/13.0.0/ucd/DerivedCoreProperties.txt查看

### 2.3.1. Keywords

The following identifiers are used as reserved words, or *keywords* of the language, and cannot be used as ordinary identifiers. They must be spelled exactly as written here:

> 以下标识符被用作保留词，或语言的*关键词*，不能作为普通标识符使用。它们必须完全按照这里的写法拼写：

False                await                else                import                pass

None                break              except            in                         raise

True                  class                finally             is                         return

and                   continue         for                   lambda              try

as                      def                   from               nonlocal             while

assert               del                   global             not                      with

async                elif                   if                      or                        yield

### 2.3.2. Soft Keywords

*New in version 3.10.*

Some identifiers are only reserved under specific contexts. These are known as *soft keywords*. The identifiers `match`, `case` and `_` can syntactically act as keywords in contexts related to the pattern matching statement, but this distinction is done at the parser level, not when tokenizing.

As soft keywords, their use with pattern matching is possible while still preserving compatibility with existing code that uses `match`, `case` and `_` as identifier names.

> *3.10. 版本中的新内容*
>
> 有些标识符只在特定情况下保留。这些被称为*软关键字*。标识符`match`、`case`和`_`在与模式匹配语句相关的上下文中可以作为关键字，但这种区分是在解析器级别进行的，而不是在标记化时。
>
> 作为软关键字，它们与模式匹配的使用是可能的，同时仍然保留了与使用`match`、`case`和`_`作为标识符名称的现有代码的兼容性。

### 2.3.3. Reserved classes of identifiers

Certain classes of identifiers (besides keywords) have special meanings. These classes are identified by the patterns of leading and trailing underscore characters:

- `_*`

  Not imported by `from module import *`.

- `_`

  In a `case` pattern within a [`match`](https://docs.python.org/3/reference/compound_stmts.html#match) statement, `_` is a [soft keyword](https://docs.python.org/3/reference/lexical_analysis.html#soft-keywords) that denotes a [wildcard](https://docs.python.org/3/reference/compound_stmts.html#wildcard-patterns).Separately, the interactive interpreter makes the result of the last evaluation available in the variable `_`. (It is stored in the [`builtins`](https://docs.python.org/3/library/builtins.html#module-builtins) module, alongside built-in functions like `print`.)Elsewhere, `_` is a regular identifier. It is often used to name “special” items, but it is not special to Python itself.Note The name `_` is often used in conjunction with internationalization; refer to the documentation for the [`gettext`](https://docs.python.org/3/library/gettext.html#module-gettext) module for more information on this convention.It is also commonly used for unused variables.

- `__*__`

  System-defined names, informally known as “dunder” names. These names are defined by the interpreter and its implementation (including the standard library). Current system names are discussed in the [Special method names](https://docs.python.org/3/reference/datamodel.html#specialnames) section and elsewhere. More will likely be defined in future versions of Python. *Any* use of `__*__` names, in any context, that does not follow explicitly documented use, is subject to breakage without warning.

- `__*`

  Class-private names. Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes. See section [Identifiers (Names)](https://docs.python.org/3/reference/expressions.html#atom-identifiers).

> 某些类别的标识符（除了关键词）具有特殊的含义。这些类别由前导和后导的下划线字符的模式来识别。
>
> - `_*`
>
>   不被 `from module import *` 所导入。
>
> - `_`
>
>   在[`match`](https://docs.python.org/3/reference/compound_stmts.html#match)语句中的 `case` 模式中，`_`是一个[软关键字](https://docs.python.org/3/reference/lexical_analysis.html#soft-keywords)，表示一个[通配符](https://docs.python.org/3/reference/compound_stmts.html#wildcard-patterns)。
>
>   另外，交互式解释器在变量`_`中提供了最后一次运算的结果。(它存储在[`builtins`](https://docs.python.org/3/library/builtins.html#module-builtins)模块中，与`print`等内置函数一起。)
>
>   在其他地方，`_` 是一个常规标识符。它经常被用来命名 "special"项目，但对Python本身来说并不特殊。
>
>   **注意：** `_` 这个名字经常与国际化结合使用；关于这个约定的更多信息，请参考 [`gettext`](https://docs.python.org/3/library/gettext.html#module-gettext) 模块的文档。它也常用于未使用的变量。
>
> - `__*__`
>
>   系统定义的名称，非正式地称为 "dunder "名称。这些名称是由解释器和它的实现（包括标准库）定义的。目前的系统名称在[Special method names](https://docs.python.org/3/reference/datamodel.html#specialnames)部分和其他地方讨论。在Python的未来版本中可能会有更多的定义。*任何*使用 `__*__` 的名称，在任何情况下，如果不遵循明确记录的使用方法，都会被破坏，没有警告。
>
> - `__*`
>
>   类的私有名称。这一类的名称，当在类定义的上下文中使用时，会被重写成使用一种混合的形式，以帮助避免基类和派生类的 "私有"属性之间的名称冲突。参见[标识符（名称）](https://docs.python.org/3/reference/expressions.html#atom-identifiers)一节。

## 2.4. Literals

Literals are notations for constant values of some built-in types.

> 字面值是一些内置类型的常量值的符号。

### 2.4.1. String and Bytes literals

String literals are described by the following lexical definitions:

> 字符串字面值由以下词法定义描述：

In [12]:
stringliteral   ::=  [stringprefix](shortstring | longstring)
stringprefix    ::=  "r" | "u" | "R" | "U" | "f" | "F"
                     | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"
shortstring     ::=  "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring      ::=  "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
shortstringitem ::=  shortstringchar | stringescapeseq
longstringitem  ::=  longstringchar | stringescapeseq
shortstringchar ::=  <any source character except "\" or newline or the quote>
longstringchar  ::=  <any source character except "\">
stringescapeseq ::=  "\" <any source character>

SyntaxError: unterminated string literal (detected at line 8) (723465277.py, line 8)

In [6]:
bytesliteral   ::=  bytesprefix(shortbytes | longbytes)
bytesprefix    ::=  "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB"
shortbytes     ::=  "'" shortbytesitem* "'" | '"' shortbytesitem* '"'
longbytes      ::=  "'''" longbytesitem* "'''" | '"""' longbytesitem* '"""'
shortbytesitem ::=  shortbyteschar | bytesescapeseq
longbytesitem  ::=  longbyteschar | bytesescapeseq
shortbyteschar ::=  <any ASCII character except "\" or newline or the quote>
longbyteschar  ::=  <any ASCII character except "\">
bytesescapeseq ::=  "\" <any ASCII character>

SyntaxError: unterminated string literal (detected at line 7) (1207887476.py, line 7)

One syntactic restriction not indicated by these productions is that whitespace is not allowed between the [`stringprefix`](https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-python-grammar-stringprefix) or [`bytesprefix`](https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-python-grammar-bytesprefix) and the rest of the literal. The source character set is defined by the encoding declaration; it is UTF-8 if no encoding declaration is given in the source file; see section [Encoding declarations](https://docs.python.org/3/reference/lexical_analysis.html#encodings).

In plain English: Both types of literals can be enclosed in matching single quotes (`'`) or double quotes (`"`). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as *triple-quoted strings*). The backslash (`\`) character is used to give special meaning to otherwise ordinary characters like `n`, which means ‘newline’ when escaped (`\n`). It can also be used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character. See [escape sequences](https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences) below for examples.

Bytes literals are always prefixed with `'b'` or `'B'`; they produce an instance of the [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) type instead of the [`str`](https://docs.python.org/3/library/stdtypes.html#str) type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.

Both string and bytes literals may optionally be prefixed with a letter `'r'` or `'R'`; such strings are called *raw strings* and treat backslashes as literal characters. As a result, in string literals, `'\U'` and `'\u'` escapes in raw strings are not treated specially. Given that Python 2.x’s raw unicode literals behave differently than Python 3.x’s the `'ur'` syntax is not supported.

*New in version 3.3:* The `'rb'` prefix of raw bytes literals has been added as a synonym of `'br'`.

*New in version 3.3:* Support for the unicode legacy literal (`u'value'`) was reintroduced to simplify the maintenance of dual Python 2.x and 3.x codebases. See [**PEP 414**](https://www.python.org/dev/peps/pep-0414) for more information.

A string literal with `'f'` or `'F'` in its prefix is a *formatted string literal*; see [Formatted string literals](https://docs.python.org/3/reference/lexical_analysis.html#f-strings). The `'f'` may be combined with `'r'`, but not with `'b'` or `'u'`, therefore raw formatted strings are possible, but formatted bytes literals are not.

In triple-quoted literals, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the literal. (A “quote” is the character used to open the literal, i.e. either `'` or `"`.)

Unless an `'r'` or `'R'` prefix is present, escape sequences in string and bytes literals are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:

> 有一个语法限制没有被这些产品所指出，就是在[`stringprefix`](https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-python-grammar-stringprefix)或[`bytesprefix`](https://docs.python.org/3/reference/lexical_analysis.html#grammar-token-python-grammar-bytesprefix)和字面的其余部分之间不允许有空格。源字符集由编码声明定义；如果源文件中没有给出编码声明，则为UTF-8；见[编码声明](https://docs.python.org/3/reference/lexical_analysis.html#encodings)一节。
>
> 用通俗话来说：这两种类型的字面值都可以用匹配的单引号（`'`）或双引号（`"`）括起来。它们也可以被包含在三个单引号或双引号的匹配组中（这些通常被称为*三引号字符串（triple-quoted strings）*）。反斜杠(`\`)字符用于给其他普通字符赋予特殊含义，如`n`，在转义(`\n`)后表示'换行'。它也可以用来转义那些本来有特殊含义的字符，如换行、反斜杠本身或引用字符。例子见下面的[转义序列（escape sequences）](https://docs.python.org/3/reference/lexical_analysis.html#escape-sequences)。
>
> 字节字面值总是以 `b` 或 `B` 为前缀；它们产生一个[`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes)类型的实例，而不是[`str`](https://docs.python.org/3/library/stdtypes.html#str)类型。它们只能包含ASCII字符；数字值为128或更大的字节必须用转义来表达。
>
> 字符串和字节字面值都可以选择以字母 `'r'` 或 `'R'` 为前缀；这样的字符串被称为*raw strings*，并将反斜线视为字面字符。因此，在字符串字面中的 `'\U'` 和 `'\u'` 转义不被特别处理。鉴于Python 2.x的原始unicode字面值的行为与Python 3.x的不同，不支持 `'ur'` 语法。
>
> *3.3版中的新内容：* raw字节字面值的前缀 `'rb'` 被添加为 `'br'` 的同义词。
>
> *3.3版中的新内容：* 重新引入了对unicode遗留字面(`u'value'`)的支持，以简化对Python 2.x和3.x双代码库的维护。参见 [**PEP 414**](https://www.python.org/dev/peps/pep-0414) 以了解更多信息。
>
> 前缀中带有 `'f'` 或 `'F'` 的字符串字面是一个*格式化字符串字面（formatted  string literal）*；参见 [格式化字符串字面](https://docs.python.org/3/reference/lexical_analysis.html#f-strings)。`'f'` 可以与 `'r'` 组合，但不能与 `'b'` 或 `'u'` 组合，因此，raw格式化字符串是可能的，但格式化字节字面是不可能的。
>
> 在三引号的字面值中，允许使用（并保留）未转义的换行符（newline）和引号，但连续三个未转义的引号将终止该字面。("引号"是用来打开字面的字符，即 `'` 或 `"` )。
>
> 除非有 `'r'` 或 `'R'` 前缀，否则字符串和字节字面值的转义序列将按照类似于标准C的规则来解释。正式认可的转义序列是：

| Escape Sequence | Meaning                          | Notes |
| :-------------- | :------------------------------- | :---- |
| `\newline`      | Backslash and newline ignored    |       |
| `\\`            | Backslash (`\`)                  |       |
| `\'`            | Single quote (`'`)               |       |
| `\"`            | Double quote (`"`)               |       |
| `\a`            | ASCII Bell (BEL)                 |       |
| `\b`            | ASCII Backspace (BS)             |       |
| `\f`            | ASCII Formfeed (FF)              |       |
| `\n`            | ASCII Linefeed (LF)              |       |
| `\r`            | ASCII Carriage Return (CR)       |       |
| `\t`            | ASCII Horizontal Tab (TAB)       |       |
| `\v`            | ASCII Vertical Tab (VT)          |       |
| `\ooo`          | Character with octal value *ooo* | (1,3) |
| `\xhh`          | Character with hex value *hh*    | (2,3) |

Escape sequences only recognized in string literals are:

>  只在字符串字面值中被识别的转义序列是：

> 

| Escape Sequence | Meaning                                        | Notes |
| :-------------- | :--------------------------------------------- | :---- |
| `\N{name}`      | Character named *name* in the Unicode database | (4)   |
| `\uxxxx`        | Character with 16-bit hex value *xxxx*         | (5)   |
| `\Uxxxxxxxx`    | Character with 32-bit hex value *xxxxxxxx*     | (6)   |



Notes:

1. As in Standard C, up to three octal digits are accepted.
2. Unlike in Standard C, exactly two hex digits are required.
3. In a bytes literal, hexadecimal and octal escapes denote the byte with the given value. In a string literal, these escapes denote a Unicode character with the given value.
4. *Changed in version 3.3:* Support for name aliases [1](https://docs.python.org/3/reference/lexical_analysis.html#id14) has been added.
5. Exactly four hex digits are required.
6. Any Unicode character can be encoded this way. Exactly eight hex digits are required.

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., *the backslash is left in the result*. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.) It is also important to note that the escape sequences only recognized in string literals fall into the category of unrecognized escapes for bytes literals.

> *Changed in version 3.6:* Unrecognized escape sequences produce a [`DeprecationWarning`](https://docs.python.org/3/library/exceptions.html#DeprecationWarning). In a future Python version they will be a [`SyntaxWarning`](https://docs.python.org/3/library/exceptions.html#SyntaxWarning) and eventually a [`SyntaxError`](https://docs.python.org/3/library/exceptions.html#SyntaxError).

Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, `r"\""` is a valid string literal consisting of two characters: a backslash and a double quote; `r"\"` is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, *a raw literal cannot end in a single backslash* (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, *not* as a line continuation.

> 注意事项：
>
> 1. 与标准C一样，最多接受三个八位数。
> 2. 与标准C不同的是，需要两个十六进制数字。
> 3. 在字节字面中，十六进制和八进制转义表示具有给定值的字节。在字符串字面中，这些转义表示具有给定值的Unicode字符。
> 4. *在3.3版本中的变化：* 增加了对名称别名的支持[^1]。
> 5. 需要精确的四位十六进制数字。
> 6. 任何Unicode字符都可以用这种方式编码。需要精确的八位十六进制数字。
>
> 与标准C不同的是，所有未被识别的转义序列在字符串中保持不变，即*反斜杠被留在结果中*。(这种行为在调试时很有用：如果转义序列打错了，产生的输出结果更容易被识别为错误的。) 同样重要的是要注意，只在字符串字面中被识别的转义序列属于字节字面中不被识别的转义类别。
>
> > *在3.6版本中改变了：* 不被识别的转义序列会产生一个 [`DeprecationWarning`](https://docs.python.org/3/library/exceptions.html#DeprecationWarning)。在未来的Python版本中，它们将是一个[`SyntaxWarning`](https://docs.python.org/3/library/exceptions.html#SyntaxWarning)，最终是一个[`SyntaxError`](https://docs.python.org/3/library/exceptions.html#SyntaxError)。
>
> 即使在一个raw字面中，引号也可以用反斜线转义，但反斜线仍保留在结果中；例如，`r"\""`是一个有效的字符串字面，由两个字符组成：一个反斜线和一个双引号；`r"\"`不是一个有效的字符串字面(即使是一个raw字符串也不能以奇数个反斜线结尾)。具体来说，*一个raw字面不能以单个反斜杠*结尾（因为反斜杠会转义随后的引号字符）。还要注意的是，单反斜线后面的换行被解释为这两个字符是字面的一部分，*不是*行的延续。

### 2.4.2. String literal concatenation

Multiple adjacent string or bytes literals (delimited by whitespace), possibly using different quoting conventions, are allowed, and their meaning is the same as their concatenation. Thus, `"hello" 'world'` is equivalent to `"helloworld"`. This feature can be used to reduce the number of backslashes needed, to split long strings conveniently across long lines, or even to add comments to parts of strings, for example:

> 允许多个相邻的字符串或字节字面值（以空格为界），可能使用不同的引号惯例，它们的含义与它们的串联相同。因此，`"hello"'world'` 等同于 `"helloworld"`。这个功能可以用来减少所需的反斜线数量，方便地在长行中分割长的字符串，甚至可以用来给字符串的一部分添加注释，例如：

In [2]:
import re

re.compile("[A-Za-z_]"        # letter or underscore
          "[A-Za-z0-9_]*"    # letter, digit or underscore
          )

re.compile(r'[A-Za-z_][A-Za-z0-9_]*', re.UNICODE)

Note that this feature is defined at the syntactical level, but implemented at compile time. The ‘+’ operator must be used to concatenate string expressions at run time. Also note that literal concatenation can use different quoting styles for each component (even mixing raw strings and triple quoted strings), and formatted string literals may be concatenated with plain string literals.

> 请注意，这个功能是在语法层面上定义的，但在编译时实现。在运行时必须使用'+'运算符来连接字符串表达式。还要注意的是，字面值的连接可以为每个组件使用不同的引号风格（甚至混合raw字符串和三引号的字符串），格式化的字符串字面值可以与普通字符串字面值连接。

### 2.4.3. Formatted string literals

*New in version 3.6.*

A *formatted string literal* or *f-string* is a string literal that is prefixed with `'f'` or `'F'`. These strings may contain replacement fields, which are expressions delimited by curly braces `{}`. While other string literals always have a constant value, formatted strings are really expressions evaluated at run time.

Escape sequences are decoded like in ordinary string literals (except when a literal is also marked as a raw string). After decoding, the grammar for the contents of the string is:

> *在3.6 版本中新增*
>
> 一个*格式化的字符串字面*或*f-string*是一个以 `'f'` 或 `'F'` 为前缀的字符串字面。这些字符串可能包含替换字段，这些字段是由大括号`{}`划定的表达式。虽然其他字符串字面值总是有一个常量值，但格式化字符串实际上是在运行时运算的表达式。
>
> 转义序列像普通的字符串字面值一样被解码（除了当字面值也被标记为raw字符串时）。解码后，字符串内容的语法是：

In [4]:
f_string          ::=  (literal_char | "{{" | "}}" | replacement_field)*
replacement_field ::=  "{" f_expression ["="] ["!" conversion] [":" format_spec] "}"
f_expression      ::=  (conditional_expression | "*" or_expr)
                         ("," conditional_expression | "," "*" or_expr)* [","]
                       | yield_expression
conversion        ::=  "s" | "r" | "a"
format_spec       ::=  (literal_char | NULL | replacement_field)*
literal_char      ::=  <any code point except "{", "}" or NULL>

IndentationError: unindent does not match any outer indentation level (<tokenize>, line 5)

The parts of the string outside curly braces are treated literally, except that any doubled curly braces `'{{'` or `'}}'` are replaced with the corresponding single curly brace. A single opening curly bracket `'{'` marks a replacement field, which starts with a Python expression. To display both the expression text and its value after evaluation, (useful in debugging), an equal sign `'='` may be added after the expression. A conversion field, introduced by an exclamation point `'!'` may follow. A format specifier may also be appended, introduced by a colon `':'`. A replacement field ends with a closing curly bracket `'}'`.

Expressions in formatted string literals are treated like regular Python expressions surrounded by parentheses, with a few exceptions. An empty expression is not allowed, and both [`lambda`](https://docs.python.org/3/reference/expressions.html#lambda) and assignment expressions `:=` must be surrounded by explicit parentheses. Replacement expressions can contain line breaks (e.g. in triple-quoted strings), but they cannot contain comments. Each expression is evaluated in the context where the formatted string literal appears, in order from left to right.

*Changed in version 3.7:* Prior to Python 3.7, an [`await`](https://docs.python.org/3/reference/expressions.html#await) expression and comprehensions containing an [`async for`](https://docs.python.org/3/reference/compound_stmts.html#async-for) clause were illegal in the expressions in formatted string literals due to a problem with the implementation.

When the equal sign `'='` is provided, the output will have the expression text, the `'='` and the evaluated value. Spaces after the opening brace `'{'`, within the expression and after the `'='` are all retained in the output. By default, the `'='` causes the [`repr()`](https://docs.python.org/3/library/functions.html#repr) of the expression to be provided, unless there is a format specified. When a format is specified it defaults to the [`str()`](https://docs.python.org/3/library/stdtypes.html#str) of the expression unless a conversion `'!r'` is declared.

*New in version 3.8:* The equal sign `'='`.

If a conversion is specified, the result of evaluating the expression is converted before formatting. Conversion `'!s'` calls [`str()`](https://docs.python.org/3/library/stdtypes.html#str) on the result, `'!r'` calls [`repr()`](https://docs.python.org/3/library/functions.html#repr), and `'!a'` calls [`ascii()`](https://docs.python.org/3/library/functions.html#ascii).

The result is then formatted using the [`format()`](https://docs.python.org/3/library/functions.html#format) protocol. The format specifier is passed to the `__format__()` method of the expression or conversion result. An empty string is passed when the format specifier is omitted. The formatted result is then included in the final value of the whole string.

Top-level format specifiers may include nested replacement fields. These nested fields may include their own conversion fields and [format specifiers](https://docs.python.org/3/library/string.html#formatspec), but may not include more deeply-nested replacement fields. The [format specifier mini-language](https://docs.python.org/3/library/string.html#formatspec) is the same as that used by the [`str.format()`](https://docs.python.org/3/library/stdtypes.html#str.format) method.

Formatted string literals may be concatenated, but replacement fields cannot be split across literals.

Some examples of formatted string literals:

> 大括号外的字符串部分按字面值意思处理，除了任何双大括号 `'{{'` 或 `'}}'` 被替换成相应的单倍大括号。单个开放的大括号 `'{'` 标志着一个替换字段，它以一个Python表达式开始。为了同时显示表达式文本和它在运算求值后的值，(在调试中很有用)，可以在表达式后面添加一个等号 `'='`。一个转换域，由感叹号 `'!'` 引入，可以紧随其后。也可以附加一个格式指定符，由冒号 `':'` 引入。替换字段以结尾的大括号 `'}'` 结束。
>
> 在格式化的字符串字面值中的表达式被当作Python正则表达式处理，周围有小括号，但有一些例外。不允许使用空表达式，[`lambda`](https://docs.python.org/3/reference/expressions.html#lambda) 和赋值表达式`:= ` 都必须用明确的括号包围。替换表达式可以包含换行符（例如在三引号字符串中），但不能包含注释。每个表达式在格式化字符串字面出现的上下文中被运算，从左到右依次进行。
>
> *在3.7版本中改变了：* 在Python 3.7之前，由于实现上的问题，在格式化字符串字面的表达式中，[`await`](https://docs.python.org/3/reference/expressions.html#await)表达式和包含[`async for`](https://docs.python.org/3/reference/compound_stmts.html#async-for)子句的解析式是非法的。
>
> 当提供等号 `'='` 时，输出将有表达式文本、`'='` 和被运算的值。开头括号 `'{'` 之后的空格、表达式内和 `'='` 之后的空格都会保留在输出中。默认情况下，`'='` 会导致提供表达式的[`repr()`](https://docs.python.org/3/library/functions.html#repr)，除非有指定的格式。当指定格式时，默认为表达式的[`str()`](https://docs.python.org/3/library/stdtypes.html#str)，除非声明转换 `'!r'`。
>
> *3.8版新增：* 等号 `'='` 。
>
> 如果指定了转换，在格式化之前，运算求值表达式的结果将被转换。转换 `'!s'` 对结果调用[`str()`](https://docs.python.org/3/library/stdtypes.html#str)，`'!r'`调用[`repr()`](https://docs.python.org/3/library/functions.html#repr)，`'!a'`调用[`ascii()`](https://docs.python.org/3/library/functions.html#ascii) 。
>
> 然后使用[`format()`](https://docs.python.org/3/library/functions.html#format)协议将结果格式化。格式指定器被传递给表达式或转换结果的`__format__()`方法。当格式指定符被省略时，将传递一个空字符串。然后，格式化的结果被包含在整个字符串的最终值中。
>
> 顶层的格式指定器可以包括嵌套的替换字段。这些嵌套的字段可以包括它们自己的转换字段和[格式指定器](https://docs.python.org/3/library/string.html#formatspec)，但不能包括更深嵌套的替换字段。[格式化指定符小语言](https://docs.python.org/3/library/string.html#formatspec)与[`str.format()`](https://docs.python.org/3/library/stdtypes.html#str.format)方法使用相同。
>
> 格式化的字符串字面值可以被连接起来，但是替换字段不能在字面值之间分割。
>
> 一些格式化字符串字面值的例子：

In [3]:
name = "Fred"
name

'Fred'

In [7]:
f"He said his name is {name!r}."    # !r - convert the value to a string using repr().

f"He said his name is {name!s}."    # !s - convert the value to a string using str().

f"He said his name is {name}."

"He said his name is 'Fred'."

'He said his name is Fred.'

'He said his name is Fred.'

In [8]:
f"He said his name is {repr(name)}."    # repr() is equvilent to !r

f"He said his name is {str(name)}."    # str() is equvilent to !s

"He said his name is 'Fred'."

'He said his name is Fred.'

In [7]:
import decimal

width = 0
precision = 4
value = decimal.Decimal("12.34567")

f"result: {value:{width}.{precision}}"    # nested fields

'result: 12.35'

In [9]:
from datetime import datetime

today = datetime(year=2022, month=5, day=31)

f"{today:%B %d, %Y}"    # using date format specifier

'May 31, 2022'

In [10]:
f"{today=:%B %d, %Y}"    # using date format specifier and debugging

'today=May 31, 2022'

In [11]:
number = 1024

f"{number:#0x}"    # using integer format specifier

'0x400'

In [12]:
foo = "bar"

f"{ foo = }"    # preserves whitespace

" foo = 'bar'"

In [13]:
line = "The mill's closed"

f"{line = }"

'line = "The mill\'s closed"'

In [14]:
f"{line = :20}"

"line = The mill's closed   "

In [15]:
f"{line = !r:20}"

'line = "The mill\'s closed" '

consequence of sharing the same syntax as regular string literals is that characters in the replacement fields must not conflict with the quoting used in the outer formatted string literal:

> 与普通字符串字面值共享相同语法的后果是，替换字段中的字符不能与外部格式化字符串字面中使用的引号冲突：

In [10]:
# f"abc {a["x"]} def"    # error: outer string literal ended prematurely
f"abc {a['x']} def"    # workaround: use different quoting

NameError: name 'a' is not defined

Backslashes are not allowed in format expressions and will raise an error:

> 格式化表达式中不允许使用反斜线，会引发错误：

In [11]:
f"newline: {ord('\n')}"    # raise SyntaxError

SyntaxError: f-string expression part cannot include a backslash (1300146658.py, line 1)

To include a value in which a backslash escape is required, create a temporary variable.

> 要包含一个需要反斜杠转义的值，请创建一个临时变量：

In [12]:
newline = ord('\n')

f"newline: {newline}"

'newline: 10'

Formatted string literals cannot be used as docstrings, even if they do not include expressions.

> 格式化的字符串字面值不能作为文档串使用，即使它们不包括表达式。

In [14]:
def foo():
    f"Not a docstring"


foo.__doc__ is None

True

See also [**PEP 498**](https://www.python.org/dev/peps/pep-0498) for the proposal that added formatted string literals, and [`str.format()`](https://docs.python.org/3/library/stdtypes.html#str.format), which uses a related format string mechanism.

> 关于增加格式化字符串字面的建议，请参见[**PEP 498**](https://www.python.org/dev/peps/pep-0498)，以及[`str.format()`](https://docs.python.org/3/library/stdtypes.html#str.format)，它使用了一个相关的格式字符串机制。

### 2.4.4. Numeric literals

There are three types of numeric literals: integers, floating point numbers, and imaginary numbers. There are no complex literals (complex numbers can be formed by adding a real number and an imaginary number).

Note that numeric literals do not include a sign; a phrase like `-1` is actually an expression composed of the unary operator ‘`-`’ and the literal `1`.

> 有三种类型的数字文字：整数、浮点数和虚数。没有复数字面（复数可以由一个实数和一个虚数相加形成）。
>
> 请注意，数字字面不包括符号；像 `-1` 这样的短语实际上是一个由单数运算符`'-'` 和字面 `1` 组成的表达式。

### 2.4.5. Integer literals

Integer literals are described by the following lexical definitions:

> 整数字元由以下词法定义描述：

In [2]:
integer      ::=  decinteger | bininteger | octinteger | hexinteger
decinteger   ::=  nonzerodigit (["_"] digit)* | "0"+ (["_"] "0")*
bininteger   ::=  "0" ("b" | "B") (["_"] bindigit)+
octinteger   ::=  "0" ("o" | "O") (["_"] octdigit)+
hexinteger   ::=  "0" ("x" | "X") (["_"] hexdigit)+
nonzerodigit ::=  "1"..."9"
digit        ::=  "0"..."9"
bindigit     ::=  "0" | "1"
octdigit     ::=  "0"..."7"
hexdigit     ::=  digit | "a"..."f" | "A"..."F"

SyntaxError: invalid syntax (4018552655.py, line 1)

There is no limit for the length of integer literals apart from what can be stored in available memory.

Underscores are ignored for determining the numeric value of the literal. They can be used to group digits for enhanced readability. One underscore can occur between digits, and after base specifiers like `0x`.

Note that leading zeros in a non-zero decimal number are not allowed. This is for disambiguation with C-style octal literals, which Python used before version 3.0.

Some examples of integer literals:

> 整数字元的长度没有限制，除了可以存储在可用的内存中。
>
> 在确定字元的数字值时，下划线被忽略。它们可以用来对数字进行分组以提高可读性。一个下划线可以出现在数字之间，以及像 `0x` 这样的基数指定符之后。
>
> 请注意，在非零的十进制数字中，不允许出现前导零。这是为了与C-风格的八进制字样区分开来，Python在3.0版本之前使用的是八进制字样。
>
> 一些整数字元段的例子：

In [3]:
7     2147483647                        0o177    0b100110111
3     79228162514264337593543950336     0o377    0xdeadbeef
      100_000_000_000                   0b_1110_0101

SyntaxError: invalid syntax (1937470749.py, line 1)

*Changed in version 3.6:* Underscores are now allowed for grouping purposes in literals.

> *3.6版中的变化：* 现在允许用下划线在字元中进行分组。

### 2.4.6. Floating point literals

Floating point literals are described by the following lexical definitions:

> 浮点字元由以下词法定义描述：

In [4]:
floatnumber   ::=  pointfloat | exponentfloat
pointfloat    ::=  [digitpart] fraction | digitpart "."
exponentfloat ::=  (digitpart | pointfloat) exponent
digitpart     ::=  digit (["_"] digit)*
fraction      ::=  "." digitpart
exponent      ::=  ("e" | "E") ["+" | "-"] digitpart

SyntaxError: invalid syntax (59127510.py, line 1)

Note that the integer and exponent parts are always interpreted using radix 10. For example, `077e010` is legal, and denotes the same number as `77e10`. The allowed range of floating point literals is implementation-dependent. As in integer literals, underscores are supported for digit grouping.

Some examples of floating point literals:

> 请注意，整数和指数部分总是用基数根10来解释。例如，`077e010` 是合法的，并且表示与 `77e10` 相同的数字。浮点字元的允许范围取决于实现。和整数字元一样，下划线也支持数字分组。
>
> 一些浮点字元的例子：

In [6]:
3.14
10.    
.001    
1e100    
3.14e-10    
0e0    
3.14_15_93

3.14

10.0

0.001

1e+100

3.14e-10

0.0

3.141593

*Changed in version 3.6:* Underscores are now allowed for grouping purposes in literals.

> *3.6版中的变化：* 现在允许用下划线在字词中进行分组。

### 2.4.7. Imaginary literals

Imaginary literals are described by the following lexical definitions:

> 虚构字元由以下词汇定义来描述：

In [7]:
imagnumber ::=  (floatnumber | digitpart) ("j" | "J")

SyntaxError: invalid syntax (2324973660.py, line 1)

An imaginary literal yields a complex number with a real part of 0.0. Complex numbers are represented as a pair of floating point numbers and have the same restrictions on their range. To create a complex number with a nonzero real part, add a floating point number to it, e.g., `(3+4j)`. Some examples of imaginary literals:

> 一个虚构字元产生一个实部为0.0的复数。复数被表示为一对浮点数，对其范围有同样的限制。要创建一个实部为非零的复数，需要给它加上一个浮点数，例如，`(3+4j)`。一些虚数字段的例子：

In [8]:
3.14j
10.j
10j
.001j
1e100j
3.14e-10j   
3.14_15_93j

3.14j

10j

10j

0.001j

1e+100j

3.14e-10j

3.141593j

## 2.5. Operators

The following tokens are operators:

> 以下标记是运算符：

## 2.6. Delimiters

The following tokens serve as delimiters in the grammar:

> 以下标记在语法中充当分隔符：

The period can also occur in floating-point and imaginary literals. A sequence of three periods has a special meaning as an ellipsis literal. The second half of the list, the augmented assignment operators, serve lexically as delimiters, but also perform an operation.

The following printing ASCII characters have special meaning as part of other tokens or are otherwise significant to the lexical analyzer:

> 句号也可以出现在浮点和虚构数字元中。三个句号的序列有一个特殊的含义，即省略号字元。列表的后半部分，即增强的赋值运算符，在词法上作为定界符，但也执行一种操作。
>
> 以下印刷ASCII字符作为其他标记的一部分具有特殊意义，或者对词法分析器具有其他意义：

In [9]:
,
"
#
\

SyntaxError: unterminated string literal (detected at line 2) (3906205298.py, line 2)

The following printing ASCII characters are not used in Python. Their occurrence outside string literals and comments is an unconditional error:

> 以下是Python中不使用的ASCII打印字符。它们出现在字符串字元和注释之外是一个无条件的错误：

In [10]:
$
?

SyntaxError: invalid syntax (1943989242.py, line 1)

**Footnotes**

https://www.unicode.org/Public/11.0.0/ucd/NameAliases.txt