In [3]:
# Mount Google Driver
from google.colab import drive # import drive from google colab

ROOT = "/content/drive"     # default location for the drive
drive.mount(ROOT)           # we mount the google drive at /content/drive
# change to clrs directionary
%cd "/content/drive/My Drive/Colab Notebooks/the-python3-standard-libary-by-example-notes"

%mkdir ch1
!touch ch1/__init__.py
%cd ch1

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
/content/drive/My Drive/Colab Notebooks/the-python3-standard-libary-by-example-notes


## 1.1 `string`: 文本常量和模板

`string` 模板提供的很多函数已经移植为 `str` 对象的方法，不过这个模块仍然保留了许多有用的常量和类来处理 `str` 对象

### 1.1.1 函数

函数 `capwords()` 会把一个字符串中所有单词的首字母大写

In [6]:
%%writefile string_capwords.py
import string

s = 'The quick brown fox jumped over the lazy dog.'

print(s)
print(string.capwords(s))

Writing string_capwords.py


In [7]:
!python3 string_capwords.py

The quick brown fox jumped over the lazy dog.
The Quick Brown Fox Jumped Over The Lazy Dog.


其可等效为下列语句

In [16]:
s = 'The quick brown fox jumped over the lazy dog.'
print(" ".join([ word.capitalize() for word in s.split()]))

The Quick Brown Fox Jumped Over The Lazy Dog.


### 1.1.2 模板

- 字符串模板是 [PEP 292](https://www.python.org/dev/peps/pep-0292/) 是新增的内容，将作为内置拼接语法的替代做法
- 使用 `string.Template` 进行拼接时，需要在名字前加 `$` 来标识变量(如 `$var`)。也可用大括号包围变量(如 `$(var)`)

In [36]:
%%writefile string_template.py
import string

values = {'var': 'foo'}

t = string.Template("""
Variable        : $var
Escape          : $$
Variable in text: ${var}iable
""")  # $ 重复两次触发转义

print('TEMPLATE:', t.substitute(values))

s = """
Variable        : %(var)s
Escape          : %%
Variable in text: %(var)siable  
"""  # % 重复两次触发转义

print('INTERPOLATION:', s % values)

s = """
Variable        : {var}
Escape          : {{}}
Variable in text: {var}iable
"""  # 重复 { 和 } 来触发转义

print('FORMAT:', s.format(**values))

Overwriting string_template.py


In [37]:
!python3 string_template.py

TEMPLATE: 
Variable        : foo
Escape          : $
Variable in text: fooiable

Traceback (most recent call last):
  File "string_template.py", line 19, in <module>
    print('INTERPOLATION:', s % values)
TypeError: not enough arguments for format string


- 通过使用 `safe_substitute()` 方法，可以避免参数缺失可能导致的异常

In [20]:
%%writefile string_template_missing.py
import string

values = {'var': 'foo'}

t = string.Template("$var is here but $missing is not provided")

try:
    print('substitute()     :', t.substitute(values))
except KeyError as err:
    print('ERROR:', str(err))

print('safe_substitute():', t.safe_substitute(values))


Overwriting string_template_missing.py


In [21]:
!python3 string_template_missing.py

ERROR: 'missing'
safe_substitute(): foo is here but $missing is not provided


### 1.1.3 高级模板

- 通过调整 `string.Template` 在模板体中查找变量名所使用的正则表达式模式，可又修改它的默认语法
- 一种简单的方法是修改 `delimiter` 和 `idpattern` 类属性
  - `delimiter` 为分隔符
  - `idpattern` 为变量名的正则表达式

In [38]:
%%writefile string_template_advanced.py
import string


class MyTemplate(string.Template):
    delimiter = '%'
    idpattern = '[a-z]+_[a-z]+'


template_text = '''
  Delimiter : %%
  Replaced  : %with_underscore
  Ignored   : %notunderscored
'''

d = {
    'with_underscore': 'replaced',
    'notunderscored': 'not replaced',
}

t = MyTemplate(template_text)
print('Modified ID pattern:')
print(t.safe_substitute(d))


Writing string_template_advanced.py


- 此例中，定量符被修改为 `%`，而且变量名中间的某个位置必须要有下划线， 因此 `%notunderstand` 不会被替换为任何字符

In [39]:
!python3 string_template_advanced.py

Modified ID pattern:

  Delimiter : %
  Replaced  : replaced
  Ignored   : %notunderscored



- `t.pattern`  的值是一个已编译的正则表达式，可以通过其 `pattern` 属性得到原来的字符串

In [40]:
%%writefile string_template_defaultpattern.py
import string

t = string.Template('$var')
print(t.pattern.pattern)

Writing string_template_defaultpattern.py


- `t.pattern` 其中包含四个命名组，分别捕获转义定界符、命名变量、加大括号的变量和不合法的定界符模式

In [41]:
!python3 string_template_defaultpattern.py


    \$(?:
      (?P<escaped>\$) |   # Escape sequence of two delimiters
      (?P<named>(?a:[_a-z][_a-z0-9]*))      |   # delimiter and a Python identifier
      {(?P<braced>(?a:[_a-z][_a-z0-9]*))}  |   # delimiter and a braced identifier
      (?P<invalid>)              # Other ill-formed delimiter exprs
    )
    


> 其中 `?:` 表示匹配一个不用保存的分组

- 要完成更复杂的修改，需要覆盖 `pattern` 属性并定义一个全新的正则表达式，其中的模式也必须保含相应的四个命名组

In [42]:
%%writefile string_template_newsyntax.py
import re
import string


class MyTemplate(string.Template):
    delimiter = '{{'
    pattern = r'''
    \{\{(?:
    (?P<escaped>\{\{)|
    (?P<named>[_a-z][_a-z0-9]*)\}\}|
    (?P<braced>[_a-z][_a-z0-9]*)\}\}|
    (?P<invalid>)
    )
    '''


t = MyTemplate('''
{{{{
{{var}}
''')

print('MATCHES:', t.pattern.findall(t.template))
print('SUBSTITUTED:', t.safe_substitute(var='replacement'))


Writing string_template_newsyntax.py


- 其中 `named` 和 `braced` 模式完全一致，但必须分别提供。(即表示变量名不能加大括号)
- 两个重复的 `{{` 会触发转义

In [43]:
!python3 string_template_newsyntax.py

MATCHES: [('{{', '', '', ''), ('', 'var', '', '')]
SUBSTITUTED: 
{{
replacement

