In [5]:
s = 'dive into python'

In [6]:
len(s)

16

In [7]:
s[0]

'd'

In [8]:
s + '3'

'dive into python3'

> 像列表一样，可以使用`+`来连接字符串

## 格式化字符串

**定义**

In [4]:
'hello world'

'hello world'

In [5]:
"hello world"

'hello world'

In [6]:
'''
hello
world
'''

'\nhello\nworld\n'

**格式化-基本**

In [2]:
username = 'ffeenon'
password = '12345'
"{0}'s password is {1}".format(username, password)

"ffeenon's password is 12345"

**格式化-复合字段名**

In [17]:
SUFFIXES = {1000: ['KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'],
            1024: ['KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB']}
"1{0[1000][0]} is not 1{0[1024][0]}".format(SUFFIXES)

'1KB is not 1KiB'

> 传递列表，使用索引访问其中的一项

> 传递字典，使用键访问字典的值

> 传递模块，并使用`.`访问其变量和函数

>  传递类实例，按照名称访问其属性和方法

> 以上任意组合

**格式说明符**

In [23]:
'{0:.2f} {1}'.format(698.253, 'GB')

'698.25 GB'

> `f`表示定点数，区别于指数符号或其他十进制表示形式

**其他常见的字符串方法**

In [3]:
s = '''Finished files are the re-
sult of years of scientif-
ic study combined with the
experience of years.'''

In [4]:
s.splitlines()

['Finished files are the re-',
 'sult of years of scientif-',
 'ic study combined with the',
 'experience of years.']

In [5]:
s.lower()

'finished files are the re-\nsult of years of scientif-\nic study combined with the\nexperience of years.'

In [6]:
s.upper()

'FINISHED FILES ARE THE RE-\nSULT OF YEARS OF SCIENTIF-\nIC STUDY COMBINED WITH THE\nEXPERIENCE OF YEARS.'

In [8]:
s.lower().count('f')

6

In [9]:
s

'Finished files are the re-\nsult of years of scientif-\nic study combined with the\nexperience of years.'

------------------

In [14]:
query = 'user=pilgrim&database=master&password=PapayaWhip'
dict([v.split('=') for v in query.split('&')])

{'user': 'pilgrim', 'database': 'master', 'password': 'PapayaWhip'}

In [15]:
a_string = 'My alphabet starts where your alphabet ends.'
a_string[3:11]

'alphabet'

In [16]:
a_string[3:-3]

'alphabet starts where your alphabet en'

In [17]:
a_string[0:2]

'My'

In [18]:
a_string[:18] 

'My alphabet starts'

## 字符串和字节

> 字符是一种抽象。字符是字符，字节是字节。

>不可变Unicode字符序列成为字符串(`strings`)

>不可变的0到255之间的数字序列成为字节对象(`bytes`)

In [8]:
by = b'abcd\x65'
by

b'abcde'

In [4]:
type(by)

bytes

In [5]:
len(by)

5

In [9]:
by += b'\xff'
by

b'abcde\xff'

In [10]:
len(by)

6

In [12]:
by[0]

97

>字符串的项目是字符串，字节对象的项目是整数

In [13]:
by[0] = 102

TypeError: 'bytes' object does not support item assignment

>如果想更改单个字节，要么使用切片和串联，要么将`bytes`对象转换为`bytearray`对象

In [18]:
by = b'abcd\x65'
by

b'abcde'

In [20]:
barr = bytearray(by)
barr

bytearray(b'abcde')

In [21]:
len(barr)

5

In [22]:
barr[0] = 100
barr

bytearray(b'dbcde')

In [23]:
barr[0] = 256
barr

ValueError: byte must be in range(0, 256)

>`bytearray`可以使用`bytes`的所有方法和操作

>`bytearray`对象可以使用索引符号分配字节，分配的值必须是0~255之间

In [24]:
s = 'abcd'
by = b'e'
s + by

TypeError: can only concatenate str (not "bytes") to str

In [29]:
s = 'abcd'
by = b'a'
s.count(by)

TypeError: must be str, not bytes

>你不能将`bytes`和`string`混着用

In [30]:
s = 'abcd'
by = b'a'
s.count(by.decode('ascii'))

1

>计算以某种编码对字节序列进行解码后的字符串的出现次数

**`bytes` & `string`**

In [33]:
a_string = '深入Python'

In [41]:
by = a_string.encode('utf-8')
by

b'\xe6\xb7\xb1\xe5\x85\xa5Python'

In [42]:
len(by)

12

In [43]:
by = a_string.encode('big5')
len(by)

10

In [44]:
by.decode('big5')

'深入Python'

In [45]:
by.decode('utf-8')

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 0: invalid start byte

>`string`可以编码为`bytes`，`bytes`可以解码为`string`

## Python源代码中的字符编码

> Python 3 假定源代码是*utf-8*编码的

**要使用其他编码，在每个文件第一行放置一个声明**

In [46]:
# -*- coding: windows-1252 -*-

**如何第一行是类`unix`的hash-bang命令，也可以放第二行**

In [47]:
#!/usr/bin/python3
# -*- coding: windows-1252 -*-