# Chapter 7 String Fundamentals
*string*: an ordered collection of characters used to store and represent text- and bytes-based information.
## This Chapter's Scope
### Unicode: The Short Story
- In Python 3.X there are three string types: `str` is used Unicode text (including ASCII), `bytes` is used for binary data (including encoded tex), and `bytearray` is a mutable variant of `bytes`.
Files work in tow modes: *text*, which represents content as `str` and implements Unicode encodings, and *bianry*, which deals in raw `bytes` and does no data translation.

## String Basics
Python strings are categorized as *immutable sequences*
## String Literals
- Single quotes: `'spa"m`
- Double quotes:`"spa'm"`
- Triple quotes: ` ```... spam ...''' `, `"""... spam ..."""`
- Escape sequences: `"s\tp\na\0m"'
- Raw strings: `r"C:\new\test.spm"`
- Byte literals: `b'sp\x01am'`
- Unicode literals: u`eggs\u0020spam'`

### Single- and Double-quoted String are the same


In [1]:
'knight"s', "knight's"

('knight"s', "knight's")

Python *automatically concatenates* adjacent string literals in any experssion.

In [3]:
title = "Meaning " 'of ' "life"
title

'Meaning of life'

You can also embed quote characters by escaping them with backslashes:

In [4]:
'knight\'s', "knight\"s"

("knight's", 'knight"s')

### Escape Sequences Represengt Special Characters
Blackslashes are used to introduce special character codings known as *excape sequences*

In [7]:
s = 'a\nb\tc'
print(s)

a
b	c


In [8]:
len(s)

5

3.X defines `str` string formally as *sequences of Unicode code points*, not byte.

In [9]:
s = 'a\0\b\0c'
s

'a\x00\x08\x00c'

In [10]:
len(s)

5

In [11]:
print(s)

a c


In [12]:
s = '\001\002\x03'
s

'\x01\x02\x03'

In [13]:
print(s)




In [14]:
S = "s\tp\na\x00m"

In [15]:
S

's\tp\na\x00m'

In [16]:
len(s)

3

In [18]:
print(S)

s	p
a m


If Python does not recognize the character after a \ as being a valid escape code, it simply keeps the backslah in the resulting string:

In [19]:
x = "C:\py\code"
x

'C:\\py\\code'

In [21]:
len(x)

10

### Raw Strimgs Suppress Escapes


In [22]:
path = r'C:\new\text.dat'
path

'C:\\new\\text.dat'

In [23]:
print(path)

C:\new\text.dat


In [24]:
len(path)

15

### Triple Quotes Code Multiline Block Strings

In [28]:
mantra = """Always look
 on the bright
side of life."""

In [29]:
mantra

'Always look\n on the bright\nside of life.'

In [30]:
print(mantra)

Always look
 on the bright
side of life.


In [32]:
menu = """spam #comments here added to string!
eggs
"""
menu

'spam #comments here added to string!\neggs\n'

In [35]:
menu = (
    "spam\n" #comments here added to string!
    "eggs\n")
menu

'spam\neggs\n'

In [36]:
print(menu)

spam
eggs



## Strings in Action


In [37]:
len('abc')

3

In [39]:
'abc' + 'def'

'abcdef'

In [40]:
'Ni!' * 4

'Ni!Ni!Ni!Ni!'

In [41]:
print('-' * 80)

--------------------------------------------------------------------------------


In [43]:
myjob = "hacker"
for c in myjob: print(c, end = ' ')

h a c k e r 

In [44]:
"k" in myjob

True

In [45]:
'z' in myjob

False

In [47]:
'spam' in 'abcspamdef'

True

### Indexing and Slicing

In [50]:
S = 'spam'
S[0], S[-2]

('s', 'a')

In [51]:
S[1:3], S[1:], S[:-1]

('pa', 'pam', 'spa')

#### Extended slicing: The third limt and slice object

In [52]:
S = 'abcdefgjhijklmnop'

In [53]:
S[1:10:2]

'bdfji'

In [55]:
S[::2]

'aceghjlnp'

In [56]:
"Hello"[::-1]

'olleH'

In [57]:
S = 'abcedfg'
S[5:1:-1]

'fdec'

In [58]:
s[1:5:-1]

''

*slice object*

In [59]:
'spam'[1:3]

'pa'

In [60]:
'spam'[slice(1,3)]

'pa'

In [61]:
'spam'[::-1]

'maps'

In [62]:
'spam'[slice(None, None, -1)]

'maps'

## String Conversion Tools


In [63]:
"42" + 1

TypeError: can only concatenate str (not "int") to str

In [64]:
int("42"), str(42)

(42, '42')

In [65]:
repr(42)

'42'

In [66]:
print(str('spam'), repr('spam'))

spam 'spam'


In [67]:
str('spam'), repr('spam')

('spam', "'spam'")

In [68]:
S = "42"
I =1
S +I

TypeError: can only concatenate str (not "int") to str

In [69]:
int(S) + I

43

In [70]:
S + str(I)

'421'

In [71]:
str(3.1415), float("1.5")

('3.1415', 1.5)

In [73]:
text = "1.234E-10"
float(text)

1.234e-10

### Character code conversion

In [74]:
ord('s')

115

In [75]:
chr(115)

's'

In [76]:
S = '5'
S = chr(ord(S)+1)

In [77]:
S

'6'

In [78]:
S = chr(ord(S) + 1)
S

'7'

In [79]:
int('5')

5

In [80]:
ord('5') - ord('0')

5

In [81]:
B = '1101'

In [82]:
I = 0
while B != '':
    I = I*2 +(ord(B[0]) - ord('0'))
    B = B[1:]
    
I

13

In [83]:
int('1101', 2)

13

In [84]:
bin(13)

'0b1101'

## Changing Strings I


In [85]:
S = 'spam'
S[0] = x

TypeError: 'str' object does not support item assignment

In [86]:
S = S +'SPAM!'
S

'spamSPAM!'

In [87]:
S = S[:4] + 'Burger' + S[-1]

In [88]:
S

'spamBurger!'

In [89]:
S = 'splot'
S = S.replace('pl', 'pamal')

In [90]:
S

'spamalot'

In [92]:
'That is %d %s bird!' %(1, 'dead')

'That is 1 dead bird!'

In [94]:
'That is {0} {1} bird!'.format(1, 'dead')

'That is 1 dead bird!'

## String Methods
### Method Call Syntax

*Attribute fetches
    
    `object.attribute`
    
*Call expressions*

    `object.method(arguments)`

In [96]:
S = 'spam'
result = S.find('pa')

In [97]:
result

1

### Methods of Strings
### String Method Examples: Changing Strings II

In [106]:
S = 'spammy'

In [107]:
S = S[:3] +'xx' +S[5:]

In [108]:
S

'spaxxy'

In [109]:
S = 'spammy'
S= S.replace('mm', 'xx')

In [110]:
S

'spaxxy'

In [111]:
'aa$bb$cc$dd'.replace("$", 'SPAM')

'aaSPAMbbSPAMccSPAMdd'

In [112]:
S = 'xxxxSPAMxxxxSPAMxxx'
where = S.find('SPAM')
where

4

In [114]:
S = S[:where] + 'EGGS' +S[where + 4 :]

In [115]:
S

'xxxxEGGSxxxxSPAMxxx'

In [117]:
S = 'xxxxSPAMxxxxSPAMxxxx'
S.replace('SPAM', 'EGGS')

'xxxxEGGSxxxxEGGSxxxx'

In [119]:
S.replace('SPAM', 'EGGS',1)

'xxxxEGGSxxxxSPAMxxxx'

In [120]:
S = 'apammy'
L = list(S)
L

['a', 'p', 'a', 'm', 'm', 'y']

In [124]:
L[3] = 'x'
L[4]= 'x'
L

['a', 'p', 'a', 'x', 'x', 'y']

In [125]:
S = ''.join(L)

In [126]:
S

'apaxxy'

In [128]:
'SPAM'.join(['eggs', 'sausage', 'ham', 'toast'])

'eggsSPAMsausageSPAMhamSPAMtoast'

## String Method Examples: Parsing Text

In [129]:
line = 'aaa bbb ccc'
cols = line.split()
cols

['aaa', 'bbb', 'ccc']

In [130]:
line = 'bob,hacker,40'
line.split(',')

['bob', 'hacker', '40']

In [132]:
line = "i'mSPAMaSPAMlumberjack"
line.split('SPAM')

["i'm", 'a', 'lumberjack']

## Other Common String Methods in Action

In [133]:
line = "The knights who sya Ni!\n"
line.rstrip()

'The knights who sya Ni!'

In [134]:
line.upper()

'THE KNIGHTS WHO SYA NI!\n'

In [135]:
line.isalpha()

False

In [138]:
line.endswith('Ni!\n')

True

In [140]:
line.startswith('The')

True

## String Formatting Expressions
暂时跳过