<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

# Python for Finance (2nd ed.)

**Mastering Data-Driven Finance**

&copy; Dr. Yves J. Hilpisch | The Python Quants GmbH

<img src="http://hilpisch.com/images/py4fi_2nd_shadow.png" width="300px" align="left">

# Data Types and Structures

## Basic Data Types

### Integers整数

In [2]:
a = 10
type(a)

int

In [2]:
a.bit_length()

4

bit_length看二进制下数字表达的情况, 2^n = 10, n在3到4之间，向上取，4.

In [3]:
for a in range(1, 17):
    print(a, a.bit_length())

1 1
2 2
3 2
4 3
5 3
6 3
7 3
8 4
9 4
10 4
11 4
12 4
13 4
14 4
15 4
16 5


In [3]:
a = 100000
a.bit_length()

17

In [7]:
2**16

65536

In [8]:
2**17

131072

In [10]:
import numpy as np
np.log(100000) / np.log(2)

16.609640474436812

10^5在2^16和2^17之间，16.609640474436812四舍五入，所以向上取整:17

In [12]:
googol = 10 ** 100
googol

10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

googol本身是一个数学符号，代表的就是那么大的一个大数

In [13]:
googol.bit_length()

333

In [16]:
np.log(float(googol)) / np.log(2)

332.19280948873626

In [17]:
1 + 4

5

In [18]:
1 / 4

0.25

In [19]:
type(1 / 4)

float

整数运算如果出现小数，结果会自动匹配为float形式

### Floats浮点数/小数

In [1]:
1.6 / 4

0.4

In [2]:
type (1.6 / 4)

float

In [3]:
b = 0.35
type(b)

float

In [4]:
b + 0.1

0.44999999999999996

电脑在记录小数的时候不是很精确

In [5]:
for i in range(17):
    x = i ** 0.5
    y = x ** 2
    print(i, i - y)

0 0.0
1 0.0
2 -4.440892098500626e-16
3 4.440892098500626e-16
4 0.0
5 -8.881784197001252e-16
6 8.881784197001252e-16
7 -8.881784197001252e-16
8 -1.7763568394002505e-15
9 0.0
10 -1.7763568394002505e-15
11 0.0
12 1.7763568394002505e-15
13 1.7763568394002505e-15
14 0.0
15 -1.7763568394002505e-15
16 0.0


因为存在误差，而误差在加总之后会不断累加，会极大影响最后结果的精确性

In [6]:
c = 0.5
c.as_integer_ratio()

(1, 2)

In [7]:
b.as_integer_ratio()

(3152519739159347, 9007199254740992)

将浮点数转化成整数相除的形式，避免误差

In [8]:
3152519739159347 / 9007199254740992

0.35

In [9]:
import decimal
from decimal import Decimal

In [10]:
decimal.getcontext()

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])

In [11]:
d = Decimal(1) / Decimal (11)
d

Decimal('0.09090909090909090909090909091')

prec=28:从小数点后第一个不是0的数往后数28位数

In [12]:
decimal.getcontext().prec = 4  

In [13]:
e = Decimal(1) / Decimal (11)
e

Decimal('0.09090909090909090909090909091')

In [14]:
decimal.getcontext().prec = 50  

In [16]:
f = Decimal(1) / Decimal (11)
f

Decimal('0.09090909090909090909090909091')

In [17]:
g = d + e + f
g

Decimal('0.2727272727272727272727272727')

加总后小数点后第一个数不是0，所以从2开始往后数50个数，prec =相当于是给数字赋予格式， 计算后的数按最大的prec来取小数

### Boolean逻辑数据类型

In [18]:
import keyword

In [19]:
keyword.kwlist

['False',
 'None',
 'True',
 'and',
 'as',
 'assert',
 'async',
 'await',
 'break',
 'class',
 'continue',
 'def',
 'del',
 'elif',
 'else',
 'except',
 'finally',
 'for',
 'from',
 'global',
 'if',
 'import',
 'in',
 'is',
 'lambda',
 'nonlocal',
 'not',
 'or',
 'pass',
 'raise',
 'return',
 'try',
 'while',
 'with',
 'yield']

In [20]:
4 > 3  

True

In [21]:
type(4 > 3)

bool

In [22]:
type(False)

bool

In [23]:
4 >= 3  

True

In [24]:
4 < 3  

False

In [25]:
4 <= 3  

False

In [26]:
4 == 3  

False

In [32]:
4 != 3  

True

 <=小于等于，==是否相等，！=是否不等

In [27]:
True and True

True

In [34]:
True and False

False

## 可以把True看作1，False看作0，and看作×，结果是0或者1，or看作并集

In [30]:
False and False

False

In [31]:
True or True

True

In [32]:
True or False

True

In [33]:
False or False

False

In [34]:
not True

False

In [35]:
not False

True

In [36]:
(4 > 3) and (2 > 3)

False

In [37]:
(4 == 3) or (2 != 3)

True

In [38]:
not (4 != 4)

True

In [39]:
(not (4 != 4)) and (2 == 3)

False

In [40]:
if 4 > 3:  
    print('condition true')  

condition true


In [47]:
i = 0  
while i < 4:  
    print('condition true, i =', i)  
    i += 1  

condition true, i = 0
condition true, i = 1
condition true, i = 2
condition true, i = 3


print出的str前后自带一个空格

while代表多个重复条件的if，直到出现不符合的时候停止

i += 1  代表i = i +1, 同样 i*= 2代表 i = i * 2

In [48]:
i = 1
i *= 2
i

2

In [49]:
int(True)

1

In [50]:
int(False)

0

In [51]:
float(True)

1.0

In [52]:
float(False)

0.0

In [51]:
bool(0)

False

In [52]:
bool(0.0)

False

反之也可以将数字（整数、浮点数都可以）转换成bool逻辑数据的结构也可以

In [53]:
bool(1)

True

In [54]:
bool(10.5)

True

In [55]:
bool(-2)

True

如果将0变成bool会是False，其他非0数就是True。

### Strings字串符

In [57]:
t = 'this is a String object'

In [58]:
t.capitalize()

'This is a string object'

capitalize()变成日常打字的英文：第一个字母大写，其他字母改成小写

In [59]:
t.split()

['this', 'is', 'a', 'String', 'object']

split()看见空格就隔开成多个字符串，形成一个list

In [62]:
t.find('String')

10

从0开始数字母，数到String开头S的位置

In [60]:
t.find('Python')

-1

找不到回传-1

In [63]:
t.replace(' ', '|')

'this|is|a|String|object'

In [64]:
t.replace(' ', '')

'thisisaStringobject'

代替，目标可以是空格，''代表去空格

In [65]:
'httpt://www.python.org'.strip('htp:/')

'www.python.org'

### strip砍掉一部分字符（只能用来砍字符串最前面和最后面的字符，'htp:/'：重复的字符或者符号只要写一个就行,电脑会自动搜索，在最前面和最后面，直到砍完所以标出的字符为止。

## 

### Excursion: Printing and String Replacements

In [66]:
print('Python for Finance')  

Python for Finance


In [67]:
print(t)  

this is a String object


In [65]:
i = 0
while i < 4:
    print(i)  
    i += 1

0
1
2
3


In [68]:
i = 0
while i < 4:
    print(i, end='|')  
    i += 1

0|1|2|3|

In [69]:
i = 0
while i < 4:
    print(i, end='')  
    i += 1

0123

print(i)多个数，print默认会换行，print(i, end='|')一行，中间用|分隔，end=''一行没有分隔

In [81]:
'this is an integer %d' %15 

'this is an integer 15'

In [68]:
'this is an integer %4d' % 15  

'this is an integer   15'

### %d代表%后面的整数int，%4d代表%后面的整数直接占4个空格（填不满时，多余的用空格代替），%04d代表用0代替空格来填满空位。

In [None]:
'this is an integer15'

In [69]:
'this is an integer %04d' % 15  

'this is an integer 0015'

In [75]:
'this is an integer %14d' % 15  

'this is an integer             15'

In [76]:
'this is a float %f' % 15.3456  

'this is a float 15.345600'

In [77]:
'this is a float %.2f' % 15.3456  

'this is a float 15.35'

In [82]:
'this is a float %8f' % 15.3456  

'this is a float 15.345600'

In [83]:
'this is a float %8.2f' % 15.3456  

'this is a float    15.35'

### %8f：当浮点数大小超过预设的空位数时，电脑会自动取出适合的空位数，而不是用预设的值。

In [74]:
'this is a float %08.2f' % 15.3456  

'this is a float 00015.35'

In [84]:
'this is a float %-8.2f' % 15.3456  

'this is a float 15.35   '

### %f代表%后面的浮点数，默认保留6位小数，%.2f（= %0.2f）代表保留两位小数，%8.2f代表保留2位小数的同时浮点数占8个空位，%-8.2f就是靠左留出空格数：str和数字中间是自动str带的空格隔开，数字后空格增加直到满足8个空位，后面打'。

In [85]:
'this is a string %s' % 'Python'  

'this is a string Python'

In [86]:
'this is a string %10s' % 'Python'  

'this is a string     Python'

### %s代表%后面的str字符串，%10s预设10格的空位

In [96]:
'this is an integer {:d}'.format(15)

'this is an integer 15'

In [97]:
'this is {:4d} an integer {:d}'.format(15, 16)

'this is   15 an integer 16'

In [106]:
'this is {} an integer {}'.format(15, 16)

'this is 15 an integer 16'

In [98]:
'this is an integer {:4d}'.format(15)

'this is an integer   15'

In [99]:
'this is an integer {:04d}'.format(15)

'this is an integer 0015'

In [100]:
'this is a float {:f}'.format(15.3456)

'this is a float 15.345600'

In [101]:
'this is a float {:.2f}'.format(15.3456)

'this is a float 15.35'

In [102]:
'this is a float {:8f}'.format(15.3456)

'this is a float 15.345600'

In [103]:
'this is a float {:8.2f}'.format(15.3456)

'this is a float    15.35'

In [104]:
'this is a float {:08.2f}'.format(15.3456)

'this is a float 00015.35'

In [107]:
'this is a float {:<08.2f}'.format(15.3456)

'this is a float 15.35000'

和%形式里面的%-8.2f的作用一致，{:<08.2f}相当于是改向左对齐为向右对齐

In [105]:
'this is a string {:s}'.format('Python')

'this is a string Python'

In [86]:
'this is a string {:10s}'.format('Python')

'this is a string Python    '

In [108]:
'this is a string {:>10s}'.format('Python')

'this is a string Python    '

对string，默认是靠左开始计算空位数{:<10s} = {:10s}，和int/float是相反的,{:>10s}是从右边开始计算空位数

和之前%的形式基本上一致，只不过%改成了{:},直接用{}就代表默认情况，电脑会自动根据后面format()里面的数据类型(str,int,float都可以)来匹配前面的形式。

In [109]:
i = 0
while i < 4:
    print('the number is %d' % i)
    i += 1

the number is 0
the number is 1
the number is 2
the number is 3


In [111]:
i = 0
while i < 4:
    print('the number is {:d}'.format(i))
    i += 1

the number is 0
the number is 1
the number is 2
the number is 3


### Excursion: Regular Expressions正择表达式

In [2]:
import re

In [3]:
series = """
'01/18/2014 13:00:00', 100, '1st';
'01/18/2014 13:30:00', 110, '2nd';
'01/18/2014 14:00:00', 120, '3rd'
"""

In [4]:
dt_1 = re.compile('[0-9]+') 

re.compile编译正则表达式模式，返回一个对象。可以把常用的正则表达式编译成正则表达式对象，方便后续调用及提高效率。

pattern 指定编译时的表达式字符串 ; flags 编译标志位，用来修改正则表达式的匹配方式。

In [5]:
result_1 = dt_1.findall(series)
result_1

['01',
 '18',
 '2014',
 '13',
 '00',
 '00',
 '100',
 '1',
 '01',
 '18',
 '2014',
 '13',
 '30',
 '00',
 '110',
 '2',
 '01',
 '18',
 '2014',
 '14',
 '00',
 '00',
 '120',
 '3']

re.compile用来抓起series里面的项，'[0-9]+'代表找出series里面所有包含0到9的项，+代表不断循环（因为不知道具体要筛选到第几个）。

结果里面的['01','18','2014',]是因为电脑在找包含所有包含0到9任一的项，到分隔符的时候会顿开（因为无法识别，所以跳过）

In [6]:
dt_2 = re.compile('[0-9/]+') 
result_2 = dt_2.findall(series)
result_2

['01/18/2014',
 '13',
 '00',
 '00',
 '100',
 '1',
 '01/18/2014',
 '13',
 '30',
 '00',
 '110',
 '2',
 '01/18/2014',
 '14',
 '00',
 '00',
 '120',
 '3']

'[0-9/]+'筛选规则里面加入了/，所以遇到/时不会直接跳过，电脑可以读出完整的['01/18/2014',]

In [7]:
dt_3 = re.compile('[0-9/a-z]+') 
result_3 = dt_3.findall(series)
result_3

['01/18/2014',
 '13',
 '00',
 '00',
 '100',
 '1st',
 '01/18/2014',
 '13',
 '30',
 '00',
 '110',
 '2nd',
 '01/18/2014',
 '14',
 '00',
 '00',
 '120',
 '3rd']

In [8]:
dt_4 = re.compile('[0-9/:]+') 
result_4 = dt_4.findall(series)
result_4

['01/18/2014',
 '13:00:00',
 '100',
 '1',
 '01/18/2014',
 '13:30:00',
 '110',
 '2',
 '01/18/2014',
 '14:00:00',
 '120',
 '3']

增加了:,在读 '13:00:00'之类的小时数的时候就不会跳开:,呈现出完整的 '13:00:00'

增加了a-z,可以筛选出字符串（str)了,可以读出完整的'3rd'。

In [9]:
dt = re.compile("'[0-9/:\s]+'")  # datetime

In [10]:
result = dt.findall(series)
result

["'01/18/2014 13:00:00'", "'01/18/2014 13:30:00'", "'01/18/2014 14:00:00'"]

### "'[0-9/:\s]+'"：\s代表空格，希望读取的时候能够不将01/18/2014和13:00:00间隔成两个项，而是联合的形式，那么就需要考虑到需要把空格(\s)作为筛选条件，防止电脑在01/18/2014和13:00:00之间的空格处断开跳开。因为\s包括换行符和空格，为了不让空格被大量取出需要用"''"将式子括起来, 这就代表筛选条件由交集/联集变成了并集（要满足所有预设条件的项才会被筛选出来）。

In [11]:
from datetime import datetime
pydt = datetime.strptime(result[0].replace("'", ""),
                         '%m/%d/%Y %H:%M:%S')
pydt

datetime.datetime(2014, 1, 18, 13, 0)

### datetime.strftime:str from time将时间转换成str字符串的格式；strptime将字符串的格式改为时间格式。
### replace("'", "")代表将str里面的单引号去除掉，因为strptime需要去掉引号的字符串
###  '%m/%d/%Y %H:%M:%S'表示时间的格式，一一对应01月m/18日d/2014年Y 13小时H:00分钟M:00秒S
### m和日期结合的时候代表01格式的月份，M代表1格式的月份

In [12]:
print(pydt)

2014-01-18 13:00:00


In [13]:
print(type(pydt))

<class 'datetime.datetime'>


In [14]:
pydt_1 = datetime.strptime(result.replace("'", ""),
                         '%m/%d/%Y %H:%M:%S')

AttributeError: 'list' object has no attribute 'replace'

不能直接对list对象使用strptime

In [15]:
for i in result:
    pydt_1 = datetime.strptime(i.replace("'", ""),'%m/%d/%Y %H:%M:%S')
    print(pydt_1)

2014-01-18 13:00:00
2014-01-18 13:30:00
2014-01-18 14:00:00


可以拆出来，strptime之后再放回去

## Basic Data Structures

### Tuples元组

In [232]:
t = (1, 2.5, 'data')
type(t)

tuple

In [168]:
t = 1, 2.5, 'data'
type(t)

tuple

In [169]:
t[2]

'data'

In [170]:
type(t[2])

str

In [171]:
t[2] = 'replace'

TypeError: 'tuple' object does not support item assignment

'tuple' object does not support item assignment元组建立以后不能改

In [172]:
t.count('data')

1

In [173]:
t.index(1)

0

count看一个元素在tuple里面出现的次数，index看一个元素在tuple里面的位置

### Lists列表（清单）

In [174]:
l = [1, 2.5, 'data']
l[2]

'data'

In [175]:
l = list(t)
l

[1, 2.5, 'data']

list()将别的形式的数据转化为list形式

In [176]:
type(l)

list

In [177]:
l.append((1.0, 1.5, 2.0)) 
l

[1, 2.5, 'data', (1.0, 1.5, 2.0)]

In [178]:
l.append([1.0, 1.5, 2.0]) 
l

[1, 2.5, 'data', (1.0, 1.5, 2.0), [1.0, 1.5, 2.0]]

In [179]:
l.append({'data': 3}) 
l

[1, 2.5, 'data', (1.0, 1.5, 2.0), [1.0, 1.5, 2.0], {'data': 3}]

In [180]:
l.extend((1.0, 1.5, 2.0))  
l 

[1, 2.5, 'data', (1.0, 1.5, 2.0), [1.0, 1.5, 2.0], {'data': 3}, 1.0, 1.5, 2.0]

In [181]:
l.extend({'data': 3})  
l

[1,
 2.5,
 'data',
 (1.0, 1.5, 2.0),
 [1.0, 1.5, 2.0],
 {'data': 3},
 1.0,
 1.5,
 2.0,
 'data']

In [182]:
l.insert(2, ('insert', 1))  
l

[1,
 2.5,
 ('insert', 1),
 'data',
 (1.0, 1.5, 2.0),
 [1.0, 1.5, 2.0],
 {'data': 3},
 1.0,
 1.5,
 2.0,
 'data']

### append是直接将数据贴在原来的list后面，可以是任意形式(tuple,list,dict都行)，保留原有的形式。
### extend是将数据里面的元素拆开，然后贴到原来list的后面，uple,list,dict都行(对于dict只保留key键的内容)
### insert是在指定位置插上所写的元素，2代表在第3位插入所写的元素。

In [183]:
l.remove('data')  
l

[1,
 2.5,
 ('insert', 1),
 (1.0, 1.5, 2.0),
 [1.0, 1.5, 2.0],
 {'data': 3},
 1.0,
 1.5,
 2.0,
 'data']

In [184]:
l.remove(1)
l

[2.5,
 ('insert', 1),
 (1.0, 1.5, 2.0),
 [1.0, 1.5, 2.0],
 {'data': 3},
 1.0,
 1.5,
 2.0,
 'data']

remove删除所选的元素，如果所选的元素在list里面有重复，就删除第一个list

In [185]:
p = l.pop(3)  
print(p)
print(l)

[1.0, 1.5, 2.0]
[2.5, ('insert', 1), (1.0, 1.5, 2.0), {'data': 3}, 1.0, 1.5, 2.0, 'data']


p = l.pop(3),p就是list里面的第4位元素，截取出来，l就是被截取走第4位元素的list.位置是相对位置，每次反复运行pop函数得到的l list是不同的(越来越少)

In [186]:
l[2:5]  

[(1.0, 1.5, 2.0), {'data': 3}, 1.0]

从第3位开始到第5位结束的切片，共3个元素

### Excursion: Control Structures控制结构(while,if)

In [192]:
for element in l[4: 7]:
    print(element ** 2)

1.0
2.25
4.0


In [193]:
r = range(0, 8, 1)  
r

range(0, 8)

In [194]:
type(r)

range

In [195]:
list(r)

[0, 1, 2, 3, 4, 5, 6, 7]

range包括第一个（起始数）不包括最后一个（终止数），0, 8, 1里面的1代表间隔，默认为1.

In [198]:
for i in range(4, 7):
    print(l[i] ** 2)
# range(0, 2) = range(0, 2， 1)

1.0
2.25
4.0


In [199]:
for i in range(1, 10):
    if i % 2 == 0:  
        print("%d is even" % i)
    elif i % 3 == 0:
        print("%d is multiple of 3" % i)
    else:
        print("%d is odd" % i)

1 is odd
2 is even
3 is multiple of 3
4 is even
5 is odd
6 is even
7 is odd
8 is even
9 is multiple of 3


%代表整除后的余数

In [200]:
total = 0
while total < 100:
    total += 1
print(total)

100


In [201]:
total = 0
while total < 100:
    total += 1
    print(total)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100


相比while，if往往需要知道循环的大致次数

while就是if一直执行到不满足条件为止，当total是99时，依然能够满足 total < 100的条件，所以依然可以走total += 1的程序，最后出来的total是99+1 = 100

In [202]:
m = [i ** 2 for i in range(5)]
m

[0, 1, 4, 9, 16]

可以在一条式子里面写出循环，必须要有 for,in

### Excursion: Functional Programming(def定义函数）

In [203]:
def f(x):
    return x ** 2
f(2)

4

In [204]:
def even(x):
    return x % 2 == 0
even(3)

False

In [205]:
list(map(even, range(10)))

[True, False, True, False, True, False, True, False, True, False]

In [206]:
import numpy as np
print(list(map(even, range(10))))
print(list(map(np.sqrt, range(10))))

[True, False, True, False, True, False, True, False, True, False]
[0.0, 1.0, 1.4142135623730951, 1.7320508075688772, 2.0, 2.23606797749979, 2.449489742783178, 2.6457513110645907, 2.8284271247461903, 3.0]


map就是对数组(list)统一进行某种函数(包括自定义函数和numpy以及pandas里面自带的函数）操作, map(even, range(10))(前面的是函数， 后面的是数据）

In [207]:
list(map(lambda x: x ** 2, range(10)))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

lambda就是匿名函数，x是定义出来的

配合lambda可以进行几乎所有对数组的数学计算

In [208]:
list(filter(even, range(15)))

[0, 2, 4, 6, 8, 10, 12, 14]

In [209]:
list(filter(lambda x:x % 2 == 0, range(15)))

[0, 2, 4, 6, 8, 10, 12, 14]

filter筛选，（前面是筛选函数：可以是lambda的形式，后面是数组）。

### Dicts字典

In [210]:
d = {
     'Name' : 'Angela Merkel',
     'Country' : 'Germany',
     'Profession' : 'Chancelor',
     'Age' : 64
     }
type(d)

dict

key: value

In [211]:
print(d['Name'], d['Age'])

Angela Merkel 64


In [212]:
d.keys()

dict_keys(['Name', 'Country', 'Profession', 'Age'])

In [213]:
d.values()

dict_values(['Angela Merkel', 'Germany', 'Chancelor', 64])

In [214]:
d.items()

dict_items([('Name', 'Angela Merkel'), ('Country', 'Germany'), ('Profession', 'Chancelor'), ('Age', 64)])

In [215]:
print(d)

{'Name': 'Angela Merkel', 'Country': 'Germany', 'Profession': 'Chancelor', 'Age': 64}


### keys()返回所有key值，values返回所有value值，items返回一个list，dict里面所有的key和一一对应的value用括号括起来(tuple元组)，直接print出来的是一个dict，和items返回的不同。(p.s.keys，values,items都要带s)

In [216]:
birthday = True
if birthday:
    d['Age'] += 1
print(d['Age'])

65


birthday = True——————if birthday:直接if True:就相当于直接执行下面的内容，

In [217]:
d['Age'] += 1
d['Age'] 

66

可以对dict里面的key进行数学操作：就是对key对应的每一个value进行一样的数学操作。

In [218]:
for item in d.items():
    print(item)
    print(type(item))

('Name', 'Angela Merkel')
<class 'tuple'>
('Country', 'Germany')
<class 'tuple'>
('Profession', 'Chancelor')
<class 'tuple'>
('Age', 66)
<class 'tuple'>


In [219]:
for value in d.values():
    print(type(value))

<class 'str'>
<class 'str'>
<class 'str'>
<class 'int'>


### Sets集合

In [220]:
s = set(['u', 'd', 'ud', 'du', 'd', 'du'])
s

{'d', 'du', 'u', 'ud'}

### set的特点：无序，不重复

In [223]:
t_1 = set(['d', 'dd', 'uu', 'u'])

In [224]:
s.union(t_1)  

{'d', 'dd', 'du', 'u', 'ud', 'uu'}

In [225]:
s.intersection(t_1)  

{'d', 'u'}

In [226]:
s.difference(t_1)  

{'du', 'ud'}

In [227]:
t_1.difference(s)  

{'dd', 'uu'}

In [228]:
s.symmetric_difference(t_1)  

{'dd', 'du', 'ud', 'uu'}

union交，intersection并，s.difference(t) ：s有但t没有，t.difference(s)  ：t有但s没有，s.symmetric_difference(t) :合并两者之间不同的子集。

In [229]:
from random import randint
l_1 = [randint(0, 10) for i in range(1000)]  
len(l_1)  

1000

randint自由生成整数,一次只能生成一个,所以依靠for i in range(1000)重复1000次获得1000个随机的在0到10之间的整数

In [230]:
l_1[:20]

[0, 2, 9, 10, 2, 8, 4, 8, 6, 10, 0, 0, 10, 4, 7, 10, 1, 1, 10, 2]

In [231]:
s = set(l_1)
list(s)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="mailto:training@tpq.io">training@tpq.io</a>