### python常用数据结构

在了解python的数据结构的时候，有几个内容一定要注意一下：
* container/容器
* iterable/可迭代对象
* iterator/迭代器
* generator/生成器
* 列表/字典/集合

### 容器(container)

容器就是一种把多个元素组合在一起的数据结构，容器的元素可以逐个迭代地获取，可以用in、not in等关键词判定在不在容器内。
通常情况下，容器的所有元素都存储在内存中，可能会用到的容器：

* list, deque, ...
* set, fronzenset ...
* dict, OrderedDict, Counter, ...
* tuple, namedtuple
* str

In [1]:
assert 1 in [1,2,3]

In [2]:
# assert 4 in [1,2,3], "not exist this item"
assert 4 not in [1,2,3]

In [3]:
# 来自collections库中的容器
from collections import deque, OrderedDict, Counter, namedtuple

# frozenset, 无add、remove等方法,"冻结的"

# deque, 双向队列
# OrderDict, 有序字典
# Counter, 计数器，e.g 传入一个list、tuple，返回结果类似于dict，元素为key、频率为value

In [4]:
# namedtuple, 相当于一个封装

# 一般tuple的用法
temp1 = ("qi","yue","zai","xian")
print(temp1)

# 或者
temp2 = tuple(["qi","yue","zai","xian"])
print(temp2)

('qi', 'yue', 'zai', 'xian')
('qi', 'yue', 'zai', 'xian')


In [5]:
# namedtuple的用法
temp3 = namedtuple("Xueyuan",["name","age","sex"])
print(temp3)

<class '__main__.Xueyuan'>


In [6]:
x1 = temp3(name="n1", age=18, sex="male")
x2 = temp3(name="n2", age=19, sex="female")

print(x1)
print(x2)

# 更清晰的命名
Xueyuan = namedtuple("Xueyuan", ["name","age","sex"])
x1 = Xueyuan(name="n1", age=18, sex="male")
x2 = Xueyuan(name="n2", age=19, sex="female")
print("\n")
print(x1)
print(x2)

Xueyuan(name='n1', age=18, sex='male')
Xueyuan(name='n2', age=19, sex='female')


Xueyuan(name='n1', age=18, sex='male')
Xueyuan(name='n2', age=19, sex='female')


### 可迭代对象(iterable)

很多容器都是可迭代对象，除掉容器，有很多对象实际上也是可迭代对象，比如打开状态的files，sockets等等。但凡是可以返回一个迭代器的对象都可以叫做可迭代对象。

In [7]:
x = [1,3,6,14,11]

In [8]:
y = iter(x)

In [9]:
y

<list_iterator at 0x10c72fb38>

In [10]:
type(x)

list

In [11]:
type(y)

list_iterator

## ** 容易混淆的两个概念 **

迭代器: 使用了iter关键字，可以通过next从头访问元素，也可以通过for循环来挨个儿访问

生成器: 使用了yield关键字，可以把带有yield关键字的都看作一个生成器，通过调用一个生成器，能够返回一个迭代器对象

** 迭代器(iterator)举例说明 **

In [12]:
print("X:list")
for i in x:
    print(i)

# 先用一个for循环遍历y
print("\nY:iterator")
for i in y:
    print(i)

X:list
1
3
6
14
11

Y:iterator
1
3
6
14
11


In [13]:
# 再次遍历，发现此时y为空
for item in y:
    print(item)

In [14]:
# 对y调用next, StopIteration
next(y)

StopIteration: 

In [15]:
# 通过next访问y
x = [1,2,4]
y = iter(x)
while True:
    try:
        print(next(y))
    except StopIteration:
        break

1
2
4


** 生成器(generator)举例说明 **

In [16]:
# fibonacci
def fib(n):
    result = list()
    result.append(0)
    result.append(1)
    for i in range(2,n):
        result.append(result[i-1] + result[i-2])
    return result

temp = fib(7)
print(temp)

[0, 1, 1, 2, 3, 5, 8]


In [17]:
def fib_generator(n):
    x, y, counter = 0, 1, 0
    while counter < n:
        yield x
        x, y = y, x+y
        counter += 1
        
temp = fib_generator(7)
while True:
    try:
        print(next(temp))
    except StopIteration:
        break

0
1
1
2
3
5
8


### 总结一下

迭代器(iterator)，一个带状态的对象，能在调用next()的时候返回下一个值

生成器(generator)，通过使用yield关键字，能返回一个迭代器

另外：

凡是可以用于for循环的称作iterable，即可迭代对象

凡是可用作next函数的都称作iterator，即迭代器， 而像list、dict、str这些就只能看作iterable，而不能是iterator


### 别的例子, itertools库

In [18]:
#生成一个无限循环
from itertools import count

counter = count(start=13)
help(count)

Help on class count in module itertools:

class count(builtins.object)
 |  count(start=0, step=1) --> count object
 |  
 |  Return a count object whose .__next__() method returns consecutive values.
 |  Equivalent to:
 |  
 |      def count(firstval=0, step=1):
 |          x = firstval
 |          while 1:
 |              yield x
 |              x += step
 |  
 |  Methods defined here:
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __iter__(self, /)
 |      Implement iter(self).
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |  
 |  __next__(self, /)
 |      Implement next(self).
 |  
 |  __reduce__(...)
 |      Return state information for pickling.
 |  
 |  __repr__(self, /)
 |      Return repr(self).



In [19]:
next(counter)

13

In [20]:
next(counter)

14

In [21]:
next(counter)

15

In [22]:
#有限序列的无限循环
from itertools import cycle

colors = cycle(['红', '黄', '蓝'])

In [23]:
next(colors)

'红'

In [24]:
next(colors)

'黄'

In [25]:
next(colors)

'蓝'

In [26]:
next(colors)

'红'

In [27]:
#从无限循环的序列中生成有限的序列

from itertools import cycle, islice
colors = cycle(['红', '黄', '蓝'])
limited = islice(colors, 0, 4)

for x in limited:
    print(x)

红
黄
蓝
红


In [28]:
# 实现了__iter__方法的对象是可迭代的，实现了__next__方法的对象是迭代器

class Fib:
    def __init__(self):
        self.prev = 0
        self.curr = 1
    
    def __iter__(self):
        return self
    
    def __next__(self):
        value = self.curr
        self.curr += self.prev
        self.prev = value
        return value

In [29]:
f = Fib()

In [30]:
list(islice(f, 0, 10))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

In [31]:
def fib():
    prev, curr = 0,1
    while True:
        yield curr
        prev, curr = curr, curr+prev

In [32]:
f = fib()
list(islice(f, 0, 10))

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]