<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#类型转换" data-toc-modified-id="类型转换-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>类型转换</a></span></li><li><span><a href="#基本数据类型" data-toc-modified-id="基本数据类型-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>基本数据类型</a></span><ul class="toc-item"><li><span><a href="#list" data-toc-modified-id="list-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>list</a></span><ul class="toc-item"><li><span><a href="#tuple" data-toc-modified-id="tuple-2.1.1"><span class="toc-item-num">2.1.1&nbsp;&nbsp;</span>tuple</a></span></li></ul></li><li><span><a href="#OrderedDict" data-toc-modified-id="OrderedDict-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>OrderedDict</a></span></li></ul></li><li><span><a href="#sorted" data-toc-modified-id="sorted-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>sorted</a></span></li><li><span><a href="#迭代器和生成器" data-toc-modified-id="迭代器和生成器-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>迭代器和生成器</a></span><ul class="toc-item"><li><span><a href="#yield-生成器" data-toc-modified-id="yield-生成器-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>yield 生成器</a></span></li><li><span><a href="#类作为可迭代对象" data-toc-modified-id="类作为可迭代对象-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>类作为可迭代对象</a></span></li></ul></li><li><span><a href="#变长函数" data-toc-modified-id="变长函数-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>变长函数</a></span></li><li><span><a href="#内置高阶函数" data-toc-modified-id="内置高阶函数-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>内置高阶函数</a></span></li><li><span><a href="#目录" data-toc-modified-id="目录-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>目录</a></span></li><li><span><a href="#正则表达式" data-toc-modified-id="正则表达式-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>正则表达式</a></span></li><li><span><a href="#cheatsheet" data-toc-modified-id="cheatsheet-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>cheatsheet</a></span></li></ul></div>

## 类型转换

In [1]:
a = 1.23

In [2]:
int(a)

1

In [3]:
int(2.67)

2

可以观察到无论浮点数的小数点是否大于5，int类型转换都只是取其整数部分

In [10]:
bin(int(a))

'0b1'

In [11]:
oct(int(a))

'0o1'

In [12]:
hex(int(a))

'0x1'

In [13]:
round(a)

1

In [15]:
round(a, 1)

1.2

## 基本数据类型

### list

In [31]:
a = [1, 2, 3, 'name', 3, 4, 5]

In [32]:
a.remove(3)
a

[1, 2, 'name', 3, 4, 5]

In [33]:
try:
    a.remove(5)
except ValueError:
    print('[Error] try to remove non-exist element')

In [34]:
a.pop()

4

In [35]:
a

[1, 2, 'name', 3]

In [36]:
a.pop(1)

2

In [37]:
a

[1, 'name', 3]

In [38]:
a.remove('name')

In [39]:
a

[1, 3]

In [40]:
del a[1]

In [41]:
a

[1]

#### tuple

In [49]:
t = (1, 2, 3)

In [50]:
t[1] = 4

TypeError: 'tuple' object does not support item assignment

In [51]:
# A comma should be append when
# there is only one element.
t = (3,)

In [52]:
t

(3,)

### OrderedDict

In [54]:
from collections import OrderedDict

In [57]:
d = OrderedDict({'jack': 90, 'mark': 100, 'sally': '98', 'jane': 78})

In [58]:
d

OrderedDict([('jack', 90), ('mark', 100), ('sally', '98'), ('jane', 78)])

可以看到OrderedDict内部其实使用了一个tuple的list来存储数据

## sorted

字典排序

In [42]:
b = {'jack': 90, 'mark': 100, 'sally': '98', 'jane': 78}

In [44]:
# 默认对字典的key排序
sorted(b)

['jack', 'jane', 'mark', 'sally']

In [47]:
# 按照字典的key，对字典排序
sorted(b.items(), key=lambda x: x[0])

[('jack', 90), ('jane', 78), ('mark', 100), ('sally', '98')]

In [61]:
sorted(d)

['jack', 'jane', 'mark', 'sally']

In [62]:
sorted(d.items())

[('jack', 90), ('jane', 78), ('mark', 100), ('sally', '98')]

## 迭代器和生成器

参考资料
+ https://blog.csdn.net/mieleizhi0522/article/details/82142856
+ https://luozhaoyu.iteye.com/blog/1513198

In [7]:
l = [1, 2, 3]

In [8]:
l = iter(l)

In [9]:
next(l)

1

In [11]:
next(l)

2

In [23]:
a = (x for x in range(10))
a

<generator object <genexpr> at 0x104642e58>

In [24]:
next(a)

0

In [25]:
next(a)

1

In [26]:
for x in a:
    print(x)

2
3
4
5
6
7
8
9


可以看到，由于之前已经调用过两次`next`获取列表中元素的值，所以当执行`for`循环访问a时，此时应该从2开始生成。

### yield 生成器

In [27]:
def fun(n):
    for x in range(n):
        yield x        

In [14]:
fun(2)

<generator object fun at 0x104642de0>

In [28]:
for x in fun(2):
    print(x)

0
1


In [71]:
def foo():
    print('starting...')
    while True:
        res = yield 4
        print('res = ', res)

In [73]:
g = foo()
g

<generator object foo at 0x10d71a7c8>

In [74]:
next(g)

starting...


4

In [75]:
next(g)

res =  None


4

In [76]:
next(g)

res =  None


4

In [77]:
g.send(7)

res =  7


4

In [78]:
next(g)

res =  None


4

In [80]:
g.send(10)

res =  10


4

In [81]:
next(g)

res =  None


4

+ `foo`函数由于有`yield`语句，所以变成了一个迭代器，因此`g`是一个迭代器。
+ 接着，执行`next`函数，迭代器`g`开始执行。当遇到`yield 4`的时候返回4。
+ 由于上一步函数在`yield 4`处直接返回，所以这个时候`res`其实是没有值的，即`None`。
+ 之后的每一次都是执行到`yield`就直接返回了，所以`res`一直都是`None`
+ 然后，使用`send`函数将7传给res，然后接着执行。因此res的值便成为了7.
+ 最后，再次调用next从上一次`yield 4`的地方退出。可以看到这个时候res的值还是None，并没有变为7

### 类作为可迭代对象

要使得类也成为一个可迭代对象，那么必须实现`__iter__`和`__next__`两个函数。
+ 首先，`__iter__`函数必须返回一个iterator对象。

In [116]:
help(iter)

Help on built-in function iter in module builtins:

iter(...)
    iter(iterable) -> iterator
    iter(callable, sentinel) -> iterator
    
    Get an iterator from an object.  In the first form, the argument must
    supply its own iterator, or be a sequence.
    In the second form, the callable is called until it returns the sentinel.



In [212]:
class CounterCall(object):
    def __init__(self):
        self.__counter = 0
    
    # 实现__call__函数，使其成为`可调用的`，类似重载operator()()
    # iter(callable, sentinel) -> iterator
    # the callable is called until it returns the sentinel
    def __call__(self):
        print('invoke __call__')
        self.__counter += 1
        return self.__counter
    
    def get_counter(self):
        return self.__counter

In [213]:
c = CounterCall()
c.get_counter()

0

In [215]:
ci = iter(c, 4)
for x in ci:
    print('*' * 10)
    print(x)
    print('-' * 10)

invoke __call__
**********
1
----------
invoke __call__
**********
2
----------
invoke __call__
**********
3
----------
invoke __call__


In [216]:
c.get_counter()

4

In [237]:
class CounterIter(object):
    def __init__(self):
        self.__counter = 0
    
    # iter(iterable) -> iterator
    # the argument must supply its own iterator, or be a sequence
    def __iter__(self):
        print('invoke __iter__')
        return self
    
    # 实现__next__函数，使得可以迭代
    def __next__(self):
        print('invoke __next__')
        if self.__counter >= 5:
            raise StopIteration
        else :    
            self.__counter += 1
            return self.__counter
    
    def get_counter(self):
        return self.__counter

In [238]:
cc = CounterIter()
cc.get_counter()

0

In [239]:
cci = iter(cc)
for x in cci:
    print('*' * 10)
    print('x:', x)
    print('-' * 10)

invoke __iter__
invoke __iter__
invoke __next__
**********
x: 1
----------
invoke __next__
**********
x: 2
----------
invoke __next__
**********
x: 3
----------
invoke __next__
**********
x: 4
----------
invoke __next__
**********
x: 5
----------
invoke __next__


**使用yield的__next__函数**

In [158]:
class CounterYield(object):
    def __iter__(self):
        self.__counter = 1
        # 将自己作为迭代器对象返回
        return self
    
    def __next__(self):
        while True:
            print('before yield')
            yield self.__counter
            print('after yield')
            self.__counter += 1
            print('counter add one: ', self.__counter)
    
    def get_counter(self):
        return self.__counter

In [159]:
c = CounterYield()
c = iter(c)

In [96]:
# 由于c的`__next__`函数中有一个yield函数
# 所以调用next(c)得到的是一个生成器
next(c)

<generator object CounterYield.__next__ at 0x10d71a9a8>

In [97]:
print(c.get_counter())
# 1. next(c)首先返回一个生成器
# 2. 然后外层的next执行__next__函数，遇到yield返回，这时得到了1
next(next(c))

1
before yield


1

In [98]:
print(c.get_counter())
# 接着，next(c)返回另外一个生成器
# 外层的next函数又执行__next__函数，这时从头开始执行
next(next(c))

1
before yield


1

## 变长函数

In [106]:
# *args: 将传入的未指定参数名的参数作为元组处理
# **kargs: 将传入的指定参数名的参数（关键字参数）作为字典处理
def func(a, *args, **kargs):
    print('a = ', a)
    print('args = ', args)
    print('kargs = ', kargs)

In [98]:
func(1)

a =  1
args =  ()
kargs =  {}


In [100]:
func(1, 2, 3, 4)

a =  1
args =  (2, 3, 4)
kargs =  {}


In [101]:
func(1, 2, 3, name='test', age=23)

a =  1
args =  (2, 3)
kargs =  {'name': 'test', 'age': 23}


In [108]:
# ！！！ 不可以在关键字参数之后再传入位置参数
func(1, 2, name='test', age=23, 3)

SyntaxError: positional argument follows keyword argument (<ipython-input-108-885397e5e4cf>, line 2)

In [111]:
# 默认参数不可以写在变长参数后面
def func(a, *args, **kargs, b=5):
    print('a = ', a)
    print('args = ', args)
    print('kargs = ', kargs)

SyntaxError: invalid syntax (<ipython-input-111-ca9d42338de6>, line 2)

In [112]:
def func(a, b=5, *args, **kargs):
    print('a = ', a)
    print('b = ', b)
    print('args = ', args)
    print('kargs = ', kargs)

In [113]:
func(1)

a =  1
b =  5
args =  ()
kargs =  {}


In [114]:
func(1, 2, 3, name='test', age=23)

a =  1
b =  2
args =  (3,)
kargs =  {'name': 'test', 'age': 23}


## 内置高阶函数

In [121]:
l = [9, 0, -2, 5]

In [122]:
res = filter(lambda x: x >=0, l)    

In [123]:
res

<filter at 0x1054bd8d0>

In [124]:
list(res)

[9, 0, 5]

In [125]:
res = map(lambda x: x * x, l)

In [126]:
res

<map at 0x10505ecc0>

In [127]:
list(res)

[81, 0, 4, 25]

In [128]:
from functools import reduce

In [130]:
res = reduce(lambda a, b: a + b, l)

In [131]:
res

12

## 目录

In [24]:
import os

In [25]:
dir(os)

['CLD_CONTINUED',
 'CLD_DUMPED',
 'CLD_EXITED',
 'CLD_TRAPPED',
 'DirEntry',
 'EX_CANTCREAT',
 'EX_CONFIG',
 'EX_DATAERR',
 'EX_IOERR',
 'EX_NOHOST',
 'EX_NOINPUT',
 'EX_NOPERM',
 'EX_NOUSER',
 'EX_OK',
 'EX_OSERR',
 'EX_OSFILE',
 'EX_PROTOCOL',
 'EX_SOFTWARE',
 'EX_TEMPFAIL',
 'EX_UNAVAILABLE',
 'EX_USAGE',
 'F_LOCK',
 'F_OK',
 'F_TEST',
 'F_TLOCK',
 'F_ULOCK',
 'MutableMapping',
 'NGROUPS_MAX',
 'O_ACCMODE',
 'O_APPEND',
 'O_ASYNC',
 'O_CLOEXEC',
 'O_CREAT',
 'O_DIRECTORY',
 'O_DSYNC',
 'O_EXCL',
 'O_EXLOCK',
 'O_NDELAY',
 'O_NOCTTY',
 'O_NOFOLLOW',
 'O_NONBLOCK',
 'O_RDONLY',
 'O_RDWR',
 'O_SHLOCK',
 'O_SYNC',
 'O_TRUNC',
 'O_WRONLY',
 'PRIO_PGRP',
 'PRIO_PROCESS',
 'PRIO_USER',
 'P_ALL',
 'P_NOWAIT',
 'P_NOWAITO',
 'P_PGID',
 'P_PID',
 'P_WAIT',
 'PathLike',
 'RTLD_GLOBAL',
 'RTLD_LAZY',
 'RTLD_LOCAL',
 'RTLD_NODELETE',
 'RTLD_NOLOAD',
 'RTLD_NOW',
 'R_OK',
 'SCHED_FIFO',
 'SCHED_OTHER',
 'SCHED_RR',
 'SEEK_CUR',
 'SEEK_END',
 'SEEK_SET',
 'ST_NOSUID',
 'ST_RDONLY',
 'TMP_MAX',
 'W

In [60]:
path = '/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare'

In [61]:
os.listdir(path)

['.DS_Store', '.ipynb_checkpoints', 'Probability.ipynb', 'data']

In [63]:
files = [f for f in os.listdir(path) if os.path.isfile(os.path.join(path, f))]
files

['.DS_Store', 'Probability.ipynb']

In [64]:
dirs = [d for d in os.listdir(path) if os.path.isdir(os.path.join(path, d))]
dirs

['.ipynb_checkpoints', 'data']

In [54]:
for dirpath, dirnames, filenames in os.walk(path):
    print(dirpath, dirnames, filenames)

/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/ ['.ipynb_checkpoints', 'data'] ['.DS_Store', 'Probability.ipynb']
/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/.ipynb_checkpoints [] ['Probability-checkpoint.ipynb']
/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/data ['ml-statistics-quartile'] ['.DS_Store', 'ml-statistics-quartile.zip']
/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/data/ml-statistics-quartile ['.ipynb_checkpoints', '.git', 'data'] ['.DS_Store', '作业：历届世界杯主客场得分箱线图分析.ipynb']
/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/data/ml-statistics-quartile/.ipynb_checkpoints [] ['作业：历届世界杯主客场得分箱线图分析-checkpoint.ipynb']
/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Prepare/data/ml-statistics-quartile/.git ['objects', 'info', 'logs', 'hooks', 'refs'] ['.DS_Store', 'ORIG_HEAD', 'confi

In [65]:
os.getcwd()

'/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Notes'

In [66]:
os.path.abspath('.')

'/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp/Notes'

In [67]:
os.path.split(path)

('/Users/yangqj/Documents/Documents/知识就是力量/OnlineCourse/MachineLearningCamp',
 'Prepare')

In [68]:
os.path.splitext('Python基础.ipynb')

('Python基础', '.ipynb')

## 正则表达式

菜鸟教程: [ttps://www.runoob.com/regexp/regexp-metachar.html](https://www.runoob.com/regexp/regexp-metachar.html)

练习网站: [https://regexr.com/](https://regexr.com/)

In [3]:
import re

In [6]:
text = "industry industries industr123"

In [7]:
re.findall(r"industr(?:y|ies)", text)

['industry', 'industries']

In [10]:
# 匹配industry industries中的industr
re.findall(r"industr(?=y|ies)", text)

['industr', 'industr']

In [11]:
# 匹配industr123中的industr
re.findall(r"industr(?!y|ies)", text)

['industr']

In [15]:
# 匹配industry industries中的industr
# 返回其后的字符串
re.findall(r"(?<=industr)y|ies", text)

['y', 'ies']

In [23]:
# 匹配两个字符串之间的字符串
re.findall(r"(?<=industry )[\w]+(?= industr123)", text)

['industries']

## cheatsheet

https://www.pythonsheets.com/

https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf