# Python Advanced

![python](https://upload.wikimedia.org/wikipedia/commons/c/c3/Python-logo-notext.svg)

## generator

### description
* Generator functions allow you to declare a function that behaves like an iterator, i.e. it can be used in a for loop.

In [None]:
#Toggle line numbers
# Build and return a list
#from guppy import hpy
import psutil
import os
def firstn(n):
    num, nums = 0, []
    while num < n:
        nums.append(num)
        num += 1
    return nums
sum_of_first_n = sum(firstn(1000000))
# info = psutil.virtual_memory()
# #print u'内存使用：',psutil.Process(os.getpid()).memory_info().rss
# print u'总内存：',info.total
# print u'内存占比：',info.percent
# print u'cpu：',psutil.cpu_count()#

### refactor with generator
* we resort to the generator pattern. The following implements generator as an iterable object.

In [None]:
# Using the generator pattern (an iterable)
class firstn(object):
    def __init__(self, n):
        self.n = n
        self.num, self.nums = 0, []

    def __iter__(self):
        return self

    def __next__(self):
        return self.next()

    def next(self):
        if self.num < self.n:
            cur, self.num = self.num, self.num+1
            return cur
        else:
            raise StopIteration()
#sum_of_first_n = sum(firstn(1000000))
#print sum_of_first_n
#num = firstn(5)
# print next(num)
# print next(num)
# print next(num)
# print next(num)
# print next(num)

In [None]:
# a generator that yields items instead of returning a list
def firstn(n):
    num = 0
    while num < n:
        yield num
        num += 1
sum_of_first_n = sum(firstn(1000000))
print sum_of_first_n
# num = firstn(5)
# print next(num)
# print next(num)
# print next(num)
# print next(num)
# print next(num)

### Generator expressions 
* Generator expressions provide an additional shortcut to build generators out of expressions similar to that of list comprehensions.

* In fact, we can turn a list comprehension into a generator expression by replacing the square brackets ("[ ]") with parentheses. Alternately, we can think of list comprehensions as generator expressions wrapped in a list constructor. 

In [None]:
doubles_list = [2 * n for n in range(50)]
print '----', doubles_list

doubles_generator = (2 * n for n in range(50))
print '++++', doubles_generator

### range vs xrange

The performance improvement from the use of generators is the result of the lazy (on demand) generation of values, which translates to lower memory usage. Furthermore, we do not need to wait until all the elements have been generated before we start to use them. This is similar to the benefits provided by iterators, but the generator makes building iterators easy.


In [None]:
sum_of_first_n = sum(range(1000000))

sum_of_first_m = sum(xrange(1000000))

### iter function

In [None]:
my_string = "hello, world"
my_iter = iter(my_string)
print next(my_iter)
print next(my_iter)

my_list = [1, 2, 3, 4]
try:
    next(my_list)
except Exception, e:
    print 'exception:', e
my_iter = iter(my_list)
next(my_iter)


### practice

- implement fabonacci sequence with generators

In [None]:
# General implementation 
def fibon(n):
    a = b = 1
    result = []
    for i in range(n):
        result.append(a)
        a, b = b, a + b
    return result
for x in fibon(10):
    print(x)

In [None]:
# generator version
def fibon(n):
    a = b = 1
    for i in range(n):
        yield a
        a, b = b, a + b
for x in fibon(10):
    print(x)

## contextmanager

### introduction

- What is a context manager? The context manager allows you to automatically start and end something. For example, you might want to open a file, write something, and close it. This is perhaps the most classic example of a context manager. In fact, Python automatically creates a context manager for you when you open a file using the with statement.

- from Python 2.5 adds a very special keyword, "with." With statements allow developers to create context managers

In [None]:
# old way before python2.5
f_obj = open("test/test.txt","w")
try:
    f_obj.write("hello")
finally:
    f_obj.close()

In [None]:
# use context manager
with open("test/test.txt","w") as f_obj:
    f_obj.write("hello")

### create context manager

#### use __enter__ and __exit__

In [None]:
import sqlite3

class DataConn:
    def __init__(self,db_name):
        self.db_name = db_name

    def __enter__(self):
        self.conn = sqlite3.connect(self.db_name)
        return self.conn

    def __exit__(self,exc_type,exc_val,exc_tb):
        self.conn.close()
        if exc_val:
            raise

if __name__ == "__main__":
    db = "test.db"
    with DataConn(db) as conn:
        cursor = conn.cursor()

#### use contextlib

In [None]:
from contextlib import contextmanager

@contextmanager
def file_open(path):
    try:
        f_obj = open(path,"w")
        yield f_obj
    except OSError:
        print("We had an error!")
    finally:
        print("Closing file")
        f_obj.close()

with file_open("test.txt") as fobj:
    fobj.write("Testing context managers")

#### contextlib.closing

In [None]:
# python org demo
from contextlib import contextmanager
@contextmanager
def closing(db):
    try:
        yield db.conn()
    finally:
        db.close()

In [None]:
from contextlib import closing
from urllib import urlopen
with closing(urlopen("http://www.baidu.com")) as webpage:
    for line in webpage:
        print '-----', line

#### note
- Most of the context managers you create can only be used once in the with statement

In [None]:
from contextlib import contextmanager
@contextmanager
def single():
    print("Yielding")
    yield
    print("Exiting context manager")

context = single()
with context:
    pass

# with context:
#     pass


#### practice
- The generator can perform an assembly line operation for a series of operations.
Let's say we have a fast food chain journal.The fourth column in the log is the number of pizzas sold per hour, which we want to sum over the last five years.
Assume that all data is characters and that the data that is not available is represented as "N/A", as you can do with the generator

generator可以对一系列操作执行流水线操作。

假设我们有一个快餐连锁店的日志。日志的第四列是每小时售出的披萨数量，我们想对近5年的这一数据进行求和。

假设所有数据都是字符，不可用的数据都以"N/A"表示，使用generator可以这样实现

In [None]:
# answer
with open('sells.log') as file:  
    pizza_col = (line[3] for line in file)  
    per_hour = (int(x) for x in pizza_col if x != 'N/A')  
    print("Total pizzas sold = ",sum(per_hour))  

## Unit test and TDD

![learning curve](//ifconfiger.com/media/programming_language_learning_curves_python.png?fileid=a9e2ae2d1a3c8d837beee6ee478df9d96592fdcb22837d72ff18e5be1c23bc48)

In [20]:
class NameIsProtected(Exception):
    """Exception raised when key is tried to be overridden."""

def my_fun(name=""):
    if name == "__init__":
        raise NameIsProtected
    else:
        return name.upper()

import unittest
from mock import Mock


class Test_1(unittest.TestCase):
    def test_01(self):
        try:
            my_fun('__init__')
        except NameIsProtected:
            self.assertTrue(True)
        else:
            self.assertTrue(False)
        
suite = unittest.TestLoader().loadTestsFromTestCase(Test_1)
unittest.TextTestRunner().run(suite)

.
----------------------------------------------------------------------
Ran 1 test in 0.003s

OK


<unittest.runner.TextTestResult run=1 errors=0 failures=0>

In [21]:
class Test_2(unittest.TestCase):

    def test_02(self):
        with self.assertRaises(NameIsProtected):
            my_fun('__init__')

    def test_03(self):
        self.assertEqual(my_fun('hi'), 'HI')
        
suite = unittest.TestLoader().loadTestsFromTestCase(Test_2)
unittest.TextTestRunner().run(suite)

..
----------------------------------------------------------------------
Ran 2 tests in 0.007s

OK


<unittest.runner.TextTestResult run=2 errors=0 failures=0>

In [27]:
import os
 
class classA():
 
    def getnum(self):
        return 0
 
 
    def self_function(self):
        try:
            if self.getnum()==0:
                print("self_function you are very good")
        except:
            print("self_function the except module")
        else:
            print("self_function the else module")

import unittest
from mock import Mock, patch


class Test_3(unittest.TestCase):
    
    @patch("__main__.classA.getnum")
    def test_03(self, mock_getnum):
        mock_getnum.side_effect = IOError
        classA().self_function()

    def test_04(self):
        classA.getnum = Mock(side_effect = IOError)
        classA().self_function()
        
suite = unittest.TestLoader().loadTestsFromTestCase(Test_3)
unittest.TextTestRunner().run(suite)

..

self_function the except module
self_function the except module



----------------------------------------------------------------------
Ran 2 tests in 0.010s

OK


<unittest.runner.TextTestResult run=2 errors=0 failures=0>

In [33]:
# mock
class classB():
    def python_function(self,path):
        try:
            if  os.path.exists(path):
                print(" test_python_function you are very good")
        except:
            print("test_python_function the except module")
        else:
            print("test_python_function the else module")

class Test_4(unittest.TestCase):

    @patch("os.path.exists")
    def test_001(self, mock_exists):
        mock_exists.side_effect = IOError
        classB().python_function("/you")

    @patch("os.path.exists")
    def test_002(self, mock_exists):
        mock_exists.side_effect = lambda x: True
        classB().python_function("/you")

suite = unittest.TestLoader().loadTestsFromTestCase(Test_4)
unittest.TextTestRunner().run(suite)

..

test_python_function the except module
 test_python_function you are very good
test_python_function the else module



----------------------------------------------------------------------
Ran 2 tests in 0.006s

OK


<unittest.runner.TextTestResult run=2 errors=0 failures=0>

## setuptool,pip, vitualenv

### python project

In [None]:
# setup.py example
import re
import sys
import os
from os.path import abspath, dirname, join
from setuptools import setup, find_packages
#from distutils.core import setup


CURDIR = dirname(abspath(__file__))
with open(join(CURDIR, 'requirements.cfg')) as f:
    REQUIREMENTS = f.readlines()

with open(join(CURDIR, 'README.md')) as f:
    DESCRIPTION = f.read()

CLASSIFIERS = '''
Development Status :: 5 - Production/Stable
License :: OSI Approved :: Apache Software License
Operating System :: OS Independent
Programming Language :: Python
Programming Language :: Python :: 2.7
Topic :: Software Development :: Testing
Framework :: Robot Framework
Framework :: Robot Framework :: Library
'''.strip().splitlines()
#[join('script', s) for s in os.listdir('script')]
setup(
    name='robot_*',
    version='1.0.0',
    description='robot package library.',
    author='leo',
    author_email='liyaowang518@gmali.com',
    license='Apache License 2.0',
    keywords='robot package library',
    platforms='any',
    classifiers=CLASSIFIERS,
    setup_requires = REQUIREMENTS,
    zip_safe=False,
    packages=find_packages(),
    include_package_data=True,
    scripts=['script/my_*.py']
)

In [None]:
➜  nbs git:(master) python setup.py --help-commands
Standard commands:
  build             build everything needed to install
  build_py          "build" pure Python modules (copy to build directory)
  build_ext         build C/C++ and Cython extensions (compile/link to build directory)
  build_clib        build C/C++ libraries used by Python extensions
  build_scripts     "build" scripts (copy and fixup #! line)
  clean             clean up temporary files from 'build' command
  install           install everything from build directory
  install_lib       install all Python modules (extensions and pure Python)
  install_headers   install C/C++ header files
  install_scripts   install scripts (Python or otherwise)
  install_data      install data files
  sdist             create a source distribution (tarball, zip file, etc.)
  register          register the distribution with the Python package index
  bdist             create a built (binary) distribution
  bdist_dumb        create a "dumb" built distribution
  bdist_rpm         create an RPM distribution
  bdist_wininst     create an executable installer for MS Windows
  upload            upload binary package to PyPI
  check             perform some checks on the package

Extra commands:
  saveopts          save supplied options to setup.cfg or other config file
  testr             Run unit tests using testr
  develop           install package in 'development mode'
  upload_docs       Upload documentation to PyPI
  isort             Run isort on modules registered in setuptools
  test              run unit tests after in-place build
  setopt            set an option in setup.cfg or another config file
  nosetests         Run unit tests using nosetests
  install_egg_info  Install an .egg-info directory for the package
  rotate            delete older distributions, keeping N newest files
  bdist_wheel       create a wheel distribution
  egg_info          create a distribution's .egg-info directory
  alias             define a shortcut to invoke one or more commands
  easy_install      Find/get/install Python packages
  bdist_egg         create an "egg" distribution

usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

### pip

> pip --version

> pip --help

> pip install -U pip

> pip install SomePackage              # 最新版本
>
> pip install SomePackage==1.0.4       # 指定版本
>
> pip install 'SomePackage>=1.0.4'     # 最小版本

upgrade
> pip install --upgrade SomePackage

uninstall
> pip uninstall SomePackage

search
> pip search SomePackage

> pip show  # 显示安装包信息

> pip show -f SomePackage        #查看指定包的详细信息
 
> pip list                       #列出已安装的包
> pip list -o                    #查看可升级的包


### vitualenv

### pipenv

#### install
> pip install pipenv

#### create new vitual project
> cd project1

> pipenv install

pipenv install的时候有三种逻辑：

如果目录下没有Pipfile和Pipfile.lock文件，表示创建一个新的虚拟环境；

如果有，表示使用已有的Pipfile和Pipfile.lock文件中的配置创建一个虚拟环境；

如果后面带诸如django这一类库名，表示为当前虚拟环境安装第三方库。


####  activate the vitual env
> pipenv shell

#### deactivate the vitual env
> exit

#### install and uninstall package
> pipenv install gcp

> pipenv uninstall gcp

> pipenv install --dev django  # develop env

#### run
> pipenv run python your_script.py

In [None]:
$ pipenv
Usage: pipenv [OPTIONS] COMMAND [ARGS]...

Options:
  --update         更新Pipenv & pip
  --where          显示项目文件所在路径
  --venv           显示虚拟环境实际文件所在路径
  --py             显示虚拟环境Python解释器所在路径
  --envs           显示虚拟环境的选项变量
  --rm             删除虚拟环境
  --bare           最小化输出
  --completion     完整输出
  --man            显示帮助页面
  --three / --two  使用Python 3/2创建虚拟环境（注意本机已安装的Python版本）
  --python TEXT    指定某个Python版本作为虚拟环境的安装源
  --site-packages  附带安装原Python解释器中的第三方库
  --jumbotron      不知道啥玩意....
  --version        版本信息
  -h, --help       帮助信息

Commands:
  check      检查安全漏洞
  graph      显示当前依赖关系图信息
  install    安装虚拟环境或者第三方库
  lock       锁定并生成Pipfile.lock文件
  open       在编辑器中查看一个库
  run        在虚拟环境中运行命令
  shell      进入虚拟环境
  uninstall  卸载一个库
  update     卸载当前所有的包，并安装它们的最新版本


## big data analysis

## machine learing

## Socket

### Example: echo server

In [None]:
# simplest version
import socket

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) # reuse port in multiple sockets
sock.bind(('127.0.0.1', 50070))
sock.listen(5)

conn, addr = sock.accept()
print 'Connected by', addr
while True:
    data = conn.recv(1024)
    if not data: break
    conn.send(data)
conn.close()

In [None]:
# multi threading
import socket
from thread import start_new_thread

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEPORT, 1) # reuse port in multiple sockets
sock.bind(('127.0.0.1', 50070))
sock.listen(5)

def _handle_conn(conn):
    while True:
        data = conn.recv(1024)
        if not data: break
        conn.send(data)
    conn.close()
    
while True:
    conn, addr = sock.accept()
    print 'Connected by', addr
    start_new_thread(_handle_conn, (conn,))

In [None]:
# asyncore
import asyncore
import socket

class EchoHandler(asyncore.dispatcher_with_send):
    def handle_read(self):
        data = self.recv(8192)
        if data:
            self.send(data)

class EchoServer(asyncore.dispatcher):
    def __init__(self, host, port):
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.set_reuse_addr()
        self.bind((host, port))
        self.listen(5)

    def handle_accept(self):
        pair = self.accept()
        if pair is not None:
            sock, addr = pair
            print 'Incoming connection from %s' % repr(addr)
            handler = EchoHandler(sock)

server = EchoServer('localhost', 50071)
server2 = EchoServer('localhost', 50072)
asyncore.loop()

In [None]:
# gevent
from gevent.server import StreamServer

def echo(socket, address):
    print('New connection from %s:%s' % address)
    # using a makefile because we want to use readline()
    with socket.makefile(mode='rb') as rfileobj:
        while True:
            line = rfileobj.readline()
            if not line:
                break
            socket.sendall(line)
    print('%s:%s disconnected' % address)

server = StreamServer(('0.0.0.0', 50070), echo)
server.serve_forever()

## Multi Threading

In [None]:
# thread
import thread
from time import sleep

def sleep_echo(sleep_interval, msg):
    sleep(sleep_interval)
    print msg

thread.start_new_thread(sleep_echo, (2, 'hello'))
print 'world'

In [None]:
# threading
from threading import Thread
from time import sleep

class DelayEcho(Thread):
    def __init__(self, interval, msg):
        super(DelayEcho, self).__init__()
        self.interval = interval
        self.msg = msg
        self.daemon = True
        
    def run(self):
        sleep(self.interval)
        print self.msg

t = DelayEcho(2, 'hello')
t.start()
t.join()
print 'world'

In [None]:
# threading.Lock
import threading
from threading import Lock
from threading import Thread

resource_lock = Lock()

def update_resource():
    with resource_lock:
        print threading.currentThread().name
        
threads = map(lambda x: Thread(target=update_resource), xrange(10))
[t.start() for t in threads]
[t.join() for t in threads]
print 'main'

In [None]:
# threading.local
import threading
from time import sleep

_lock = threading.Lock()
resource = threading.local()
resource.name = 'default'

def update_resource():
    resource.name = threading.currentThread().name
    sleep(1)
    with _lock:
        print resource.name # each thread as its one value

threads = map(lambda x: Thread(target=update_resource), xrange(5))
[t.start() for t in threads]
[t.join() for t in threads]
print resource.name

In [None]:
# Queue
from Queue import Queue
from threading import Thread

q = Queue()

def setter(q, v):
    q.put(v)

def getter(q):
    print q.get()
    
_setter = Thread(target=setter, args=(q, 1))
_getter = Thread(target=getter, args=(q, ))
_getter.start()
_setter.start()
_getter.join()
_setter.join()

In [None]:
from Queue import LifoQueue

q = LifoQueue()
q.put(1)
q.put(2)
print q.get()
print q.get()

In [None]:
from Queue import PriorityQueue

q = PriorityQueue()
q.put((1, 'a'))
q.put((3, 'b'))
q.put((2, 'c'))

print q.get()
print q.get()
print q.get()

### Limitation of Thread

* No stop/interrupt
* No multi-core support

## Multi Processing

In [None]:
# multiprocessing
from multiprocessing import Process
from time import sleep

def delay_echo(interval, msg):
    sleep(interval)
    print msg
    
p = Process(target=delay_echo, args=(2, 'hello'))
p.start()
print 'world'
p.join()

In [None]:
# Pool
import urllib
from multiprocessing import Pool
# from multiprocessing.dummy import Pool
from time import sleep

urls = ['http://www.google.com',
        'http://www.facebook.com',
        'http://www.baidu.com']

def fetch_content(url):
    sleep(1)
    print url
    print len(urllib.urlopen(url).read())
    
pool = Pool()
pool.map(fetch_content, urls)
pool.close()
pool.join()

In [None]:
# Queue
from multiprocessing import Queue
# Lock
from multiprocessing import Lock

## Coroutine

In [None]:
# yield
def fibonacci():
    a, b = 1, 1
    yield a
    yield b
    while True:
        a, b = b, a+b
        yield b

fib = fibonacci()
for _ in xrange(20):
    print fib.next()

In [None]:
# yield send
def puzzle_game():
    while True:
        answer = (yield 'type a word: ')
        if answer == 'harry':
            yield 'you got it'
        else:
            yield 'try again'
            
game = puzzle_game()
game.next()
game.send('jerry')
game.next()
game.send('harry')

In [None]:
# gevent, tornado
import gevent

def delay_echo(msg):
    gevent.sleep(2)
    print msg
    
gevent.spawn(delay_echo, 'hello world')
gevent.spawn(delay_echo, 'how are you')
gevent.wait()

**In Python3, there is builtin coroutine support from [asyncio](//docs.python.org/3.5/library/asyncio.html#module-asyncio) module**

## Unit Testing and TDD

![learning curve](//ifconfiger.com/media/programming_language_learning_curves_python.png?fileid=a9e2ae2d1a3c8d837beee6ee478df9d96592fdcb22837d72ff18e5be1c23bc48)

### Why "Unit Testing" is so important in Python

### Simple Example

### Bowling Game

#### Description:

Write a program to score a game of Ten-Pin Bowling.

Input: string (described below) representing a bowling game
Ouput: integer score

The scoring rules:

> Each game, or "line" of bowling, includes ten turns, or "frames" for the bowler.
> 
> In each frame, the bowler gets up to two tries to knock down all ten pins.
> 
> If the first ball in a frame knocks down all ten pins, this is called a "strike". The frame is over. The score for the frame is ten plus the total of the pins knocked down in the next two balls.
> 
> If the second ball in a frame knocks down all ten pins, this is called a "spare". The frame is over. The score for the frame is ten plus the number of pins knocked down in the next ball.
> 
> If, after both balls, there is still at least one of the ten pins standing the score for that frame is simply the total number of pins knocked down in those two balls.
> 
> If you get a spare in the last (10th) frame you get one more bonus ball. If you get a strike in the last(10th) frame you get two more bonus balls. These bonus throws are taken as part of the same turn. If a bonus ball knocks down all the pins, the process does not repeat. The bonus balls are only used to calculate the score of the final frame.

The game score is the total of all frame scores.

#### Examples:

* X indicates a strike
* / indicates a spare
* - indicates a miss
* | indicates a frame boundary
* The characters after the || indicate bonus balls

X|X|X|X|X|X|X|X|X|X||XX

* Ten strikes on the first ball of all ten frames.
* Two bonus balls, both strikes.

Score for each frame == 10 + score for next two 

balls == 10 + 10 + 10 == 30

Total score == 10 frames x 30 == 300

9-|9-|9-|9-|9-|9-|9-|9-|9-|9-||

* Nine pins hit on the first ball of all ten frames.
* Second ball of each frame misses last remaining pin.
* No bonus balls.

Score for each frame == 9

Total score == 10 frames x 9 == 90

5/|5/|5/|5/|5/|5/|5/|5/|5/|5/||5

* Five pins on the first ball of all ten frames.
* Second ball of each frame hits all five remaining pins, a spare.
* One bonus ball, hits five pins.

Score for each frame == 10 + score for next one

ball == 10 + 5 == 15

Total score == 10 frames x 15 == 150

X|7/|9-|X|-8|8/|-6|X|X|X||81

Total score == 167

```python
# test_bowling.py
import unittest
from bowlling import get_bowlling_score


class TestScore(unittest.TestCase):
    def test_all_missing(self):
        self._assert_score('--|--|--|--|--|--|--|--|--|--||', 0)
        
    def test_first_hit(self):
        self._assert_score('1-|--|--|--|--|--|--|--|--|--||', 1)
        
    def test_one_spare(self):
        self._assert_score('1/|--|--|--|--|--|--|--|--|--||', 10)
        
    def test_one_strike(self):
        self._assert_score('X|--|--|--|--|--|--|--|--|--||', 10)
        
    def test_two_hits(self):
        self._assert_score('12|--|--|--|--|--|--|--|--|--||', 3)
        
    def test_second_hit(self):
        self._assert_score('-5|--|--|--|--|--|--|--|--|--||', 5)
        
    def test_two_frames_hit(self):
        self._assert_score('13|1-|--|--|--|--|--|--|--|--||', 5)
        
    def test_multi_frames_hit(self):
        self._assert_score('1-|1-|-2|33|--|11|-1|--|--|-1||', 14)
        
    def test_spare_with_bonus(self):
        self._assert_score('1/|6-|--|--|--|--|--|--|--|--||', 22)
        
    def test_spare_with_bonus_2(self):
        self._assert_score('1/|62|--|--|--|--|--|--|--|--||', 24)
        
    def test_strike_after_spare(self):
        self._assert_score('1/|X|--|--|--|--|--|--|--|--||', 30)
        
    def test_strike_with_bonus(self):
        self._assert_score('X|12|--|--|--|--|--|--|--|--||', 16)
        
    def test_strike_after_strike(self):
        self._assert_score('X|X|12|--|--|--|--|--|--|--||', 37)
        
    def test_last_strike_with_bonus(self):
        self._assert_score('--|--|--|--|--|--|--|--|X|X||12', 34)
        
    def test_last_spare_with_bonus(self):
        self._assert_score('--|--|--|--|--|--|--|--|X|2/||2', 32)
        
    def test_last_all_strikes(self):
        self._assert_score('--|--|--|--|--|--|--|--|X|X||XX', 60)
        
    def test_last_strike_after_spare(self):
        self._assert_score('--|--|--|--|--|--|--|--|4/|X||XX', 50)

    def _assert_score(self, score_str, expect_score):
        self.assertEqual(get_bowlling_score(score_str), expect_score)

        
if __name__ == '__main__':
    unittest.main()
```

```python
# bowling.py
def get_bowlling_score(bowling_str):
    frames = bowling_str.split('|')
    return sum(get_frame_total_score(frames, index) for index in range(10))

def get_subsequent_balls(frames, index):
    return ''.join(frames[index+1:])

def get_frame_total_score(frames, index):
    if 'X' in frames[index]:
        return get_ball_score(get_subsequent_balls(frames, index)[:2])+10
    elif '/' in frames[index]:
        return 10+get_ball_score(get_subsequent_balls(frames,index)[:1])
    return get_ball_score(frames[index])

def get_ball_score(balls):
    if '/' in balls:
        return 10
    return sum({'-':0,'X':10,'1':1,'2':2,'3':3,'4':4,'5':5,'6':6,'7':7,'8':8,'9':9}[ball] for ball in balls)
```

### Mock

In [None]:
# mock
import time

def delay_print(msg, delay):
    time.sleep(delay)
    print msg
    
import unittest

time.sleep = lambda x: True

class TestDelayPrint(unittest.TestCase):
    def test_delay_print_empty_string(self):
        delay_print('', 5)
            
suite = unittest.TestLoader().loadTestsFromTestCase(TestDelayPrint)
unittest.TextTestRunner().run(suite)

In [None]:
# mock for thread
from threading import Thread

def echo_in_process(interval, msg):
    from time import sleep
    sleep(interval)
    print msg

import time
time.sleep = lambda x: None # mock time.sleep

t = Thread(target=echo_in_process, args=(5, 'hello world'))
t.start()
t.join()

In [None]:
# mock in thread
from threading import Thread

def mock_in_thread():
    import time
    time.sleep = lambda x: None # mock time.sleep
    print 'after mock'

t = Thread(target=mock_in_thread)
t.start()
t.join()

from time import sleep
sleep(5)
print 'hello world'

In [None]:
# mock for child process
import multiprocessing

def echo_in_process(interval, msg):
    from time import sleep
    sleep(interval)
    print msg
    
import time
time.sleep = lambda x: None # mock time.sleep

p = multiprocessing.Process(target=echo_in_process, args=(5, 'hello world'))
p.start()
p.join()

## Big data analysis

* Memory
* Index

In [None]:
# iterator
# with open('access_10000.log') as fp:
#     for line in fp:
#         pass # proceed line

d = {'a': 1, 'b': 2, 'c': 3}
for k in d:
    print k

for k, v in d.iteritems():
    print k, v

from itertools import imap

imap(int, ('0', '1', '2'))

In [None]:
# numpy
import numpy as np

# 1 2 3
# 4 5 6
# 7 8 9
metrix = np.array([[1,2,3], [4,5,6], [7,8,9]])
metrix[:2, 1:] # slice
metrix[:2, 1:] = 0
metrix # view of data but not copy

In [None]:
(metrix[:2, 1:] + 3) * 2 # broadcast

In [None]:
bool_index = np.array([True, False, True])
metrix[bool_index]

In [None]:
metrix[metrix % 2 == 1]

In [None]:
# statistics on ndarray
print metrix.sum()
print metrix[metrix % 2 == 1].mean()

In [None]:
# pandas
import pandas as pd

data = pd.read_table('access_10000.log', sep=' ', names=[
        'src', 'field2', 'field3', 'datetime', 'timezone', 'method', 'code', 'length', 'referer', 'agent'])
del data['field2']
del data['field3']
methods = data['method'].str.split()
data['method'] = methods.apply(lambda x: x[0])
data['url'] = methods.apply(lambda x: x[1])
data['protocol'] = methods.apply(lambda x: x[2])
# TODO: handle merge datetime and timezone to an unique datetime field

In [None]:
data[data['code']>300]['url'].unique() # get all invalid request urls

In [None]:
%matplotlib inline
import seaborn as sns

data['src'].value_counts()[:15].plot(kind='barh', figsize=(12, 5))

In [None]:
# use chunksize to handle huge dataset
import pandas as pd
from pandas import Series

data = pd.read_table('access_10000.log', sep=' ', chunksize=1000, names=[
        'src', 'field2', 'field3', 'datetime', 'timezone', 'method', 'code', 'length', 'referer', 'agent'])
invalid_visits = 0
for chunk in data:
    invalid_visits += len(chunk[chunk['code'] >= 300])
print invalid_visits

## Common Patterns

![bible](//a1.att.hudong.com/34/62/19300001337301131296620943684.jpg)

### Singleton


> http://blog.zhangyu.so/python/2016/02/16/design-patterns-of-python-borg/

### Decorator

> http://blog.zhangyu.so/python/2016/02/17/design-patterns-of-python-decorator/

### Proxy

> http://blog.zhangyu.so/python/2016/02/24/design-patterns-of-python-proxy/

### MapReduce

> http://blog.zhangyu.so/python/2016/02/19/design-patterns-of-python-mapreduce/

> https://wiki.python.org/moin/Generators

> https://eastlakeside.gitbooks.io/interpy-zh/content/Generators/Generators.html

> https://blog.csdn.net/wangliyao518/article/details/83444107