一些简单的python性能测试对比代码
===


## abs与if 的速度对比

In [5]:
def abs_func(a, b, c):
    return (abs(a) + b) / c

def if_func(a, b, c):
    return (a + b) /c if a > 0 else (b-a)/c

In [6]:
%timeit abs_func(10, 2, 5)

239 ns ± 6.37 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [7]:
%timeit if_func(10,2,5)

207 ns ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [8]:
%timeit abs_func(-10, 2, 5)

261 ns ± 23.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [9]:
%timeit if_func(-10, 2, 5)

208 ns ± 11.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


明显还是使用if速度为优,但是两者速度处理都很快，都是ns级别速度，除非特别极端例子，否者两者差距很大

## pandas append 速度与list append速度对比

In [1]:
import pandas
import random

In [8]:
def list_append():
    a = []
    for i in range(1000):
        a.append({'a':random.randint(0,100000)})
        
    c = pandas.DataFrame(a)
    
def pandas_append():
    a = pandas.DataFrame(columns=['a'])
    for i in range(1000):
        a.append({'a':random.randint(0,100000)},ignore_index=True)

In [9]:
%timeit list_append()

3.96 ms ± 126 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [10]:
%timeit pandas_append()

1.12 s ± 42.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


很明显pandas的append速度十分慢，不是一个数量级的

## 字符串比较和数字比较

In [15]:
%timeit 1 == 1

34.6 ns ± 1.36 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [None]:
%timeit 1 == 2

In [16]:
%timeit 'a' == 'a'

34 ns ± 0.966 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [17]:
%timeit 'a' == 'b'

39 ns ± 1.04 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


## pandas resample 成kline的过程比较

In [8]:
import pandas
from dateutil.relativedelta import relativedelta
import random
def trades_to_1m_kline(frame: pandas.DataFrame) -> pandas.DataFrame:
    kline = frame['price'].resample('1Min', label='right', closed='right').ohlc()
    kline['volume'] = frame['homeNotional'].resample('1Min', label='right', closed='right').sum()
    kline['turnover'] = frame['foreignNotional'].resample('1Min', label='right', closed='right').sum()
    kline.fillna(method='ffill', inplace=True)
    return kline

def trades_to_1m_kline2(frame: pandas.DataFrame) -> pandas.DataFrame:
    re_df = frame.resample('1Min', label='right', closed='right')
    kline = re_df['price'].ohlc()
    kline['volume'] = re_df['homeNotional'].sum()
    kline['turnover'] = re_df['foreignNotional'].sum()
    kline.fillna(method='ffill', inplace=True)
    return kline

def random_trade_frame(length: int, timestamp: pandas.Timestamp = pandas.Timestamp(2018, 1, 3)) -> pandas.DataFrame:
    r = lambda: random.randint(1, 1000)
    # random side 1 buy 2 sell
    r_s = lambda: random.randint(1, 2)
    # random tick direction
    r_t = lambda: random.randint(1, 4)

    columns = ["timestamp", "side", "size", "price", "tickDirection",
               "grossValue", "homeNotional", "foreignNotional"]
    df = pandas.DataFrame(
        [(timestamp + relativedelta(seconds=i), r_s(), r(), r(), r_t(), r(), r(), r()) for i in range(length)],
        columns=columns)
    df.set_index('timestamp', inplace=True)
    return df

In [9]:
df = random_trade_frame(100000)

In [10]:
%timeit trades_to_1m_kline(df)

10.5 ms ± 442 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [11]:
%timeit trades_to_1m_kline2(df)

7.14 ms ± 164 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
