<a href="https://colab.research.google.com/github/sakamoto-hands-on/Python_InteractiveComputing_and_Visualization/blob/master/Using_Python_to_write_FasterCode.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#標準的なPythonで早いコードを書くには

Cやjavaのように書こうとすると、Pythonは遅いかもしれない。しかしパイソニック(Pythonic)に書けるなら十分な早さを出せることがしばしばある

In [0]:
import random
l = [random.normalvariate(0,1) for i in range(100000)]

In [0]:
def sum1():
  #パイソニックではないダメで遅いコード
  res = 0
  for i in range(len(l)):
    res = res + l[i]
  return res

In [6]:
sum1()

357.0955700389667

In [7]:
%timeit sum1()

100 loops, best of 3: 5.39 ms per loop


たった10万個の数字を計算するに約6ミリセカンドかかるのは、遅い

In [0]:
def sum2():
  #まだダメ
  res = 0
  for x in l:
    res=res+x
  return res

In [9]:
sum2()

357.0955700389667

In [10]:
%timeit sum2()

100 loops, best of 3: 2.45 ms per loop


先ほどに比べると2倍くらいは早くなった

In [0]:
def sum3():
  #GOOD
  return sum(l)

In [12]:
sum3()

357.0955700389667

In [13]:
%timeit sum3()

1000 loops, best of 3: 485 µs per loop


このColaboratory環境で正確にどれくらい早くなったのかは計算してませんが、参考書では17倍速くなったと書いてあります

###文字列のリストで試してみる

In [14]:
strings = ['%.3f' %x for x in l]
strings[:3]

['1.577', '-0.066', '0.072']

In [0]:
def concat1():
  #パイソニックではないダメコード
  cat = strings[0]
  for s in strings[1:]:
    cat=cat+','+s
  return cat  

In [16]:
concat1()[:24]

'1.577,-0.066,0.072,-0.89'

In [18]:
%timeit concat1()

1 loop, best of 3: 3.92 s per loop


In [0]:
def concat2():
  #GOOD
  return ','.join(strings)

In [20]:
concat2()[:24]

'1.577,-0.066,0.072,-0.89'

In [21]:
%timeit concat2()

1000 loops, best of 3: 1.13 ms per loop


参考書では1640倍速くなったとのこと

###最後に、10万個の整数リストの中から、0から99の出現頻度を数える

In [0]:
l=[random.randint(0,100) for _ in range(100000)]

In [0]:
def hist1():
  #BAD
  count={}
  for x in l:
    if x not in count:
      count[x] = 0
    count[x]+=1
  return count

In [24]:
hist1()

{0: 1017,
 1: 958,
 2: 1036,
 3: 991,
 4: 1045,
 5: 958,
 6: 1015,
 7: 987,
 8: 962,
 9: 981,
 10: 995,
 11: 1040,
 12: 1035,
 13: 1025,
 14: 1049,
 15: 988,
 16: 969,
 17: 964,
 18: 984,
 19: 1001,
 20: 1013,
 21: 961,
 22: 956,
 23: 987,
 24: 994,
 25: 1037,
 26: 964,
 27: 967,
 28: 1016,
 29: 1042,
 30: 999,
 31: 974,
 32: 998,
 33: 1051,
 34: 980,
 35: 990,
 36: 974,
 37: 953,
 38: 987,
 39: 924,
 40: 967,
 41: 1017,
 42: 990,
 43: 1006,
 44: 1030,
 45: 954,
 46: 949,
 47: 957,
 48: 991,
 49: 986,
 50: 986,
 51: 1008,
 52: 1006,
 53: 946,
 54: 1022,
 55: 982,
 56: 978,
 57: 1016,
 58: 1013,
 59: 986,
 60: 966,
 61: 908,
 62: 1000,
 63: 1017,
 64: 973,
 65: 1003,
 66: 1023,
 67: 1037,
 68: 1047,
 69: 998,
 70: 1033,
 71: 969,
 72: 987,
 73: 962,
 74: 955,
 75: 977,
 76: 1041,
 77: 985,
 78: 917,
 79: 945,
 80: 985,
 81: 970,
 82: 1011,
 83: 973,
 84: 963,
 85: 997,
 86: 983,
 87: 994,
 88: 969,
 89: 1019,
 90: 996,
 91: 1017,
 92: 924,
 93: 995,
 94: 962,
 95: 979,
 96: 949,
 97: 99

In [25]:
%timeit hist1()

100 loops, best of 3: 9.94 ms per loop


Pythonには辞書キーを自動的に作り出すdefaultdict structureがあることに気づく

In [0]:
from collections import defaultdict

In [0]:
def hist2():
  #BETTER
  count = defaultdict(int)
  for x in l:
    count[x]+=1
  return count

In [28]:
hist2()

defaultdict(int,
            {0: 1017,
             1: 958,
             2: 1036,
             3: 991,
             4: 1045,
             5: 958,
             6: 1015,
             7: 987,
             8: 962,
             9: 981,
             10: 995,
             11: 1040,
             12: 1035,
             13: 1025,
             14: 1049,
             15: 988,
             16: 969,
             17: 964,
             18: 984,
             19: 1001,
             20: 1013,
             21: 961,
             22: 956,
             23: 987,
             24: 994,
             25: 1037,
             26: 964,
             27: 967,
             28: 1016,
             29: 1042,
             30: 999,
             31: 974,
             32: 998,
             33: 1051,
             34: 980,
             35: 990,
             36: 974,
             37: 953,
             38: 987,
             39: 924,
             40: 967,
             41: 1017,
             42: 990,
             43: 1006,
         

In [29]:
%timeit hist2()

100 loops, best of 3: 7.63 ms per loop


少し早くなる

ついに、まさしく私たちが求めているCounter classを提供してくれる組み込みコレクションがあることに気づく

In [0]:
from collections import Counter

In [0]:
def hist3():
  #GOOD
  return Counter(l)

In [32]:
hist3()

Counter({0: 1017,
         1: 958,
         2: 1036,
         3: 991,
         4: 1045,
         5: 958,
         6: 1015,
         7: 987,
         8: 962,
         9: 981,
         10: 995,
         11: 1040,
         12: 1035,
         13: 1025,
         14: 1049,
         15: 988,
         16: 969,
         17: 964,
         18: 984,
         19: 1001,
         20: 1013,
         21: 961,
         22: 956,
         23: 987,
         24: 994,
         25: 1037,
         26: 964,
         27: 967,
         28: 1016,
         29: 1042,
         30: 999,
         31: 974,
         32: 998,
         33: 1051,
         34: 980,
         35: 990,
         36: 974,
         37: 953,
         38: 987,
         39: 924,
         40: 967,
         41: 1017,
         42: 990,
         43: 1006,
         44: 1030,
         45: 954,
         46: 949,
         47: 957,
         48: 991,
         49: 986,
         50: 986,
         51: 1008,
         52: 1006,
         53: 946,
         54: 1022,


In [33]:
%timeit hist3()

100 loops, best of 3: 4.94 ms per loop


最初のバージョンよりも2倍速くなった