# 数値計算

べき乗は**演算子を用いる

In [1]:
2**10

1024

^はビット単位排他的論理和です。

In [2]:
2^10

8

# 配列の初期化
## 0で初期化した配列を作る

In [3]:
import numpy as np
x = np.zeros(5)
x

array([0., 0., 0., 0., 0.])

2次元以上の場合は各次元のlengthを**引数ではなくリスト**で渡します。

In [4]:
X = np.zeros([3,5])
X

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

## 乱数がセットされた配列を得る

In [5]:
x = np.random.rand(5)
x

array([0.45786014, 0.80183077, 0.97250512, 0.63621742, 0.9995285 ])

2次元以上の場合、各次元のlengthを**リストではなく引数**として渡します。

In [6]:
X = np.random.rand(3,5)
X

array([[0.67825629, 0.01156951, 0.0068161 , 0.69752329, 0.47612695],
       [0.15208256, 0.09543362, 0.81156197, 0.70008284, 0.38936996],
       [0.28094327, 0.37670593, 0.73712213, 0.81492328, 0.65110318]])

## 空の配列に値を追加していく

np.appendではnp.appendのたびに配列のコピーが発生してしまうので、Pythonのlistに追加してnp.arrayに変換します。

In [7]:
x = []
type(x)

list

In [8]:
for i in range(5):
    x.append(np.random.randn())
x

[1.1022456609757516,
 0.04787152013911488,
 -2.491423809077364,
 0.21612476089205074,
 -0.5348162830427939]

追加が終わったら、np.arrayに変換する。

In [9]:
x = np.array(x)
x

array([ 1.10224566,  0.04787152, -2.49142381,  0.21612476, -0.53481628])

# 行列演算

## 行列の演算子

転置は.T

In [10]:
X = np.random.randn(5,2)
X.T

array([[ 1.55636992, -0.02933722, -0.49111044, -0.65868838,  1.23510538],
       [-0.09765146, -2.15672381, -0.15531088, -0.98738182, -0.71820996]])

逆行列はnp.linalg.inv()

In [11]:
X = np.random.randn(5,5)
np.linalg.inv(X)

array([[-0.45085016, -1.04776819, -0.20542419, -0.59578615, -2.46933752],
       [ 0.32600872, -1.16196903,  0.52798707, -0.4093146 , -0.03310271],
       [ 0.02149765,  0.60710885, -0.55455074, -0.38413664,  0.30573919],
       [-0.21734739,  1.74744197, -0.76971704,  1.20580675,  1.61508929],
       [ 0.71206654, -0.18545909,  0.54141091,  0.75307144,  0.87925816]])

行列式はnp.linalg.det()

In [12]:
np.linalg.det(X)

1.523857045905313

トレースはnp.trace()

In [13]:
np.trace(X)

-2.6331219827868773

## 行列の積

np.arrayでは積は@

In [14]:
X = np.random.randn(3,5)
Y = np.random.randn(5,3)
X @ Y

array([[-1.25056585,  0.90598395,  1.14224006],
       [-0.49345218,  0.3137737 ,  1.72524985],
       [ 0.88022283,  0.45998124, -4.68323947]])

np.matrixでは積は*

In [15]:
X = np.matrix(X)
Y = np.matrix(Y)
X * Y

matrix([[-1.25056585,  0.90598395,  1.14224006],
        [-0.49345218,  0.3137737 ,  1.72524985],
        [ 0.88022283,  0.45998124, -4.68323947]])

## 要素ごとの演算

np.arrayでは*は要素ごとの積（アダマール積）

In [16]:
X = np.random.randn(3,5)
Y = np.random.randn(3,5)
X * Y

array([[-0.03654157, -0.55163526,  0.14885947, -0.19407307,  2.00542704],
       [-1.86773146,  0.5145021 ,  0.66583041, -2.04078872, -0.12637374],
       [-0.36295933,  1.91678787,  0.17772953, -0.08239529, -0.02892473]])

numpyに用意されているユニバーサル関数（配列の各要素に関数を適用して、配列を返す関数）の場合、引数に配列を渡せます。

In [17]:
np.exp(X)

array([[0.77858857, 2.37622816, 0.73391409, 0.57698939, 0.27931764],
       [0.45790891, 3.21054859, 0.53738436, 0.23816819, 0.32436526],
       [3.32824898, 6.20273539, 9.04825092, 1.64498433, 0.79248892]])

自作関数を配列の各要素に適用する場合、ユニバーサル関数を作るnp.vectorize, np.frompyfuncを用いる。

In [18]:
import math
def f(x):
    return 1.0 + x + x**2/2 + x**3/math.factorial(3)

In [19]:
x = np.random.randn(5)
np.vectorize(f)(X)

array([[0.77843295, 2.3481337 , 0.73355489, 0.57356208, 0.192149  ],
       [0.44453937, 3.11124259, 0.53188282, 0.10224641, 0.27005832],
       [3.21514988, 5.50333444, 7.40912892, 1.64214984, 0.79237247]])

In [20]:
np.frompyfunc(f,1,1)(X)

array([[0.7784329522706827, 2.3481337003067027, 0.7335548889239774,
        0.5735620830925868, 0.19214900141946512],
       [0.44453936695675994, 3.111242587750439, 0.5318828200251333,
        0.10224640754748965, 0.2700583190080261],
       [3.215149877271177, 5.503334438399696, 7.409128916718805,
        1.6421498369948793, 0.792372468066146]], dtype=object)

一次元配列の場合map関数も使える。mapオブジェクトが返ってくるのでlistに突っ込むなどする。

In [21]:
list(map(f, x))

[0.8229072775900572,
 6.940234138593935,
 0.33954019808304836,
 0.5853958991773215,
 -0.05887679518859701]

# DataFrame

In [22]:
import pandas as pd

空のDataFrameを定義して、カラムを追加していく

In [23]:
df = pd.DataFrame()

In [24]:
df['A'] = ["abc", "def", "ghi", "jkl", "mno"]
df['B'] = [2,1,1,2,2]
df['C'] = [5.5, 6.6, 7.7, 8.8, 9.9]
df['D'] = np.random.rand(5)
df

Unnamed: 0,A,B,C,D
0,abc,2,5.5,0.226157
1,def,1,6.6,0.27289
2,ghi,1,7.7,0.626231
3,jkl,2,8.8,0.795708
4,mno,2,9.9,0.237503


df.head(行数)

In [25]:
df.head(3)

Unnamed: 0,A,B,C,D
0,abc,2,5.5,0.226157
1,def,1,6.6,0.27289
2,ghi,1,7.7,0.626231


df.describe()

In [26]:
df.describe()

Unnamed: 0,B,C,D
count,5.0,5.0,5.0
mean,1.6,7.7,0.431698
std,0.547723,1.739253,0.262453
min,1.0,5.5,0.226157
25%,1.0,6.6,0.237503
50%,2.0,7.7,0.27289
75%,2.0,8.8,0.626231
max,2.0,9.9,0.795708


カラムを取り出すには df['カラム名']  
**PythonのDataFrameのカラムはpd.Series。**

In [27]:
type(df['A'])

pandas.core.series.Series

演算結果を新たにカラムに追加する場合

In [28]:
df['E'] = df['C'] * df['D']

In [29]:
df

Unnamed: 0,A,B,C,D,E
0,abc,2,5.5,0.226157,1.243864
1,def,1,6.6,0.27289,1.801071
2,ghi,1,7.7,0.626231,4.821979
3,jkl,2,8.8,0.795708,7.002234
4,mno,2,9.9,0.237503,2.351282


## ソート

In [30]:
df.sort_values(['B', 'A'])

Unnamed: 0,A,B,C,D,E
1,def,1,6.6,0.27289,1.801071
2,ghi,1,7.7,0.626231,4.821979
0,abc,2,5.5,0.226157,1.243864
3,jkl,2,8.8,0.795708,7.002234
4,mno,2,9.9,0.237503,2.351282


## DataFrameのCSVへの書き出し、読み出し

In [31]:
df.to_csv('test.csv')

In [32]:
df2 = pd.read_csv('test.csv')

In [33]:
df2

Unnamed: 0.1,Unnamed: 0,A,B,C,D,E
0,0,abc,2,5.5,0.226157,1.243864
1,1,def,1,6.6,0.27289,1.801071
2,2,ghi,1,7.7,0.626231,4.821979
3,3,jkl,2,8.8,0.795708,7.002234
4,4,mno,2,9.9,0.237503,2.351282


## DataFrameに対するQuery処理

### 条件式による行の抽出

In [34]:
df[df['B'] == 1]

Unnamed: 0,A,B,C,D,E
1,def,1,6.6,0.27289,1.801071
2,ghi,1,7.7,0.626231,4.821979


pandasのメソッドチェーンによる処理

In [35]:
res = df\
    .groupby('B')\
    .agg({'D': np.max, 'E': np.sum})\
    .assign(F = lambda df: df['D'] + df['E'])\
    .sort_values('B')

In [36]:
res

Unnamed: 0_level_0,D,E,F
B,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,0.626231,6.62305,7.249281
2,0.795708,10.59738,11.393089


# SQL

In [37]:
import sqlite3

In [38]:
db = sqlite3.connect('test.db')

In [39]:
c = db.cursor()

テーブル一覧取得

In [40]:
c.execute("select * from sqlite_master where type='table'")
for row in c.fetchall():
    print(row)

('table', 'test', 'test', 2, 'CREATE TABLE test(\n            id INTEGER PRIMARY KEY,\n            name TEXT NOT NULL,\n            age INTEGER NOT NULL\n        )')


テーブルのdrop

In [41]:
c.execute("""
    DROP TABLE IF EXISTS
        test          
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [42]:
c.execute("""
    CREATE TABLE IF NOT EXISTS
        test(
            id INTEGER PRIMARY KEY,
            name TEXT NOT NULL,
            age INTEGER NOT NULL
        )
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [43]:
c.execute("""
    INSERT INTO test
        VALUES(
            1,
            'Yamada',
            39
        )
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [44]:
c.execute("""
    INSERT INTO test
        VALUES(
            2,
            'Suzuki',
            26
        )
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [45]:
c.execute("""
    INSERT INTO test
        VALUES(
            3,
            'Tanaka',
            43
        )
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [46]:
c.execute("""
    SELECT
        *
    FROM
        test
""")

<sqlite3.Cursor at 0x1f5b737cdc0>

In [47]:
print(c.fetchall())

[(1, 'Yamada', 39), (2, 'Suzuki', 26), (3, 'Tanaka', 43)]
