## Numpy

수치해석, 수학적 처리, 통계 등을 위한 모듈 - Pandas, Matplotlib, Scipy

### Numpy
```
!pip install numpy
```

In [1]:
!pip show numpy

Name: numpy
Version: 1.25.2
Summary: Fundamental package for array computing in Python
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: BSD-3-Clause
Location: C:\source\iot-bigdata-2023\venv\Lib\site-packages
Requires: 
Required-by: pandas


In [2]:
import pandas as pd
import numpy as np

#### Numpy 사용이유
- python 리스트를 만들고 사용처리(1,000,000건 ) 리스트를 만들고 사용처리 - 대략 200ms
- numpy 1,000,000건 배열 만들고 사용처리 - 대략 20ms(1/10배)

In [9]:
py_list = list(range(1_000_000)) # 100만건 리스트
np_arr = np.arange(1_000_000) # 100만건 배열

In [6]:
%timeit for _ in range(10): py_list2 = py_list * 100

5.52 s ± 43.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [10]:
%timeit for _ in range(10): np_arr2 = np_arr + 100

12.9 ms ± 335 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


#### Numpy로 배열 생성, 연산

- 파이썬 리스트 +, * 연산밖에 없음
- 행렬(벡터) 연산이 필요하면 Numpy로 형변환 후 처리가능 

#### Numpy 연산

In [11]:
list1 = [4, 5, 6]
np_arr1 = np.array(list1)
np_arr1

array([4, 5, 6])

In [12]:
np_arr2 = np.array([[1, 2, 3], [4, 5, 6]])
np_arr2

array([[1, 2, 3],
       [4, 5, 6]])

In [13]:
list2 = [7, 8, 9]
np_arr3 = np.array(list1)
np_arr4 = np.array(list2)

In [14]:
np_arr3 + np_arr4

array([11, 13, 15])

In [15]:
np_arr4 - np_arr3

array([3, 3, 3])

In [17]:
np_arr4 / np_arr3

array([1.75, 1.6 , 1.5 ])

In [20]:
list(np.zeros(4))

[0.0, 0.0, 0.0, 0.0]

In [21]:
np.zeros(4000)

array([0., 0., 0., ..., 0., 0., 0.])

In [23]:
np.zeros([2, 3])

array([[0., 0., 0.],
       [0., 0., 0.]])

In [25]:
# 랜덤은 활용도가 높음
np.random.randn(5, 4)

array([[ 3.58917091e-01,  3.93342922e-01,  2.13450989e-02,
         1.29108943e+00],
       [-5.80107061e-01,  7.50122626e-01, -4.18785860e-01,
        -1.69806591e-01],
       [-1.51602650e+00,  1.54572685e-02, -5.28840806e-02,
         6.74913327e-02],
       [ 1.95427411e-04, -1.87799143e+00,  1.22808540e+00,
        -4.60037205e-01],
       [ 1.18840878e+00,  1.16774405e+00, -1.29379368e+00,
        -5.64315020e-01]])

In [27]:
(np_arr3 + np_arr4) < 15

array([ True,  True, False])

#### Numpy 속성

In [31]:
# 1차원 배열은 열값이 앞에 나오고
np_arr1.shape

(3,)

In [30]:
# 2차원 부터는 행, 열순
np_arr2.shape

(2, 3)

In [33]:
np_arr2.dtype

dtype('int32')

In [36]:
np_arr2.T.shape

(3, 2)

##### 기타 함수

In [41]:
np_arr5 = np.arange(10)
np_arr5

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [42]:
# 평균
np_arr5.mean()

4.5

In [43]:
# 합계
np_arr5.sum()

45

In [44]:
# 적층형 합계
np_arr5.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45])

In [45]:
np_arr6 = np.random.randn(10)
np_arr6

array([-1.65686549,  1.02890995, -1.86700653, -0.7479874 , -1.59129868,
        1.35495408, -0.78927493, -0.36217711,  0.76386755,  1.64895051])

In [46]:
np_arr6 = np_arr6 + 10

In [47]:
np_arr6

array([ 8.34313451, 11.02890995,  8.13299347,  9.2520126 ,  8.40870132,
       11.35495408,  9.21072507,  9.63782289, 10.76386755, 11.64895051])

In [48]:
# 오름차순 정렬
np_arr6.sort()
np_arr6

array([ 8.13299347,  8.34313451,  8.40870132,  9.21072507,  9.2520126 ,
        9.63782289, 10.76386755, 11.02890995, 11.35495408, 11.64895051])

In [49]:
# 내림차순 정렬
-np.sort(-np_arr6)

array([11.64895051, 11.35495408, 11.02890995, 10.76386755,  9.63782289,
        9.2520126 ,  9.21072507,  8.40870132,  8.34313451,  8.13299347])