## Numpy

수치해석, 수학적 처리, 통계 등을 위한 라이브러리 모듈 - Pandas, Matplotlib, Scipy

### Numpy
```
!pip install numpy
```

In [1]:
!pip install numpy



In [2]:
!pip show numpy

Name: numpy
Version: 1.25.2
Summary: Fundamental package for array computing in Python
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: BSD-3-Clause
Location: C:\source\IoT-BigData-2023\da_env\Lib\site-packages
Requires: 
Required-by: pandas


In [3]:
!pip install matplotlib

Collecting matplotlib
  Obtaining dependency information for matplotlib from https://files.pythonhosted.org/packages/4d/9c/65830d4a56c47f5283eaa244dc1228c5da9c844a9f999ebcc2e69bf6cc65/matplotlib-3.7.2-cp311-cp311-win_amd64.whl.metadata
  Downloading matplotlib-3.7.2-cp311-cp311-win_amd64.whl.metadata (5.8 kB)
Collecting contourpy>=1.0.1 (from matplotlib)
  Obtaining dependency information for contourpy>=1.0.1 from https://files.pythonhosted.org/packages/16/09/989b982322439faa4bafffcd669e6f942b38fee897c2664c987bcd091dec/contourpy-1.1.0-cp311-cp311-win_amd64.whl.metadata
  Downloading contourpy-1.1.0-cp311-cp311-win_amd64.whl.metadata (5.7 kB)
Collecting cycler>=0.10 (from matplotlib)
  Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
  Obtaining dependency information for fonttools>=4.22.0 from https://files.pythonhosted.org/packages/52/65/aaa3d2b7a292d93cc2cf1c534d03ba3f744e480f15b3b2ab6ad68189f7ee/fonttools-4.42.0-cp311-cp311-win_amd64

In [10]:
import pandas as pd
import numpy as np

##### Numpy 사용 이유
- python 1,000,000건 리스트를 만들고 사용처리 - 대략 200ms
- numpy 1,000,000건 배열 만들고 사용처리 - 대략 20ms (1/10배)

In [11]:
py_list = list(range(1_000_000)) # 100만건 리스트
np_arr = np.arange(1_000_000) # 100만건 배열

In [12]:
%timeit for _ in range(10): py_list2 = py_list * 3

278 ms ± 3.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [13]:
%timeit for _ in range(10): np_arr2 = np_arr * 3

13.5 ms ± 51.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


##### Numpy로 배열 생성, 연산

- 파이썬 리스트 +, * 연산밖에 없다.
- 행렬(벡터) 연산이 필요하면 Numpy로 형변환 후 처리가능

In [18]:
list1 = [1, 2, 3]
list2 = [4, 5 ,6]
np_arr1 = np.array(list2)

In [15]:
np_arr1

array([4, 5, 6])

In [16]:
np_arr2 = np.array([[1, 2, 3], [4, 5, 6]])
np_arr2

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
np_arr3 = np.array(list1)
np_arr4 = np.array(list2)

In [20]:
np_arr3 + np_arr4

array([5, 7, 9])

In [24]:
np_arr4 - np_arr3

array([3, 3, 3])

In [22]:
np_arr3 * np_arr4

array([ 4, 10, 18])

In [23]:
np_arr3 / np_arr4

array([0.25, 0.4 , 0.5 ])

In [27]:
np.zeros(4)

array([0., 0., 0., 0.])

In [28]:
np.zeros([2, 3])

array([[0., 0., 0.],
       [0., 0., 0.]])

In [29]:
np.empty([3, 2])

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [32]:
# 랜덤은 활용도가 높음
np.random.randn(10)

array([ 1.00957781,  0.44940077,  1.22315938,  0.26349322, -0.29488459,
       -0.01390809, -0.78738421, -0.02602198,  0.06165978,  1.13737847])

In [31]:
np.random.randn(5, 4)

array([[-0.20856057,  0.6311399 , -1.08467717, -0.15465393],
       [ 2.75866995,  2.47453907, -0.41830504, -1.07734918],
       [-1.07323397, -0.19728483,  0.25612072,  1.77493533],
       [ 0.77022291, -0.38256428, -2.36353631, -0.6406423 ],
       [-1.82884837,  0.92515964, -0.62376046, -1.51914419]])

In [33]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [34]:
list(np.arange(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [37]:
(np_arr3 + np_arr4) < 9

array([ True,  True, False])

##### Numpy 속성

In [39]:
# 1차원 배열은 열값이 앞에 나오고
np_arr1.shape

(3,)

In [40]:
# 2차원부터는 행, 열순
np_arr2.shape

(2, 3)

In [41]:
np_arr2.dtype

dtype('int32')

In [42]:
np_arr2.T.shape

(3, 2)

#### 기타 함수

In [43]:
np_arr5 = np.arange(10)
np_arr5

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [44]:
# 평균
np_arr5.mean()

4.5

In [45]:
# 합계
np_arr5.sum()

45

In [46]:
# 적층형 합계
np_arr5.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45])

In [47]:
np_arr6 = np.random.randn(10)
np_arr6

array([-0.20058755, -0.37016235,  0.00860167,  0.62420602,  1.26741498,
       -0.66199845, -0.0343861 ,  0.56648025, -1.74099398,  0.01011436])

In [49]:
np_arr6 = np_arr6 * 10

In [50]:
np_arr6

array([ -2.00587554,  -3.70162347,   0.08601668,   6.24206019,
        12.67414985,  -6.61998454,  -0.34386103,   5.66480255,
       -17.40993983,   0.10114365])

In [53]:
# 오름차순 정렬
np_arr6.sort()

In [54]:
np_arr6

array([-17.40993983,  -6.61998454,  -3.70162347,  -2.00587554,
        -0.34386103,   0.08601668,   0.10114365,   5.66480255,
         6.24206019,  12.67414985])

In [56]:
# 내림차순 정렬 - 특이!
-np.sort(-np_arr6)

array([ 12.67414985,   6.24206019,   5.66480255,   0.10114365,
         0.08601668,  -0.34386103,  -2.00587554,  -3.70162347,
        -6.61998454, -17.40993983])