## Numpy

수치해석, 수학적 처리, 통계 등을 위한 모듈 - Pandas, Matplotlib, SciPy

### Numpy
```
!pip install numpy
```

In [1]:
!pip install numpy




[notice] A new release of pip is available: 23.0 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
!pip show numpy

Name: numpy
Version: 1.25.2
Summary: Fundamental package for array computing in Python
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email: 
License: BSD-3-Clause
Location: C:\Source\iot-bigdata-2023\da_env\Lib\site-packages
Requires: 
Required-by: contourpy, matplotlib, pandas


In [3]:
!pip install matplotlib




[notice] A new release of pip is available: 23.0 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [1]:
import pandas as pd
import numpy as np

##### Numpy 사용 이유
- python 1,000,000건 리스트를 만들고 사용처리 - 대략 200ms
- numpy 1,000,000건 배열 만들고 사용처리 - 대략 20ms (1/10배)

In [2]:
py_list = list(range(1_000_000)) # 100만건 리스트
np_arr = np.arange(1_000_000) # 100만건 배열

In [3]:
%timeit for _ in range(10): py_list2 = py_list * 3

776 ms ± 47.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [4]:
%timeit for _ in range(10): np_arr2 = np_arr * 3

44.1 ms ± 6.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


##### Numpy로 배열 생성, 연산

- 파이썬 리스트 +, * 연산밖에 없슴
- 행렬(벡터) 연산이 필요하면 Numpy로 형변환 후 처리가능

In [5]:
list1 = [1, 2, 3]
list2 = [4, 5, 6]
list1 + list2

[1, 2, 3, 4, 5, 6]

In [6]:
list2 * 3

[4, 5, 6, 4, 5, 6, 4, 5, 6]

In [7]:
list2 - list1

TypeError: unsupported operand type(s) for -: 'list' and 'list'

In [8]:
list2 / list1

TypeError: unsupported operand type(s) for /: 'list' and 'list'

##### numpy 연산

In [9]:
list2 = [4, 5, 6]
np_arr1 = np.array(list2)

In [10]:
np_arr1

array([4, 5, 6])

In [11]:
np_arr2 = np.array([[1, 2, 3], [4, 5, 6]])
np_arr2

array([[1, 2, 3],
       [4, 5, 6]])

In [12]:
np_arr3 = np.array(list1)
np_arr4 = np.array(list2)

In [13]:
np_arr3 + np_arr4

array([5, 7, 9])

In [14]:
np_arr3 * np_arr4

array([ 4, 10, 18])

In [15]:
np_arr4 - np_arr3

array([3, 3, 3])

In [16]:
np_arr4 / np_arr3

array([4. , 2.5, 2. ])

In [20]:
np.zeros(4000)

array([0., 0., 0., ..., 0., 0., 0.])

In [21]:
np.zeros([2, 3])

array([[0., 0., 0.],
       [0., 0., 0.]])

In [22]:
np.empty([3, 2])

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [29]:
# 랜덤은 활용도가 높음
np.random.randn(50)

array([ 0.0273828 ,  1.03079807, -0.39929704, -0.0557112 , -0.34929211,
        0.26664923, -0.42705386, -0.04869016, -0.23441697,  0.87855197,
       -1.21005883,  0.99404588, -2.14347293, -0.054699  ,  1.09650419,
        0.089915  ,  0.50565635, -1.0852419 ,  0.58486549,  0.40226606,
       -0.58560339,  0.25808484, -0.46787945,  2.36816636,  0.55409498,
        1.09391543, -0.90269189, -2.19146445, -0.04520954,  0.85819842,
       -0.34216046, -0.03462206,  0.2225608 , -1.04981676, -0.49249176,
        0.82797861, -2.84866569, -0.01297863,  0.6781788 , -1.07419867,
        0.46562511,  0.33710269, -0.61190447, -2.96825951,  0.39386314,
       -0.38169288, -0.42191319,  0.0591933 ,  2.23087898, -1.39786804])

In [25]:
np.random.randn(5, 4)

array([[ 0.47690381, -1.21013081,  0.55043987, -0.19568322],
       [ 0.13858958, -2.43928625,  0.58174316,  0.15077205],
       [-1.44025524, -1.33035312, -0.39024396, -1.75459458],
       [-1.8338721 ,  0.70041134,  0.46379291, -1.44116854],
       [-0.35573163,  0.17418356, -0.45742329, -0.17144576]])

In [30]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [31]:
list(np.arange(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [37]:
(np_arr3 + np_arr4) < 9

array([ True,  True, False])

In [40]:
(np_arr3 + np_arr4) == 9

array([False, False,  True])

#### Numpy 속성

In [45]:
# 1차원 배열은 열값이 앞에 나오고
np_arr1.shape

(3,)

In [47]:
# 2차원부터는 행, 열순
np_arr2.shape

(2, 3)

In [48]:
np_arr2.dtype

dtype('int32')

In [51]:
np_arr2.T.shape

(3, 2)

##### 기타 함수

In [52]:
np_arr5 = np.arange(10)
np_arr5

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [53]:
# 평균
np_arr5.mean()

4.5

In [54]:
# 합계
np_arr5.sum()

45

In [55]:
# 적층형 합계
np_arr5.cumsum()

array([ 0,  1,  3,  6, 10, 15, 21, 28, 36, 45])

In [56]:
np_arr6 = np.random.randn(10)
np_arr6

array([-0.60660985,  1.92055891,  1.24534515,  1.48247981,  1.17812766,
       -1.1471446 , -0.01175818, -1.07862851, -0.50720267, -0.80396151])

In [61]:
np_arr6 = np_arr6 * 10

In [62]:
np_arr6

array([ -6.06609845,  19.20558906,  12.45345155,  14.8247981 ,
        11.78127659, -11.47144602,  -0.11758181, -10.78628514,
        -5.0720267 ,  -8.03961511])

In [63]:
# 오름차순 정렬
np_arr6.sort()

In [65]:
np_arr6

array([-11.47144602, -10.78628514,  -8.03961511,  -6.06609845,
        -5.0720267 ,  -0.11758181,  11.78127659,  12.45345155,
        14.8247981 ,  19.20558906])

In [68]:
# 내림차순 정렬 - 특이!
-np.sort(-np_arr6)

array([ 19.20558906,  14.8247981 ,  12.45345155,  11.78127659,
        -0.11758181,  -5.0720267 ,  -6.06609845,  -8.03961511,
       -10.78628514, -11.47144602])