# Numpy : Numerical Python

- Mathematical arrays , Matrices
- Mathematical Functions

Numpy's main object is the homogeneous multi-dimensional array.
It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.

Dimensions are called **axes**. Number of axes is **rank**.

[1, 2, 1] is an array of rank 1
[[1,0,0], [0,1,2]] is an array of rank 2.

First axis with length 2, second with 3

axis 0 : col, axis 1 : row ,...


* np.set_printoptions(threshold='nan') #force NumPy to print the entire array

## 0. Basic Principles

While in C, the data types of each variable are explicitly declared, in Python the types are dynamically inferred. This means, for example, that we can assign any kind of data to any variable.

This sort of flexibility is one piece that makes Python and other dynamically typed languages convenient and easy to use. Understanding how this works is an important piece of learning to analyze data efficiently.


### 1) A Python Integer is more than just an integer 
    
1)Standard Python implementation is written in C.
    
2)This means that every Python object is simply a disguised C structure.

3) A single integer in Python actually contains four pieces :
    - reference count
    - type of the variable
    - size of the following data member
    - actual integer value
4) So, a C integer is essentially a label for a position in memory whose bytes encode an integer value.
    
5) A Python integer is a pointer to a position in memory containing all the Python object information.
    
6) *** This extra information is what allows Python to be coded so freely and dynamically. However, all this additional information comes at a cost. ***




### 2) A Python list is more than just a list

1) The standard mutable multielement container in Python is the list.
    - L = list(range(10))        
    - L2 = [str(c) for c in L]
    
2) Because of Python’s dynamic typing, we can create heterogeneous lists.

3) But this flexibility comes at a cost : to allow these flexible types, each item in the list must contain its own type info, reference count, and other information - that is, each item is a complete Python Object.

4) So, in the special case that all variables are of the same type, much of this information is redundant.




### 3) Fixed-Type Arrays in Python

1) Python offers several different options for storing data in efficient, fixed-type data buffers.
    - import array
    - array.array(‘i’, L)

2) Much more useful, however, is the ndarray object of the NumPy package.

3) NumPy adds to this efficient operations on that data.




*So that's why we use NumPy!*

## 1. Array Creation

### 1) Specified Creation

In [1]:
import numpy as np

# Numpy Array

a = np.array([[1,2,3], [4,5,6]])
print(a)

print(np.array([ (1,2,3), (4,5,6)])) #same as above

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]


In [4]:
# array creation with list

np.array( [range(i,i+3) for i in [2,4,6] ])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

In [6]:
# array with type

c = np.array([[1,2],[3,4]], dtype=complex)
c

array([[ 1.+0.j,  2.+0.j],
       [ 3.+0.j,  4.+0.j]])

### 2) Empty / Random Creation

In [9]:
# array with rand, empty num

with_zero = np.zeros( (3,4) )
with_one = np.ones( (2,3,4), dtype=np.int16 )
with_rand = np.empty( 2, dtype=float )

print(with_zero)
print(with_one)
print(with_rand)

[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]
[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]]
[  6.93166827e-310   4.67901755e-310]


In [5]:
# array with certain number

np.full( (3,5), 3.14 )

array([[ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14],
       [ 3.14,  3.14,  3.14,  3.14,  3.14]])

In [7]:
# array with rand
print(np.random.random((3,3)))
print(np.random.normal(0, 1, (3,3)))   # mean 0, sd 1, normal dist
print(np.random.randint(0,10,(3,3)))   # [0,10)
print(np.eye(3))  # identity matrix

[[ 0.08210476  0.74760541  0.81001923]
 [ 0.20704571  0.48183279  0.87213213]
 [ 0.51277309  0.31657756  0.00436055]]
[[-0.15963933 -0.24277317  1.1375256 ]
 [ 0.42268912  0.63659462 -0.75103618]
 [-0.52807377  0.1756799  -0.44492371]]
[[3 9 2]
 [1 5 7]
 [8 5 2]]
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]


### 3) Sequential Creation

In [11]:
print(np.arange(3))       # from 0 to 2
print(np.arange(10,30,5)) # over 10 below 30, step 5
print(np.arange(0,2,0.3)) # impossible to predict # of elements
print(np.linspace(0,2,9)) # 9 numbers from 0 to 2 (inclusive) - easy to predict num
print(np.r_[1:4, 0, 4])   # special usage - veector addition

[0 1 2]
[10 15 20 25]
[ 0.   0.3  0.6  0.9  1.2  1.5  1.8]
[ 0.    0.25  0.5   0.75  1.    1.25  1.5   1.75  2.  ]
[1 2 3 0 4]


### 4) Attributes

In [8]:
a = np.arange(12).reshape(3,4)

print(a.ndim)     # number of axes of the array (also as rank)
print(a.shape)    # dimension of array
print(a.size)     # totalnumber of elements = shape[0]*shape[1]
print(a.dtype)    # describe the type of elements
print(a.itemsize) # size in byte of each element
print(a.nbytes)   # total size = itemsize * size
print(a.data)     # buffer containing the actual elements

2
(3, 4)
12
int64
8
96
<memory at 0x7f7b890b3cf0>


## 2. Array Indexing / Slicing / Iterating

In [17]:
# Array Indexing 

a = np.arange(10)**3
print(a)
print(a[-1])     # last element
print(a[2])
print(a[2:5])
print(a[:6:2])   # from 0 to 5, with every 2nd element
print(a[::-1])   # every element, from backward


# fromfunction

def f(x,y) :
    return 10*x+y

b = np.fromfunction(f, (5,4), dtype=int )  # use index as parameter
print(b)


[  0   1   8  27  64 125 216 343 512 729]
729
8
[ 8 27 64]
[ 0  8 64]
[729 512 343 216 125  64  27   8   1   0]
[[ 0  1  2  3]
 [10 11 12 13]
 [20 21 22 23]
 [30 31 32 33]
 [40 41 42 43]]


In [19]:
# Indexing with booleans
b = a>16
a[b]

array([ 27,  64, 125, 216, 343, 512, 729])

In [24]:
# Matrix Indexing / Slicing

two_D_array = numpy.array([[1,2,3], [4,5,6]], float)
print(two_D_array)
print(two_D_array[1][1])
print(two_D_array[1, :])
print(two_D_array[:,2])

[[ 1.  2.  3.]
 [ 4.  5.  6.]]
5.0
[ 4.  5.  6.]
[ 3.  6.]


* When fewer indices are provided than the number of axes, the missing indices are considered complete slices 

In [24]:
b = np.fromfunction(f, (5,4), dtype=int )  # use index as parameter

b[-1]  # equivalent to b[-1,:] , last row


array([40, 41, 42, 43])

* dots(...) represent as many colons as needed to produce a complete indexing tuple.
* x[1,2,...] equivalent to x[1,2,:,:,:]

In [25]:
c = np.array( [[[0,1,2], [10,12,13]], [[100,101,102], [110,112,113]]])
print(c.shape)
print(c[1,...])  # same as c[1,:,:] or c[1]
print(c[...,2])  # same as c[:,:,2]

(2, 2, 3)
[[100 101 102]
 [110 112 113]]
[[  2  13]
 [102 113]]


* Iterating over multi-dimension is done with respect to the first axis (row)

In [26]:
for row1 in b:
    print(row1)
    
for element in b.flat :
    print(element)            # iterate over all the elements

[0 1 2 3]
[10 11 12 13]
[20 21 22 23]
[30 31 32 33]
[40 41 42 43]
0
1
2
3
10
11
12
13
20
21
22
23
30
31
32
33
40
41
42
43


One important thing to know about array slice is that they return vies rather than copies of the array data. In Python list, slices will be copies.

## 3. Arithmentic Operations

In [37]:
# Basic Statistics

numbers = [1,2,3,4,5]
print(numpy.mean(numbers))
print(numpy.median(numbers))
print(numpy.std(numbers))

print("Now with npArray")

number = np.array([1,2,3,4,5], int)
print(number.sum())
print(number.mean())
print(number.min())

3.0
3.0
1.41421356237
Now with npArray
15
3.0
1


In [36]:
# Basic Statistics 2

b = np.arange(12).reshape(3,4)

print(b)
print(b.sum(axis=0))      # sum of each column
print(b.min(axis=1))      # min of each row
print(b.cumsum(axis=1))   # cum sum along each row

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[12 15 18 21]
[0 4 8]
[[ 0  1  3  6]
 [ 4  9 15 22]
 [ 8 17 27 38]]


In [33]:
# Array arithmetics

array_1 = numpy.array([1,2,3], float)
array_2 = numpy.array([5,2,6], float)
print(array_1+array_2)
print(array_1*array_2)

print(numpy.dot(array_1, array_2)) # 벡터의 dot product

[ 6.  4.  9.]
[  5.   4.  18.]
27.0


In [29]:
# Array arithmetics 2

a = np.array([20,30,40,50])
b = np.arange(4)

print(a<35)
print(np.exp(b))
print(np.add(a,b))

[ True  True False False]
[  1.           2.71828183   7.3890561   20.08553692]
[20 31 42 53]


In [30]:
# Matrix arithmetics

array_1 = numpy.array([[1,2], [3,4]], float)
array_2 = numpy.array([[5,6], [7,8]], float)
print(array_1+array_2)
print(array_1*array_2) # 주의, 행렬 곱이 아닌 원소끼리의 곱

print(numpy.dot(array_1, array_2)) # 이게 행렬 곱이다. dot product
print(array_1.dot(array_2)) # same as above

[[  6.   8.]
 [ 10.  12.]]
[[  5.  12.]
 [ 21.  32.]]
[[ 19.  22.]
 [ 43.  50.]]
[[ 19.  22.]
 [ 43.  50.]]


* Upcasting occurs when two different data types collapse.

In [32]:
a = np.ones(3, np.int32)
b = np.linspace(0, np.pi, 3)
print(b.dtype)
c = a+b
print(c.dtype)  # int + float converts into more general form (float, in this case)

float64
float64
