# Numpy Basics

## Import Numpy package

In [1]:
import numpy as np

## Define a numpy ndarray

In [2]:
arr  = np.arange(1,6)

## Numpy Array Output

In [3]:
arr

array([1, 2, 3, 4, 5])

In [4]:
print(arr)

[1 2 3 4 5]


## Ndarray Operation 
for exmaple， list [1,2,3,4,5] for each element to add 1 
- Use python list to add one for each element 

In [5]:
# Method 1: Loop
lst = [1,2,3,4,5]
for i in range(len(lst)):
    lst[i] = lst[i] + 1
print(lst)

[2, 3, 4, 5, 6]


In [6]:
# Python Expression
lst1 = [1,2,3,4,5]
lst1 = [i + 1 for i in lst1 ]
print(lst1)

[2, 3, 4, 5, 6]


- Use numpy ndarray

In [7]:
arr = np.arange(1,6)
arr + 1
arr

array([1, 2, 3, 4, 5])

The reason we use Numpy for calculation is that Numpy is written by C programming language, the array operation will be more time efficient and optimized. 
For example, we calculate the sum of intergers up to 10000
- Use Python and magic function "%timeit" to see the average time of 7 runs of calculating the sum of intergers up to 10000

In [8]:
a = list(range(10000))
%timeit sum(a)

57.3 µs ± 204 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


-Use Numpy to Calculate the sum of intergers up to 10000

In [9]:
b = np.array(a)
%timeit np.sum(b)

5.27 µs ± 92.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


# Ndarry object
ndarray = n dimensional array. 

The ndarray object is a multi-dimensional array used to store elements of the same type. Each element in the ndarray has a memory area with the same storage size.

One dimensional array：[1,2,3,4,5]        (Vector)

Two dimensional array：[[1,2,3],
                        [4,5,6]]          （Matrix）

Three dimensioanl array：[ [[1,2,3],
                            [4,5,6]],
                           [[7,8,9],
                            [10,11,12]]
                                ]       (Tensor)

## Generate ndarray
### Generate a one-dimensional array

In [10]:
import numpy as np
a1 = [1,2,3,4,5]
arr1 = np.array(a1)
arr1

array([1, 2, 3, 4, 5])

### Generate a two-dimensional array (A matrix)

In [11]:
a2 = [ [1,2,3],
    [4,5,6]
     ]
arr2 = np.array(a2)
arr2

array([[1, 2, 3],
       [4, 5, 6]])

### Generate a three-dimensional array (A tensor)

In [12]:
a3 = [
    [[1,2,3],
    [4,5,6]],
      [[7,8,9],
       [10,11,12]]    
]
arr3 = np.array(a3)
arr3

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

## Property of ndarray

| Property Name    | Explanation                        |
| :--------------- | :----------------------------------------- |
| ndarray.shape    | the shape of a ndarray                     |
| ndarray.ndim     | the dimension of a ndarray                 |
| ndarray.size     | the number of elements in a ndarray        |
| ndarray.itemsize | ndarray the size of each element, the unit is Byte |
| ndarray.dtype    | the type of each element                   |


### Look at the shape of a ndarray

In [13]:
arr2.shape

(2, 3)

In [14]:
arr3.shape

(2, 2, 3)

### Look at the dimension of a ndarray

In [15]:
arr2.ndim

2

In [16]:
arr3.ndim

3

### Look at the number of elements in a ndarray

In [17]:
arr2.size

6

In [18]:
arr3.size

12

### Look at the size of each element

In [19]:
arr2.itemsize

4

In [20]:
arr3.itemsize

4

### Look at the type of each element

In [21]:
arr2.dtype

dtype('int32')

## Ndarry element type

| Type Name          | Description                                 | Abbreviation     |
| ------------------ | ------------------------------------------- | ---------------- | 
| np.bool            | Bool Type  (1 Byte)                         | “b”              | 
| np.int8\16\32\64   | Signed 8\16\32\64-bit Integers              | “i1、i2、i4、i8” |
| np.uint 8\16\32\64 | Unsigned 8\16\32\64-bit Intergers           | “u1、u2、u4、u8” |
| np.float16         | Half-precision floating 16-bit             | ‘f2’             |
| np.float32         | Single-precision floating 32-bit           | ‘f4’             |
| np.float64         | Double-precision floating 64-bit               | ‘f8’             |
| np.complex64\128   | Complex number，32/64-bit floating for real/imaginary part | ‘C8、C16’        |
| np.string_         | String                                     | ‘S’              |
| np.unicode         | Unicode type                                | ‘U’              |

## Data Type

### Compare types and size

In [22]:
# Set the ndarray data type as int8 (1 byte)
a4 = np.array([1, 2, 3, 4, 5], dtype = np.int8)
a4

array([1, 2, 3, 4, 5], dtype=int8)

In [23]:
# Look at the size the element: 1 byte
a4.itemsize

1

In [24]:
# Set the ndarry data type as float64 (8 bytes)
a5 = np.array([1,2,3,4,5], dtype = np.float64)
a5

array([1., 2., 3., 4., 5.])

In [25]:
a5.itemsize

8

In [26]:
lst = list("abcd")
a6 = np.array(lst)  # Unicode configration should be U type
print(a6.dtype, a6.itemsize)

<U1 4


In [27]:
lst2 = list("abcd")           # np.string configration should be S type, 1 string in the list 1 byte
a7 = np.array(lst, dtype = np.string_)
print(a7.dtype, a7.itemsize)

|S1 1


In [28]:
a8 = np.array(["SLM", "JPM", "BOA","CITI", "CAPITAL ONE"], dtype = np.string_) # string 11 byte because of len(Capital one) = 11 
print(a8.dtype, a8.itemsize)

|S11 11


### Revise the type

In [29]:
a9 = np.array([1.5, 2, 3], dtype = np.float64)
a9.dtype, a9.itemsize

(dtype('float64'), 8)

Now revise it to int32, using numpy function "astype"

In [30]:
a10 = a9.astype(np.int32)
a10, a10.dtype, a10.itemsize

(array([1, 2, 3]), dtype('int32'), 4)

- astype(type)  
type: provide specific data types that you would like me to modify.
> int32 --> float64:   Space is sufficient, no problem
>
> float64 --> int32: The space is not sufficient, the decimal part will be truncated.
>
> string_ --> float64:  If the string array represents only numbers, it is also possible to use astype to convert it to a numerical data type.

### tolist() Covert the ndarray to Python list

In [31]:
a11 = np.array([[1,2,3], [4,5,6]])
lst4 = a11.tolist()
lst4, type(lst4)  # See the type of a Python list

([[1, 2, 3], [4, 5, 6]], list)

### The data type is propagated downstream.

For ndarray, all elements inside must be of the same data type. If not, it will automaticly downward propagation, in the order int $\rightarrow$ float $\rightarrow$ str 

In [32]:
a12 = np.array([1,2,3,4,5])
a12.dtype   # all intergers, data type should be intergers for all

dtype('int32')

In [33]:
a13 = np.array([1, 2.5, 3, 4, 5])
a13.dtype  # One element changed to 2.5 the float, all data type should be changed to float 

dtype('float64')

In [34]:
a14 = np.array([1, 2.5, 3, 4, 5, 'SLM'])
a14.dtype    # add one more element string, all data type converted into unicode.

dtype('<U32')

a14*2  # UFuncTypeError, but if we converted a14 into np.object_, it can be multiplied and the data type is python object.

In [35]:
a15 = np.array([1, 2.5, 3, 4, 5, 'SLM'], dtype = np.object_)
a15*2
a15, a15.dtype

(array([1, 2.5, 3, 4, 5, 'SLM'], dtype=object), dtype('O'))

# Create Numpy Ndarray
## Create numpy ndarray based on existing data
### array
$\textbf{Syntax:}$ array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

In [36]:
a16 = [1,2,3,4,5]
a17 = [[1,2,3]
      ,[4,5,6]]

In [37]:
arr4 = np.array(a16)
arr5 = np.array(a17)
print("arr4:", arr4)
print("arr5:", arr5)

arr4: [1 2 3 4 5]
arr5: [[1 2 3]
 [4 5 6]]


Through list, tuple and the mix of list tuple to create a ndarray 

In [38]:
x =  [1,2,3,4]   # List
x =  (1,2,3,4)   # Tuple
x =  [(1,2,3),(4,5,6)] # Mix of list and tuple
a = np. array(x)
print (a,type(a))

[[1 2 3]
 [4 5 6]] <class 'numpy.ndarray'>


### asarray
$\textbf{Syntax:}$ asarray(a, dtype=None, order=None)

In [39]:
a = np. array([[1,2,3],[4,5,6]])
a1 = np. array(a)    # To generate a new array from an existing array   
a2 = np. asarray(a)  # It does not actually create a new array, but rather acts as an index to the original a."
a2

array([[1, 2, 3],
       [4, 5, 6]])

- Difference between array and asarray
Both array and asarray can convert structured data into an ndarray, but the main difference is that when the data source is already an ndarray, array will still make a copy and occupy new memory, whereas asarray will not.

In [40]:
print('a1 is a?', a1 is a)
print('a2 is a?', a2 is a)

a1 is a? False
a2 is a? True


## Create arrays based on their shape or values.
| Function Name    | Description                                              |
| ------------- | ---------------------------------------------------- |
| np.ones          | To generate an array filled with ones                   |
| np.ones_like     | To generate an array filled with ones that has the same shape as a given array |
| np.zeros         | To generate an array filled with zeros                     |
| np.zeros_like    | To generate an array filled with zeros that has the same shape as a given array              |
| np.empty         | To generate an empty array of a given shape without initializing its values           |
| np.empty_like    | To generate an empty array of the same shape as a given array without initializing its values |
| np.full          | To generate an array of a specified shape, data type, and with a specific value           |
| np.full_like     | To generate an array of the same shape as a given array, but filled with a specific value |
| np.eye, np.identity | To generate an N x N identity matrix (i.e., a feature matrix with ones on the diagonal and zeros elsewhere)  |

### Generate an array filled with ones
**Syntax：**
- ones(shape, dtype=None, order='C')
- ones_like(a, dtype=None, order='K', subok=True, shape=None)

In [41]:
a = np.ones((3, 4),dtype = int)
a

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [42]:
x = [[1, 2, 3], [4, 5, 6]]
b = np.ones_like(x)
b

array([[1, 1, 1],
       [1, 1, 1]])

### Generate an array filled with zeros
**Syntax：**
- zeros(shape, dtype=float, order=‘C’)
- zeros_like(a, dtype=None, order='K', subok=True, shape=None)

In [43]:
a = np.zeros((3,4), dtype = np.float64)
a

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [44]:
x =  [[1, 2, 3], [4, 5, 6]]
b = np.zeros_like(x)
b

array([[0, 0, 0],
       [0, 0, 0]])

### Generate an array of a specified shape, data type, and with a specific value
**Syntax：**
- full(shape, fill_value, dtype=None, order='C')
- full_like(a, fill_value, dtype=None, order='K', subok=True)

In [45]:
a = np. full((3,4),7)
a
x = [[1,2,3],[4,5,6]] 
a = np. full_like(x,7)
a

array([[7, 7, 7],
       [7, 7, 7]])

### Generate an empty array of a given shape without initializing its values
**Syntax：**
- empty(shape, dtype=float, order=‘C’)
- empty_like(prototype, dtype=None, order='K', subok=True, shape=None)

In [46]:
a = np. empty((3,4))
print(a, a.itemsize)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]] 8


In [47]:
x = [[1,2,3],[4,5,6]] 
b = np. empty_like(x)
b #Note that the values in the array are not guaranteed to be 0 or any other specific value; 
  # they are whatever happened to be in the memory locations that were allocated for the array.

array([[-1714418096,         622,           0],
       [          0,      131074,   174418036]])

### Generate an identity matrix 
**np.eye() and np.identity() difference**

**np.identity syntax：** np.identity(n, dtype=None)

**np.eye syntax：** np.eye(N, M=None, k=0, dtype=<type ‘float'>)

- np.identity can only create square matrices.
- np.eye can create rectangular matrices, and the k parameter can be adjusted to shift the position of the diagonal of 1's. A k value of 0 places the diagonal in the center of the matrix, a k value of 1 shifts the diagonal one position up, a k value of 2 shifts the diagonal two positions up, and so on. Similarly, a k value of -1 shifts the diagonal one position down. If the absolute value of k is too large, the diagonal will shift completely out of the matrix, resulting in a matrix full of zeros.

In [48]:
a = np.identity(4)
a

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [49]:
b = np.eye(3,4)
b

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

In [50]:
a = np.eye(3)
b = np.eye(3, k = 1)
print('No shift：\n', a)
print('One shift to the right：\n', b)

No shift：
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
One shift to the right：
 [[0. 1. 0.]
 [0. 0. 1.]
 [0. 0. 0.]]


In [51]:
c = np.eye(5,6, k = -2)
c

array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.]])

### Generate or extract a diagonal array
**Syntax:** diag(v, k=0)

If the input v is a one-dimensional array, the diag function returns a matrix with the input array as its diagonal elements.

If the input v is a two-dimensional matrix, the diag function returns a one-dimensional array containing the diagonal elements of the input matrix.


In [52]:
a = np.arange(1, 4)
b = np.arange(1, 10).reshape(3, 3)
print('a: \n', a)
print('np.diag(a): \n', np.diag(a))
print('b: \n', b)
print('np.diag(b): \n', np.diag(b))

a: 
 [1 2 3]
np.diag(a): 
 [[1 0 0]
 [0 2 0]
 [0 0 3]]
b: 
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
np.diag(b): 
 [1 5 9]


## Create an array based on a numerical range
### Generate a sequence of numbers within a specified range.
**Syntax：**np.arange(start, stop, step, dtype = None)

In [53]:
# Defining an arithmetic sequence with a starting value of 10, ending value of 20, and a step size of 2.
a = np. arange(10,21,2) # Left closed right open
a

array([10, 12, 14, 16, 18, 20])

### Generate a arithmetic sequence.
**Syntax：** np.linspace(start, stop, num = 50, endpoint = True, retstep = False, dtype = None)

In [54]:
a= np.linspace(0, 100, 11)
a

array([  0.,  10.,  20.,  30.,  40.,  50.,  60.,  70.,  80.,  90., 100.])

### Generate a geometric sequence.
**Syntax：** np.logspace(start, stop, num = 50, endpoint = True, base = 10.0, dtype = None)

In [55]:
# Defining a geometric sequence based 10 (10^0, 10^1, 10^2)
a = np. logspace(0, 2, 3)
a

array([  1.,  10., 100.])

In [56]:
# efining a geometric sequence based 2
a = np.logspace(0, 9, 10, base = 2)
a 

array([  1.,   2.,   4.,   8.,  16.,  32.,  64., 128., 256., 512.])