# Data Processing

#### Data is the most important element of Machine Learning Industry. Given a large and diverse set of training data, a good deep learning model will significantly outperform non-deep learning algorithms.


# Getting Started with NumPy

#### The majority of neural networks use input data that is either numeric or has been converted to a numeric form. When we deal with numeric data, the best Python library to use is NumPy. The NumPy library allows us to perform many operations on numeric data, and convert the data to more usable forms.
#### And NumPy aims to provide an array object that is up to **50x faster** than traditional Python lists.
### Its a commone practice to import NumPy **alias** with **np** Like
> import **numpy** as **np**

# NumPy Arrays

#### NumPy arrays are basically just Python lists with added features. In fact, you can easily convert a Python list to a Numpy array using the np.array function, which takes in a Python list as its required argument.
#### The function also has quite a few keyword arguments, but the main one to know is **dtype**.
#### The dtype keyword argument takes in a NumPy type and manually casts the array to the specified type.

In [2]:
import numpy as np

arr = np.array([[0, 1, 2], [3, 4, 5]],
               dtype=np.float32)
print(repr(arr))

array([[0., 1., 2.],
       [3., 4., 5.]], dtype=float32)


## Copying

In [6]:
a = np.array([0, 1])
b = np.array([9, 8])
c = a
print('Array a: {}'.format(repr(a)))
c[0] = 5
print('Array a: {}'.format(repr(a)))

d = b.copy()
d[0] = 6
print('Array b: {}'.format(repr(b)))

Array a: array([0, 1])
Array a: array([5, 1])
Array b: array([9, 8])


## Casting

In [8]:
arr = np.array([0, 1, 2])
print(arr.dtype)
arr = arr.astype(np.float32)
print(arr.dtype)

int64
float32


## NaN & Infinity

* Note that np.nan cannot take on an integer type.
* Note that np.inf cannot take on an integer type.
### Try it by uncommenting and commenting the code chunks

In [18]:
# # NaN
# arr = np.array([np.nan, 1, 2])
# print(repr(arr))

# arr = np.array([np.nan, 'abc'])
# print(repr(arr))

# # Will result in a ValueError
# np.array([np.nan, 1, 2], dtype=np.int32)

# # # Infinity
print(np.inf > 1000000)

arr = np.array([np.inf, 5])
print(repr(arr))

arr = np.array([-np.inf, 1])
print(repr(arr))

# Will result in an OverflowError
np.array([np.inf, 3], dtype=np.int32)

True
array([inf,  5.])
array([-inf,   1.])


OverflowError: cannot convert float infinity to integer

# Dimensions in Arrays

#### A dimension in arrays is one level of array depth (nested arrays).
### 0-D Arrays
#### 0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
### 1-D Arrays
#### An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
**These are the most common and basic arrays.**
### 2-D Arrays
#### An array that has 1-D arrays as its elements is called a 2-D array.
**These are often used to represent matrix or 2nd order tensors.**
>NumPy has a whole sub module dedicated towards matrix operations called numpy.mat
### 3-D arrays
#### An array that has 2-D arrays (matrices) as its elements is called 3-D array.
**These are often used to represent a 3rd order tensor.** 
### Check Number of Dimensions?
#### NumPy Arrays provides the **ndim** attribute that returns an integer that tells us how many dimensions the array have.

In [45]:
# importing numpy and alias as np
import numpy as np

# 0-D 
d= np.array(42)
print("0-D numpy array {}\n".format(d))

# 1-D
D = np.array([1, 2, 3])
print("1-D numpy array {}\n".format(D))

# 2-D
DD = np.array([[1, 2, 4], [3, 4, 5]])
print("2-D numpy array {}\n".format(DD))

# 3-D
DDD = np.array([[[1, 2, 3], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]])
print("3-D numpy array {}\n".format(DDD))

# Check Number of Dimensions?
print("The dimensions of DDD is {}\n".format(DDD.ndim))

0-D numpy array 42

1-D numpy array [1 2 3]

2-D numpy array [[1 2 4]
 [3 4 5]]

3-D numpy array [[[ 1  2  3]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]]

The dimensions of DDD is 3



# Numpy Ranged data