# Introduction to Numpy
NumPy (short for Numerical Python) provides an efficient interface to store and operate on dense data buffers. In some ways, NumPy arrays are like Python’s built-in list type, but NumPy arrays provide much more efficient storage and data operations as the arrays grow larger in size. NumPy arrays form the core of nearly the entire ecosystem of data science tools in Python, so time spent learning to use NumPy effectively will be valuable no matter what aspect of data science interests you.

In [16]:
# Load libraries
import numpy as np


## Creating Arrays from Python Lists
- First, we can use `np.array` to create arrays from Python lists.
- Remember that unlike Python lists, NumPy is constrained to arrays that all contain the same type. If types do not match, NumPy will upcast if possible.
- If we want to explicitly set the data type of the resulting array, we can use the dtype keyword.
- Finally, unlike Python lists, NumPy arrays can explicitly be multidimensional.

In [17]:
# Creating an integer array from a list
int_array = np.array( [1, 2, 3, 4, 5, 6])
print(int_array)

# Here, integers are upcast to floating point
float_array = np.array([1.0, 2, 3, 4, 5, 6])
print(float_array)

# We can use dtype keyword to specify the data type
type_array = np.array([1, 2, 3, 4], dtype = float)
print(type_array)

# Creating an multidimensional array
mult_array = np.array([range(i, i + 3) for i in [2, 4, 6]])
print(mult_array)

[1 2 3 4 5 6]
[1. 2. 3. 4. 5. 6.]
[1. 2. 3. 4.]
[[2 3 4]
 [4 5 6]
 [6 7 8]]


## Creating Arrays from Scratch
Especially for larger arrays, it is more efficient to create arrays from scratch using routines built into NumPy. Here I am goint to mention two methods that are more commonly used: `np.arange()` and `np.linspace()`

`np.arange()` stands for "NumPy array range" and is a function provided by the NumPy library in Python. It returns an array of evenly spaced values within a specified interval. The syntax for using np.arange() is: 
        
        np.arange([start,] stop[, step,], dtype = None)

Where: 
- `start`: (optional): The start of the interval. If not specified, the default value is 0.
- `stop`: The end of the interval. This parameter is required.
- `step` (optional): The step size between each pair of adjacent values in the array. If not specified, the default value is 1.
- `dtype` (optional): The data type of the array. If not specified, NumPy will choose the appropriate data type based on the other parameters.

`np.linspace()` is another function provided by the NumPy library in Python. It returns an array of evenly spaced numbers over a specified interval. The syntax for using linspace() is:

        np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype = None)

Where:
- `start`: The start of the interval. This parameter is required.
- `stop`: The end of the interval. This parameter is required.
- `num`: (optional): The number of values to generate. Default value is 50.
- `endpoint`: (optional): Whether to include the end value in the array. Default value is True.
- `retstep`: (optional): Whether to return the step value that was used. Default value is False.
- `dtype`: (optional): The data type of the array. If not specified, NumPy will choose the appropriate data type based on the other parameters.


In [29]:
# Create a length-10 integer array filled with zeros
zero_array = np.zeros(shape = 10, dtype = int)
print(zero_array)

# Create a 3x5 floating-point array filled with 1s
one_array = np.ones(shape = (3, 5), dtype = float)
print(one_array)

# Create a 3x5 array filled with 3.14
value_array = np.full(shape = (3, 5), fill_value = 3.14)
print(value_array)

# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
seq_array = np.arange(start = 0, stop = 20, step = 2)
print(seq_array)

# Create an array of five values evenly spaced between 0 and 1
even_array = np.linspace(start = 0, stop = 1, num = 5)
print(even_array)

# Create a 3x3 identity matrix
identity_array = np.eye(N = 3)
print(identity_array)

[0 0 0 0 0 0 0 0 0 0]
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
[[3.14 3.14 3.14 3.14 3.14]
 [3.14 3.14 3.14 3.14 3.14]
 [3.14 3.14 3.14 3.14 3.14]]
[ 0  2  4  6  8 10 12 14 16 18]
[0.   0.25 0.5  0.75 1.  ]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


### Random Module
The `np.random` module is a sub-module of the NumPy library which provides functions for generating random numbers. It's commonly used in machine learning applications, simulations and statistical analysis. Here are some commonly used functions in numpy.random:

- `rand()`: generates an array of random samples from a uniform distribution over the interval [0, 1).
- `randn()`: generates an array of samples from a standard normal distribution (mean = 0, variance = 1).
- `randint()`: generates random integers from low (inclusive) to high (exclusive), where you can specify the size and shape of the output array.
- `random()`: generates an array of random floats between 0 and 1 (excluding 1).
- `choice()`: generates a random sample from a given 1D array, with replacement or without replacement.
- `shuffle()`: shuffles the elements in a given 1D array.
- `normal()`: generates an array of samples from a normal distribution with specified mean and standard deviation.


In [33]:
# Create a 3x3 array of uniformly distributed random values between 0 and 1
uni_array = np.random.random((3, 3))
print(uni_array)

# Create a 3x3 array of normally distributed random values with mean 0 and standard deviation 1
norm_array = np.random.normal(loc = 0, scale = 1, size = (3, 3))
print(norm_array)

# Create a 3x3 array of random integers in the interval [0, 10)
rand_int_array = np.random.randint(low = 0, high = 10, size = (3, 3))
print(rand_int_array)

[[0.34517572 0.73485715 0.97802204]
 [0.26860885 0.21435429 0.45992539]
 [0.13177993 0.17355146 0.50103694]]
[[-0.41906872  0.29326765 -1.12070206]
 [-1.31093232 -1.37388139  0.39575358]
 [-1.35720927 -1.65754914 -2.40962646]]
[[9 1 4]
 [6 2 2]
 [9 4 6]]


## The Basics of NumPy Arrays
Data manipulation in Python is nearly synonymous with NumPy array manipulation: even newer tools like Pandas (Chapter 3) are built around the NumPy array. This section will present several examples using NumPy array manipulation to access data and subarrays, and to split, reshape, and join the arrays.

We’ll cover a few categories of basic array manipulations here:
- **Attributes of arrays**: Determining the size, shape, memory consumption, and data types of arrays
- **Indexing of arrays**: Getting and setting the value of individual array elements
- **Slicing of arrays**: Getting and setting smaller subarrays within a larger array
- **Reshaping of arrays**: Changing the shape of a given array
- **Joining and splitting**: of arrays: Combining multiple arrays into one, and splitting one array into many

### Attributes of arrays
Each array has attributes ndim (the number of dimensions), shape (the size of each dimension), and size (the total size of the array)

In [34]:
# Creating different arrays
x1 = np.random.randint(10, size = 6) # One-dimensional array
x2 = np.random.randint(10, size = (3, 4)) # Two-dimensional array
x3 = np.random.randint(10, size = (3, 4, 5)) # Three-dimensional array

# Printing array attributes
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

# Another useful attribute is the dtype
print("dtype:", x3.dtype)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60
