# Introduction to NumPy

## Objectives

- Introduce NumPy, a Python library for numerical computing.
- Demonstrate how to create and manipulate arrays with NumPy.
- Highlight the efficiency of NumPy arrays compared to Python lists.
- Showcase the functions available in NumPy for generating different arrays.
- Illustrate the use of NumPy arrays to handle large datasets efficiently.

## Background

NumPy (Numerical Python) is an open-source Python library that is the foundational package for scientific computing in Python. It provides a high-performance multidimensional array object and tools for working with these arrays. Unlike Python lists, NumPy arrays are more compact and faster for numerical operations and allow mathematical operations to be executed over entire arrays. Due to its efficiency and interoperability with other libraries, NumPy has become an indispensable tool for data scientists, researchers, and engineers working on Python.

## Datasets Used

The notebook does not reference external datasets; it utilizes synthetic data generated through various NumPy functions. 

## NumPy library

NumPy is a Python library used for working with arrays.

The NumPy (Numeric Python) package provides basic routines for manipulating large arrays and matrices of numeric data.

NumPy is usually imported under the np alias.

In [33]:
import numpy as np

Now the NumPy package can be referred to as np

For determing the NumPy version

In [34]:
np.__version__

'1.26.4'

## NumPy array

The main feature of NumPy is the array object class. 

Arrays are similar to lists in Python, except that every element of an array must be of the same type, usually a numeric type like float or int. 

Arrays make operations with large amounts of numeric data very fast and are much efficient than lists.

### 0-D arrays

0-D arrays are scalar

In [35]:
arr0 = np.array(5)
arr0

array(5)

### 1-D arrays

In [36]:
arr1 = np.array([1,2,3,4,5,6])
arr1

array([1, 2, 3, 4, 5, 6])

### 2-D arrays

In [37]:
arr2 = np.array([[1, 2, 3],
                [4, 5, 6]])
arr2

array([[1, 2, 3],
       [4, 5, 6]])

### 3-D arrays

In [38]:
arr3 = np.array([[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]])
arr3

array([[[1, 1, 1],
        [2, 2, 2]],

       [[3, 3, 3],
        [4, 4, 4]]])

All of them are NumPy arrays: np.ndarray

In [39]:
print('Type of arr0 =', type(arr0))
print('Type of arr1 =', type(arr1))
print('Type of arr2 =', type(arr2))
print('Type of arr3 =', type(arr3))

Type of arr0 = <class 'numpy.ndarray'>
Type of arr1 = <class 'numpy.ndarray'>
Type of arr2 = <class 'numpy.ndarray'>
Type of arr3 = <class 'numpy.ndarray'>


**Array Dimensions**: number of axes

In [40]:
print('Dimensions of arr0 =', arr0.ndim)
print('Dimensions of arr1 =', arr1.ndim)
print('Dimensions of arr2 =', arr2.ndim)
print('Dimensions of arr3 =', arr3.ndim)

Dimensions of arr0 = 0
Dimensions of arr1 = 1
Dimensions of arr2 = 2
Dimensions of arr3 = 3


**Array Shape**: number of elements in each dimension

In [41]:
print('Shape of arr0 =', arr0.shape)
print('Shape of arr1 =', arr1.shape)
print('Shape of arr2 =', arr2.shape)
print('Shape of arr3 =', arr3.shape)

Shape of arr0 = ()
Shape of arr1 = (6,)
Shape of arr2 = (2, 3)
Shape of arr3 = (2, 2, 3)


**Array Size**: total number of elements

In [42]:
print('Size of arr0 =', arr0.size)
print('Size of arr1 =', arr1.size)
print('Size of arr2 =', arr2.size)
print('Size of arr3 =', arr3.size)

Size of arr0 = 1
Size of arr1 = 6
Size of arr2 = 6
Size of arr3 = 12


**Type of the elements in the array**: in this case, all elements are int

In [43]:
print('Type of elements in arr0 =', arr0.dtype)
print('Type of elements in arr1 =', arr1.dtype)
print('Type of elements in arr2 =', arr2.dtype)
print('Type of elements in arr3 =', arr3.dtype)

Type of elements in arr0 = int32
Type of elements in arr1 = int32
Type of elements in arr2 = int32
Type of elements in arr3 = int32


## Array definitions

There are several ways to create arrays in NumPy

Creating an array from a **list**

In [44]:
arr_l = np.array([[1, 2, 3], [4, 5, 6]], dtype = 'float')
arr_l

array([[1., 2., 3.],
       [4., 5., 6.]])

In [45]:
print('Type of elements in arr_l =', arr_l.dtype)

Type of elements in arr_l = float64


Creating an array from a **tuple**

In [46]:
arr_t = np.array((1,2,3,4,5))
arr_t

array([1, 2, 3, 4, 5])

In [47]:
print('Type of elements in arr_t =', arr_t.dtype)

Type of elements in arr_t = int32


If one number in the tuple is a float, the array type will be float

In [48]:
arr_t = np.array((1,2,3,4,5.4))
arr_t

array([1. , 2. , 3. , 4. , 5.4])

In [49]:
print('Type of elements in arr_l =', arr_l.dtype)

Type of elements in arr_l = float64


Creating a 3x3 array with all **zeros**

In [50]:
np.zeros((3,3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

Creating a 3x3 array with all **ones**

In [51]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

Creating a 3x3 array with all **fives**

In [52]:
np.full((3,3), 5)

array([[5, 5, 5],
       [5, 5, 5],
       [5, 5, 5]])

Creating a 3x3 array with **random values between 0 and 1**

In [53]:
np.random.random((3,3)) 

array([[0.75928517, 0.06798045, 0.76337869],
       [0.75283553, 0.48402255, 0.8036277 ],
       [0.67072135, 0.73371274, 0.86516464]])

Creating a 3x3 array with integer **random values between 0 and 10**

In [54]:
np.random.randint(11, size=(3,3))

array([[1, 3, 4],
       [9, 3, 2],
       [9, 5, 4]])

Creating a 2x3 array with integer **random values between 0 and 9**

In [55]:
np.random.randint(10, size=(2,3))

array([[5, 1, 1],
       [1, 7, 1]])

Create a 3x3 array of **normally distributed random values** with mean 0 and standard deviation 1

In [56]:
np.random.normal(0,1,(2,3))

array([[-0.75043314, -0.4289831 , -1.1950208 ],
       [-1.93025115, -0.49967249,  0.81923846]])

Create a 3x3 array of **normally distributed random values** with mean 10 and standard deviation of 5

In [57]:
np.random.normal(10,5,(3,3))

array([[15.05839186,  9.35675618, 14.32052596],
       [ 2.82524967,  8.38390888, 15.23075851],
       [15.40709374, -0.65698764, 15.54662077]])

Create a 3x3 identity matrix

In [58]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Create a sequence of 11 values evenly spaced between 0 and 5 

In [59]:
l2 = np.linspace(0, 5, 11) 
print('Number of elements =',len(l2))
l2

Number of elements = 11


array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ])

Create a sequence of 6 values evenly spaced between 0 and 5 

In [60]:
np.linspace(0, 5, 6)

array([0., 1., 2., 3., 4., 5.])

Create a sequence of values between 0 (included) and 20 (not included), stepping by 2

In [61]:
np.arange(0, 20, 2) 

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

If you want to include 20, he second parameter must be 21:

In [62]:
np.arange(0, 21, 2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

Create an array filled with a linear sequence between 0 (included) and 21 (not included), stepping by 5

In [63]:
np.arange(0, 21, 5) 

array([ 0,  5, 10, 15, 20])

Create an array filled with a linear sequence between 0 and 28 (not included), stepping by 5

In [64]:
np.arange(0, 28, 5) 

array([ 0,  5, 10, 15, 20, 25])

## Conclusions

Key Takeaways:
- NumPy arrays offer speed and efficiency for numerical operations, outperforming Python lists, especially with large datasets.
- The library supports arrays of various dimensions, enabling complex data structures and computations.
- A wide range of functions for array creation allows for flexible data initialization and manipulation.
- Built-in methods for inspecting and modifying array properties streamline data analysis and preprocessing.
- NumPy's capabilities are foundational for advanced numerical computations and analyses across many scientific and engineering disciplines.

## References

- VanderPlas, J. (2017) Python Data Science Handbook: Essential Tools for Working with Data. USA: O’Reilly Media, Inc. chapter 2.