<a href="https://colab.research.google.com/github/EngineeredForHU/Machine-Learning-Reference/blob/main/PythonDataAnalysis/NumPyBasic(Ch4).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 4: NumPy Basics – Arrays and Vectorized Computation

In this chapter, we will explore the **core features of NumPy**, the foundational library for numerical and scientific computing in Python.  
We will cover:
- ndarray, an efficient multidimensional array providing fast array-oriented

- arithmetic operations and flexible broadcasting capabilities


- Mathematical functions for fast operations on entire arrays of data without having to write loops

- Tools for reading/writing array data to disk and working with memory-mapped files

- Linear algebra, random number generation, and Fourier transform capabilities

- A C API for connecting NumPy with libraries written in C, C++, or FORTRAN

By the end of this section, you’ll understand how to use NumPy arrays as the backbone of data analysis, paving the way for more advanced numerical techniques and machine learning workflows.

For most data analysis applications, the main areas of functionality I’ll focus on are:

- Fast array-based operations for data munging and cleaning, subsetting and filtering, transformation, and any other kind of computation

- Common array algorithms like sorting, unique, and set operations

- Efficient descriptive statistics and aggregating/summarizing data

- Data alignment and relational data manipulations for merging and joining heterogeneous datasets

- Expressing conditional logic as array expressions instead of loops with if-elif-else branches

- Group-wise data manipulations (aggregation, transformation, and function application)

## Introduction to Numpy
One of the reasons NumPy is so important for numerical computations in python is because it is designed for efficiency on large arrays of data.

It provides efficient multi-dimensional array objects and various mathematical functions

## Creating an Array (ndarray)
**ndarray = N-dimentional arrays**

The easiest way to create an array is to use the **array function**. This accepts any sequence-like object(including other arrays) and produces a NumPy array containing the passed data.

**Note:** although **ndarrays** supports only `homogeneous` data types (elements within it are of the same data type).
- You can also use **ndarrays** with different data type, using dtype = 'object'

In [62]:
import numpy as np
# Here I will create a few different dimentional arrays

# creating a 1D array
arr_1d = np.array([1,2,3])

# creating a 2D array
arr_2d = np.array([[1,2,3],[3,4,6]])

# creating a 3D array
# A 3D array is like having 2D array stacked
arr_3d = np.array([[[1, 2,0],
                    [3, 4,6]],

                    [[5, 6,9],
                    [7, 8,10]]])

print(f"1D array:\n{arr_1d}\n")
print(f"2D array:\n{arr_2d}\n")
print(f"3D array:\n{arr_3d}")



1D array:
[1 2 3]

2D array:
[[1 2 3]
 [3 4 6]]

3D array:
[[[ 1  2  0]
  [ 3  4  6]]

 [[ 5  6  9]
  [ 7  8 10]]]


### NumPy Array Indexing
Knowning the basic of array indexing is important for analyzing and manipulating the array object.
- **Basic indexing:** Basic indexing in NumPy allows you to access elements of an array using indices

In [63]:
# Some examples of array indexing
import numpy as np

print("1D array:")

# Creating 1D array
arr = np.array([1,2,3,4,5])

# printing arr
print(f"{arr}")

print(f'Single Element Access: arr[2] = {arr[2]}')

# negitive indexing gets the last element
print(f'Negative Indexing: arr[-1] = {arr[-1]}\n')

# 2D array
print(f'2D array:')
arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(arr2d)

# Indexing 2D array
print(f"Array access: arr2d[1,1] = {arr2d[1,1]}")
print(f'Negative access: arr2d[-1,0] = {arr2d[-1,0]}\n')

# 3D Array
print(f"3D Array:")
arr3d = np.array([[[1,2,3],[4,5,6],[7,8,9]],[[11,22,33],[44,55,66],[77,88,99]]])
print(arr3d)
print(f'Array access: arr3d[1,1,0] = {arr3d[1,1,0]}')



1D array:
[1 2 3 4 5]
Single Element Access: arr[2] = 3
Negative Indexing: arr[-1] = 5

2D array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Array access: arr2d[1,1] = 5
Negative access: arr2d[-1,0] = 7

3D Array:
[[[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]]

 [[11 22 33]
  [44 55 66]
  [77 88 99]]]
Array access: arr3d[1,1,0] = 44


### Slicing(NEED TO DO THIS)

## Basic Operations
Element-wise operations in NumPy let you apply mathematical calculations to each element of an array independently, all at once, without writing explicit loops.
This makes the code shorter, faster, and easier to read compared to manually iterating through elements.

In [60]:
import numpy as np

x = np.array([1,2,3])
y = np.array([4,5,6])

# Addition
add = x + y
print(f"Addition: {add}")

# Subtraction
subtract = x - y
print(f"Subtraction: {subtract}")

# Multiplication
multiply = x * y
print(f"Multiplication: {multiply}")

# Division
divide = x / y
print(f"division:{divide}")

Addition: [5 7 9]
Subtraction: [-3 -3 -3]
Multiplication: [ 4 10 18]
division:[0.25 0.4  0.5 ]


In [61]:
import numpy as np

arr = np.array([-3,-1,0,1,3])

result = np.absolute(arr)

print(f"absolute value: {result}")

absolute value: [3 1 0 1 3]


## Array Attributes
Here I will be learning the basic array attributes before getting into ndarrays in NumPy. Array attributes describe what the array looks like(shape), dimension(ndim), size(size), and dtype is the data type.
* **`ndim`** – Number of dimensions (axes) in the array.
  *Example:* A 1D array → `ndim = 1`, a 2D matrix → `ndim = 2`.

* **`shape`** – Size of the array in each dimension, returned as a tuple.
  *Example:* `(2, 3)` means 2 rows and 3 columns.

* **`size`** – Total number of elements in the array.
  *Example:* A `(2, 3)` array has `2 × 3 = 6` elements.

* **`dtype`** – Data type of the array’s elements.
  *Example:* `dtype('int64')` means all elements are 64-bit integers.

### 3D Array Example

In [77]:
import numpy as np

# Here we will create a 3D array and return it to validate it is a 3 dimensional array.

arr_3d = np.array([[[1,2,3,99],[4,5,6,99],[22,23,24,99]],[[7,8,9,99],[10,11,12,99],[33,44,55,99]]])
print(f'3D array:\n{arr_3d}\n')
print(f"The dimensions of this 3D matrix: {arr_3d.ndim}")
print(f"Shape of the matrix: {arr_3d.shape}")
print(f"Size of the matrix: {arr_3d.size} Elements")
print(f"dtype of this matrix: {arr_3d.dtype}")

3D array:
[[[ 1  2  3 99]
  [ 4  5  6 99]
  [22 23 24 99]]

 [[ 7  8  9 99]
  [10 11 12 99]
  [33 44 55 99]]]

The dimensions of this 3D matrix: 3
Shape of the matrix: (2, 3, 4)
Size of the matrix: 24 Elements
dtype of this matrix: int64


### 2D Array Example

In [81]:
import numpy as np

# Here we will create a 2D array and return it to validate it is a 2 dimensional array.

arr_3d = np.array([[1,2,3,99],[4,5,6,99],[22,23,24,99]])
print(f'2D array:\n{arr_3d}\n')
print(f"The dimensions of this 2D matrix: {arr_3d.ndim}")
print(f"Shape of the matrix: {arr_3d.shape}")
print(f"Size of the matrix: {arr_3d.size} Elements")
print(f"dtype of this matrix: {arr_3d.dtype}")

2D array:
[[ 1  2  3 99]
 [ 4  5  6 99]
 [22 23 24 99]]

The dimensions of this 2D matrix: 2
Shape of the matrix: (3, 4)
Size of the matrix: 12 Elements
dtype of this matrix: int64


##4.1: The NumPy ndarray: A Multidimensional Array Object
One of the key features of NumPy is its N-dimensional array object, or ndarray, which is a fast, flexible container for large datasets in Python.     

In [1]:
import numpy as np

In [5]:
names = np.array([["angel", "bob"], ["ariana", "briana"]])
print(names.shape)   # (2, 2)


(2, 2)


bob
