In [1]:
#Panda-Data Manipulation
#Numpy-Numerical Computing-Numerical Python
#Matplotlib-Visulaization

# Introduction:
NumPy (Numerical Python) is a fundamental package for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with mathematical functions to operate on these arrays.
## Key Features:

* Fast array operations
* Mathematical and logical operations on arrays
* Broadcasting capabilities
* Linear algebra, random number generation, and more

# Installation

In [44]:
pip install numpy

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


# Importing NumPy

In [2]:
import numpy as np

# Array Creation
## Creating Arrays from Lists
### Single Dimensional Array (1D Array)


In [5]:
#Single Dimensional Array
n1 = np.array([10,20,30,40,50])
n1

array([10, 20, 30, 40, 50])

In [6]:
type(n1)

numpy.ndarray

# Explanation:

np.array([10, 20, 30, 40, 50]) creates a one-dimensional array (also called a vector)  
The array n1 contains 5 elements: 10, 20, 30, 40, 50  
This is different from a Python list because:  

* NumPy arrays are faster for mathematical operations  
* All elements must be the same data type (homogeneous)  
* NumPy provides vectorized operations (operations on entire array at once)  


The variable n1 now holds a NumPy array object  

## When to use 1D arrays:  

Storing a sequence of numbers (temperatures, prices, scores)  
* Mathematical vectors  
* Time series data  
* Any single list of values  

## Multi-Dimensional Array (2D Array)

In [7]:
#Multi Dimensional Array
n2= np.array([[10,20,30,40,50],[10,20,30,40,50]])
n2

array([[10, 20, 30, 40, 50],
       [10, 20, 30, 40, 50]])

In [8]:
type(n2)

numpy.ndarray

# Explanation:

* np.array([[...], [...]]) creates a two-dimensional array (also called a matrix).  
The outer list contains 2 inner lists, making it a 2×5 array (2 rows, 5 columns).  
Row 0: [10, 20, 30, 40, 50]  
Row 1: [10, 20, 30, 40, 50]  
Each row must have the same number of elements (5 in this case)  
The array n2 is a rectangular grid of numbers  

Structure Breakdown:  
        Column 0  Column 1  Column 2  Column 3  Column 4  
Row 0:     10        20        30        40        50  
Row 1:     10        20        30        40        50  

## Key Properties:

* Shape: (2, 5) - 2 rows and 5 columns
* Dimensions: 2 (it's a 2D array)
* Total elements: 10 (2 × 5)
* All elements are integers

## When to use 2D arrays:

* Representing tables or spreadsheets
* Storing matrices for linear algebra
* Image data (grayscale images are 2D arrays)
* Multiple samples of data (each row = one sample)
* Grid-based data (game boards, geographic data)

# Arrays with Initial Values
## Array of Zeros

In [11]:
n3 = np.zeros((2,3))
n3

array([[0., 0., 0.],
       [0., 0., 0.]])

In [12]:
type(n3)

numpy.ndarray

# Explanation:

np.zeros((2, 3)) creates a 2D array filled with zeros  
The parameter (2, 3) is a tuple specifying the shape:  

* 2 = number of rows  
* 3 = number of columns  


Creates a 2×3 array (2 rows, 3 columns) = 6 total elements  
All values are initialized to 0.0 (floating-point zeros)  

Visual Structure:  
        Column 0  Column 1  Column 2  
Row 0:    0.0       0.0       0.0  
Row 1:    0.0       0.0       0.0 

## Key Points:

* Default data type is float64 (floating-point numbers)
* The zeros have decimal points: 0. not 0
* Very memory efficient for creating large arrays

## When to use np.zeros():

* Initializing arrays before filling them with calculated values
* Placeholder arrays that will be updated later
* Accumulator arrays for summing values in loops
* Image processing - creating blank canvases
* Machine learning - initializing weight matrices

In [13]:
n4 = np.zeros((10,10))
n4

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

### Example Use Case:

In [17]:
# Create a blank array to store results
results = np.zeros((100, 5))  # Store 100 samples with 5 features each

# Later fill it with data in a loop
for i in range(100):
    results[i] = calculate_something(i)

NameError: name 'calculate_something' is not defined

### Different Shapes

In [21]:
np.zeros(5)           # 1D array: [0. 0. 0. 0. 0.]

array([0., 0., 0., 0., 0.])

In [22]:
np.zeros((3, 3))      # 2D array: 3×3 matrix of zeros

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [23]:
np.zeros((2, 4, 3))   # 3D array: 2×4×3 tensor of zeros

array([[[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]],

       [[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]]])

# Array with a Specific Value

In [18]:
n5 = np.full((3,3),60)
n5

array([[60, 60, 60],
       [60, 60, 60],
       [60, 60, 60]])

## Explanation:

np.full((3, 3), 60) creates a 2D array filled with the specific value 60  
First parameter (3, 3) is a tuple specifying the shape:  

* 3 rows
* 3 columns


Second parameter 60 is the fill value - every element will be this value  
Creates a 3×3 array (3 rows, 3 columns) = 9 total elements  
All 9 elements are initialized to 60  

## Visual Structure:  
        Column 0  Column 1  Column 2  
Row 0:    60        60        60  
Row 1:    60        60        60  
Row 2:    60        60        60  
## Key Points:  

* The fill value can be any number (integer, float, negative, etc.)
* Data type is automatically inferred from the fill value
* In this case: 60 (integer) → array of integers
* More flexible than np.zeros() or np.ones()

## When to use np.full():

* Initialize arrays with a default value other than 0 or 1
* Create constant arrays for mathematical operations
* Placeholder arrays with meaningful default values
* Testing and debugging with known values
* Image processing - creating colored backgrounds

### Example Use Cases:


In [26]:
# Initialize scores to 100
scores = np.full((10, 5), 100)  # 10 students, 5 subjects, all start at 100
scores

array([[100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100],
       [100, 100, 100, 100, 100]])

In [27]:
# Create a matrix with -1 for missing data
data = np.full((50, 20), -1)  # Mark missing values as -1
data

array([[-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
        -1, -1, -1, -1],
       [-1, -1, -1, -1, -1, -1

In [28]:
# Initialize temperatures to room temperature
temps = np.full((24, 7), 22.5)  # 24 hours, 7 days, start at 22.5°C
temps

array([[22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5],
       [22.5, 22.5, 22.5, 22.5,

### Different Fill Values:

In [30]:
np.full((2, 2), 7)      # Fill with integer 7

array([[7, 7],
       [7, 7]])

In [31]:
np.full((3, 3), 3.14)   # Fill with float 3.14

array([[3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14]])

In [32]:
np.full((2, 4), -99)    # Fill with negative number -99

array([[-99, -99, -99, -99],
       [-99, -99, -99, -99]])

In [29]:
np.full(5, 'A')         # Fill with string (creates string array)

array(['A', 'A', 'A', 'A', 'A'], dtype='<U1')

# Range-based Arrays
## Creating Arrays with arange()

In [33]:
n6 = np.arange(10,20)
n6

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

## Explanation:

* np.arange(10, 20) creates an array with values from 10 to 19
* First parameter 10 = start value (inclusive)
* Second parameter 20 = stop value (exclusive - not included)
* Default step = 1 (increments by 1)
* Generates 10 values: 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
* Similar to Python's built-in range() function, but returns a NumPy array

## Key Points:

* The stop value (20) is NOT included in the output
* Step size defaults to 1 if not specified
* Returns a 1D array (vector)
* Very efficient for creating sequential numbers

### arange() Syntax Variations:

In [34]:
# Single parameter: start from 0
np.arange(5) 

array([0, 1, 2, 3, 4])

In [35]:
# Two parameters: start and stop
np.arange(10, 20)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])

In [36]:
# Three parameters: start, stop, step
np.arange(0, 10, 2)    # Output: [0 2 4 6 8] (even numbers)
np.arange(1, 10, 2)    # Output: [1 3 5 7 9] (odd numbers)

array([1, 3, 5, 7, 9])

In [37]:
# Negative step (counting backwards)
np.arange(10, 0, -1) 

array([10,  9,  8,  7,  6,  5,  4,  3,  2,  1])

In [38]:
# Float step
np.arange(0, 1, 0.1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

### When to use np.arange():

* Creating indices for loops or indexing
* Generating sequences of numbers
* X-axis values for plotting
* Test data with predictable values
* Iterating over ranges in vectorized operations

### Example Use Cases:

In [41]:
# Create indices for a dataset
indices = np.arange(0, 100)  # 0 to 99
indices

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [42]:
# Generate years for time series
years = np.arange(2000, 2025)  # 2000 to 2024
years

array([2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010,
       2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021,
       2022, 2023, 2024])

In [43]:
# Create evenly spaced angles
angles = np.arange(0, 360, 10)  # 0°, 10°, 20°, ..., 350°
angles

array([  0,  10,  20,  30,  40,  50,  60,  70,  80,  90, 100, 110, 120,
       130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250,
       260, 270, 280, 290, 300, 310, 320, 330, 340, 350])

### Comparison with Python's range():

* range() returns a range object (must convert to list)
* np.arange() returns a NumPy array (ready for math operations)
* NumPy arrays support vectorized operations (faster)

# Random Arrays
## Random Integers with randint()

In [47]:
n7 = np.random.randint(1,50,12)
n7
# Note: Your output will be different (random values each time)

array([48, 20, 11, 39, 15, 47,  6,  6, 28,  5, 38, 21], dtype=int32)

## Explanation:

* np.random.randint(1, 50, 12) creates an array of 12 random integers
* First parameter 1 = low value (inclusive - can be selected)
* Second parameter 50 = high value (exclusive - NOT included)
* Third parameter 12 = size (number of random integers to generate)
* Generates 12 random integers between 1 and 49 (inclusive)
* Each time you run this code, you get different random values

### Key Points:

* Random values are between 1 (inclusive) and 50 (exclusive)
* Possible values: 1, 2, 3, ..., 48, 49 (NOT 50)
* Returns a 1D array with 12 elements
* Each value is randomly and independently selected
* Uses a pseudo-random number generator

### np.random.randint() Syntax Variations:

In [48]:
# Two parameters: low, high (generates single value)
np.random.randint(1, 10)         # Output: single random int like 7

1

In [49]:
# Three parameters: low, high, size (generates array)
np.random.randint(1, 50, 12)     # Output: array of 12 random ints

array([22, 37,  7, 10,  2, 34, 22, 34, 20,  5, 38, 16], dtype=int32)

In [50]:
# Size as tuple for multi-dimensional arrays
np.random.randint(0, 100, (3, 4))  # Output: 3×4 array of random ints

array([[33, 20, 71, 70],
       [26,  5,  7, 59],
       [26, 51, 14, 96]], dtype=int32)

In [51]:
# Only high parameter (low defaults to 0)
np.random.randint(10, size=5)    # Output: [0-9] range, 5 values

array([3, 6, 9, 0, 0], dtype=int32)

### When to use np.random.randint():

* Simulations and Monte Carlo methods
* Random sampling from a range
* Game development (dice rolls, random events)
* Test data generation for testing algorithms
* Random initialization of arrays
* Lottery numbers or random selection

### Example Use Cases:

In [52]:
# Simulate 100 dice rolls (1-6)
dice_rolls = np.random.randint(1, 7, 100)
dice_rolls

array([3, 2, 2, 3, 1, 6, 5, 3, 1, 5, 4, 5, 2, 1, 2, 5, 5, 6, 1, 6, 1, 6,
       3, 5, 6, 1, 5, 1, 5, 4, 6, 1, 4, 2, 5, 3, 6, 1, 2, 6, 3, 3, 6, 4,
       4, 3, 2, 2, 2, 5, 4, 1, 6, 3, 4, 3, 2, 2, 3, 5, 1, 2, 6, 6, 4, 1,
       1, 3, 6, 3, 2, 3, 1, 6, 6, 5, 6, 2, 3, 6, 4, 1, 1, 1, 5, 1, 5, 2,
       6, 6, 5, 1, 5, 1, 2, 6, 6, 1, 3, 2], dtype=int32)

In [53]:

# Generate random student IDs (1000-9999)
student_ids = np.random.randint(1000, 10000, 50)
student_ids

array([2781, 2975, 9193, 6846, 8807, 5634, 5610, 7004, 8491, 6095, 9390,
       9125, 9042, 7921, 6165, 5468, 7007, 2101, 1088, 1373, 5006, 3295,
       5245, 5068, 7842, 5591, 5437, 3865, 5708, 5460, 6716, 4007, 4012,
       2895, 5833, 9573, 4715, 7050, 4930, 6921, 9845, 1325, 4391, 3562,
       3685, 4782, 9859, 8669, 1518, 3354], dtype=int32)

In [54]:
# Create random matrix for testing
test_matrix = np.random.randint(0, 256, (10, 10))  # 10×10 matrix
test_matrix

array([[140, 223, 117, 203,  72,  78,  82, 229,  28, 215],
       [ 32,  33,  44,  26, 157, 134,  63,   7,  40,  57],
       [140, 231, 143, 211,  52,  40, 139, 118, 169,   4],
       [119,  93, 235, 154,   5, 231,  71, 197,  65, 224],
       [242, 165,  23, 111, 234,  43, 104, 197,  74,  80],
       [202,  95, 232,   5, 247,  17, 179, 128, 161,  47],
       [147,  60,  50,  90, 150, 136, 173, 182,  78, 108],
       [  4,  18, 144, 133, 147, 128, 224,  63,  68, 131],
       [235,  23,   3,  86, 120, 104, 245, 220,  26,  96],
       [ 59, 144,  75, 156, 127,  83,  49, 165, 216, 225]], dtype=int32)

In [55]:
# Random lottery numbers (1-49, pick 6)
lottery = np.random.randint(1, 50, 6)
lottery

array([19, 35, 11, 37, 35,  5], dtype=int32)

### Setting Random Seed (for reproducibility):

In [57]:
# Set seed to get same random numbers each time
np.random.seed(42)
n8 = np.random.randint(1, 50, 12)  # Always same output with seed 42
n8

array([39, 29, 15, 43,  8, 21, 39, 19, 23, 11, 11, 24], dtype=int32)

### Other Random Functions:

In [58]:
# Random floats between 0 and 1
np.random.random(5)    

array([0.33370861, 0.14286682, 0.65088847, 0.05641158, 0.72199877])

In [59]:
# Random floats in a range
np.random.uniform(1.5, 5.5, 10)  # 10 floats between 1.5 and 5.5

array([5.25421084, 1.50311506, 5.46884624, 3.96992604, 3.94661264,
       1.52826522, 1.5922497 , 3.59909864, 3.09944389, 1.68666265])

In [60]:
# Random from normal distribution
np.random.randn(5)               # Output: standard normal distribution

array([ 0.2220789 , -0.7679765 ,  0.1424646 , -0.03465218,  1.13433927])

In [61]:
n9 = np.array([1,2,3,4],[1,2,3,4])
n9

array([[1, 2, 3, 4],
       [1, 2, 3, 4]])

In [64]:
n9.shape

(2, 4)

In [67]:
n9.shape = (4,2)
n9

array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]])

### Saving and Loading Arrays  
Saving Arrays to Disk with save()

In [3]:
n10 = np.array([10,20,30,40,50,60,70,80,90])
np.save('my_numpy',n10)  # Creates a file: my_numpy.npy

### Explanation:

* np.save('my_numpy', n10) saves the array n10 to a file on disk
* First parameter 'my_numpy' is the filename (without extension)
* Second parameter n10 is the array to save
* NumPy automatically adds the .npy extension → creates my_numpy.npy
* The file is saved in binary format (not human-readable, but very efficient)

### Key Points:

* File extension .npy is automatically added by NumPy
* Saves in NumPy's binary format (compact and fast)
* Preserves data type, shape, and all array properties
* File can be loaded later using np.load()
* Overwrites existing file if filename already exists

### What Gets Saved:

* All array values (10, 20, 30, ...)
* Array shape (9,) - 1D array with 9 elements
* Array data type (int64 or int32)
* All metadata about the array

### Loading Saved Arrays:

In [4]:
# Load the saved array
loaded_array = np.load('my_numpy.npy')
print(loaded_array)

[10 20 30 40 50 60 70 80 90]
