# NumPy

NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries. For more info on why you would want to use Arrays instead of lists, check out this great [StackOverflow post](http://stackoverflow.com/questions/993984/why-numpy-instead-of-python-lists).

## Installation Instructions

**  pip install numpy

## Using NumPy

Numpy has many built-in functions and capabilities. We will focus on some of the most important aspects of Numpy: vectors,arrays,matrices, and number generation. Let's start by discussing arrays.

# Numpy Arrays

NumPy arrays are the main way we will use Numpy throughout the course. Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.


## Creating NumPy Arrays

### From a Python List

We can create an array by directly converting a list or list of lists:

In [1]:
import numpy as np

In [3]:
arr = np.array([1, 2, 3, 4, 5])
print(f"Numpy array: {arr} with type: {type(arr)}", arr)

Numpy array: [1 2 3 4 5] with type: <class 'numpy.ndarray'> [1 2 3 4 5]


In [4]:
# 2d numpy array
two_d_array = np.array([[1,2,3], [4,5,6],[7,8,9]])
print(two_d_array)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


## Built-in Methods
NumPy provides several methods to create arrays efficiently:
- arange()   Return a range from start to stop value, default step = 1.
- zeros()    Returns an array 1d or 2d with all elements zero
- ones()     Same but instead of 0, returns 1
- linespace()  Return evenly spaced numbers between specified interval
- eye()        Matrica njesi
- Random rand randn randint

In [5]:
# arange() similiar with range(start, stop, step)
# numpy arrays are called 'ndarray' - n-dimensional array
ndarray_from_arange = np.arange(1,10)
print(ndarray_from_arange)

[1 2 3 4 5 6 7 8 9]


In [6]:
print(np.arange(0, 10, 2))  # Creates array [0, 2, 4, 6, 8]

[0 2 4 6 8]


In [7]:
print(np.zeros(10))      # 1-d array with 10 -zeros
# by default its float type, we can change it later

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


In [8]:
print(np.zeros((2,3)))      # 2x3 matrix of zeros

[[0. 0. 0.]
 [0. 0. 0.]]


In [9]:
# instead of zeros we can create a ndarray of ones:
print(np.ones((3,3)))       # 3x3 matrix of ones

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [10]:
print(np.linspace(0, 5, 10)) # 10 points between 0 and 5
# same distance between 10 elements

[0.         0.55555556 1.11111111 1.66666667 2.22222222 2.77777778
 3.33333333 3.88888889 4.44444444 5.        ]


In [11]:
print(np.eye(4))            # Identity matrix

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [12]:
print(np.random.random(5))  # Random 5 numbers between 0 and 1

[0.95706979 0.72041413 0.02780392 0.2461285  0.28462671]


In [13]:
print(np.random.random((2,2))) # 2x2 matrix with random number between 0 and 1

[[0.30558807 0.51632715]
 [0.1225847  0.02705911]]


In [14]:
# if we want random int we can use: randint(from, to, dimensions)
print(np.random.randint(1,100, (4,4)))

[[ 5 51 58 54]
 [58 76 41 90]
 [24 99 86 43]
 [90 58 23  1]]


In [15]:
# if we want to generate same random numbers in different computers
# we can use seed() method
np.random.seed(105)

#105 is totally random, whoever uses 105 in their computer it will give
#same random numbers as here

# test: create an array with 5 random elements using seed 105
print(np.random.randint(1,10, 5))

[1 6 7 5 1]


## Array Attributes and Methods
Understanding array properties like shape, min, max, etc.

In [16]:
arr_3_3 = np.random.randint(0, 100, (3,3))  # creating random array 3_3
print(arr_3_3)
print("Shape:", arr_3_3.shape)

# shape returns a tuple representing length of dimensions in numpy array
# if its 1d-array 'shape' attribute will return a tuple containing only length
# if its 2d-array, it will return a tuple with 2 elements (rows,cols)
# if its 3d, returns tuple with (i, j, k), length of each dimension

[[64 36 48]
 [36 45  1]
 [70 47 73]]
Shape: (3, 3)


In [17]:
# find max value and its index
# notice numpy considers 2d array as 1d array, so argmax returns a single index
print("Max:", arr_3_3.max(), "at index", arr_3_3.argmax())

Max: 73 at index 8


In [18]:
# finding min value and its index
print("Min:", arr_3_3.min(), "at index", arr_3_3.argmin())

Min: 1 at index 5


In [19]:
# we can play around using for loops if we want to get (row,col) of min or max:

# find the index of maximum value as (row,col)
# first step, take rows and cols from shape:
rows, cols = arr_3_3.shape
max_index_1d = arr_3_3.argmax()  # it will return index = 8

for row in range(0,rows):  # iterate through rows [0,1,2]
    for col in range (0,cols):
        if max_index_1d == 0:
            print(f'Max value is {arr_3_3.max()} in position: {row,col}')
            break
        max_index_1d-=1

# you may use another approach


Max value is 73 in position: (2, 2)


In [20]:
# we can find also max or min for each row (row-wise)
# to do that we need to iterate through columns [0,1,2] for first row
# columns [0,1,2] for second row ...
# shape attribute returns (rows, cols), meaning rows are represented in position 0
# columns represent in position 1
print(f'Rows: {arr_3_3.shape[0]}')
print(f'Cols: {arr_3_3.shape[1]}')

Rows: 3
Cols: 3


In [22]:
# lets find max values for each row (iterating through columns)
print(arr_3_3)
print(f'Max value for each row: {arr_3_3.max(axis=1)}')
# axis = 1, means iterate through columns, to find max of each row

[[64 36 48]
 [36 45  1]
 [70 47 73]]
Max value for each row: [64 45 73]


In [25]:
# if we want to find min values for each column, we will iterate
# through rows using axis = 0
# columns - wise
print(arr_3_3)
print(f'Min value for each column: {arr_3_3.min(axis=0)}')

[[64 36 48]
 [36 45  1]
 [70 47 73]]
Min value for each column: [36 36  1]


In [26]:
# axis will be useable in most of numpy functions like sum, average, std ...
# another important parameter in most of numpy function is 'keepdims'
print(arr_3_3)
print(f'Min value for each column: {arr_3_3.min(axis=0, keepdims=True)}')

#it will find a minimum value for each column keep dimension zero because of (axis=0)
# meaning keep the dimension of row

[[64 36 48]
 [36 45  1]
 [70 47 73]]
Min value for each column: [[36 36  1]]


In [28]:
# if axis = 1 and keepdims = True
# because of axis = 1, it will iterate through columns
# because of keepdims = True, result with keep the format of axis = 1
# meaning it will be shown as column
print(f'Max value for each row: \n {arr_3_3.max(axis=1, keepdims=True)}')

Max value for each row: 
 [[64]
 [45]
 [73]]


## Data Type and Type Conversion
- Checking Data Type (dtype)
- Creating Arrays with a Specific Data Type
- Converting between types (astype()).

In [29]:
# we can define the type of elements in numpy array while creating the array:
arr = np.array([1, 2, 3], dtype='int')
print("Data type:", arr.dtype)

Data type: int64


In [30]:
# if we want to convert elements to another type after initial creation
# we can use astype() method
# note, it will not change the actual array
# it will create a copy of original with type specified

arr_float = arr.astype('float')
print("Converted to float:", arr_float, arr_float.dtype)

Converted to float: [1. 2. 3.] float64


In [32]:
# previously we saw np.zeros() returning zeros with type float:
# we can change it while creating it, using dtype parameter
print(np.zeros(shape=5, dtype = int))

[0 0 0 0 0]


In [34]:
# converting elements to type str:
print(arr)
print(arr.astype(str)) #copy of arr, with elements change to 'str'

[1 2 3]
['1' '2' '3']


In [35]:
# conversion will fail im this next case
vowel = np.array(['a','e','i','o','u','y'])
print(vowel, vowel.dtype)

# U1 meaning unicode 1 char

['a' 'e' 'i' 'o' 'u' 'y'] <U1


In [38]:
# trying to convert to int, will fail
try:
    print(vowel.astype(int))
except ValueError as error:
    print('If u c this mess, code in try block failed!!')
    print(error)

# it good practice conversion to be managed using try except..

If u c this mess, code in try block failed!!
invalid literal for int() with base 10: np.str_('a')


### reshape() method
The reshape() method in NumPy is used to change the shape of an array without changing its data. It returns a new array with the specified shape, but the total number of elements must remain the same.

In [39]:
# Example 1: Reshaping a 1D array into a 2D array
array_1 = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = array_1.reshape(2, 3)
print(reshaped_arr)

[[1 2 3]
 [4 5 6]]


In [41]:
# it does not change orginal array
print('Original array_1: ', array_1)

Original array_1:  [1 2 3 4 5 6]


In [43]:
# Example 2: Reshaping a 1D array into a 3D array
array_2 = np.arange(12)
reshaped_arr = array_2.reshape(2, 3, 2) # 2*3*2 = 12
print(reshaped_arr)

[[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]


In [44]:
# Example 3: Reshaping a 2D array into a 1D array
arr_2_3 = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_arr = arr_2_3.reshape(6)  # 6 because of 6 elements
print(reshaped_arr)

[1 2 3 4 5 6]


In [45]:
# instead of writing static number of elements, in prev case '6'
# we can do like this
reshaped_arr_dynamic = arr_2_3.reshape(-1)
# Here, -1 automatically calculates the required size for a 1D array.
print(reshaped_arr_dynamic)

[1 2 3 4 5 6]


## NumPy Indexing and Fancy Indexing
Selecting specific elements from arrays.

In [46]:
arr_1d = np.array([10, 20, 30, 40, 50])
print(arr_1d[0])  # First element
print(arr_1d[-1]) # Last element

10
50


In [47]:
# 1D Array Slicing
# same as in python lists, using [:]

# Slice elements from index 1 to 3 (excluding index 3)
print(arr_1d[1:3])  # Output: [20 30]

# Slice from the beginning to index 3 (excluding index 3)
print(arr_1d[:3])  # Output: [10 20 30]

# Slice from index 2 to the end
print(arr_1d[2:])  # Output: [30 40 50]

[20 30]
[10 20 30]
[30 40 50]


In [49]:
# Slice with step (every second element)
print(arr_1d[::2])  # Output: [10 30 50]

# Reverse the array
print(arr_1d[::-1])  # Output: [50 40 30 20 10]

# Another way to reverse
print(np.flip(arr_1d))

[10 30 50]
[50 40 30 20 10]
[50 40 30 20 10]


In [50]:
# Simple Indexing in a 2D Array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
print(arr_2d[1, 2])  # Row 1, Column 2

# Row index 1 → [4, 5, 6]
# Column index 2 in this row → 6

6


In [51]:
# Slicing in a 2D Array
# Extracting a subarray (First 2 rows and first 2 columns)
print(arr_2d[:2, :2])

[[1 2]
 [4 5]]


In [52]:
# Extracting a Single Row
print(arr_2d[1, :])  # Extracts all columns from Row index 1

[4 5 6]


In [53]:
# Extracting a Single Column
print(arr_2d[:, 2])  # Extracts all rows from Column index 2

[3 6 9]


In [54]:
# Skipping Elements with a Step in Slicing

print(arr_2d[::2, ::2])  # Every 2nd row and every 2nd column

[[1 3]
 [7 9]]


### NumPy copy() Method
The copy() method in NumPy creates a new independent copy of an array. Changes made to the copied array do not affect the original array. <br>

### Why use copy()?
If you don’t use copy(), modifying a new variable that refers to the original array will also modify the original data. This happens because NumPy arrays use views (not actual copies) unless explicitly copied.


In [55]:
original_arr = np.array([1, 2, 3, 4, 5])
arr_view = original_arr  # No copy, just a reference

arr_view[0] = 100  # Modify the new variable
print(original_arr)  # Original array is also changed!

[100   2   3   4   5]


In [56]:
# Example With copy() (Independent Copy)

first_array = np.array([1, 2, 3, 4, 5])
arr_copy = first_array.copy()  # Creates an independent copy

arr_copy[0] = 100  # Modify the copied array
print(first_array)  # Original remains unchanged
print(arr_copy)  # Only the copy is changed


[1 2 3 4 5]
[100   2   3   4   5]
