# Numpy

- Arrays
- Array Attributes
- Data Types
- Filling Arrays
- NAN & INF
- Mathematical Operations
- Array Methods
- Structuring Methods
- Concatenating, Stacking, Splitting
- Random
- Importing and exporting NumPy arrays

In [2]:
import numpy as np

### Arrays:

In [5]:
# Python
a = [1,2,3,4,5]
print(a)

[1, 2, 3, 4, 5]


In [12]:
# Numpy
b = np.array([1,2,3,4,5])
print(type(b))
print(b)
print(b[1])
print(b[1:])
print(b[:-2])
b[1] = 100
print(b)

<class 'numpy.ndarray'>
[1 2 3 4 5]
2
[2 3 4 5]
[1 2 3]
[  1 100   3   4   5]


### Array Attributes

In [24]:
c_mul = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(c_mul)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [25]:
print(c_mul[1, 1])  # 5
print(c_mul[1, :])  # [4 5 6]

5
[4 5 6]


In [21]:
# takes all rows of '1'st column => [2 5 8]
print(c_mul[:, 1])

[2 5 8]


In [23]:
# 1: indicates selecting rows from index 1 (inclusive) to the end.
# 2: indicates selecting columns from the beginning up to index 2 (exclusive).
print(c_mul[1:, :2]) 

[[4 5]
 [7 8]]


In [30]:
d_mul = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(d_mul.shape)

(3, 3)


In [36]:
e_mul = np.array(
    [
        [[1, 2, 3, 1], [4, 5, 6, 1], [7, 8, 9, 1]],
        [[10, 11, 12, 1], [13, 14, 15, 1], [16, 17, 18, 1]],
    ]
)

print(e_mul.shape) # (2, 3, 4) => 2 matrices, 3 rows, 4 columns
print(e_mul.ndim)  # number of dimensions (lists in lists)
print(e_mul.size)  # amount of elements in the matrix

(2, 3, 4)
3
24


---

### Data Types

When there is a string in the array, it will convert all elements to a string.

In [43]:
f_mul = np.array([[1, 2, 3], [4, "Hello", 6], [7, 8, 9]])
print(f_mul.dtype)  # <U11 => Unicode string of 11 characters
print(type(f_mul[1, 1]))  # <class 'numpy.str_'>

<U11
<class 'numpy.str_'>


In [40]:
g_mul = np.array([[1, 2, 3], [4, 1, 6], [7, 8, 9]])
print(g_mul.dtype)

int32


Typecasting:

In [49]:
h_mul = np.array([[1, 2, 3], [4, "1", 6], [7, 8, 9]], dtype=np.int32)
print(h_mul.dtype)

int32


---

### Filling Arrays

2: The array has 2 blocks.<br>
3: Each block contains 3 matrices.<br>
4: Each matrix has 4 rows.<br>
5: Each row has 5 columns.<br>

In [5]:
a = np.full((2, 3, 4, 5), 5)
a

array([[[[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]],

        [[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]],

        [[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]]],


       [[[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]],

        [[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]],

        [[5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5],
         [5, 5, 5, 5, 5]]]])

In [6]:
b = np.zeros((2, 3, 4))
b

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [7]:
c = np.ones((2, 3))
c

array([[1., 1., 1.],
       [1., 1., 1.]])

In [11]:
d = np.empty((2, 3, 4))
d

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [12]:
'''
An array with evenly spaced values within a specified range.
start = 10: The sequence starts at 10.
stop = 90: The sequence stops before 90 (90 is not included).
step = 3: The difference between each pair of consecutive values is 3.
'''

e = np.arange(10, 90, 3)
e

array([10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49, 52, 55, 58,
       61, 64, 67, 70, 73, 76, 79, 82, 85, 88])

In [29]:
e = np.arange(6)
e

array([0, 1, 2, 3, 4, 5])

In [13]:
"""
Creates an array with evenly spaced values over a specified range.
start = 0: The sequence starts at 0.
stop = 1: The sequence stops before 1 (1 is not included).
amount = 5: The amount of generated values is 5.
"""

f = np.linspace(0, 1, 5)
f

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

----

### NAN & INF

In [18]:
print(np.nan)
print(np.inf)

print(np.isnan(np.inf))
print(np.isinf(np.inf))



nan
inf
False
True


---

### Mathematical Operations

In [20]:
# Python list
l1 = [1, 2, 3, 4, 5]
l2 = [10, 9, 8, 7, 6]

# Numpy array
a1 = np.array(l1)
a2 = np.array(l2)   


Multiplication:

In [21]:
# Full list 5 times
print(l1 * 5)

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5]


In [22]:
# Each element in the list is multiplied by 5
print(a1 * 5)

[ 5 10 15 20 25]


Other calculations:

(+, -, /)

In [24]:
# Takes two lists and concatenates them
print(l1 + l2)

[1, 2, 3, 4, 5, 10, 9, 8, 7, 6]


In [25]:
# NumPy arrays will perform element-wise addition, resulting in a new array 
# where each element is the sum of the corresponding elements.
print(a1 + a2)

[11 11 11 11 11]


Mathematical functions:

In [27]:
g = np.array([[1, 2], [3, 4]])
print(np.sqrt(g))
print(np.sin(g))

[[1.         1.41421356]
 [1.73205081 2.        ]]
[[ 0.84147098  0.90929743]
 [ 0.14112001 -0.7568025 ]]


---

### Array Methods

In [31]:
arr = np.array([1, 2, 3, 4, 5])

# Calculates the mean of array elements.
mean = np.mean(arr)
print(mean)

# Calculates the sum of array elements.
total = np.sum(arr)
print(total)

# Finds the maximum value in an array.
max_value = np.max(arr)
print(max_value)

# Finds the minimum value in an array.
min_value = np.min(arr)
print(min_value)

# Finds the index of the maximum value in an array.
index_max = np.argmax(arr)
print(index_max)

# Finds the index of the minimum value in an array.
index_min = np.argmin(arr)
print(index_min)

3.0
15
5
1
4
0


Reshapes an array without changing its data.

In [28]:
# np.arange(6) => [0 1 2 3 4 5]
arr = np.arange(6).reshape((2, 3))
print(arr)

[[0 1 2]
 [3 4 5]]


Transposes the dimensions of an array.

In [30]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
transposed = np.transpose(arr)
print(transposed)

[[1 4]
 [2 5]
 [3 6]]


Concatenates two or more arrays.

In [32]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
concatenated = np.concatenate((arr1, arr2))
print(concatenated)

[1 2 3 4 5 6]


Append values to the end of an array.

In [33]:
# Original array
arr = np.array([1, 2, 3, 4, 5])

# Append a single value
arr_appended = np.append(arr, 6)
print(arr_appended)

# Append multiple values
arr_appended_multiple = np.append(arr, [6, 7, 8])
print(arr_appended_multiple)

[1 2 3 4 5 6]
[1 2 3 4 5 6 7 8]


In [35]:
# 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Append along axis 0 (rows)
arr_appended_axis0 = np.append(arr_2d, [[7, 8, 9]], axis=0)
print(arr_appended_axis0)

print('------')

# Append along axis 1 (columns)
arr_appended_axis1 = np.append(arr_2d, [[7], [8]], axis=1)
print(arr_appended_axis1)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
------
[[1 2 3 7]
 [4 5 6 8]]


Insert values into an array at specified indices.

In [36]:
# Original array
arr = np.array([1, 2, 3, 4, 5])

# Insert a single value at index 2
arr_inserted = np.insert(arr, 2, 99)
print(arr_inserted)

# Insert multiple values at index 2
arr_inserted_multiple = np.insert(arr, 2, [99, 100])
print(arr_inserted_multiple)

[ 1  2 99  3  4  5]
[  1   2  99 100   3   4   5]


In [38]:
# 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Insert along axis 0 (rows)
arr_inserted_axis0 = np.insert(arr_2d, 1, [7, 8, 9], axis=0)
print(arr_inserted_axis0)

print("------")

# Insert along axis 1 (columns)
arr_inserted_axis1 = np.insert(arr_2d, 1, [7, 8], axis=1)
print(arr_inserted_axis1)

[[1 2 3]
 [7 8 9]
 [4 5 6]]
------
[[1 7 2 3]
 [4 8 5 6]]


Remove elements from an array at specified indices.

In [39]:
# Original array
arr = np.array([1, 2, 3, 4, 5])

# Delete the element at index 2
arr_deleted = np.delete(arr, 2)
print(arr_deleted)

# Delete multiple elements at indices 1 and 3
arr_deleted_multiple = np.delete(arr, [1, 3])
print(arr_deleted_multiple)

[1 2 4 5]
[1 3 5]


In [40]:
# 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Delete the second row (index 1)
arr_deleted_axis0 = np.delete(arr_2d, 1, axis=0)
print(arr_deleted_axis0)

print("------")

# Delete the second column (index 1)
arr_deleted_axis1 = np.delete(arr_2d, 1, axis=1)
print(arr_deleted_axis1)

[[1 2 3]]
------
[[1 3]
 [4 6]]


---

### Structuring Methods

The shape attribute returns the dimensions of the array.

In [41]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape)

(2, 3)


The `reshape()` method changes the shape of an array without changing its data.

In [42]:
arr = np.array([1, 2, 3, 4, 5, 6])
reshaped = np.reshape(arr, (2, 3))
print(reshaped)

[[1 2 3]
 [4 5 6]]


The `resize()` method changes the shape and size of an array in-place.

In [43]:
arr = np.array([1, 2, 3, 4, 5, 6])
arr.resize((2, 3))
print(arr)

[[1 2 3]
 [4 5 6]]


The `flatten()` method returns a copy of the array collapsed into one dimension.

In [44]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
flattened = arr.flatten()
print(flattened)

[1 2 3 4 5 6]


The `ravel()` method returns a flattened array. Unlike flatten(), it returns a view when possible.

In [45]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
raveled = arr.ravel()
print(raveled)

[1 2 3 4 5 6]


The `swapaxes()` method interchanges two axes of an array.<br>
This can be particularly useful when you need to reorient the data in a multi-dimensional array.

- Original Array Shape: The original array arr has a shape of (2, 3), meaning it has 2 rows and 3 columns.
- Axes:
    - Axis 0 refers to the rows.
    - Axis 1 refers to the columns.

When you swap axis 0 with axis 1:
- The rows become columns.
- The columns become rows.

In [46]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
swapped = np.swapaxes(arr, 0, 1)
print(swapped)

[[1 4]
 [2 5]
 [3 6]]


---

### Concatenating, Stacking, Splitting

Join a sequence of arrays along an existing axis.

In [47]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6]])

# Concatenate along axis 0 (rows)
concatenated = np.concatenate((arr1, arr2), axis=0)
print(concatenated)

[[1 2]
 [3 4]
 [5 6]]


Join a sequence of arrays along a new axis.

In [48]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Stack along a new axis
stacked = np.stack((arr1, arr2), axis=0)
print(stacked)

[[1 2 3]
 [4 5 6]]


Stacks arrays in sequence vertically (row-wise).

In [49]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Vertical stack
vstacked = np.vstack((arr1, arr2))
print(vstacked)

[[1 2 3]
 [4 5 6]]


Stacks arrays in sequence horizontally (column-wise).

In [51]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr3 = np.array([7, 8, 9])

# Horizontal stack
hstacked = np.hstack((arr1, arr2, arr3))
print(hstacked)

[1 2 3 4 5 6 7 8 9]


Splits an array into multiple sub-arrays.

In [52]:
arr = np.array([1, 2, 3, 4, 5, 6])

# Split into 3 sub-arrays
split_arr = np.split(arr, 3)
print(split_arr)

[array([1, 2]), array([3, 4]), array([5, 6])]


Splits an array into multiple sub-arrays vertically (row-wise).

In [53]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Vertical split into 3 sub-arrays
vsplit_arr = np.vsplit(arr, 3)
print(vsplit_arr)

[array([[1, 2, 3]]), array([[4, 5, 6]]), array([[7, 8, 9]])]


Splits an array into multiple sub-arrays horizontally (column-wise).

In [55]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Horizontal split into 3 sub-arrays
hsplit_arr = np.hsplit(arr, 3)
hsplit_arr

[array([[1],
        [4]]),
 array([[2],
        [5]]),
 array([[3],
        [6]])]

---

### Random

##### Returns random integers from a specified range.

In [56]:
# Generate a single random integer between 0 (inclusive) and 10 (exclusive)
rand_int = np.random.randint(0, 10)
print(rand_int)

# Generate a 2x3 array of random integers between 0 (inclusive) and 10 (exclusive)
rand_int_array = np.random.randint(0, 10, size=(2, 3))
print(rand_int_array)

7
[[1 7 0]
 [4 2 5]]


##### Draw samples from a binomial distribution.

n: Number of trials (e.g., 10 coin flips).<br>
p: Probability of success (e.g., 0.5 for a fair coin).<br>
size: Number of samples to draw.<br>

In [57]:
# Parameters: n (number of trials), p (probability of success), size (output shape)
binom_sample = np.random.binomial(n=10, p=0.5, size=5)
print(binom_sample)

[3 5 5 6 5]


##### Draw samples from a normal (Gaussian) distribution.

loc: Mean of the distribution (e.g., 0).<br>
scale: Standard deviation of the distribution (e.g., 1).<br>
size: Number of samples to draw.<br>

In [58]:
# Parameters: loc (mean), scale (standard deviation), size (output shape)
normal_sample = np.random.normal(loc=0, scale=1, size=5)
print(normal_sample)

[ 1.39543127  0.39280393  1.24144979 -0.67939082  1.15626805]


##### Generates a random sample from a given 1-D array.

In [59]:
# Generate a single random choice from the array [1, 2, 3, 4, 5]
choice_sample = np.random.choice([1, 2, 3, 4, 5])
print(choice_sample)

# Generate a 3-element array of random choices from the array [1, 2, 3, 4, 5]
choice_array = np.random.choice([1, 2, 3, 4, 5], size=3)
print(choice_array)

4
[4 4 4]


---

### Importing and exporting NumPy arrays

Importing and exporting NumPy arrays is essential for saving and loading data efficiently. NumPy provides several methods to handle this, including `np.save`, `np.load`, `np.savetxt`, and `np.loadtxt`.

##### Save and load arrays in NumPy's binary format (`.npy`).

In [61]:
# Create a sample array
arr = np.array([1, 2, 3, 4, 5])

# Save the array to a file
np.save("../datasets/array.npy", arr)

In [62]:
# Load the array from the file
loaded_arr = np.load("../datasets/array.npy")
print(loaded_arr)

[1 2 3 4 5]


##### Save and load arrays in a text format (e.g., `.txt` or `.csv`).

In [66]:
# Create a sample 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Save the array to a text file
np.savetxt("../datasets/array.csv", arr_2d, delimiter=",")

In [67]:
# Load the array from the text file
loaded_arr_2d = np.loadtxt("../datasets/array.csv", delimiter=",")
print(loaded_arr_2d)

[[1. 2. 3.]
 [4. 5. 6.]]


---

### Numpy and Pandas example

In [68]:
import pandas as pd

# Step 1: Create a NumPy Array
np_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("NumPy Array:")
print(np_array)

# Step 2: Convert NumPy Array to pandas DataFrame
df = pd.DataFrame(np_array, columns=["A", "B", "C"])
print("\nPandas DataFrame:")
print(df)

# Step 3: Perform Basic Operations with pandas
df["D"] = df["A"] + df["B"]
print("\nDataFrame after adding column 'D':")
print(df)

filtered_df = df[df["A"] > 2]
print("\nFiltered DataFrame (A > 2):")
print(filtered_df)

# Step 4: Convert pandas DataFrame back to NumPy Array
final_np_array = filtered_df.to_numpy()
print("\nFinal NumPy Array:")
print(final_np_array)

NumPy Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Pandas DataFrame:
   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9

DataFrame after adding column 'D':
   A  B  C   D
0  1  2  3   3
1  4  5  6   9
2  7  8  9  15

Filtered DataFrame (A > 2):
   A  B  C   D
1  4  5  6   9
2  7  8  9  15

Final NumPy Array:
[[ 4  5  6  9]
 [ 7  8  9 15]]


**Explanation**

1. **NumPy Array**: We start by creating a 2D NumPy array.
2. **pandas DataFrame**: We convert the NumPy array to a pandas DataFrame for easier data manipulation.
3. **Basic Operations**: We add a new column to the DataFrame and filter rows based on a condition.
4. **Back to NumPy**: Finally, we convert the filtered DataFrame back to a NumPy array.

This example shows how NumPy and pandas can be used together to handle and manipulate data efficiently.