# NumPy and Arrays



## Introduction

The first object we encounter in NumPy is the array. Arrays can be created from different objects.


In [4]:
### Creating Arrays

import numpy as np

In [2]:
# Creating an array from a list
list1 = [2, 4, 6]
np.array(list1)

array([2, 4, 6])

In [3]:
# Creating an array from a list with mixed data types
np.array(['pippo', 2, 47])

array(['pippo', '2', '47'], dtype='<U21')

In [4]:
# Creating an array using the range function
np.array(range(20, 40, 5))

array([20, 25, 30, 35])

The range function generates a series of numbers of the type (start, end, step).

Note: The object we define for Python is in the ndarray class, while array is the function that generates it.



### Using np.arange()
We can also generate an array using the np.arange() function.

  



In [5]:
# Creating an array with np.arange()
np.arange(4)


array([0, 1, 2, 3])

In [6]:

np.arange(3, 11)


array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [7]:

np.arange(3, 11, 4)  # Syntax: (start, end, step)


array([3, 7])

## Creating Arrays of Zeros and Ones
We can use np.zeros and np.ones to create arrays of zeros or ones.


In [8]:
# Array of zeros
np.zeros(4)


array([0., 0., 0., 0.])

In [9]:
list1 = [1,2,3,4]
5 * np.array(list1)

array([ 5, 10, 15, 20])

In [10]:
np.array(list1) + 5

array([6, 7, 8, 9])

In [13]:
list10_15 = [10,11,12,13,14,15]
array3 = 3 * np.ones(6)
print(array3)
list10_15_array = np.array(list10_15)
list10_15_array + array3

[3. 3. 3. 3. 3. 3.]


array([13., 14., 15., 16., 17., 18.])

In [14]:
for i in len(list10_15):
  list10_15[i] + array3[i]


TypeError: 'int' object is not iterable

In [15]:
# Array of ones
5 * np.ones(10)

#([5,5,5,5,5,5,5,5,5,5,5])


array([5., 5., 5., 5., 5., 5., 5., 5., 5., 5.])

## Vectorized Operations
An interesting feature of arrays is vectorized operations, meaning when you perform operations on an array, they are applied to each element.

  



In [16]:
# Creating an array
np.arange(1, 7)


array([1, 2, 3, 4, 5, 6])

In [17]:
# Adding a scalar to an array
np.arange(1, 7) + 2

array([3, 4, 5, 6, 7, 8])

In [18]:
# Multiplying an array by a scalar
np.arange(1, 7) * 3


array([ 3,  6,  9, 12, 15, 18])

## Element-wise Operations
  



In [19]:
x_array = np.array([1, 3, 4, 5, 2, 6])
y_array = np.arange(1, 7)
z_array = np.arange(4,10)

In [20]:
# Adding two arrays
x_array + y_array
x_array + z_array


array([ 5,  8, 10, 12, 10, 15])

In [21]:
x_list = [1, 3, 4, 5, 2, 6]
y_list = [1,2,3,4,5,6]
x_list + y_list

[1, 3, 4, 5, 2, 6, 1, 2, 3, 4, 5, 6]

In [22]:
# Multiplying two arrays
x_array * y_array



array([ 1,  6, 12, 20, 10, 36])

In [23]:
# Squaring each element of the array
x_array ** 2


array([ 1,  9, 16, 25,  4, 36])

In [25]:
[1,2,3,4]**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

## Mathematical Operations
NumPy provides various mathematical expressions we can use.

  



In [26]:
x_array = np.array([1, 3, 4, 5, 2])

In [27]:
# Square root of each element
np.sqrt(x_array)

array([1.        , 1.73205081, 2.        , 2.23606798, 1.41421356])

In [28]:
# Rounding the square roots to 5 decimal places
np.round(np.sqrt(x_array), 2)

array([1.  , 1.73, 2.  , 2.24, 1.41])

In [29]:
# Exponential of each element
np.exp(x_array)

# Sine of each element (in radians)
np.sin(x_array)

# Hypotenuse given the lengths of the other two sides
np.hypot(3, 4)

# Converting radians to degrees
print('Degrees: ', np.degrees(np.pi))



Degrees:  180.0


In [30]:
# Rounding to the nearest integer
print(np.rint(1.4))
print(np.rint(1.5))

1.0
2.0


In [31]:
# Cumulative sum
print(x_array)
np.cumsum(x_array)

[1 3 4 5 2]


array([ 1,  4,  8, 13, 15])

In [32]:
# Consecutive differences
np.ediff1d(x_array)

array([ 2,  1,  1, -3])

In [33]:
# Generating 4 equally spaced values between 0 and 10
np.linspace(0, 10, 5)



array([ 0. ,  2.5,  5. ,  7.5, 10. ])

In [34]:
# Generating 20 equally spaced values between 1 and 10
np.linspace(1, 10, 20)


array([ 1.        ,  1.47368421,  1.94736842,  2.42105263,  2.89473684,
        3.36842105,  3.84210526,  4.31578947,  4.78947368,  5.26315789,
        5.73684211,  6.21052632,  6.68421053,  7.15789474,  7.63157895,
        8.10526316,  8.57894737,  9.05263158,  9.52631579, 10.        ])

## Creating 2-D or 3-D Arrays
The simplest way to create 2-D or 3-D arrays is by converting from lists.

  


In [35]:

# Creating a 2-D array from a list of lists
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
np.array(my_matrix)



array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [36]:
# Creating a 3-D array
np.array([[[1, 0], [1, 1]], [[0.1, 0], [0.5, 0.7]]])


array([[[1. , 0. ],
        [1. , 1. ]],

       [[0.1, 0. ],
        [0.5, 0.7]]])

## Creating an Identity Matrix
To create an identity matrix, we can use the np.eye() function.

  



In [37]:
# Creating a 4x4 identity matrix
np.eye(4)

# Creating a 6x6 identity matrix
np.eye(6)



array([[1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1.]])

## Random Numbers in NumPy
NumPy provides numerous functionalities for generating (pseudo-)random numbers.


In [5]:
# Setting the seed for reproducibility
np.random.seed(1502)


In [54]:
# Generating random numbers uniformly distributed in [0,1]
np.random.rand(5)

array([0.63935505, 0.79030897, 0.53047393, 0.23722854, 0.09067367])

In [46]:
# Generating a 3x3 matrix of random numbers
np.random.rand(3, 3)


array([[0.28045025, 0.48198631, 0.10486441],
       [0.33924931, 0.69361458, 0.66447322],
       [0.51270562, 0.72931104, 0.80151463]])

In [55]:
# Generating random numbers from a normal distribution (mean=0, variance=1)
np.random.randn(3, 4)

array([[-0.24024492, -1.07396596,  0.0479024 , -0.49332007],
       [-1.36045708, -1.09611306,  0.12510805, -1.85598727],
       [-2.01390746, -0.1747761 ,  2.36531021, -1.47066371]])

In [56]:
# Generating random integers within a specified range
np.random.randint(1, 10)

np.random.randint(1, 100, 10)  # start, end, shape

np.random.randint(1, 100, (5, 5))


array([[59, 23, 63,  7, 98],
       [63, 13,  7, 24,  2],
       [48, 43, 37, 82, 31],
       [12, 28, 79, 83, 66],
       [71,  6, 10,  8, 88]], dtype=int32)

In [57]:
np.random.randint(1, 20)


2

In [6]:
input_variable = input()
random_variable = np.random.randint(1, 6)
if input_variable == random_variable:
  print("You won!")
else:
  print("You lost!")

You lost!


If you want to sample values with replacement, use np.random.choice.


In [7]:
# Setting the seed for reproducibility
np.random.seed(1409)



In [8]:
# Sampling without replacement
np.random.choice(10, size=5, replace=False)



array([9, 5, 7, 1, 3], dtype=int32)

In [9]:
# Sampling with replacement
np.random.choice(np.arange(20, 30), size=5, replace=True)



array([20, 22, 25, 27, 22])

In [10]:
# Sampling from a list of strings
disney = ['pippo', 'pluto', 'topolino']
np.random.choice(disney, size=5, replace=True)


array(['pippo', 'pippo', 'topolino', 'topolino', 'topolino'], dtype='<U8')

## Extending NumPy Arrays
To extend an ndarray with new elements, we can use the append and concatenate functions.

  



In [13]:
x_array = np.array([9, 7, 5, 3, 1])

# Appending a single element
np.append(x_array, 12)

array([ 9,  7,  5,  3,  1, 12])

In [None]:
ones = list[]
ones.append()

TypeError: descriptor 'append' for 'list' objects doesn't apply to a 'int' object

In [21]:
# Appending multiple elements
np.append(x_array, [12, 15])

# Appending another array
y_array = np.arange(1, 6)
np.append(x_array, y_array)

# Concatenating two arrays
np.concatenate((x_array, y_array))


array([9, 7, 5, 3, 1, 1, 2, 3, 4, 5])

When extending multidimensional arrays, you must specify the axis along which to perform the extension (0 for rows, 1 for columns).

  


In [22]:
np.random.seed(111)
new_arr = np.random.randn(2, 2)

new_arr

array([[-1.13383833,  0.38431919],
       [ 1.49655378, -0.3553823 ]])

In [23]:
# Appending along axis 0 (rows)
np.append(np.eye(2), new_arr, axis=0)

array([[ 1.        ,  0.        ],
       [ 0.        ,  1.        ],
       [-1.13383833,  0.38431919],
       [ 1.49655378, -0.3553823 ]])

In [24]:
# Concatenating along axis 1 (columns)
np.concatenate((np.eye(2), new_arr), axis=1)


array([[ 1.        ,  0.        , -1.13383833,  0.38431919],
       [ 0.        ,  1.        ,  1.49655378, -0.3553823 ]])

## Utility Functions for Comparing/Manipulating Arrays
NumPy defines several utilities for intersecting or merging elements of 1-D arrays as if they were "sets" of objects in a mathematical sense.



In [25]:
# Finding unique elements
np.unique([1, 1, 2, 2, 3, 4, 5, 5, 5, 5])

array([1, 2, 3, 4, 5])

In [26]:
# Checking for elements in one array that are in another array
np.sum(np.in1d([1, 2, 3, 4], np.arange(1, 9, 2)))



  np.sum(np.in1d([1, 2, 3, 4], np.arange(1, 9, 2)))


np.int64(2)

In [27]:
# Union of two arrays
np.union1d([1, 2, 3, 4], [5, 5, 5, 5])

array([1, 2, 3, 4, 5])

In [28]:
# Intersection of two arrays
np.intersect1d([1, 2, 3, 4, 5, 6], [5, 5, 5, 5])



array([5])

In [29]:
# Set difference
np.setdiff1d([1, 2, 3, 4, 5, 6], [5, 5, 5, 5])

#np.setdiff1d([5, 5, 5, 5], [1, 2, 3, 4, 5, 6])


array([1, 2, 3, 4, 6])

## Methods Available for NumPy Arrays
NumPy provides numerous methods and attributes for ndarray objects. The first ones we see are related to the type of values present in the array itself.


In [30]:
x_array = np.array([9, 7, 5, 3, 1])

# Data type of the array
x_array.dtype


dtype('int64')

In [31]:
# Changing the data type of the array
x_array = x_array.astype('int8')
x_array

array([9, 7, 5, 3, 1], dtype=int8)

In [32]:
# Creating an array with a specific data type
np.array(['1.1', '-2', '4.5'], dtype=np.double)

nstr = np.array(['1.1', '-2', '4.5'], dtype=np.string_)
nstr.astype(np.double)

AttributeError: `np.string_` was removed in the NumPy 2.0 release. Use `np.bytes_` instead.

Not all values can be converted to all types!

  



In [33]:
nstr.astype(np.int32)  # This will raise an error because of incompatible conversion


NameError: name 'nstr' is not defined

Other methods are related to the size/shape of the array itself.

In [34]:
 # Shape of the array
x_array.shape

# Number of dimensions
x_array.ndim



1

In [35]:
# Creating a 2-D array
my_matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_matrix
np.array(my_matrix).shape
np.array(my_matrix).ndim



2

In [36]:
# Creating a 3-D array
ar3d = np.array([[[1, 0], [1, 1]], [[0.1, 0], [0.5, 0.7]]])
ar3d.shape
ar3d.ndim


3

NumPy also provides several summary statistics methods for arrays: min, max, argmin, argmax, mean, std, and sum.

  



In [37]:
x_array = np.array([9, 7, 5, 3, 1])

# Minimum value
x_array.min()

np.int64(1)

In [39]:
# Index of minimum value
x_array.argmin()

# Maximum value
x_array.max()

# Index of maximum value
x_array.argmax()

# Mean value
x_array.mean()

# Standard deviation
x_array.std()

# Sum of all elements
x_array.sum()

# For multi-dimensional arrays, the array is first flattened into 1-D, then the method is applied.
ar3d.std()


np.float64(0.42407988634218435)

You can find a complete list of methods implemented for NumPy arrays at NumPy ndarray Documentation.



## Sorting NumPy Arrays
The sort method sorts an array in place, while np.sort returns a sorted copy of the array.

  



In [40]:
tmp_array = np.array([8, 4, 1, 2, 7])
tmp_array


array([8, 4, 1, 2, 7])

In [41]:
# Sorting the array in place
tmp_array.sort()
tmp_array


array([1, 2, 4, 7, 8])

In [42]:
# Sorting the array and returning a sorted copy
tmp_array = np.array([8, 4, 1, 2, 7])
np.sort(tmp_array)


array([1, 2, 4, 7, 8])

For multi-dimensional arrays, you can choose whether to sort by rows, columns, or other dimensions.


In [43]:
tmp_array2 = np.array([[8, 4, 1], [2, 7, 5]])
tmp_array2


array([[8, 4, 1],
       [2, 7, 5]])

In [44]:
# Sorting each column
tmp_array2.sort(0)
tmp_array2


array([[2, 4, 1],
       [8, 7, 5]])

In [45]:
# Sorting each row
tmp_array2.sort(1)
tmp_array2


array([[1, 2, 4],
       [5, 7, 8]])

The reshape method allows you to change the shape of an array.


In [46]:
x_array.reshape(5, 1)
ar3d = np.array([[[1, 0], [1, 1]], [[0.1, 0], [0.5, 0.7]]])
ar3d.reshape(1, 8)


array([[1. , 0. , 1. , 1. , 0.1, 0. , 0.5, 0.7]])

## Slicing Arrays
Selection and slicing of arrays work exactly like they do for lists.


In [47]:
vec = np.arange(3, 11)
vec
#vec[1]


array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [48]:
vec[2:6]


array([5, 6, 7, 8])

In [49]:
vec[:3]


array([3, 4, 5])

In [50]:
vec[:]


array([ 3,  4,  5,  6,  7,  8,  9, 10])

In [51]:
vec[-1]


np.int64(10)

In [52]:
vec[1:5:2]


array([4, 6])

A key difference from lists is the ability to redefine a range of elements in an array with a constant value.


In [53]:
vec[0:3] = 0
vec

array([ 0,  0,  0,  6,  7,  8,  9, 10])

In [54]:
vec[0:4] = 1
vec

array([ 1,  1,  1,  1,  7,  8,  9, 10])

In [55]:
Be careful! Changes made to a slice affect the original object.

SyntaxError: invalid syntax (1010184765.py, line 1)

In [56]:
subvec = vec[0:4]
subvec[:] = 99
print(vec)
print(subvec)


[99 99 99 99  7  8  9 10]
[99 99 99 99]


In [57]:
# To avoid this effect, use the copy method of NumPy arrays.
vec = np.arange(3, 11)
subvec = vec[0:4].copy()
subvec

array([3, 4, 5, 6])

In [58]:
subvec[:] = 99
subvec
vec


array([ 3,  4,  5,  6,  7,  8,  9, 10])

You can also select elements of an array based on the values contained in a second array of boolean values.

In [None]:
vec = np.arange(3, 11)
print(vec)
vec > 5

[ 3  4  5  6  7  8  9 10]


array([False, False, False,  True,  True,  True,  True,  True])

In [None]:
vec[vec > 5] = 99
(vec > 5) & (vec < 800)

array([False, False, False,  True,  True,  True,  True,  True])

In [None]:
vec[(vec > 5) & (vec < 800)]


array([99, 99, 99, 99, 99])

For multi-dimensional arrays, you can access sub-blocks using the notation [interval1, interval2], choosing appropriate ranges of indices for each dimension.


In [None]:
mat = np.arange(1, 10).reshape(3, 3)
mat

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [None]:
mat[0:, :1]

array([[1],
       [4],
       [7]])

You can select elements using lists and vectors and choose the extraction order.


In [None]:
mat[[2, 0],0]


array([7, 1])

In [None]:
mat[np.arange(0, 1)]


array([[1, 2, 3]])

In [None]:
mat[[False, True, True]]


array([[4, 5, 6],
       [7, 8, 9]])

You can also modify part of a multi-dimensional array using slicing.


In [None]:
mat[~(np.array([False, True, True]))] = 0
mat


In [None]:
mat[0] = np.arange(1, 4)
mat


## The where Function
Another common operation in data science is generating an array (multi-D) with different values based on a condition verified by another array of the same size. This operation is implemented in NumPy using the where function.

In [None]:
z_array = np.arange(1, 13, 3).reshape(2, 2)
z_array

array([[ 1,  4],
       [ 7, 10]])

In [None]:
#np.where(z_array % 2, 1,0)
z_array % 2

array([[1, 0],
       [1, 0]])

In [None]:
np.where(z_array % 2, 'here', 'there')

array([['here', 'there'],
       ['here', 'there']], dtype='<U5')

In [None]:
c1 = [['a', 'b'], ['c', 'd']]
c2 = [['x', 'y'], ['w', 'z']]
np.where(z_array % 2, c1, c2)

array([['a', 'y'],
       ['c', 'z']], dtype='<U1')

## Special Case of Slicing and where
The options for slicing a NumPy array include a special use case that is not consistent with those described so far but useful in some situations. If the slicing indices are given as a pair of lists of the same length, NumPy extracts individual elements at the intersection of the specified indices, placing them in an array.

In [None]:
mat = np.arange(1, 10).reshape(3, 3)
mat[[2, 0], [1, 2]]

In [None]:
mat[[0, 0, 1, 2], [1, 2, 2, 2]]


We can use the where function to produce a pair of index lists where a specific condition is met. It is sufficient not to specify additional arguments besides the condition. In this case, the function returns a tuple of row and column indices identifying which elements of the array satisfy the boolean condition.



In [None]:
z_array = np.arange(1, 13, 3).reshape(2, 2)
z_array



In [None]:
np.where(z_array % 2)

In [None]:
np.where(z_array == 4)
z_array[np.where(z_array == 4)]


### Special Values in NumPy
NumPy has some special values to handle calculations between arrays: inf and nan which have their arithmetic (and therefore often do not lead to errors).



In [None]:
np.arange(1, 3) / 0

  np.arange(1, 3) / 0


array([inf, inf])

In [None]:
np.inf * 0

nan

In [None]:
np.arange(3) / 0

  np.arange(3) / 0
  np.arange(3) / 0


array([nan, inf, inf])

In [None]:
np.nan * 2

nan

In [None]:
5. + np.nan + 8

nan

In [None]:
np.nan + np.nan

nan

In [None]:
3 + np.inf

inf

In [None]:
np.inf - np.inf

nan

In [None]:
np.inf + np.nan

nan

In [None]:
np.inf * np.nan

nan

In [None]:
import time
# Time
a=0
start = time.time()
for i in range(20000):
  a+=0

end = time.time()

end - start

0.0065610408782958984

# NumPy Exercises

## Exercise 1
Create a 1-D NumPy array with elements from 10 to 50 (inclusive).

## Exercise 2
Create a 1-D array of 20 equally spaced values between 1 and 10.

## Exercise 3
Create a 2x2 identity matrix.

## Exercise 4
Create a 3x3 array of all zeros and a 3x3 array of all ones.

## Exercise 5
Create a 1-D array with elements [1, 3, 5, 7, 9] and another with elements [2, 4, 6, 8, 10]. Add these two arrays.

## Exercise 6
Multiply the above two arrays element-wise.

## Exercise 7
Compute the square root of each element in the first array.

## Exercise 8
Create a 4x4 matrix with values from 0 to 15.

## Exercise 9
Slice the above matrix to get a 2x2 sub-matrix from the top left corner.

## Exercise 10
Replace all odd numbers in a 1-D array of 10 elements with -1.

## Exercise 11
Create a 3x3 matrix and find the cumulative sum along each row.

## Exercise 12
Create a 5x5 matrix with random values and find the maximum value in each row.

## Exercise 13
Normalize a 5x5 random matrix (subtract the mean and divide by the standard deviation).

## Exercise 14
Create a 5x5 matrix with values 1 to 25 and then subtract the mean of each row from the corresponding row elements.

## Exercise 15
Find the indices of the minimum and maximum values in a 1-D random array of 15 elements.







In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
even_numbers = []
#even_numbers.append(3)
#even_numbers

In [None]:
for number in numbers:
    print(number)
    if number % 2 == 0:
        even_numbers.append(number)
print(even_numbers)

1
2
3
4
5
6
7
8
9
10
[2, 4, 6, 8, 10]
