### Package Imports

In [1]:
import numpy as np

# Array Basics

## List != Array

Lists are objects in Python that can contain any type of data, whereas an array is also an object that represents a sequence of elements that are homogeneous in type and of predefined size.

In [3]:
my_list1 = [x * 10 for x in range(1, 11)]

my_list1

[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

## From List to Array 


    my_list = ["a", "e", "i", "o", "u"]
    my_array = np.array(my_list)

## **TO DO**:

**Create a list that contains the numbers of the 3 times table and then convert it into an array.**

*Note*: Use list comprehensions to create the Python list, it's a more compact syntax and the interpreter prefers it too!

```python
new_list = [ `operation to apply to each element` for `element in loop` in range(`start`, `stop`, `step`)]


In [5]:
# create the list
tab_three = []
# tab_three = list()


#create array from the list
tab_three = np.array()
tab_three

TypeError: array() missing required argument 'object' (pos 0)

### What has changed

The fundamental difference between Lists and Numpy Arrays is the data type of the elements they hold.

- A list can contain any type of data without enforcing homogeneity among its elements.
- An array contains only one data type.


In [6]:
valid_list = [2, "hello", "world", 42]

# np.array(valid_list)

## A Limitation or a Feature?

The fact that the array behaves this way may seem like a limitation, but the reality is that Python aims to make life easier for the programmer, often at the expense of performance, making certain types of operations harder (for the machine) to execute.

We should think of an array as a sequence of identical elements lined up one after another, whereas a list is a hodgepodge of scattered elements. Some elements can be very large, others very small, and we can locate them only because each element tells us where to find the next one—it's almost like a treasure hunt!


<img src="arrayVSlists.jpg">

## Numbers, But What Kind?

Data types in Numpy are primarily designed to handle and manipulate numbers of various sizes.  
The NumPy Documentation can help us better understand the available data types and how to use them:
- [Data Types](https://numpy.org/doc/stable/user/basics.types.html)

Having all data of the same type allows for better memory management and leads to significant performance benefits, especially when working with large datasets.


In [None]:
# np.array(["a", "e", "i", "o", "u"])

In [12]:
my_list = ["helloW", "world", "a", "b"]


my_array1 = np.array(my_list)
my_array2 = np.array([2,3,4,5,6])

**dtype** allow us to understand the data type of the array

In [13]:
# my_array1.dtype
# my_array2.dtype

dtype('int64')

In [10]:
# what happend with numbers as a string?
# ["1","3","5","7","9","11","13"]



with **astype()** we can change (or try to) the data type of the elements in the array 

## Operations

Another major difference is that arrays can be treated as a block of numbers and used directly for mathematical operations.

## **TO DO**:
Multiply all values in the **list** a number. *To do this on a list, a loop is required.*
1. create a list of number from 1 to 100 
2. using a loop calculate the product of each number of the list and a factor

In [11]:
num_list = []
factor = 4 


Now using array:

1. create the array startin from the list 
2. use the * between the array and the factor


### Why we use array 

One answer could be the ability to handle large chunck of data very fast. Let see how fast.

In [8]:
#Creating 2 object, one list and one array each with a million element inside
test_list = list(range(1_000_0000))
test_arr = np.arange(1_000_000)
# test_list, test_arr

In [9]:
%timeit operation_on_list = [3 * n for n in test_list] 
%timeit operation_on_arr = test_arr * 3

820 ms ± 55.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.48 ms ± 26.1 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


## [Broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html)

The operation we performed earlier to multiply an array by an integer can be generalized. In fact, we can think of the integer as a one-dimensional array.

This leads us to conclude that, to perform operations on arrays, they do not necessarily need to have the same size and shape.


In [15]:
a = np.array([[0.0, 10.0, 20.0, 30.0], [40.0, 50.0, 60.0, 70.0]])
b = np.array([1.,2.,3.])
# a + b 
a.shape, b.shape

((2, 4), (3,))

However, they must still meet certain constraints regarding their form, known as their **shape**.

For example, if we want to add or multiply arrays of different sizes, the smaller array needs to be "broadcasted" across the larger one. Broadcasting is a technique that NumPy uses to allow operations between arrays of different shapes, under specific conditions. 

The shape constraints ensure that NumPy knows how to align the elements of each array for operations. Typically, the dimensions must be compatible, which means either they are the same, or one of them is 1 so it can be stretched or repeated across the larger dimension. This flexibility allows for efficient array manipulations and simplifies the code, as we can work directly with arrays of different sizes without needing complex loops or extra transformations.


**Using Broadcast** and the `+` operator, find 2 array that sum together gives us this configuration:

| 0      | 1     | 2      | 3     | 4      | 5     |
|--------|-------|--------|-------|--------|-------|
| 1      | 2     | 3      | 4     | 5      | 6     |
| 2      | 3     | 4      | 5     | 6      | 7     |
| 3      | 4     | 5      | 6     | 7      | 8     |
| 4      | 5     | 6      | 7     | 8      | 9     |
| 5      | 6     | 7      | 8     | 9      | 10    |

## shape

The `shape` attribute is used to return the dimensions of an array. The `shape` attribute returns a tuple representing the number of elements along each dimension of the array.  
The first number represents the rows, and the second represents the columns.

To perform operations between two arrays, the number of columns in the first array must be equal to the number of rows in the second, or they must have the same shape.



In [35]:
c = np.array([0., 0.])
c.shape
b.shape
# a + c 
d = np.array([3.,4.,5.,6.])
d.shape


(1, 4)

## The Shape Can Change

The `reshape()` method in NumPy is used to change the shape of an array without altering its data. It allows you to resize the array to fit a new specified shape, as long as the total number of elements remains unchanged.

Transform the array: `[ 0  1  2  3  4  5  6  7  8  9 10 11]` into an array with 3 rows and 4 columns.


In [15]:
# make this row array a 3 row, 4 column array
my_array = np.array([ 0 , 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])


# Transposition

Sometimes it is necessary to flip arrays, turning rows into columns.

This operation is called transposition.


In [16]:
# on the array above try transposition

What is now the shape of the array?

### 

In [17]:
# Multidimensional Array 

array1D = np.array([1, 2, 3, 4, 5])  # mono dimensional Array
array2D = np.array([[1, 2, 3], [4, 5, 6]])  # bidimensional Array
array3D = np.array([[

    [
        [1, 2], 
        [3, 4]
    ], 
    [
        [5, 6], 
        [7, 8]
    ], 
    [
        [9, 10],
        [11, 12]
    ]]])  # Tridimensional Array
array4D = np.array([[[[1, 2], [3, 4]], [[5, 6], [7, 8]]], [[[9, 10], [11, 12]], [[13, 14], [15, 16]]]])

4

## Creating Arrays

For now, we have created arrays ourselves, but NumPy provides methods to create arrays more directly.

### arange()

To create an array in NumPy using the `arange()` function, you can specify a range of values and a step (increment) between the values. The basic syntax is as follows:

Creating an array with `arange`:

`array = np.arange(start, stop, step)`

Where:

- **start**: the starting value of the range (inclusive).
- **stop**: the final value of the range (exclusive).
- **step**: the increment between values.

If you do not specify `start`, the default value is 0. If you do not specify `step`, the default value is 1.



In [None]:
# Create an array of values from 0 to 9

# Create an array of values from 5 to 14

# Create an array of even values from 0 to 10

# Create an array of values from 10 to 1 (in descending order)


In [19]:
# Crea un array dei multipli di 10 tra le prime 9 cifre (escluso lo zero) e disponi l'array su 5 righe e 2 colonne
# my_array = 
my_array

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

### Values in a Range

`linspace()` is used to create an array of evenly spaced values within a specified range. The basic syntax of `linspace()` is as follows:

`array = np.linspace(start, stop, num)`

Where:

- **start**: the starting value of the range.
- **stop**: the final value of the range.
- **num**: the number of samples to generate between `start` and `stop`. This value indicates how many elements the array will have.

The main difference between `arange()` and `linspace()` is that `arange()` uses a specific step between values, while `linspace()` automatically determines the step based on the number of requested samples.


In [None]:
# Create an array of 5 evenly spaced values from 0 to 10

# Create an array of 10 evenly spaced values from 0 to 1

# Create an array of 20 evenly spaced values from -1 to 1



In [29]:
import matplotlib.pyplot as plt
# linspace is pretty useful to generate graph axis
# let's generate a graph of the sin(x) function

# Generate evenly spaced x values using linspace

# x =  # 100 value from 0 to 10

# Let's calculate the corresponding y values
# y = 

# draw the plot
# plt.plot(x, y)
# plt.title('sin function')
# plt.xlabel('x')
# plt.ylabel('sin(x)')
# plt.grid(True)
# plt.show()
#

### Random Numbers

The function `numpy.random.rand()` is used to generate random values uniformly distributed between 0 and 1.

You can specify the dimensions of the array you want to generate by passing arguments to the function. Here’s an example:


In [24]:
# Create an array from random numbers
np.random.seed(42)
array_rnd = np.random.rand(5)  # the argument is the shape of the generated array
print("Random number array: ")
print(array_rnd)

Random number array: 
[0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]


To generate random integers, you can use the function `numpy.random.randint()`. This function generates random integers within a specified range.

Example:


In [26]:
array_randint = np.random.randint(1, 11, size=5)  # 5 random integers between 1 and 10
print("Random integers")
print(array_randint)

Random integers
[2 8 6 2 5]


In [36]:
# Generate a random array of integers between 1 and 100 for 1000 values, then reshape it into a 3 by 3
rand_arr = ''
rand_arr

''

## Random Numbers from a Distribution

We can generate random numbers that follows a particular law, for exemple random number taken from a normal distribution 

In [30]:
np.random.normal(0, 1, 10)

array([ 0.55008647, -1.15577476,  1.15768782, -0.90299533,  0.89018446,
        0.44465443, -0.8718596 , -0.28021882, -1.09660795, -1.60379216])

**Let's Try** to generate random array of value from a normal distribution and plot the data. What happend if the sample will grow?

# Array Slicing

It is a powerful way to extract portions of an array based on certain criteria. With slicing, you can select a subset of elements from an array according to their position along one or more axes.

Slicing NumPy arrays is done by specifying a sequence of indices or ranges separated by commas within square brackets. The indices or ranges indicate which elements of the array will be included in the selection.
- `array[1:4]` selects the elements from index 1 to index 3 (the final index is excluded).
- `array[::2]` selects every second element of the array.
- `matrix[:, 0]` selects all elements from the first column of the matrix.
- `matrix[1:, :2]` selects a rectangular portion of the matrix, including all rows from the second one onward and only the first two columns.


In [None]:
#first two rows all columns

In [62]:
#first one column

In [63]:
#row3rd 2nd element

# Filtering array

We can use boolean algebra directly on array

In [31]:
# Array with the names of the products
prod_names = np.array(["Prod_1", "Prod_2", "Prod_3", "Prod_4", "Prod_5"])

# Array with the prices of the products
prod_prices = np.array([10.99, 19.99, 5.49, 7.95, 12.50])

# Print the two arrays
print("Product names:", prod_names)
print("Product prices:", prod_prices)


Product names: ['Prod_1' 'Prod_2' 'Prod_3' 'Prod_4' 'Prod_5']
Product prices: [10.99 19.99  5.49  7.95 12.5 ]


In [32]:
cond_min10 = prod_prices <= 10
cond_min10

array([False, False,  True,  True, False])

applicando l'array booleano a un array con i dati otterrò l'array filtrato

In [41]:
prod_prices[cond_min10]

array([5.49, 7.95])

In [42]:
# trovare nell'array rand_arr i tutti i valori > 90

# Sorting

In [45]:
prices = np.array([5.99, 6.99, 22.49, 99.99, 4.99, 49.99])
prices.sort()

In [50]:
prices

array([ 4.99,  5.99,  6.99, 22.49, 49.99, 99.99])

In [43]:
# find the three higest prices

## Where 

Anther way to apply a condition to an array is the `where()` method. The syntax is
    where(`boolean condition`, `value if true`, `value if false`)

In [51]:
# find the price that are haiger than the mean
prices.mean()
# np.where()

31.74

#### Check Please

Calculate the check for the 2 table, try to use all the functionality seen for the array. 

The menu array contains the info about type of food and its price, the order array has 1 on the food ordered by a client, each row rapresent a client.

You can use `array.sum()` to calculate the sum of the elements inside an array.

- Create a new array in which each line will show a name of food, its price and a final row with total 

In [100]:
# array menu
menu_item = np.array(["beef", "pasta", "hamburger", "salad", "cake"])
menu_price = np.array([21.0, 13.5, 15.3, 7., 6.4])
## The menu
menu = np.vstack((menu_item, menu_price))
menu

array([['beef', 'pasta', 'hamburger', 'salad', 'cake'],
       ['21.0', '13.5', '15.3', '7.0', '6.4']], dtype='<U32')

In [101]:
# Orders of 3 clients 
orders = np.random.randint(0, 2, size=(3,5))
orders

array([[1, 0, 1, 1, 0],
       [1, 1, 1, 0, 1],
       [0, 1, 1, 1, 0]])