# Introduction

In this lesson we´ll learn about extremely important package called `numpy`(Numerical Python) that is the core for scientific computing in Python. Using numpy, you can create array objects and perform various operations on them 'at the speed of light', while these objects are memory-efficient. 

Let´s visit the world of arrays!

## Numpy´s array

So, what is an array object? 

Array objects serve as a container for values, in other words it is a collection of values, commonly called n-dimensional array (ndarray for short. Such as array has any number of dimensions.

Array objects can have one or more dimensions and contain homogenous data:
- vector: 1 dimension
- matrix: 2 dimensions
- tensor: 3 or more dimensions
___
How the arrays store values?

I assume that you remember creating list objects. When a list is created, each element is placed at the particular place in the memory. If you would perform some operation such as appending a new value, the size of a list will grown. This is different in case of numpy´s arrays. There is a fixed memory size when creating an array object called the data buffer. So ndarrays take the advantage of **locality reference** (dopisat). 

If you change the size of an array by performing some operation, the original array will be deleted and replaced by a completely new array. Since the values of arrays are required to be of the same data type, the size of memory remain the same for all members. 

These are the two main advantages of ndarrays that allow us to perform operations much more faster compared to the other Python types. 

Now we import numpy library and give it alias np:

In [None]:
import numpy as np

To check which version of numpy library you are using, type `package_name.__version__`:

In [None]:
# Numpy package version
np.__version__

'1.19.5'

# 1. One-dimensional arrays

There are several ways how you can create arrays. Let's look at some examples.

## 1.1 Arrays creation

To create a ndarray we can use `array()` function where all elements will be passed as a list. We can specify the desired data type using `dtype` parameter, for example to floats:



In [None]:
# Initializing 1-D array
a_array = np.array([15,77,90,34], dtype = "float64")
a_array

array([15., 77., 90., 34.])

**add a paragraph about Numpy data types**

Each element of the array has specific index representing its position:


In the case of numpy's arrays, the index starts at 0. We can access a particular element in the same manner as in the case of lists - using indexing. For example, to acces the second element of a_array variable we would write the following code:

In [None]:
# Accessing the second element of a_array:
a_array[1]

77.0




The dimensions are called axes in Numpy. The output of the above example is one-dimensional array, so there is only one axis that holds 4 elements.

`type()` function returns the type of an object, which is in this case `ndarray` class (object):


In [None]:
# Type of a_array variable
type(a_array)

numpy.ndarray

Using `dtype` attribute we can check the data type of the elements in the array: 

In [None]:
# The data type of array's data type
a_array.dtype

dtype('float64')

`size` attribute returns the total number of elements:

In [None]:
# The total number of elements
a_array.size

4

The attribute `shape` will return the number of elements present in each dimension in the form of a tuple. This attribute is equivalent to `size` attribute.

In [None]:
# The shape of ndarray
a_array.shape

(4,)



We can get the number of dimensions using `ndim` attribute. When we apply this attribute on a_array variable, the output will be 1 because there is only 1 dimension:

In [None]:
# Getting the number of dimensions
a_array.ndim

1

Numpy's function `empty()` is used for creating an empty array which in reality is not empty at all. Let me explain: this function create an array with arbitrary values based on the memory's state. Creating such an array is convenient a fast when you need to have some 'storage place' for the large number of values that need to and have to be filled in the future.

So for example, if we want to create an empty array with 20 elements, the code would look like the example below. The output will be array of 1 dimension with 20 elements stored.
To create an empty array, pass the number of values to `shape` parameter (or leave out the parameter's name and write only the desired integer).
The data type of values can be set with `dtype` parameter and the default data type is a float. Let's specify integer data type:


In [None]:
# Initializing an empty array
b_array = np.empty(20, dtype = "int")
b_array

array([93827532678240,   446676598868,   137438953573,   446676598899,
         481036337249,   137438953573,   438086664303,   472446402592,
         416611827812,   489626271858,   519691042913,   416611827722,
         416611827807,   489626271858,   519691042913,   493921239086,
         416611827816,   433791697008,   408021893215,              0])

**add type casting of arrays**

In [None]:
# Explicit converting integer values to floats
to_float = b_array.astype('float')
to_float.dtype

dtype('float64')

For creating the arrays filled with **zeros** use `zeros()` function and specify the shape. Again you are able to specify the data type of array's elements:

In [None]:
# Initializing an array filled with 0's
c_array = np.zeros(10)
c_array

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

It is also possible to create an array filled with ones using `ones()` function. The parameters of this function ar the same as in the case of `zeros()` function.

In [None]:
# Initializing an array with 1's
d_array = np.ones(10)
d_array

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

To create an array with a specific value to fill, use `full()` function. The below example displays creating an array of 1 dimension populated with 7 elements and filled with the value 7:

In [None]:
# Initializing an array filled with 7's
e_array = np.full(7, fill_value = 7)
e_array

array([7, 7, 7, 7, 7, 7, 7])

Another useful numpy's function is `arange()`. 

Using this function you can create a sequence of values within a given range that are evenly spaced:

- `start`: start value of the sequence is 0 by default
- `stop`: this is the end value of sequence that **won't** be included in the result
- `step`: the step size (spacing) between values that is 1 by default

Optionally you can specify the desired data type using `dtype` parameter.

There are two approaches how you can define parameters:

a) define parameter names and corresponding arguments:

```
my_array = np.arange(start = 5, stop = 50, step = 5)
```

b) define only positional arguments:

```
my_array = np.arange(5,50,5)
```

Firstly, let's create a simple sequence of 10 values. To do so, just pass 10 as a positional argument inside the brackets. Python automatically recognize that 10 is actually **stop** value and starts counting from 0. Notice that values are separated by 1 by default:



In [None]:
# Creating a sequence of 10 elements
f_array = np.arange(10)
f_array

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Here is one thing you should be aware of: 

When you want to create the sequence of 20 values separated by 2 you might be tempted to write a code looks like this one below. Just specifying the stop and the step value, since you know that start value is set to 0 by default.

This will return `TypeError`:

```
my_array = np.arange(stop = 20, step = 2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-27-c2eac05fa1c8> in <module>()
----> 1 e_array = np.arange(stop = 10, step = 2)

TypeError: arange() missing required argument 'start' (pos 1)
```
Why is that so? Although the default start value is 0 you can't leave out this parameter when specifying the remaining parameters.

So when you are creating a sequence, write either parameter names or positional arguments to avoid confusioning results:


In [None]:
# Creating a sequence with typed parameter names
g_array = np.arange(start = 0, stop = 20, step = 2)
g_array

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [None]:
# Creating a sequence of values with positional arguments
h_array = np.arange(0, 20, 2)
h_array

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

If you set float data type using `dtype` parameter or specify float argument to `step` parameter, the resulting values will be converted to floats. 

In the case of float step size, it might happen that that the last value of sequence may be greater that defined `stop` value. In this case is better to use the similar function `linspace()`. Let's look at what this function does and what the syntax look like.

`linspace()` function also creates a sequence of evenly spaced values within a given range while you are able to control endpoint value. 

The parameters are:    
`start`: the start value of the sequence of values  
`stop`: the end value of the sequence that **will be included** in the output by default   
`num`: the number of elements you want to generate (the default number of samples is 50)    
`endpoint`: this parameter control whether the last element is included in the output (set to True by default)  
`retstep`: showing the spacing between elements if set to True  
`dtype`: the type of the output elements

> Note: The differences between np.linspace() and np.arange(): In the case of linspace, the **end point is included** in the output unless you specify otherwise. You don´t specify step size, but **the number of elements** to produce.


Let´s create a sequence of elements by specifying the start and the end point. The default number of produced elements is 50, but let´s change that to 40 using `num` parameter. 


In [None]:
# Creating a sequence of values
a_seq = np.linspace(start = 5, stop = 20, num = 40)
print(type(a_seq))
print(a_seq)

<class 'numpy.ndarray'>
[ 5.          5.38461538  5.76923077  6.15384615  6.53846154  6.92307692
  7.30769231  7.69230769  8.07692308  8.46153846  8.84615385  9.23076923
  9.61538462 10.         10.38461538 10.76923077 11.15384615 11.53846154
 11.92307692 12.30769231 12.69230769 13.07692308 13.46153846 13.84615385
 14.23076923 14.61538462 15.         15.38461538 15.76923077 16.15384615
 16.53846154 16.92307692 17.30769231 17.69230769 18.07692308 18.46153846
 18.84615385 19.23076923 19.61538462 20.        ]


The output is 1-D array containing evenly spaced float values from 5 to 20 while the last value is included in the output. It means that we created a closed interval. Most of the time you will be using these three parameters to create such as vector, but there is posibility to add optional parameters.

Let´s create a sequence of integer values and store it to b_seq variable. Notice, that we always need to specify the start value. Stop value will be set to 60 and the number of elements to generate is 30. As another optional parameters we´ll specify `dtype` parameter to be 'int' and don´t include the last element by setting `endpoint` to False. So that below code will return a half-open interval of values:

In [None]:
# Creating a sequence of values
b_seq = np.linspace(start = 0, stop = 60, num = 30, dtype = 'int', endpoint = False)
b_seq

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,
       34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58])

Notice the spacing between values of the array: values are separated from each others by 2. If you want to have more control over the increments between elements, use np.arange() instead.

Let´s verify that output elements are of integer data type using `type` function. We´ll select the first element using index operator:

In [None]:
# Printing the data type of an element
print(type(b_seq[0]))

<class 'numpy.int64'>


If you want to see the spacing between values, set `retstep` to True. In this case, the output will be in the form of a `tuple` that holds the array and step size as a float number:

In [None]:
b_step = np.linspace(start = 0, stop = 60, num = 30, dtype = 'int', endpoint = False, retstep = True)
b_step

(array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32,
        34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58]), 2.0)

We´ll learn more about tuples later on within this notebook. For now, just note that tuple is another type of container that can hold values. 

Let´s do some indexing to understand the above output more. Firstly, printing the type of b_step displays that this variable is a tuple:

In [None]:
# Printing the type of b_step
type(b_step)

tuple

The first part of this tuple is the array that we created using `np.linspace`. We can access it by using index 0:

In [None]:
# Printing the type of an array
type(b_step[0])

numpy.ndarray

To print the data type of the first element (value 0) we need to specify two indices. We access the array using index 0 and then we acces the first element of that array using index 0:

(Don´t worry if this concept is not clear to you, we´ll learn more about indexing in 2-D array section within this notebook.)

In [None]:
# Printing the data type of the first element of the array
type(b_step[0][0])

numpy.int64

The second part of the tuple is actual step size that represent the spacing between elements of the array. Since this value is at the second place, we can access it using index 1:

In [None]:
# Printing the type of the step size
type(b_step[1])

numpy.float64

## 1.2 Adding and removing elements




# 2. Two-dimensional arrays

To create a 2-D array we'll again use `array()` function and pass list of 2 lists with desired values as an argument, for example:



In [None]:
# Creating two-dimensional array
a_ndarray = np.array([[10,20,30], [70,80,90]])
print(a_ndarray)
print(type(a_ndarray))

[[10 20 30]
 [70 80 90]]
<class 'numpy.ndarray'>


The output of this code is two-dimensional array so that there are **2 axes**. To better understanding, look at the following image that represents ndarray we just created:






Blue arrow represents **the first axis that goes down along the rows** and starts at 0. The yellow arrow is the second axis with direction **across the columns** and also starts at 0.

If we print the shape of a_ndarray we can see that there are 2 dimensions and 3 elements in each dimension:

In [None]:
# Printing the shape of a_ndarray
a_ndarray.shape

(2, 3)

## 2.1 Slicing 2-D arrays

**image of 2-D array - accessing specific element**

# 3. Vectorization

- arithmetic operations on arrays

# 4. Broadcasting

# 5. Reshaping arrays

# 7. Universal functions

