![Ironhack logo](https://i.imgur.com/1QgrNNw.png)

# Lab | Numpy

## Introduction

An important ability of a data scientist/data engineer is to know where and how to find information that helps you to accomplish your work. In the exercise, you will both practice the Numpy features we discussed in the lesson and learn new features by looking up documentations and references. You will work on your own but remember the teaching staff is at your service whenever you encounter problems.

## Getting Started
There are a bunch of comments which instruct what you are supposed to do step by step. Follow the order of the instructions from top to bottom. Read each instruction carefully and provide your answer beneath it. You should also test your answers to make sure your responses are correct. If one of your responses is incorrect, you may not be able to proceed because later responses may depend upon previous responses.


## Resources

Some of the questions in the assignment are not covered in our lesson. You will learn how to efficiently look up the information on your own. Below are some resources you can find the information you need.

[Numpy User Guide](https://docs.scipy.org/doc/numpy/user/index.html)

[Numpy Reference](https://docs.scipy.org/doc/numpy/reference/)

[Google Search](https://www.google.com/search?q=how+to+use+numpy)



# Intrduction to NumPy


#### 1. Import NumPy under the name np.

In [1]:
# your code here
import numpy as np

#### 2. Print your NumPy version.

In [2]:
# your code here
np.__version__

'1.19.1'

#### 3. Generate a 3x2x5 3-dimensional array with random values. Assign the array to variable *a*.
**Challenge**: there are at least three easy ways that use numpy to generate random arrays. How many ways can you find?

**Example of output**:
````python
[[[0.29932768, 0.85812686, 0.75266145, 0.09278988, 0.78358352],
  [0.13437453, 0.65695946, 0.82047594, 0.09764179, 0.52230096]],
 
 [[0.54248247, 0.06431281, 0.65902257, 0.92736679, 0.3302839 ],
  [0.86867236, 0.33960592, 0.62295821, 0.74563567, 0.24351584]],
 
 [[0.21276812, 0.06917533, 0.35106591, 0.82273425, 0.7910178 ],
  [0.37768961, 0.56107736, 0.99965953, 0.97615549, 0.2445537 ]]]
````

In [3]:
# Method 1
# Generate the array with float point numbers in the range [0.0, 1.0)
# Using np.random.random() function

a = np.random.random(size=(3,2,5))
print(a)

[[[6.13240020e-01 7.22369569e-01 8.75753254e-01 1.17610276e-01
   4.08166233e-01]
  [4.41100465e-01 2.80905635e-01 4.99977197e-01 2.16496573e-02
   9.62359339e-01]]

 [[7.59179811e-01 9.47942411e-01 1.44397259e-02 9.93037920e-01
   6.47399029e-01]
  [3.29547199e-01 7.50172740e-04 9.89470699e-01 9.46979562e-01
   9.27380668e-01]]

 [[9.52761172e-01 7.79216215e-01 3.14175145e-02 9.46598210e-02
   2.36610977e-01]
  [8.34632405e-01 8.75291727e-01 9.63961540e-01 9.54497029e-01
   4.33739588e-01]]]


In [4]:
# Method 2
# Generate the array with float point numbers in the range [0.0, 1.0)
# Using list comprehension

a = np.array([[[np.random.random() for column in range(5)] for row in range(2)] for number in range(3)])
print(a)

[[[0.02286161 0.68810213 0.56222181 0.99359585 0.35357002]
  [0.0990899  0.43146345 0.67346489 0.70885427 0.24013971]]

 [[0.13003288 0.35316504 0.5128444  0.10554247 0.87582265]
  [0.0459152  0.29320192 0.92999226 0.5381321  0.49562079]]

 [[0.32637998 0.92762941 0.41897758 0.04979319 0.67732049]
  [0.51242202 0.47884974 0.50698744 0.72175004 0.00637232]]]


In [5]:
# Method 3
# Generate the array with float point numbers in the range [0.0, 1.0)
# Using reshape

a = np.reshape(np.random.random(30), (3, 2, 5))
print(a)

[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]


#### 4. Print *a*.


In [6]:
# your code here

print(a)

[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]


#### 5. Create a 5x2x3 3-dimensional array with all values equaling 1. Assign the array to variable *b*.

Expected output:

````python
      [[[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]]]
````

In [7]:
# your code here

b = np.ones([5, 2, 3], dtype=int)

#### 6. Print *b*.


In [8]:
# your code here

print(b)

[[[1 1 1]
  [1 1 1]]

 [[1 1 1]
  [1 1 1]]

 [[1 1 1]
  [1 1 1]]

 [[1 1 1]
  [1 1 1]]

 [[1 1 1]
  [1 1 1]]]


#### 7. Do *a* and *b* have the same size? How do you prove that in Python code?

In [9]:
# your code here

# The shape attribute returns the a tuple with the size of a array.

# Size of array 'a'
print(f'The size of the array "a" is {a.shape}.')
print(f'The size of the array "b" is {b.shape}.')

# To prove if they have the same size, it is necessary to compare the sizes of both array
print(f'The arrays "a" and "b" have the same size? {a.shape == b.shape}')

The size of the array "a" is (3, 2, 5).
The size of the array "b" is (5, 2, 3).
The arrays "a" and "b" have the same size? False


#### 8. Are you able to add *a* and *b*? Why or why not?


In [10]:
# your answer here

# sum_a_b = a + b

# Since the arrys 'a' and 'b' have different sizes (it was confirmed above), there is no way to add them together.

#### 9. Reshape *b* so that it has the same structure of *a* (i.e. become a 3x2x5 array). Assign the reshaped array to variable *c*.

Expected output:

````python
      [[[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]]]
````

In [11]:
# your code here

c = np.reshape(b, (3, 2, 5))
print(c)

[[[1 1 1 1 1]
  [1 1 1 1 1]]

 [[1 1 1 1 1]
  [1 1 1 1 1]]

 [[1 1 1 1 1]
  [1 1 1 1 1]]]


#### 10. Try to add *a* and *c*. Now it should work. Assign the sum to variable *d*. But why does it work now?

In [12]:
# your code/answer here

d = a + c

print('Now it is possible to add them together, because both arrays have the same size (3x2x5), since "c" stores', end=' ')
print('a reshaped array of "b" with the size 3x2x5.')

Now it is possible to add them together, because both arrays have the same size (3x2x5), since "c" stores a reshaped array of "b" with the size 3x2x5.


#### 11. Print *a* and *d*. Notice the difference and relation of the two array in terms of the values? Explain.

In [13]:
# your code/answer here

print(f'Array "a":\n{a}')
print(f'\nArray "d":\n{d}')

print('\nEach element of the array "d" is greater exactly by one of its corresponding element on the array "a",', end=' ')
print('since 1 (each element of "c") was added to its corresponding element of "a".')

Array "a":
[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]

Array "d":
[[[1.12348241 1.50527688 1.49271015 1.87517456 1.20747976]
  [1.75630427 1.1649355  1.31186828 1.9677737  1.28869434]]

 [[1.4053397  1.23078235 1.312252   1.16574213 1.36737237]
  [1.30315092 1.641853   1.41816749 1.66189076 1.70597047]]

 [[1.70795566 1.72534775 1.25617834 1.98354511 1.70723502]
  [1.65958254 1.18435292 1.121184   1.64998462 1.04752008]]]

Each element of the array "d" is greater exactly by one of its corresponding element on the array "a", since 1 (each element of "c") was added to its corresponding element of "a".


#### 12. Multiply *a* and *c*. Assign the result to *e*.

In [14]:
# your code here

e = a * c

#### 13. Does *e* equal to *a*? Why or why not?


In [15]:
# your code/answer here

# Print arrays 'a' and 'e' on the screen
print(f'Array "a":\n{a}')
print(f'\nArray "e":\n{e}')

# Check if arrays 'a' and 'e' are equal
if np.count_nonzero((e == a) == True) == e.size == a.size:
    print('\nThe array "e" EQUALS to the arrray "a".')
else:
    print('\nThe array "e" DOES NOT equal to the array "a".')

# Reason for the array 'e' and 'a' be the same
print('The multiplication of 3-D arrays using Numpy is not the same as the concept of matrix multiplication.', end=' ')
print('Using Numpy, each element multiplies by its corresponding element of the other array. So, since all', end=' ')
print('elements of the array "c" are one, multiplying the array "a" to the "c", the result should be equal to', end=' ')
print('the array "a".')

Array "a":
[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]

Array "e":
[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]

The array "e" EQUALS to the arrray "a".
The multiplication of 3-D arrays using Numpy is not the same as the concept of matrix multiplication. Using Numpy, each element multiplies by its corresponding element of the other array. So, since all elements of th

#### 14. Identify the max, min, and mean values in *d*. Assign those values to variables *d_max*, *d_min* and *d_mean*.

In [16]:
# your code here

# Maximum value in 'd'
d_max = d.max()
print(f'The maximum value in the array "d" is {d_max}.')

# Minimum value in 'd'
d_min = d.min()
print(f'The minimum value in the array "d" is {d_min}.')

# Mean value in 'd'
d_mean = d.mean()
print(f'The mean of all elements of the array "d" is {d_mean}.')

The maximum value in the array "d" is 1.9835451057363715.
The minimum value in the array "d" is 1.0475200808329495.
The mean of all elements of the array "d" is 1.4649702366749584.


#### 15. Now we want to label the values in *d*. First create an empty array *f* with the same shape (i.e. 3x2x5) as *d* using `np.empty`.


In [17]:
# your code here

f = np.empty([3, 2, 5])
print(f)

[[[0.12348241 0.50527688 0.49271015 0.87517456 0.20747976]
  [0.75630427 0.1649355  0.31186828 0.9677737  0.28869434]]

 [[0.4053397  0.23078235 0.312252   0.16574213 0.36737237]
  [0.30315092 0.641853   0.41816749 0.66189076 0.70597047]]

 [[0.70795566 0.72534775 0.25617834 0.98354511 0.70723502]
  [0.65958254 0.18435292 0.121184   0.64998462 0.04752008]]]


#### 16. Populate the values in *f*. 

For each value in *d*, if it's larger than *d_min* but smaller than *d_mean*, assign 25 to the corresponding value in *f*. If a value in *d* is larger than *d_mean* but smaller than *d_max*, assign 75 to the corresponding value in *f*. If a value equals to *d_mean*, assign 50 to the corresponding value in *f*. Assign 0 to the corresponding value(s) in *f* for *d_min* in *d*. Assign 100 to the corresponding value(s) in *f* for *d_max* in *d*. In the end, f should have only the following values: 0, 25, 50, 75, and 100.

**Note**: you don't have to use Numpy in this question.

In [18]:
# your code here

# Populating the array 'f'
condlist = [(d > d_min) & (d < d_mean), (d > d_mean) & (d < d_max), d == d_mean, d == d_min, d == d_max]
choicelist = [25, 75, 50, 0, 100]

#### 17. Print *d* and *f*. Do you have your expected *f*?
For instance, if your *d* is:
```python
[[[1.85836099, 1.67064465, 1.62576044, 1.40243961, 1.88454931],
  [1.75354326, 1.69403643, 1.36729252, 1.61415071, 1.12104981]],

[[1.72201435, 1.1862918 , 1.87078449, 1.7726778 , 1.88180042],
  [1.44747908, 1.31673383, 1.02000951, 1.52218947, 1.97066381]],

[[1.79129243, 1.74983003, 1.96028037, 1.85166831, 1.65450881],
 [1.18068344, 1.9587381 , 1.00656599, 1.93402165, 1.73514584]]]
```
Your *f* should be:
```python
[[[ 75.  75.  75.  25.  75.]
  [ 75.  75.  25.  25.  25.]]

 [[ 75.  25.  75.  75.  75.]
  [ 25.  25.  25.  25. 100.]]

 [[ 75.  75.  75.  75.  75.]
  [ 25.  75.   0.  75.  75.]]]
```

In [19]:
# your code here

# Print the array 'd'
print(f'Array "d"\n{d}')

# Print the array 'f'
print(f'\nArray "f"\n{np.select(condlist, choicelist)}')

Array "d"
[[[1.12348241 1.50527688 1.49271015 1.87517456 1.20747976]
  [1.75630427 1.1649355  1.31186828 1.9677737  1.28869434]]

 [[1.4053397  1.23078235 1.312252   1.16574213 1.36737237]
  [1.30315092 1.641853   1.41816749 1.66189076 1.70597047]]

 [[1.70795566 1.72534775 1.25617834 1.98354511 1.70723502]
  [1.65958254 1.18435292 1.121184   1.64998462 1.04752008]]]

Array "f"
[[[ 25  75  75  75  25]
  [ 75  25  25  75  25]]

 [[ 25  25  25  25  25]
  [ 25  75  25  75  75]]

 [[ 75  75  25 100  75]
  [ 75  25  25  75   0]]]


#### 18. Bonus question: instead of using numbers (i.e. 0, 25, 50, 75, and 100), use string values  ("A", "B", "C", "D", and "E") to label the array elements. For the example above, the expected result is:

```python
[[['D' 'D' 'D' 'B' 'D']
  ['D' 'D' 'B' 'B' 'B']]

 [['D' 'B' 'D' 'D' 'D']
  ['B' 'B' 'B' 'B' 'E']]

 [['D' 'D' 'D' 'D' 'D']
  ['B' 'D' 'A' 'D' 'D']]]
```
**Note**: you don't have to use Numpy in this question.

In [20]:
# your code here

# Changing [0, 25, 50, 75, 100] to ['A', 'B', 'C', 'D', 'E']
condlist = [(d > d_min) & (d < d_mean), (d > d_mean) & (d < d_max), d == d_mean, d == d_min, d == d_max]
choicelist = ['B', 'D', 'C', 'A', 'E']

# Print the array 'd'
print(f'Array "d"\n{d}')

# Print the array 'f'
print(f'\nArray "f"\n{np.select(condlist, choicelist)}')

Array "d"
[[[1.12348241 1.50527688 1.49271015 1.87517456 1.20747976]
  [1.75630427 1.1649355  1.31186828 1.9677737  1.28869434]]

 [[1.4053397  1.23078235 1.312252   1.16574213 1.36737237]
  [1.30315092 1.641853   1.41816749 1.66189076 1.70597047]]

 [[1.70795566 1.72534775 1.25617834 1.98354511 1.70723502]
  [1.65958254 1.18435292 1.121184   1.64998462 1.04752008]]]

Array "f"
[[['B' 'D' 'D' 'D' 'B']
  ['D' 'B' 'B' 'D' 'B']]

 [['B' 'B' 'B' 'B' 'B']
  ['B' 'D' 'B' 'D' 'D']]

 [['D' 'D' 'B' 'E' 'D']
  ['D' 'B' 'B' 'D' 'A']]]


## Additional Challenges for the Nerds

If you are way ahead of your classmates and willing to accept some tough challenges about Numpy, take one or several of the following Codewar *katas*. 	You need to already possess a good amount of knowledge in Python and statistics because you will need to write Python functions, do loops, write conditionals, and deal with matrices.

* [Insert dashes](https://www.codewars.com/kata/insert-dashes)
* [Thinkful - Logic Drills: Red and bumpy](https://www.codewars.com/kata/thinkful-logic-drills-red-and-bumpy)