![Ironhack logo](https://i.imgur.com/1QgrNNw.png)

# Lab | Numpy

## Introduction

An important ability of a data scientist/data engineer is to know where and how to find information that helps you to accomplish your work. In the exercise, you will both practice the Numpy features we discussed in the lesson and learn new features by looking up documentations and references. You will work on your own but remember the teaching staff is at your service whenever you encounter problems.

## Getting Started
There are a bunch of comments which instruct what you are supposed to do step by step. Follow the order of the instructions from top to bottom. Read each instruction carefully and provide your answer beneath it. You should also test your answers to make sure your responses are correct. If one of your responses is incorrect, you may not be able to proceed because later responses may depend upon previous responses.


## Resources

Some of the questions in the assignment are not covered in our lesson. You will learn how to efficiently look up the information on your own. Below are some resources you can find the information you need.

[Numpy User Guide](https://docs.scipy.org/doc/numpy/user/index.html)

[Numpy Reference](https://docs.scipy.org/doc/numpy/reference/)

[Google Search](https://www.google.com/search?q=how+to+use+numpy)



# Intrduction to NumPy


#### 1. Import NumPy under the name np.

In [1]:
import numpy as np

#### 2. Print your NumPy version.

In [2]:
np.__version__

'1.18.5'

#### 3. Generate a 3x2x5 3-dimensional array with random values. Assign the array to variable *a*.
**Challenge**: there are at least three easy ways that use numpy to generate random arrays. How many ways can you find?

**Example of output**:
````python
[[[0.29932768, 0.85812686, 0.75266145, 0.09278988, 0.78358352],
  [0.13437453, 0.65695946, 0.82047594, 0.09764179, 0.52230096]],
 
 [[0.54248247, 0.06431281, 0.65902257, 0.92736679, 0.3302839 ],
  [0.86867236, 0.33960592, 0.62295821, 0.74563567, 0.24351584]],
 
 [[0.21276812, 0.06917533, 0.35106591, 0.82273425, 0.7910178 ],
  [0.37768961, 0.56107736, 0.99965953, 0.97615549, 0.2445537 ]]]
````

In [3]:
# Method 1
a = np.random.rand(3,2,5)
print(a)

[[[0.14453707 0.11287009 0.98331048 0.53629646 0.14255057]
  [0.77327446 0.12792291 0.05871884 0.55537681 0.48584454]]

 [[0.2005378  0.59577668 0.10076208 0.69454724 0.26137859]
  [0.15572503 0.78438786 0.66630507 0.31064556 0.69286042]]

 [[0.49279237 0.05788412 0.64647295 0.05571305 0.30116672]
  [0.30804177 0.86370911 0.29374553 0.91956778 0.48696456]]]


In [4]:
# Method 2
a = np.random.random((3,2,5))
print(a)

[[[0.28678819 0.15849851 0.03691974 0.68588817 0.76493805]
  [0.68421736 0.64491295 0.84747222 0.04795944 0.58410382]]

 [[0.7128477  0.28465903 0.26336085 0.00449995 0.70468663]
  [0.26397474 0.46333672 0.82160001 0.24366319 0.00492096]]

 [[0.62675959 0.60809218 0.5358674  0.23083134 0.68045263]
  [0.08606731 0.2803108  0.62730471 0.28793762 0.68536936]]]


In [5]:
# Method 3
a = np.random.ranf((3,2,5))
print(a)

[[[0.20802047 0.09311642 0.7751273  0.59263683 0.86794075]
  [0.56336702 0.17111355 0.11889252 0.64445008 0.51973141]]

 [[0.36214574 0.90385213 0.89303767 0.85033592 0.77939822]
  [0.45294685 0.87405857 0.21890982 0.89352095 0.97012714]]

 [[0.44316829 0.78293822 0.25357848 0.57778661 0.43853764]
  [0.16080295 0.90409346 0.47812449 0.88177019 0.51785705]]]


#### 4. Print *a*.


In [6]:
# your code here
print(a)

[[[0.20802047 0.09311642 0.7751273  0.59263683 0.86794075]
  [0.56336702 0.17111355 0.11889252 0.64445008 0.51973141]]

 [[0.36214574 0.90385213 0.89303767 0.85033592 0.77939822]
  [0.45294685 0.87405857 0.21890982 0.89352095 0.97012714]]

 [[0.44316829 0.78293822 0.25357848 0.57778661 0.43853764]
  [0.16080295 0.90409346 0.47812449 0.88177019 0.51785705]]]


#### 5. Create a 5x2x3 3-dimensional array with all values equaling 1. Assign the array to variable *b*.

Expected output:

````python
      [[[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]],

       [[1, 1, 1],
        [1, 1, 1]]]
````

In [7]:
b = np.ones((5,2,3))

#### 6. Print *b*.


In [8]:
print(b)

[[[1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]]]


#### 7. Do *a* and *b* have the same size? How do you prove that in Python code?

In [11]:
a.shape == b.shape

False

#### 8. Are you able to add *a* and *b*? Why or why not?


In [12]:
#they don't have the same shape
a + b

ValueError: operands could not be broadcast together with shapes (3,2,5) (5,2,3) 

#### 9. Reshape *b* so that it has the same structure of *a* (i.e. become a 3x2x5 array). Assign the reshaped array to variable *c*.

Expected output:

````python
      [[[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]],

       [[1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1]]]
````

In [13]:
c = b.T

#### 10. Try to add *a* and *c*. Now it should work. Assign the sum to variable *d*. But why does it work now?

In [14]:
#It works because they have the same shape

In [15]:
#Method 1
d = a + c
print(d)

[[[1.20802047 1.09311642 1.7751273  1.59263683 1.86794075]
  [1.56336702 1.17111355 1.11889252 1.64445008 1.51973141]]

 [[1.36214574 1.90385213 1.89303767 1.85033592 1.77939822]
  [1.45294685 1.87405857 1.21890982 1.89352095 1.97012714]]

 [[1.44316829 1.78293822 1.25357848 1.57778661 1.43853764]
  [1.16080295 1.90409346 1.47812449 1.88177019 1.51785705]]]


In [16]:
#Method 2
d = np.add(a, c)
print(d)

[[[1.20802047 1.09311642 1.7751273  1.59263683 1.86794075]
  [1.56336702 1.17111355 1.11889252 1.64445008 1.51973141]]

 [[1.36214574 1.90385213 1.89303767 1.85033592 1.77939822]
  [1.45294685 1.87405857 1.21890982 1.89352095 1.97012714]]

 [[1.44316829 1.78293822 1.25357848 1.57778661 1.43853764]
  [1.16080295 1.90409346 1.47812449 1.88177019 1.51785705]]]


#### 11. Print *a* and *d*. Notice the difference and relation of the two array in terms of the values? Explain.

In [58]:
print(a)

[[[0.37951599 0.91838319 0.16640135 0.06400807 0.2388544 ]
  [0.1412736  0.9538947  0.57195216 0.41294524 0.43271428]]

 [[0.21645545 0.57996177 0.53173609 0.68272013 0.27795107]
  [0.75465852 0.16205901 0.86576937 0.69167078 0.84476151]]

 [[0.04140834 0.87603687 0.71631573 0.56132983 0.16406292]
  [0.38237493 0.85141742 0.47725662 0.87265803 0.48147993]]]


In [59]:
print(d)

[[[1.37951599 1.91838319 1.16640135 1.06400807 1.2388544 ]
  [1.1412736  1.9538947  1.57195216 1.41294524 1.43271428]]

 [[1.21645545 1.57996177 1.53173609 1.68272013 1.27795107]
  [1.75465852 1.16205901 1.86576937 1.69167078 1.84476151]]

 [[1.04140834 1.87603687 1.71631573 1.56132983 1.16406292]
  [1.38237493 1.85141742 1.47725662 1.87265803 1.48147993]]]


In [60]:
#the values in d are equal to the a values + 1 in the corresponding positions

#### 12. Multiply *a* and *c*. Assign the result to *e*.

In [17]:
# Method 1
e = a * c

In [18]:
# Method 2
e = np.multiply(a, c)

In [19]:
print(e)

[[[0.20802047 0.09311642 0.7751273  0.59263683 0.86794075]
  [0.56336702 0.17111355 0.11889252 0.64445008 0.51973141]]

 [[0.36214574 0.90385213 0.89303767 0.85033592 0.77939822]
  [0.45294685 0.87405857 0.21890982 0.89352095 0.97012714]]

 [[0.44316829 0.78293822 0.25357848 0.57778661 0.43853764]
  [0.16080295 0.90409346 0.47812449 0.88177019 0.51785705]]]


#### 13. Does *e* equal to *a*? Why or why not?


In [20]:
# yes, array 'e' is equal to 'a' because they have equal values and shape
#since all the values in c were 1 the multiplication results are the same
e == a

array([[[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]],

       [[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]],

       [[ True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True]]])

In [21]:
print(a.shape)
print(e.shape)

(3, 2, 5)
(3, 2, 5)


#### 14. Identify the max, min, and mean values in *d*. Assign those values to variables *d_max*, *d_min* and *d_mean*.

In [22]:
d_max = d.max()
print(d_max)

1.97012714232079


In [23]:
d_min = d.min()
print(d_min)

1.0931164241970919


In [24]:
d_mean = d.mean()
print(d_mean)

1.5730462244769516


#### 15. Now we want to label the values in *d*. First create an empty array *f* with the same shape (i.e. 3x2x5) as *d* using `np.empty`.


In [25]:
f = np.empty([3,2,5])

In [26]:
print(f)

[[[0.20802047 0.09311642 0.7751273  0.59263683 0.86794075]
  [0.56336702 0.17111355 0.11889252 0.64445008 0.51973141]]

 [[0.36214574 0.90385213 0.89303767 0.85033592 0.77939822]
  [0.45294685 0.87405857 0.21890982 0.89352095 0.97012714]]

 [[0.44316829 0.78293822 0.25357848 0.57778661 0.43853764]
  [0.16080295 0.90409346 0.47812449 0.88177019 0.51785705]]]


#### 16. Populate the values in *f*. 

For each value in *d*, if it's larger than *d_min* but smaller than *d_mean*, assign 25 to the corresponding value in *f*. If a value in *d* is larger than *d_mean* but smaller than *d_max*, assign 75 to the corresponding value in *f*. If a value equals to *d_mean*, assign 50 to the corresponding value in *f*. Assign 0 to the corresponding value(s) in *f* for *d_min* in *d*. Assign 100 to the corresponding value(s) in *f* for *d_max* in *d*. In the end, f should have only the following values: 0, 25, 50, 75, and 100.

**Note**: you don't have to use Numpy in this question.

In [27]:
for i in range(3):
    for j in range(2):
        for k in range(5):

            if d[i,j,k]>d_min and d[i,j,k]<d_mean:
                f[i,j,k]=25
            elif d[i,j,k]>d_mean and d[i,j,k]<d_max:
                f[i,j,k]=75
            elif d[i,j,k]==d_mean:
                f[i,j,k]=50
            elif d[i,j,k]==d_min:
                f[i,j,k]=0
            elif d[i,j,k]==d_max:
                f[i,j,k]=100

print(f)

[[[ 25.   0.  75.  75.  75.]
  [ 25.  25.  25.  75.  25.]]

 [[ 25.  75.  75.  75.  75.]
  [ 25.  75.  25.  75. 100.]]

 [[ 25.  75.  25.  75.  25.]
  [ 25.  75.  25.  75.  25.]]]


#### 17. Print *d* and *f*. Do you have your expected *f*?
For instance, if your *d* is:
```python
[[[1.85836099, 1.67064465, 1.62576044, 1.40243961, 1.88454931],
  [1.75354326, 1.69403643, 1.36729252, 1.61415071, 1.12104981]],

[[1.72201435, 1.1862918 , 1.87078449, 1.7726778 , 1.88180042],
  [1.44747908, 1.31673383, 1.02000951, 1.52218947, 1.97066381]],

[[1.79129243, 1.74983003, 1.96028037, 1.85166831, 1.65450881],
 [1.18068344, 1.9587381 , 1.00656599, 1.93402165, 1.73514584]]]
```
Your *f* should be:
```python
[[[ 75.  75.  75.  25.  75.]
  [ 75.  75.  25.  25.  25.]]

 [[ 75.  25.  75.  75.  75.]
  [ 25.  25.  25.  25. 100.]]

 [[ 75.  75.  75.  75.  75.]
  [ 25.  75.   0.  75.  75.]]]
```

In [28]:
print(d)

[[[1.20802047 1.09311642 1.7751273  1.59263683 1.86794075]
  [1.56336702 1.17111355 1.11889252 1.64445008 1.51973141]]

 [[1.36214574 1.90385213 1.89303767 1.85033592 1.77939822]
  [1.45294685 1.87405857 1.21890982 1.89352095 1.97012714]]

 [[1.44316829 1.78293822 1.25357848 1.57778661 1.43853764]
  [1.16080295 1.90409346 1.47812449 1.88177019 1.51785705]]]


In [29]:
print(f)

[[[ 25.   0.  75.  75.  75.]
  [ 25.  25.  25.  75.  25.]]

 [[ 25.  75.  75.  75.  75.]
  [ 25.  75.  25.  75. 100.]]

 [[ 25.  75.  25.  75.  25.]
  [ 25.  75.  25.  75.  25.]]]


#### 18. Bonus question: instead of using numbers (i.e. 0, 25, 50, 75, and 100), use string values  ("A", "B", "C", "D", and "E") to label the array elements. For the example above, the expected result is:

```python
[[['D' 'D' 'D' 'B' 'D']
  ['D' 'D' 'B' 'B' 'B']]

 [['D' 'B' 'D' 'D' 'D']
  ['B' 'B' 'B' 'B' 'E']]

 [['D' 'D' 'D' 'D' 'D']
  ['B' 'D' 'A' 'D' 'D']]]
```
**Note**: you don't have to use Numpy in this question.

In [30]:
f = f.astype(str)
for i in range(3):
    for j in range(2):
        for k in range(5):

            if d[i,j,k]>d_min and d[i,j,k]<d_mean:
                f[i,j,k]="B"
            elif d[i,j,k]>d_mean and d[i,j,k]<d_max:
                f[i,j,k]="D"
            elif d[i,j,k]==d_mean:
                f[i,j,k]="C"
            elif d[i,j,k]==d_min:
                f[i,j,k]="A"
            elif d[i,j,k]==d_max:
                f[i,j,k]="E"

print(f) 

[[['B' 'A' 'D' 'D' 'D']
  ['B' 'B' 'B' 'D' 'B']]

 [['B' 'D' 'D' 'D' 'D']
  ['B' 'D' 'B' 'D' 'E']]

 [['B' 'D' 'B' 'D' 'B']
  ['B' 'D' 'B' 'D' 'B']]]
