<img src="../../../images/banners/data_processing.png" width="600"/>

# <img src="../../../../images/logos/python.png" width="23"/> NumPy

**Question:**  
Explain four top benefits of NumPy?

**Answer:**
1. **More speed:** NumPy uses algorithms written in C that complete in nanoseconds rather than seconds.
2. **Fewer loops:** NumPy helps you to reduce loops and keep from getting tangled up in iteration indices.
3. **Clearer code:** Without loops, your code will look more like the equations you’re trying to calculate.
4. **Better quality:** There are thousands of contributors working to keep NumPy fast, friendly, and bug free.   

---

**Question:**  
What does `ndarray` means?

**Answer:**  
An ndarray is a (usually fixed-size) multidimensional container of items of the same type and size.

---

**Question:**  
What is the use of `shape` in NumPy?

**Answer:**  
The number of dimensions and items in an array is defined by its shape, which is a tuple of N non-negative integers that specify the sizes of each dimension.

---

**Question:**   
How can you specify the type of items in NumPy?

**Answer:**  
The type of items in the array is specified by a data-type object `dtype`, one of which is associated with each ndarray.

---

**Question:**  
Is it possible an ndarray be a `view` to another ndarray?

**Answer:**  
Yes, different ndarrays can share the same data, so that changes made in one ndarray may be visible in another. So, an ndarray can be a “view” to another ndarray, and the data it is referring to is taken care of by the “base” ndarray.

---

**Question:**  
When you are creating an array from lists or tuples, what if the input consists of different data types? 

**Answer:**  
The `array` function will normally cast all input elements into the most suitable data type required for the array. For example, if a list contains both floats and integers, the resulting array will be of type float. If it contains an integer and a boolean, the resulting array will consist of integers.

---

**Question:**  
What's the meaning of `vectorization` in NumPy?

**Answer:**  
Vectorization is the process of performing the same operation in the same way for each element in an array. This removes `for` loops from your code but achieves the same result.

---

**Question:**  
Which NumPy operations are vectorized?

**Answer:**  
All NumPy operations are vectorized, where you apply operations to the whole array instead of on each element individually. This is not just neat and handy but also improves the performance of computation compared to using loops.

---

**Question:**  
Explain `broadcasting` in NumPy? What are the broadcasting rules?

**Answer:**  
The term broadcasting describes how NumPy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. There are, however, cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation.
Two dimensions are compatible when they are equal or one of them is 1.

---

**Question:**  
What's the difference between `C_CONTIGUOUS` flag and `F_CONTIGUOUS` flag of a NumPy array?

**Answer:**  
The difference between C and F is just whether the array is row major or column major (i.e. either row or column entries are stored in adjacent memory address). C_CONTIGUOUS means that operating row-rise on the array will be slightly quicker. F_CONTIGUOUS means that column-wise operations will be faster.

---

**Question:**  
What's the difference between `view` and `copy` of a NumPy array?

**Answer:**
A view is a reference of the original array modifying a view modifies the original array too. This is not true for copies.

---

**Question:**  
How can you determine whether two arrays are copies or views of each other?

**Answer:**  
By using `np.may_share_memory(array1, array2)` function in NumPy.

---

**Question:**  
What's the meaning of strides in `numpy.ndarray`?

**Answer:**  
Strides are the indexing scheme in NumPy arrays, and indicate the number of bytes to jump to find the next element.

---

**Question:**  
What's the output of the following code? Explain your answer.
```
import numpy as np
x = np.arange(24, dtype = np.int16).reshape(2, 3, 4)
x.strides
```
**Answer:**  
24, 8, 2  
The data type of this array is integer 16, which means each element in the array is an 16-bit integer (2-byte).
This means the elements in the first dimension are 24 bytes apart, and the array need to jump 24 bytes to find the next element in this dimension. The elements in the second dimension are 8 byte apart, jumping 8 byte to find the next element in this dimension. And the elements in the last dimension are 2 bytes apart.


---

**Question:**  
Can you guess what the output is? Explain your answer.
```
import numpy as np
x = np.array([
    [[5, 3, 7, 1, 2],
     [2, 6, 4, 6 ,3]],
    [[6, 1, 5, 1, 8],
     [4, 3, 2, 0, 9],]
])
print(x.max(axis=2))
```

**Answer:**  
```
[[7 6]
 [8 9]]
```
Shape of this array is (2, 2, 5). Method `x.max(axis=2)` collapse and delete the third dimension and result array has shape (2, 2) with each value in the new array equal to the max of the corresponding collapsed values.

---

**Question:**  
How many kinds of indexing there are in NumPy?

**Answer:**  
There are different kinds of indexing available depending on obj:
- Basic indexing
- Advanced indexing
- Field access

---

**Question:**  
NumPy basic slicing creates a view or a copy of the array?

**Answer:**  
NumPy basic slicing creates a view instead of a copy as in the case of built-in Python sequences such as string, tuple and list. Care must be taken when extracting a small portion from a large array which becomes useless after the extraction, because the small portion extracted contains a reference to the large original array whose memory will not be released until all arrays derived from it are garbage-collected. In such cases an explicit copy() is recommended.

---

**Question:**
What is the difference between `x[2]` and `x[0:2]` in the code below?
```
import numpy as np
x = np.array([
    [2, 0, 8, 4],
    [9, 4, 6, 8],
    [1, 3, 4, 6],
])
print(x[2])
print(x[0:2])
```
**Answer:**  
The output of `x[2]` (indexing) is the entire row with index 2 of array `x` that is equal to:
```
[1 3 4 6]
```
The output of `x[0:2]` (slicing) is a array with shape (2, 4) contains two rows of array `x` with index 0 and 1 that is equal to:
```
[[2 0 8 4]
 [9 4 6 8]]
```

---

**Question:**  
What's NumPy advanced indexing?

**Answer:**  
Advanced indexing is triggered when the selection object is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing:
- Integer
- Boolean

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).


---

**Question:**  
We have an array like below. What is the difference between `x[0, 2]` and `x[[0, 2]]`?
```
x = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90],
])
```
**Answer:**  
`x[0, 2]` is bacic indexing and the output is 30  
`x[[0, 2]]` is advanced indexing and returns a copy of row 0 and row 2 of array x:
```
[[10 20 30]
 [70 80 90]]
 ```

---

**Question:**  
Can You use `and`, `or` operators in advanced indexing?

**Answer:**  
No, because the `and` , `or`  operate on the truth value of the whole array, not element by element.  
You have to use `binary` operators such as `&`, `|`, `~` and etc.  NumPy designates them as the vectorized, element-wise operators to combine Booleans.


---

**Question:**  
Consider the following array. Calculate the `sum` of elements greater than 20 and less than 90.
```
x = np.array([
    [10, 20, 30],
    [40, 50, 60],
    [70, 80, 90],
])
```
**Answer:**  
```
x[(x > 20) & (x < 90)].sum()
```
result = 330

---

**Question:**  
What is the default axis in `np.sort()` and what is the difference between this function and `.sort()` method?

**Answer:**  
The default axis is `-1` which means the last axis (the innermost dimension). The difference is `np.sort()` function returns a copy sorted array but `.sort()` method sorts the array in-place.


---

**Question:**   
Consider following array. Explain what's the difference between`np.sort(x, axis=0)` and `np.sort(x, axis=1)`?
```
x = np.array([
    [20, 70, 60],
    [80, 50, 30],
    [40, 10, 90],
])
```
**Answer:**  
In output of `np.sort(x, axis=0)`, each column of the array still has all of its elements but they have been sorted low-to-high inside that column:
```
[[20 10 30]
 [40 50 60]
 [80 70 90]]
```
Similarly `np.sort(x, axis=1)` do the same for rows:
```
[[20 60 70]
 [30 50 80]
 [10 40 90]]
```


---

**Question:**  
What if you modify the first element of following array to `'working'`?
```
x = np.array(['help', 'name', 'book'])
x[0] = 'working'
```
**Answer:**  
Array x would be:
```
['work' 'name' 'book']
```
NumPy truncates `'working'`, gets four characters and the rest get lost in the void.


---

**Question:**  
Create a numpy array named data contains following information to `print(data['Age'].mean())` returns 22.
```
Reza, 19
Amir, 25
Hamed, 22
```
**Answer:**
```
data = np.array([
    ('Reza', 19),
    ('Amir', 25),
    ('Hamed', 22),
],
dtype = {
    'names': ('Name', 'Age'),
    'formats': ('U10', 'i2'),
})
print(data['Age'].mean())
```

---

**Question:**  
How can you store a Numpy array into a file as a text or csv file?


**Answer:**  
You can use the `np.savetxt()` function to handle the exporting. This function accepts:
- first argument: exported file path
- second argument: array name
- third argument: context formating using the `fmt` keyword


---

**Question:**  
How can you read a text or csv file as a numpy array?

**Answer:**  
You can use `np.genfromtxt()` function and pass the followin to it:
- file path
- dtype
- delimiter
- and etc...  

This functions returns a numpy array.

---

**Question:**  
How can you save or load numpy arrays as numpy binary files?


**Answer:**  
In NumPy, the `load()`, `save()`, `savez()`, and `savez_compressed()` functions help you to load and save NumPy binary files.  
NumPy binary files can store information about an array, even when you open the file on another machine with a different architecture. You can store your array shape and data type as well.

---

**Question:**  
What is the difference between `save()` and `savez()` function in NumPy? What should be the extension of the file?

**Answer:**  
By using `save()` you can save one NumPy array and the file extension should be `.npy`, whereas by using `savez()` you can save multiple arrays in the file with extension `.npz`

---