---   

<h1 align="center">Introduction to Data Analyst and Data Science for beginners</h1>
<h1 align="center">Lecture no 2.7(NumPy-07)</h1>

---
<h3><div align="right">Ehtisham Sadiq</div></h3>    

<img align="left" width="500" height="500"  src="images/concatfunctions.png" >

## Recap:
- In last lectures , We have discussed introduction of numpy, different methods to create numpy arrays, difference between numpy array and lists, basic operations of numpy,indexing and slicing of numpy arrays, broadcasting and reshaping of numpy arrays, and methods of array manipulation.

# Learning agenda of this notebook
1. Concatenating of NumPy Arrays
2. Stacking of NumPy Arrays
3. Splitting of NumPy Arrays
4. Bonus:
   - Numpy Normal (Gaussian) Distribution (Numpy Random Normal)

## 1. Concatenating of NumPy Arrays

<img align="right" width="500" height="200"  src="images/concataxis1.png" >
<img align="left" width="400" height="400"  src="images/concataxis0.png" > 

## 2. Stacking of NumPy Arrays

<img align="left" width="350" height="350"  src="images/hs1.png" > 
<img align="right" width="350" height="350"  src="images/vs1.png" > 

## 3. Splitting NumPy Arrays

<img align="center" width="400" height="300"  src="images/splitting.png" > 

In [None]:
# To install this library in Jupyter notebook
#import sys
#!{sys.executable} -m pip install numpy

In [1]:
import numpy as np
np.__version__ , np.__path__

('1.22.3', ['/home/dell/.local/lib/python3.8/site-packages/numpy'])

## 1. Concatenating NumPy Arrays
<img align="right" width="400" height="200"  src="images/concataxis1.png" >
<img align="left" width="500" height="400"  src="images/concataxis0.png" > 

<br><br><br><br><br><br><br><br><br><br>

The `np.concatenate()` method is used to join arrays with respect to given axis.

```
np.concatenate(tup, axis=0)
```

- Where `tup` is comma separated ndarrays
- If axis is 0, it will join the arrays by row-wise (vertically). For 2-D arrays, the number of columns must match.
- If axis is 1, it will join the arrays by column-wise (horizontally). For 2-D arrays, the number of rows must match.
- For 1-D arrays, the arrays can be of any size/length.
- The original arrays remains as such, as it does not occur in-place.

**Example:** Concatenate two 1-D Arrays along axis = 0 (row wise). The 1-D arrays can be of any size/length.

In [2]:
import numpy as np
arr1 = np.random.randint(low = 1, high = 100, size = 5)
arr2 = np.random.randint(low = 1, high = 100, size = 3)
print("arr1 = ", arr1)
print("arr2 = ", arr2)

arr3 = np.concatenate((arr1, arr2))
arr3 = np.concatenate((arr1, arr2), axis=0)
print("\nnp.concatenate((arr1,arr2)) = ", arr3)

arr1 =  [88 84 87 50  2]
arr2 =  [84 84 97]

np.concatenate((arr1,arr2)) =  [88 84 87 50  2 84 84 97]


You cannot concatenate 1-D arrays on `axis=1`, as it do not exist :)

**Example:** Concatenate two 2-D Arrays along `axis=0` (vertically/row-wise). The number of columns of two arrays must match.

In [4]:
arr1 = np.random.randint(low = 1, high = 10, size = (2,3))
arr2 = np.random.randint(low = 1, high = 10, size = (3,3))
print("arr1 = \n", arr1)
print("arr2 = \n", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)


arr3 = np.concatenate((arr1, arr2), axis=0)
print("\nnp.concatenate((arr1,arr2)) = \n", arr3)

arr1 = 
 [[1 4 6]
 [2 3 6]]
arr2 = 
 [[5 7 1]
 [7 6 4]
 [1 5 7]]
arr1.shape = 
 (2, 3)
arr2.shape = 
 (3, 3)

np.concatenate((arr1,arr2)) = 
 [[1 4 6]
 [2 3 6]
 [5 7 1]
 [7 6 4]
 [1 5 7]]


**Example:** Concatenate two 2-D Arrays along `axis=1` (horizontally/column-wise). The number of rows of two arrays must match.

In [5]:
arr1 = np.random.randint(low = 1, high = 10, size = (2,2))
arr2 = np.random.randint(low = 1, high = 10, size = (2,3))
print("arr1 = \n", arr1)
print("arr2 = \n", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)


arr3 = np.concatenate((arr1, arr2), axis=1)
print("\nnp.concatenate((arr1,arr2)) = \n", arr3)

arr1 = 
 [[2 7]
 [2 3]]
arr2 = 
 [[7 9 6]
 [1 4 4]]
arr1.shape = 
 (2, 2)
arr2.shape = 
 (2, 3)

np.concatenate((arr1,arr2)) = 
 [[2 7 7 9 6]
 [2 3 1 4 4]]


## 2. Stacking NumPy Arrays
- Concatenating joins a sequence of arrays along an existing axis, and stacking joins a sequence of arrays along existing as well as along a new axis.
- We can perform stacking along three dimensions:
    - `np.vstack()` : it performs vertical stacking along the rows.
    - `np.hstack()` : it performs horizontal stacking along with the columns.
    - `np.dstack()` : it performs in-depth stacking along a new third axis (depth).

**Note:** 
- `numpy.stack()` is the most general of the three methods, offering an axis parameter for specifying which way to put the arrays together.
- `np.column_stack()` is used to stack 1-D arrays as columns into 2-D array.
- `np.row_stack()` is used to stack 1-D arrays as rows into 2-D array.

<img align="right" width="250" height="100"  src="images/vstackfinal.png" > 

### a. Use `np.vstack()` for Row-Wise Concatenation
The `np.vstack()` method is used to stack arrays vertically or row-wise.

```
np.vstack(tup)
```

- Where `tup` is comma separated ndarrays
- 1-D arrays must have the same size/length, while for 2-D arrays, the number of columns must match.
- It returns an ndarray formed by stacking the given arrays, will be at least 2-D.
- The original arrays remains as such, as it does not occur in-place.

**Example:** Perform vertical stacking of two 1-D Arrays, which must have the same size/length.

In [6]:
import numpy as np
arr1 = np.random.randint(low = 1, high = 10, size = 4)
arr2 = np.random.randint(low = 1, high = 10, size = 4)

print("arr1 = ", arr1)
print("arr2 = ", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)
  
arr3 = np.vstack((arr1, arr2))

print ("\nnp.vstack((arr1, arr2)):\n ", arr3)

arr1 =  [7 5 8 4]
arr2 =  [6 5 2 2]
arr1.shape = 
 (4,)
arr2.shape = 
 (4,)

np.vstack((arr1, arr2)):
  [[7 5 8 4]
 [6 5 2 2]]


Note: The output array is a 2-D array

**Example:** Perform vertical stacking of two 2-D Arrays. The number of columns of two arrays must match

In [7]:
arr1 = np.random.randint(low = 1, high = 10, size = (2,3))
arr2 = np.random.randint(low = 1, high = 10, size = (3,3))

print("arr1 = \n", arr1)
print("arr2 = \n", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)

arr3 = np.vstack((arr1, arr2))
print ("\n np.vstack((arr1, arr2)):\n ", arr3)

arr1 = 
 [[2 8 9]
 [2 9 2]]
arr2 = 
 [[9 9 4]
 [1 2 3]
 [3 2 7]]
arr1.shape = 
 (2, 3)
arr2.shape = 
 (3, 3)

 np.vstack((arr1, arr2)):
  [[2 8 9]
 [2 9 2]
 [9 9 4]
 [1 2 3]
 [3 2 7]]


<img align="right" width="250" height="100"  src="images/hstackfinal.png" > 

### b. Using `np.hstack()` for Column-Wise Concatenation
The `np.hstack()` method is used to stack arrays horizontally or column-wise.

```
np.hstack(tup)
```

- Where `tup` is comma separated ndarrays
- 1-D arrays can have any size/length, while for 2-D arrays, the number of rows must match.
- It returns an ndarray formed by stacking the given arrays.
- The original arrays remains as such, as it does not occur in-place.

**Example:** Perform horizontal stacking of two 1-D Arrays, which can be of different  size/length

In [8]:
arr1 = np.random.randint(low = 1, high = 10, size = 5)
arr2 = np.random.randint(low = 1, high = 10, size = 4)
print("arr1 = ", arr1)
print("arr2 = ", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)

  
arr3 = np.hstack((arr1, arr2))
print ("\n np.hstack((arr1, arr2)):\n ", arr3)

arr1 =  [2 2 6 5 5]
arr2 =  [9 1 9 1]
arr1.shape = 
 (5,)
arr2.shape = 
 (4,)

 np.hstack((arr1, arr2)):
  [2 2 6 5 5 9 1 9 1]


Note: The output array is a 1-D array

**Example:** Perform horizontal stacking of two 2-D Arrays. The number of rows of two arrays must match

In [9]:
arr1 = np.random.randint(low = 1, high = 10, size = (2,2))
arr2 = np.random.randint(low = 1, high = 10, size = (2,3))
print("arr1 = \n", arr1)
print("arr2 = \n", arr2)

arr3 = np.hstack((arr1, arr2))
print ("\n np.hstack((arr1, arr2)):\n ", arr3)

arr1 = 
 [[7 4]
 [7 3]]
arr2 = 
 [[6 6 6]
 [1 4 6]]

 np.hstack((arr1, arr2)):
  [[7 4 6 6 6]
 [7 3 1 4 6]]


### e. Using `np.stack()`

- The `np.stack()` method is used to join a sequence of same dimension arrays along a new axis.
- The axis parameter specifies the index of the new axis in the dimensions of the result. 
- For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension.
```
np.stack(a1, a2, a3, ..., axis=0)
```

- Where `tup` is comma separated ndarrays
- 1-D or 2-D arrays must have the same shape, while n-D arrays must have the same shape along all but the third axis.
- It returns the array formed by stacking the given arrays, which has one more dimension than the input arrays.
- This function makes most sense for arrays with up to 3 dimensions. For instance, for pixel-data with a height (first axis), width (second axis), and r/g/b channels (third axis).



**Note:** Concatenating joins a sequence of tensors along an existing axis, and stacking joins a sequence of tensors along a new axis

**Example:** Perform stacking of two 1-D Arrays, which must have the same size/shape.

In [10]:
import numpy as np
arr1 = np.random.randint(low = 1, high = 10, size = 4)
arr2 = np.random.randint(low = 1, high = 10, size = 4)
print("arr1 = ", arr1)
print("arr2 = ", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)

# Stacking the two arrays along axis 0
arr3 = np.stack((arr1, arr2), axis = 0)
print ("\n np.stack(arr1, arr2, axis=0):\n ", arr3)
  
# Stacking the two arrays along axis 1
arr4 = np.stack((arr1, arr2), axis = 1)
print ("\n np.stack((arr1, arr2), axis=1):\n ", arr4)


arr1 =  [6 6 3 1]
arr2 =  [7 1 4 7]
arr1.shape = 
 (4,)
arr2.shape = 
 (4,)

 np.stack(arr1, arr2, axis=0):
  [[6 6 3 1]
 [7 1 4 7]]

 np.stack((arr1, arr2), axis=1):
  [[6 7]
 [6 1]
 [3 4]
 [1 7]]


**Example:** Perform stacking of two 2-D Arrays, which must have the same size/shape.

In [11]:
import numpy as np
arr1 = np.random.randint(low = 1, high = 10, size = (2,3))
arr2 = np.random.randint(low = 1, high = 10, size = (2,3))
print("arr1 = \n", arr1)
print("arr2 = \n", arr2)
print("arr1.shape = \n", arr1.shape)
print("arr2.shape = \n", arr2.shape)


# Stacking the two arrays along axis 0
arr3 = np.stack((arr1, arr2), axis = 0)
print ("\n np.stack((arr1, arr2), axis=0): \n", arr3)
  
# Stacking the two arrays along axis 1
arr4 = np.stack((arr1, arr2), axis = 1)
print ("\n np.stack((arr1, arr2), axis=1):\n ", arr4)

# Stacking the two arrays along last axis
arr5 = np.stack((arr1, arr2), axis = -1)
print ("\n np.stack((arr1, arr2), axis=-1):\n ", arr5)

arr1 = 
 [[6 8 8]
 [6 8 1]]
arr2 = 
 [[9 6 8]
 [7 6 5]]
arr1.shape = 
 (2, 3)
arr2.shape = 
 (2, 3)

 np.stack((arr1, arr2), axis=0): 
 [[[6 8 8]
  [6 8 1]]

 [[9 6 8]
  [7 6 5]]]

 np.stack((arr1, arr2), axis=1):
  [[[6 8 8]
  [9 6 8]]

 [[6 8 1]
  [7 6 5]]]

 np.stack((arr1, arr2), axis=-1):
  [[[6 9]
  [8 6]
  [8 8]]

 [[6 7]
  [8 6]
  [1 5]]]


## 3. Splitting NumPy Arrays
- Splitting is reverse operation of Joining, and is used to split one array into multiple arrays....
- We can perform splitting along three dimensions:
    - `np.split()` : Split array into a list of multiple sub-arrays of equal size.
    - `np.hsplit()` : Split array into multiple sub-arrays horizontally (column wise).
    - `np.vsplit()` : Split array into multiple sub-arrays vertically (row wise).

### a. The `np.split()` and `np.array_split()` Methods

<img align="left" width="400" height="200"  src="images/split.png" > 
<img align="right" width="400" height="200"  src="images/array_split.png" > 

<br><br><br><br><br><br><br><br><br><br>
- The `np.split()` method splits an array into multiple sub-arrays of equal sizes.

```
np.split(arr, size, axis=0)
```
- Where,
    - `arr` is the array to be divided into sub-arrays.
    - `size` is an size of the sub-arrays, into which `arr` will be divided along the axis. 
    - `axis` is the axis along which to split, default is 0.
- If such split is not possible, an error is raised. To avoid error you can use `np.array_split()`
- It returns a list of sub-arrays as views into `arr`

**Example:** Use of `split()`

In [12]:
arr1 = np.random.randint(low = 1, high = 10, size = 20)
print("arr1:\n",arr1)

# The split size must be a factor of array size (can be 1, 2, 4, 5, 10) 
print("\nSub-arrays: \n", np.split(arr1, 2))

arr1:
 [1 4 9 2 3 3 8 2 6 2 1 7 1 3 3 8 9 6 7 9]

Sub-arrays: 
 [array([1, 4, 9, 2, 3, 3, 8, 2, 6, 2]), array([1, 7, 1, 3, 3, 8, 9, 6, 7, 9])]


**Example:** Use of `array_split()`

In [13]:
# create an array of float type
arr1 = np.random.randint(low = 1, high = 10, size = 20)
print("arr1:\n",arr1)

# The array_split() will not flag an error if size is not a factor of array size (will manage)
print("\nSub-arrays: \n", np.array_split(arr1, 3))

arr1:
 [9 7 6 7 4 1 8 3 7 3 3 4 5 5 5 9 7 3 4 2]

Sub-arrays: 
 [array([9, 7, 6, 7, 4, 1, 8]), array([3, 7, 3, 3, 4, 5, 5]), array([5, 9, 7, 3, 4, 2])]


<img align="right" width="150" height="50"  src="images/hsplit.png" > 

### b. The `np.hsplit()` Method
- The `np.hsplit()` method is used to split an array into multiple sub-arrays horizontally (column-wise). 
- The `np.hsplit()` is equivalent to split with axis=1, the array is always split along the second axis regardless of the array dimension.

```
np.hsplit(arr, size)
```
- Where,
    - `arr` is the array to be divided into sub-arrays.
    - `size` is the size of the sub-arrays, into which `arr` will be divided along the axis. For `hsplit()`, size argument should be a factor of number of columns, else it flags an error
- It returns a list of sub-arrays as views into `arr`

**Example:**

In [14]:
# create an array of float type with 4 rows and 5 columns with sequential values from 0 to 15
arr1 = np.arange(16.0).reshape(4,4)
# print array
print("arr1:\n",arr1)
print("shape: ", arr1.shape)

arr1:
 [[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]
 [12. 13. 14. 15.]]
shape:  (4, 4)


In [15]:
# horizontally split array into 2 subarrays
print("\nSub-arrays: \n", np.hsplit(arr1, 2))


Sub-arrays: 
 [array([[ 0.,  1.],
       [ 4.,  5.],
       [ 8.,  9.],
       [12., 13.]]), array([[ 2.,  3.],
       [ 6.,  7.],
       [10., 11.],
       [14., 15.]])]


<img align="right" width="200" height="50"  src="images/vsplit.png" > 

### c. The `np.vsplit()` Method
- The `np.vsplit()` method is used to split an array into multiple sub-arrays vertically (row-wise). Not applicable for 1-D array.
- The `np.vsplit()` is equivalent to split with axis=0, the array is always split along the first axis regardless of the array dimension.

```
np.vsplit(arr, size)
```
-Where,
   - `arr` is the array to be divided into sub-arrays.
   - `size` is the size of the sub-arrays, into which `arr` will be divided along the axis. For `vsplit()`, size argument should be a factor of number of rows, else it flags an error
- It returns a list of sub-arrays as views into `arr`

**Example:**

In [29]:
# create an array of float type with 4 rows and 5 columns with sequential values from 0 to 19
arr1 = np.arange(20.0).reshape(4,5)
print("arr1:\n",arr1)
print("shape: ", arr1.shape)

arr1:
 [[ 0.  1.  2.  3.  4.]
 [ 5.  6.  7.  8.  9.]
 [10. 11. 12. 13. 14.]
 [15. 16. 17. 18. 19.]]
shape:  (4, 5)


In [30]:
# vertically split array into 2 subarrays (remember size argument must be a factor of number of rows )
print("\nSub-arrays: \n", np.vsplit(arr1, 2))


Sub-arrays: 
 [array([[0., 1., 2., 3., 4.],
       [5., 6., 7., 8., 9.]]), array([[10., 11., 12., 13., 14.],
       [15., 16., 17., 18., 19.]])]


### Check Your Concepts
- Create a Numpy array with random values
- How to choose elements from the list with different probability using NumPy?(Hint : np.random.choice())
- How to get weighted random choice in Python?
- Generate Random Numbers From The Uniform Distribution using NumPy
- Get Random Elements form geometric distribution
- Get Random elements from Laplace distribution
- Return a Matrix of random values from a uniform distribution
- Return a Matrix of random values from a Gaussian distribution
- Different ways to convert a Python dictionary to a NumPy array
- How to convert a list and tuple into NumPy arrays?
- Ways to convert array of strings to array of floats
- Convert a NumPy array into a csv file
- How to Convert an image to NumPy array and save it to CSV file using Python?
- How to save a NumPy array to a text file?
- Load data from a text file
- Plot line graph from NumPy array
- Create Histogram using NumPy

## NumPy Normal Distribution
- In this topic, we’ll learn how to use the Numpy random.normal function to create normal (or Gaussian) distributions. The functions  allow us to create distributions with specific means and standard distributions and we can also create distributions of different sizes.

### What is the Normal (Gaussian) Distribution
- The normal distribution describes a common phenomenon that occurs when data is spread in a certain way. This means that the data aren’t skewed in a particular way, but are also not jumbled all over the place. In fact, they form a bell-curve, similar to the chart below:
![](images/Standard_deviation_diagram.svg_.png)

By above figure, we can think that the heights and weights of people are generally normally-distributed. Similarly, blood pressure, marks on a test, and items produced by machinery.

When we say that data are distributed normally, we mean:
   - They are entered along a mean
   - They follow conventions around standard deviations

In the image above, the dark blue lines represent 1 standard deviation from the mean in both directions. According to a Gaussian distribution, `~68.2%` of values will fall within one standard deviation.

# NumPy - Assignment no 07
- Here is link of [NumPy - Assignment no 07]()