# **`Data Science Learners Hub`**

**Module : Python**

**email** : [datasciencelearnershub@gmail.com](mailto:datasciencelearnershub@gmail.com)

### **`1.3. NumPy Arrays:`**

**Creating NumPy Arrays:**

NumPy arrays are the foundation of numerical computing in Python. **<mark>They are homogeneous, multi-dimensional data structures that efficiently store and manipulate numerical data</mark>**. Here are various methods to create NumPy arrays:

| Method | Syntax | Return Type | Input Parameters | In-Place or Copy | One-liner Explanation | Peculiarities/Considerations |
| --- | --- | --- | --- | --- | --- | --- |
| `array()` | `np.array(object, ...)` | `ndarray` | `object`: Input data, such as a list or tuple | Copy | <mark>**Create an array from an existing Python list or tuple.**</mark> |  |
| `arange()` | `np.arange([start, ]stop, [step, ])` | `ndarray` | `start`, `stop`, `step`: Parameters defining range | Copy | Generate an array with <mark>**regularly spaced values**.</mark> | <mark>**The `stop` value is exclusive.**</mark> |
| `empty()` | `np.empty(shape, ...)` | `ndarray` | `shape`: **<mark>Tuple specifying array dimensions</mark>** | In-Place | Create an uninitialized array with shape and dtype. | <mark>**Content of the array is not initialized; it may contain garbage values.**</mark> |
| `zeros()` | `np.zeros(shape, ...)` | `ndarray` | `shape`: Tuple specifying array dimensions | In-Place | Create an array filled with zeros. | **<mark>Default data type is `float64`.</mark>** |
| `ones()` | `np.ones(shape, ...)` | `ndarray` | `shape`: Tuple specifying array dimensions | In-Place | Create an array filled with ones. | **<mark>Default data type is `float64`.</mark>** |
| `full()` | `np.full(shape, fill_value, ...)` | `ndarray` | `shape`: Tuple specifying array dimensions, **<mark>`fill_value`: Constant value</mark>** | In-Place | Create an array of specified shape and fill it with a constant value. | **<mark>Useful for initializing arrays with a specific value.</mark>** |
| `linspace()` | `np.linspace(start, stop, num=50, ...)` | `ndarray` | `start`, `stop`: Range limits, **<mark>`num`: Number of points</mark>** | Copy | <mark>Generate an array with a specified number of evenly spaced values.</mark> | **<mark>Commonly used for creating time intervals.</mark>** |
| `logspace()` | `np.logspace(start, stop, num=50, ...)` | `ndarray` | `start`, `stop`: Range limits, **<mark>`num`: Number of points</mark>** | Copy | Generate an array with a specified number of **<mark>logarithmically spaced values.</mark>** | Useful for creating logarithmically spaced scales. |
| `eye()` | `np.eye(N, M=None, k=0, ...)` | `ndarray` | **<mark>`N`: Number of rows, `M`: Number of columns (default is N), `k`: Index of the diagonal (0 is the main diagonal)</mark>** | Copy | Create a **<mark>2D identity matrix (diagonal elements are 1, others are 0).</mark>** | Useful for linear algebra and transformations. |
| `identity()` | `np.identity(n, dtype=None)` | `ndarray` | `n`: Number of rows (and columns) in the output | Copy | Create a square identity matrix of given size. | **<mark>Similar to `eye()`, but only creates square matrices.</mark>** |

**`Note`** : <mark>**Shape is always given in tuple**</mark>

1. **`np.array()`:**
    - The most basic way to create a NumPy array is **<mark>by converting an existing Python list or tuple.</mark>**

In [4]:
import numpy as np

arr_from_list = np.array([1, 2, 3, 4, 5])
print(arr_from_list)
print(type(arr_from_list))

[1 2 3 4 5]
<class 'numpy.ndarray'>


**`Note`** : **<mark>Observe that unlike lists datatypes there are no commas in output incase of NumPy arrays</mark>**

2. **`np.zeros()`:**
   - Creates an array filled with zeros of a specified shape.

In [5]:
import numpy as np

zeros_array = np.zeros((2, 3))
print(zeros_array)

# Note : (2,3) is tuple
# Why there is '.' after the number '0' ?

[[0. 0. 0.]
 [0. 0. 0.]]


3. **`np.ones()`:**
   - Creates an array filled with ones of a specified shape.

In [6]:
import numpy as np

ones_array = np.ones((3, 2))
print(ones_array)

# Note : (3,2) is tuple
# Why there is '.' after the number '1' ?

[[1. 1.]
 [1. 1.]
 [1. 1.]]


4. **`np.full()`:**
   - Create an array of specified shape and fill it with a constant value

In [1]:
import numpy as np

full_array = np.full((3, 2),6)
print(full_array)

# Note : (3,2) is tuple
# Why there is no '.' after the number '6' ?

[[6 6]
 [6 6]
 [6 6]]


5. **`np.arange()`:**
   - Generates an array with <mark>`regularly spaced values within a given range.`</mark>

In [7]:
import numpy as np

range_array = np.arange(0, 10, 2)  # start, stop (exclusive), step
print(range_array)

[0 2 4 6 8]


#### Explanation:

**Is stop inclusive or exclusive in case of arange()?**

- In the case of np.arange(), **<mark>the stop value is exclusive</mark>**, meaning the generated array stops just before reaching the specified stop value. In the example, the array includes values up to, but not including, 10.

6. **`np.linspace()`:**
    - Generates an array with a <mark>`specified number of evenly spaced values within a given range.`</mark>

In [2]:
import numpy as np

linspace_array = np.linspace(0, 1, 5)  # start, stop, number of points
print(linspace_array)

# Does the number 1(stop) get included in the output

[0.   0.25 0.5  0.75 1.  ]


#### Explanation :

- **`np.linspace(0, 1, 5)`:**
    
    - Generates an array of 5 evenly spaced values between 0 and 1 (inclusive).
    - The `linspace()` function takes three arguments: `start`, `stop`, and `num` (number of points).
    - In this case, it generates an array of 5 points between **<mark>0 and 1 (both inclusive).</mark>**
- **Does the number 1 (stop) get included in the output?**
    
    - **Yes, the number 1 (the `stop` value) is included in the output** because the `linspace()` function generates values up to and including the specified `stop` value. In this case, the array includes 1 as one of its values.

7. **`np.logspace()`:**
- Generate an array with a <mark>`specified number of logarithmically spaced values.`</mark>

In [2]:
import numpy as np

logspace_array = np.logspace(0, 1, 5)  # start, stop, number of points
print(logspace_array)

[ 1.          1.77827941  3.16227766  5.62341325 10.        ]


#### Explanation:

- `np.logspace(0, 1, 5)`: This function generates an array of values on a logarithmic scale. The parameters are as follows:
  - `0`: Start exponent of the sequence. In this case, it starts at 10^0, which is 1.
  - `1`: Stop exponent of the sequence. It stops at 10^1, which is 10.
  - `5`: Number of points to generate. In this case, it will generate 5 points.


In [2]:
import numpy as np

logspace_array = np.logspace(0, 2, 5)  # start, stop, number of points
print(logspace_array)

# NOTE : Here values are in between 10^1 and 10^2

[  1.           3.16227766  10.          31.6227766  100.        ]


8. **`np.empty()`:**
- `Create an uninitialized array with shape and dtype`.
- Content of the array is not initialized; **`it may contain garbage values.`**

In [6]:
import numpy as np
np.empty((2,3))

array([[0., 0., 0.],
       [0., 0., 0.]])

#### More about syntax of empty()

The syntax for the `empty()` function is as follows:

```
numpy.empty(shape, dtype=float, order='C')

```

- `shape`: Tuple specifying the dimensions of the array.
- `dtype`: Data type of the array. <mark>Optional, and the default is `float64`.</mark>
- `order`: **<mark>Specifies whether to store the array data in row-major (`'C'`, default) or column-major (`'F'`) order.</mark>**

**Explanation of Order Parameter:**

`'C' (C-style, row-major):` <mark>In a 2-dimensional array, **elements are stored row by row**. This means that **elements of the same row are adjacent in memory**.</mark>

`'F' (Fortran-style, column-major):` <mark>In a 2-dimensional array, **elements are stored column by column**. This means that **elements of the same column are adjacent in memory.**</mark>

The choice between 'C' and 'F' can affect performance in certain operations, especially when working with large arrays or performing operations that involve accessing elements in a specific order (such as matrix multiplication). By default, 'C' (row-major) is used, which is generally suitable for most applications unless you specifically need column-major storage for compatibility with Fortran or certain numerical algorithms.


#### Explain `empty()` in Detail when to use and when not to 

```
import numpy as np

empty_array = np.empty((2, 3))
print(empty_array)

```

In this example, `empty((2, 3))` creates a 2x3 array without initializing the elements. <mark>**The actual values in the array will depend on the current state of the memory and are not meaningful. It's important to note that if you need an array with initialized values, you should consider using `zeros()`, `ones()`, or another appropriate function depending on your requirements. The `empty()` function is typically used when you need to allocate memory for an array but do not necessarily care about the initial values.**</mark>

#### How to create a 1D array using empty() or ones() or zeros() ?

In [6]:
empty_array_1D = np.empty(8) 
print(empty_array_1D)

# NOTE : Similarly we can create 1D array using ones(), zeros() etc 

[0. 0. 0. 0. 0. 0. 0. 0.]


In [8]:
ones_array_1D = np.ones(5)
print(ones_array_1D)

[1. 1. 1. 1. 1.]


9. **`np.eye()`:**
    - Create a **<mark>2D</mark>** identity matrix (<mark>**diagonal elements are 1, others are 0**</mark>).
    - Using eye() u can create an identity matrix wich need not be square matrix by shifting the daigonal

In [1]:
import numpy as np

# Create a 3x3 identity matrix
eye_matrix = np.eye(3)
print("3x3 Identity Matrix:")
print(eye_matrix)

# Create a 4x5 identity matrix with the main diagonal shifted by 1
eye_shifted = np.eye(4, 5, k=1)
print("\n4x5 Identity Matrix with Diagonal Shifted by 1:")
print(eye_shifted)


3x3 Identity Matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

4x5 Identity Matrix with Diagonal Shifted by 1:
[[0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


10. **`np.identity()`:**

- Create a **<mark>square identity matrix</mark>** of given size.

In [2]:
import numpy as np

# Create a 2x2 identity matrix
identity_matrix = np.identity(2)
print("2x2 Identity Matrix:")
print(identity_matrix)

# Create a 5x5 identity matrix
identity_5x5 = np.identity(5)
print("\n5x5 Identity Matrix:")
print(identity_5x5)


2x2 Identity Matrix:
[[1. 0.]
 [0. 1.]]

5x5 Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


**Real-world Examples:**

1. **Temperature Data:**
   - **Scenario:** You have daily temperature readings for a week.
   - **Application:** Create a NumPy array to store and manipulate temperature data.

In [9]:
temperature_readings = np.array([22.5, 24.0, 23.8, 25.3, 21.7, 22.1, 23.5])
print(temperature_readings)

[22.5 24.  23.8 25.3 21.7 22.1 23.5]


2. **Financial Modeling:**
   - **Scenario:** You are modeling monthly sales figures for a company.
   - **Application:** Use `np.zeros()` to initialize an array for monthly sales and update it as new data becomes available.

In [10]:
monthly_sales = np.zeros(12)
print(monthly_sales)

# Note the output is 1D because no dimensions mentioned in zeros()

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


3. **Time Series Analysis:**
   - **Scenario:** You need a time series of timestamps at regular intervals.
   - **Application:** Generate a time series using `np.arange()` or `np.linspace()`.

In [11]:
from datetime import datetime, timedelta

start_date = datetime(2022, 1, 1)
time_series = np.array([start_date + timedelta(days=i) for i in range(10)])
print(time_series)


[datetime.datetime(2022, 1, 1, 0, 0) datetime.datetime(2022, 1, 2, 0, 0)
 datetime.datetime(2022, 1, 3, 0, 0) datetime.datetime(2022, 1, 4, 0, 0)
 datetime.datetime(2022, 1, 5, 0, 0) datetime.datetime(2022, 1, 6, 0, 0)
 datetime.datetime(2022, 1, 7, 0, 0) datetime.datetime(2022, 1, 8, 0, 0)
 datetime.datetime(2022, 1, 9, 0, 0) datetime.datetime(2022, 1, 10, 0, 0)]


4. **Physical Simulation:**
   - **Scenario:** Simulating the trajectory of a projectile over time.
   - **Application:** Use `np.linspace()` to generate time intervals and create an array representing the projectile's position at each time step.

In [2]:
import numpy as np
time_intervals = np.linspace(0, 5, 100)  # 100 time points from 0 to 5 seconds
print(time_intervals)

[0.         0.05050505 0.1010101  0.15151515 0.2020202  0.25252525
 0.3030303  0.35353535 0.4040404  0.45454545 0.50505051 0.55555556
 0.60606061 0.65656566 0.70707071 0.75757576 0.80808081 0.85858586
 0.90909091 0.95959596 1.01010101 1.06060606 1.11111111 1.16161616
 1.21212121 1.26262626 1.31313131 1.36363636 1.41414141 1.46464646
 1.51515152 1.56565657 1.61616162 1.66666667 1.71717172 1.76767677
 1.81818182 1.86868687 1.91919192 1.96969697 2.02020202 2.07070707
 2.12121212 2.17171717 2.22222222 2.27272727 2.32323232 2.37373737
 2.42424242 2.47474747 2.52525253 2.57575758 2.62626263 2.67676768
 2.72727273 2.77777778 2.82828283 2.87878788 2.92929293 2.97979798
 3.03030303 3.08080808 3.13131313 3.18181818 3.23232323 3.28282828
 3.33333333 3.38383838 3.43434343 3.48484848 3.53535354 3.58585859
 3.63636364 3.68686869 3.73737374 3.78787879 3.83838384 3.88888889
 3.93939394 3.98989899 4.04040404 4.09090909 4.14141414 4.19191919
 4.24242424 4.29292929 4.34343434 4.39393939 4.44444444 4.4949

## **Extra Innings**

#### `Flatten 2D NumPy Array to 1D NumPy array from a python list`

**ravel()** :

- In NumPy, **<mark>ravel() is a function that returns a flattened one-dimensional array from a multi-dimensional array</mark>**. It returns a contiguous flattened array. **<mark>A flattened array is a 1D array containing all the elements of the original array in row-major (C-style) order.</mark>**

In [3]:
arr = np.array([[1,2,3],[3,4,5]])
arr.ravel()

array([1, 2, 3, 3, 4, 5])

In [2]:
import numpy as np
arr = np.array([[1,2,3],[3,4,5]])
array_1D = arr.ravel()
print(array_1D)

[1 2 3 3 4 5]


#### Why there is a difference in output for both the below code ?

In [None]:
import numpy as np

arr = np.array([[1,2,3],[3,4,5]])

array_1D = arr.ravel()

print(array_1D)

In [None]:
arr = np.array([[1,2,3],[3,4,5]])

arr.ravel()

#### Explanation

`First Snippet`: The print function is explicitly used to display the contents of the flattened array. This provides a simple list-like output.
`Second Snippet`: Without print, the return value of arr.ravel() is automatically shown by the interactive environment, which includes the array type and shape information.

`print(array_1D)` : The print function outputs the flattened array in a more human-readable format without explicitly showing it as a NumPy array object.

`arr.ravel()` : The ravel method flattens the 2D array into a 1D array but does not assign it to any variable or print it. This line on its own will simply return the flattened array.