In [1]:
import numpy as np

# Characteristics of NumPy Arrays

1. **Homogeneous**
   - All elements in a NumPy array must have the same data type (e.g., `int`, `float`, `bool`).

2. **Multidimensional**
   - NumPy arrays can be 1D, 2D, or n-dimensional, making them flexible for various data representations.

3. **Efficient Memory Usage**
   - NumPy arrays are stored in contiguous blocks of memory, making operations faster and more efficient than Python lists.

4. **Fixed Size**
   - Once a NumPy array is created, its size is fixed and cannot be changed (no addition or removal of elements).

5. **Vectorized Operations**
   - NumPy allows element-wise operations without explicit loops, enabling faster computations (e.g., `arr + 10` adds 10 to each element).

6. **Support for Advanced Indexing**
   - NumPy arrays support slicing, boolean indexing, and fancy indexing for data manipulation.

7. **Shape and Dimension**
   - Arrays have attributes like `shape` (size of each dimension) and `ndim` (number of dimensions).

8. **Broadcasting**
   - Enables operations on arrays of different shapes by "stretching" smaller arrays to match the larger shape when possible.

9. **Universal Functions (ufuncs)**
   - NumPy provides a wide range of mathematical functions (e.g., `np.sum`, `np.mean`, `np.sin`) that operate efficiently on arrays.

10. **Data Type Flexibility**
    - NumPy supports various data types like integers (`int32`, `int64`), floats (`float32`, `float64`), complex numbers, and even user-defined types.

11. **Integration with Other Libraries**
    - NumPy arrays are widely used and integrated into libraries like Pandas, Matplotlib, and Scikit-learn.

12. **Efficient I/O**
    - Provides methods to read/write arrays to disk, like `np.save` and `np.load`, for fast storage and retrieval.

13. **Interoperability**
    - Supports conversion between Python lists and arrays, as well as compatibility with other array-like objects.


In NumPy, the shape attribute of a NumPy array gives the dimensions of the array as a tuple of integers. It essentially tells you the size of the array along each dimension.

Definition: array.shape returns a tuple representing the number of elements in each dimension of the array.
Structure:

A 1D array: (number_of_elements,)

A 2D array: (number_of_rows, number_of_columns)


A 3D array: (depth, number_of_rows, number_of_columns)

In [12]:
data_1d = [1, 2, 3, 4, 5]

np_array_1d = np.array(data_1d)

print(f"The 1D numpy array: {np_array_1d}")
print(f"Dimensions: {np_array_1d.ndim}")
print(f"Shape: {np_array_1d.shape}")
print(f"Size (no. of elements): {np_array_1d.size}")
print(f"Data Type of the numpy array: {np_array_1d.dtype}")

The 1D numpy array: [1 2 3 4 5]
Dimensions: 1
Shape: (5,)
Size (no. of elements): 5
Data Type of the numpy array: int32


In [13]:
data_2d = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]

np_array_2d = np.array(data_2d)

print(f"The 2D numpy array: {np_array_2d}")
print(f"Dimensions: {np_array_2d.ndim}")
print(f"Shape: {np_array_2d.shape}")
print(f"Size (no. of elements): {np_array_2d.size}")
print(f"Data Type of the numpy array: {np_array_2d.dtype}")

The 2D numpy array: [[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
Dimensions: 2
Shape: (2, 5)
Size (no. of elements): 10
Data Type of the numpy array: int32


### To create a valid 3D NumPy array, ensure that all sublists have the same shape:

In [14]:
data_3d = [[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]], [[14, 15, 16], [20, 15, 46]]]

np_array_3d = np.array(data_3d)

print(f"The 3D numpy array: {np_array_3d}")
print(f"Dimensions: {np_array_3d.ndim}")
print(f"Shape: {np_array_3d.shape}")
print(f"Size (no. of elements): {np_array_3d.size}")
print(f"Data Type of the numpy array: {np_array_3d.dtype}")

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (2, 2) + inhomogeneous part.

In [15]:
data_3d = [[[1, 2, 3], [6, 7, 8]], [[14, 15, 16], [20, 15, 46]]]

np_array_3d = np.array(data_3d)

print(f"The 3D numpy array: {np_array_3d}")
print(f"Dimensions: {np_array_3d.ndim}")
print(f"Shape: {np_array_3d.shape}")
print(f"Size (no. of elements): {np_array_3d.size}")
print(f"Data Type of the numpy array: {np_array_3d.dtype}")

The 3D numpy array: [[[ 1  2  3]
  [ 6  7  8]]

 [[14 15 16]
  [20 15 46]]]
Dimensions: 3
Shape: (2, 2, 3)
Size (no. of elements): 12
Data Type of the numpy array: int32


## creating Numpy Arrays

In [21]:
zeros_array = np.zeros((2, 3))
ones_array = np.ones((2, 3))
specific_number_array = np.full((2, 3), 7)
identity_array = np.eye(3)
empty_array = np.empty((2, 3))

sequence_array = np.arange(0, 10, 2)
linspace_array = np.linspace(0, 5, 20)

print("Zeros Array:")
print(zeros_array)
print("\nOnes Array:")
print(ones_array)
print("\nSpecific Number Array (filled with 7):")
print(specific_number_array)
print("\nIdentity Array:")
print(identity_array)
print("\nEmpty Array (uninitialized):")
print(empty_array)
print("\nSequence Array (using np.arange):")
print(sequence_array)
print("\nLinspace Array (using np.linspace):")
print(linspace_array)

Zeros Array:
[[0. 0. 0.]
 [0. 0. 0.]]

Ones Array:
[[1. 1. 1.]
 [1. 1. 1.]]

Specific Number Array (filled with 7):
[[7 7 7]
 [7 7 7]]

Identity Array:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Empty Array (uninitialized):
[[1. 1. 1.]
 [1. 1. 1.]]

Sequence Array (using np.arange):
[0 2 4 6 8]

Linspace Array (using np.linspace):
[0.         0.26315789 0.52631579 0.78947368 1.05263158 1.31578947
 1.57894737 1.84210526 2.10526316 2.36842105 2.63157895 2.89473684
 3.15789474 3.42105263 3.68421053 3.94736842 4.21052632 4.47368421
 4.73684211 5.        ]


### Reading from files

In [24]:
data = np.loadtxt("data.txt")
# print(data)
data

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

In [27]:
data_csv = np.loadtxt("data.csv", delimiter = ',')
data_csv

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

In [36]:
sample_data = np.random.rand(5, 5)

np.savetxt("sample_csv.csv", sample_data,  delimiter = ',', fmt='%0.2f')

read_sample_data = np.loadtxt("sample_csv.csv", delimiter = ',')
read_sample_data

array([[0.76, 0.97, 0.74, 0.57, 0.48],
       [0.48, 0.18, 0.92, 0.85, 0.88],
       [0.04, 0.28, 0.12, 0.21, 0.6 ],
       [0.77, 0.25, 0.94, 0.55, 0.41],
       [0.37, 0.98, 0.12, 0.97, 0.8 ]])

## Arrays Manupilation

In [43]:
# NumPy Array
org_array = np.array([10, 20, 30, 40, 50])
print(f"The array before: {org_array}")

sliced_arr = org_array[1:4]
sliced_arr[0] = 555

print(f"The array after: {org_array}")

# while a normal list
org_array = [10, 20, 30, 40, 50]
print(f"\nThe array before: {org_array}")

sliced_arr = org_array[1:4]
sliced_arr[0] = 555

print(f"The array after: {org_array}")

The array before: [10 20 30 40 50]
The array after: [ 10 555  30  40  50]

The array before: [10, 20, 30, 40, 50]
The array after: [10, 20, 30, 40, 50]


## Math Operations are element-wise

In [46]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([10, 20, 30, 40])

# +, -, /, *, **
print(arr1 + arr2)
print(arr2 ** arr1)

[11 22 33 44]
[     10     400   27000 2560000]


# Broadcasting in NumPy

Broadcasting is a powerful feature in NumPy that allows operations on arrays of different shapes. It eliminates the need to manually reshape or replicate arrays to perform element-wise operations.

---

## How Broadcasting Works
When performing operations between two arrays, NumPy compares their shapes element by element. Two dimensions are **compatible** when:
1. They are **equal**, or
2. One of them is **1**.

If the dimensions are not compatible, broadcasting raises an error. Otherwise, the smaller array is “broadcast” across the larger array to match its shape.

---

## Broadcasting Rules
1. **Compare shapes from right to left**.
2. Apply these conditions for each dimension:
   - If the dimensions match, proceed.
   - If one of the dimensions is `1`, "stretch" it to match the other dimension.
   - If neither condition is met, broadcasting is not possible.

---

## Examples of Broadcasting

### 1. Adding a Scalar to an Array
A scalar is treated as an array with shape `(1, ...)`.
```python
import numpy as np

arr = np.array([1, 2, 3])
result = arr + 10
print(result)  # Output: [11 12 13]
```

## 2. Broadcasting 1D and 2D Arrays
The 1D array is broadcast to match the shape of the 2D array.

```python
import numpy as np

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_1d = np.array([10, 20, 30])

result = arr_2d + arr_1d
print(result)
# Output:
# [[11 22 33]
#  [14 25 36]]
```

## 3. Broadcasting a Column Vector
A column vector is broadcast to match a 2D matrix.

```python
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
col_vector = np.array([[10], [20]])

result = arr_2d + col_vector
print(result)
# Output:
# [[11 12 13]
#  [24 25 26]]
```

In [49]:
# BroadCasting Multiplication in NumPy

arr_1_x_3 = np.array([[1], [2], [3]])
arr_3_x_1 = np.array([1, 2, 3, 4])

arr_result = arr_1_x_3 * arr_3_x_1
print(arr_result)

[[ 1  2  3  4]
 [ 2  4  6  8]
 [ 3  6  9 12]]


## Masking or Boolean Indexing

In [56]:
arr = np.array([10, 20, 25, 30, 40, 60])
mask = (arr > 10) & (arr < 50)

print(f"The filtering: {mask}")
print(f"The array after filtration: {arr[mask]}")

# or it could be simply
arr[(arr > 10) & (arr < 50)]

The filtering: [False  True  True  True  True False]
The array after filtration: [20 25 30 40]


array([20, 25, 30, 40])

## Data Types in NumPy and Casting

# Data Types in NumPy

NumPy provides a wide range of data types to handle different kinds of numerical and non-numerical data efficiently. Below is an overview of the commonly used data types in NumPy:

## Numerical Data Types
1. **Integer Types**:
   - `int8`: Integer (-128 to 127)
   - `int16`: Integer (-32,768 to 32,767)
   - `int32`: Integer (-2,147,483,648 to 2,147,483,647)
   - `int64`: Integer (-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)

2. **Unsigned Integer Types**:
   - `uint8`: Unsigned integer (0 to 255)
   - `uint16`: Unsigned integer (0 to 65,535)
   - `uint32`: Unsigned integer (0 to 4,294,967,295)
   - `uint64`: Unsigned integer (0 to 18,446,744,073,709,551,615)

3. **Floating-Point Types**:
   - `float16`: Half precision floating-point
   - `float32`: Single precision floating-point
   - `float64`: Double precision floating-point (default)
   - `float128`: Extended precision floating-point (platform-dependent)

4. **Complex Numbers**:
   - `complex64`: Complex number with 32-bit real and imaginary parts
   - `complex128`: Complex number with 64-bit real and imaginary parts
   - `complex256`: Complex number with 128-bit real and imaginary parts (platform-dependent)

## Non-Numerical Data Types
1. **Boolean**:
   - `bool`: Boolean value (`True` or `False`).

2. **String**:
   - `str_`: Fixed-length Unicode string.
   - `unicode_`: Alias for `str_`.

3. **Object**:
   - `object`: Generic Python object.

4. **Void**:
   - `void`: Represents raw data or custom binary data.

## Specifying Data Types
You can specify the data type explicitly when creating a NumPy array:
```python
import numpy as np

arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.dtype)  # Output: int32


In [64]:
flt_array = np.array([1.1, 2.2, 3.3, 4.4])
print(flt_array.dtype)

int_array = flt_array.astype('int8')
print(int_array)

float64
[1 2 3 4]


# Random Number Generation in NumPy

NumPy provides a powerful **random module** for generating random numbers, arrays, and performing random sampling. Below are the commonly used functions for random number generation.

- Uniform Distribution (np.random.rand):

Generates values in the range [0, 1) with equal probability.
Useful for modeling scenarios where all outcomes are equally likely.
- Normal Distribution (np.random.randn):

Generates values with a mean of 0 and a variance of 1.
Common in statistical modeling, where data is expected to follow a Gaussian distribution.
- Random Integer (np.random.randint):

Generates a random integer within the specified range [low, high).
In this case, low=0 (default) and high=5, so it outputs a number between 0 and 4.

In [69]:
np.random.seed(42)

# Generate 5 random floats sampled from a uniform distribution [0, 1)
random_array_uniform = np.random.rand(5)
print("Random Array (Uniform Distribution):")
print(random_array_uniform)  # Example: [0.5488135  0.71518937 0.60276338 0.54488318 0.4236548 ]

# Generate 5 random floats sampled from a normal (Gaussian) distribution with mean 0 and variance 1
random_array_normal_dist = np.random.randn(5)
print("\nRandom Array (Normal Distribution):")
print(random_array_normal_dist)  # Example: [-0.97727788  0.95008842 -0.15135721 -0.10321885  0.4105985 ]

# Generate a single random integer between 0 (inclusive) and 5 (exclusive)
random_array_int = np.random.randint(5)
print("\nRandom Integer (0 to 4):")
print(random_array_int)  # Example: 3

random_array_inters = np.random.randint(1, 100, size = 5)
print(random_array_inters)

Random Array (Uniform Distribution):
[0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]

Random Array (Normal Distribution):
[ 0.27904129  1.01051528 -0.58087813 -0.52516981 -0.57138017]

Random Integer (0 to 4):
4
[89 49 91 59 42]
