In [1]:
# ------------------------------------------------ Numpy Datatypes -----------------------------------------------
## Contents:--
    #-- Integer
    #-- Float
    #-- Complex
    #-- Boolean
    #-- String
    #-- Object

### **Integer Data Types in NumPy**
NumPy provides multiple **integer data types** ‚Äî each using a different amount of memory and representing different numeric ranges.  
These are useful for optimizing **memory usage** and **performance** when working with large datasets.

---
**üîπ Common Integer Types**
| Data Type | Bytes | Range |
|------------|--------|--------|
| `int8`  | 1 byte | -128 to 127 |
| `int16` | 2 bytes | -32,768 to 32,767 |
| `int32` | 4 bytes | -2,147,483,648 to 2,147,483,647 |
| `int64` | 8 bytes | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 |
---
‚û°Ô∏è **Why Different Integer Types Matter**
1. **Memory Efficiency:** Choosing int8 or int16 saves memory for large arrays of small integers.
2. **Performance:** Smaller data types can improve processing speed, especially when memory bandwidth is a constraint.
3. **Range Limitations:** Using a smaller integer type (int8) restricts values to a narrow range.

In [3]:
import numpy as np

arr_8  = np.array([1, 2, 3], dtype=np.int8)
arr_64 = np.array([1, 2, 3], dtype=np.int64)

print("int8 array size:", arr_8.nbytes, "bytes")
print("int64 array size:", arr_64.nbytes, "bytes")

int8 array size: 3 bytes
int64 array size: 24 bytes


In [6]:
# Example: Creating Integer Arrays

arr_int = np.array([1, 2, 3], dtype=np.int32)
print(arr_int)

print("Data type:", arr_int.dtype)

[1 2 3]
Data type: int32


### **Float and Complex Data Types in NumPy**
NumPy provides several floating-point and complex number data types that allow precise numerical computation, scientific calculations, and matrix operations.

---
**üîπ Floating-Point Types**
| Data Type | Bytes | Precision | Typical Use |
|------------|--------|------------|--------------|
| `float16` | 2 bytes | Low (3‚Äì4 decimal digits) | Memory-efficient storage, less precision |
| `float32` | 4 bytes | Medium (6‚Äì7 decimal digits) | Standard floating-point precision |
| `float64` | 8 bytes | High (15‚Äì16 decimal digits) | Default and most accurate |
---
**‚úÖ Key Takeaways**
1. Use float64 for most precise floating-point calculations.
2. Use float16 or float32 when working with limited memory or GPU computations.
3. NumPy‚Äôs complex64 and complex128 efficiently handle complex numbers.
4. Access real and imaginary components using `.real` and `.imag`.
---

**üîπ Example: Creating Float Arrays**

In [10]:
import numpy as np

arr_float = np.array([1.0, 2.5, 3.8], dtype=np.float64)
print(arr_float)

print("Data type:", arr_float.dtype)

[1.  2.5 3.8]
Data type: float64


**üîπ Complex Number Types**
| Data Type    | Bytes    | Description                                   |
| ------------ | -------- | --------------------------------------------- |
| `complex64`  | 8 bytes  | 32-bit real + 32-bit imaginary part           |
| `complex128` | 16 bytes | 64-bit real + 64-bit imaginary part (default) |

In [9]:
arr_complex = np.array([1 + 2j, 3 + 4j], dtype=np.complex128)
print(arr_complex)

print("Data type:", arr_complex.dtype)

[1.+2.j 3.+4.j]
Data type: complex128


**üîπ Accessing Real and Imaginary Parts**

In [11]:
arr = np.array([2 + 3j, 4 + 5j])

print("Real part:", arr.real) # Real part
print("Imaginary part:", arr.imag) # Imaginary part

Real part: [2. 4.]
Imaginary part: [3. 5.]


### **Boolean Data Type in NumPy**
NumPy supports **Boolean arrays**, highly useful for **logical operations**, **filtering**, and **conditional indexing**.

---
**‚û°Ô∏è Key Takeaways**
- Use `np.bool_` to create Boolean arrays.
- Boolean arrays are essential for logical operations, filtering, and masking.
- Comparisons on NumPy arrays automatically return Boolean arrays.
- Boolean arrays are memory-efficient and fast for vectorized operations.
---
**‚û°Ô∏è Syntax**: 
``` python
arr_bool = np.array([True, False, True], dtype=np.bool_)
print(arr_bool)     # Output: [True False True]
```
- `np.bool_` (notice the underscore) is the NumPy Boolean data type.
- It ensures that elements are stored efficiently using a single byte (1 or 0).
- Using `np.bool_` is preferred over the deprecated plain `np.bool`.

**üîπ Creating a Boolean Array from Conditions**

In [14]:
arr = np.array([10, 20, 30, 40, 50])
bool_arr = arr > 25

print(bool_arr)

[False False  True  True  True]


**üîπ Using Boolean Arrays for Filtering**

In [13]:
arr = np.array([10, 20, 30, 40, 50])
filtered = arr[arr > 25]

print(filtered)

[30 40 50]


In [15]:
import numpy as np

# Create an array of integers
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
even_mask = (arr % 2 == 0) # Boolean mask for even numbers

print(even_mask)

[False  True False  True False  True False  True False  True]


### **Strings and Object Data Types in NumPy**
NumPy allows storing **strings** and even **generic Python objects** in arrays.  
While NumPy is primarily designed for numerical data, it provides support for string and object data types for special use cases.

---
‚û°Ô∏è **Summary**
| Data Type | Example Syntax | Description                            |
| --------- | -------------- | -------------------------------------- |
| `S`       | `dtype='S5'`   | Fixed-length byte strings (ASCII only) |
| `str`     | `dtype=str`    | Variable-length Unicode strings        |
| `object`  | `dtype=object` | Allows mixed or complex data (slower)  |

**üîπ Fixed-Length Strings (`dtype = 'S'`)**: NumPy supports **fixed-length byte strings**, specified using `S` followed by the number of characters.

In [None]:
import numpy as np

arr_str = np.array(['a', 'bc', 'def'], dtype='S3') # 'S3' ‚Üí Each element can store up to 3 bytes (characters).

print(arr_str)
print("Data type:", arr_str.dtype)

[b'a' b'bc' b'def']
Data type: |S3


**üîπ Variable-Length Strings (`dtype = str`)**: When you use `dtype = str`, NumPy automatically adjusts the length for each string.

In [19]:
import numpy as np

# dtype=str ‚Üí NumPy infers the string length automatically (<U6 means Unicode string of max length 6).
arr_exercise_str = np.array(['apple', 'banana', 'cherry'], dtype=str)

print(arr_exercise_str)
print("Data type:", arr_exercise_str.dtype)

['apple' 'banana' 'cherry']
Data type: <U6


**üîπ Object Data Type (`dtype = object`)**: The object data type allows NumPy arrays to store mixed data types (integers, floats, strings, etc.).

In [20]:
# dtype=object enables arrays to store any Python object ‚Äî numbers, strings, lists, or custom objects.
arr_obj = np.array([1, 'a', 3.5], dtype=object)

print(arr_obj)
print("Data type:", arr_obj.dtype)

[1 'a' 3.5]
Data type: object


##### **‚û°Ô∏è Example:** Check difference when strings are declared using the datatype 'S5' v/s 'str'

In [21]:
import numpy as np

# Using S5 datatype
arr_exercise_str = np.array(['apple', 'banana', 'cherry'], dtype='S5')
print(arr_exercise_str)

# Use S6 to avoid truncation
arr_exercise_str = np.array(['apple', 'banana', 'cherry'], dtype='S6')
print(arr_exercise_str)

# Use str to auto detect the string length
arr_exercise_str = np.array(['apple', 'banana', 'cherry'], dtype='str')
print(arr_exercise_str)

[b'apple' b'banan' b'cherr']
[b'apple' b'banana' b'cherry']
['apple' 'banana' 'cherry']


### **Data Type Conversion in NumPy (`astype()` Method)**
NumPy provides a simple and powerful way to **convert arrays from one data type to another** using the `astype()` method.  
This is especially useful when you need to perform mathematical operations, save memory, or prepare data for specific computations.

---
**üîπ Syntax**: **`array.astype(new_dtype)`**
- `new_dtype` ‚Üí The target data type (e.g., `int`, `float`, `str`, or `'float64'`, `'int32'`, etc.)
- Returns a new array with the converted data type (the original array remains unchanged).
---
**‚û°Ô∏è Common Conversions**
| From            | To                  | Example                             |
| --------------- | ------------------- | ----------------------------------- |
| `int` ‚Üí `float` | `arr.astype(float)` | Convert for precise math operations |
| `float` ‚Üí `int` | `arr.astype(int)`   | Truncates decimals                  |
| `int` ‚Üí `str`   | `arr.astype(str)`   | For text display or labeling        |
| `bool` ‚Üí `int`  | `arr.astype(int)`   | `True ‚Üí 1`, `False ‚Üí 0`             |


**üîπ Example 1 ‚Äî Integer to String Conversion**

In [24]:
import numpy as np

int_array = np.array([1, 2, 3, 4, 5]) # Create an array of integers
str_array = int_array.astype(str) # Convert integer array to string array

print(int_array)      # Output: [1 2 3 4 5]
print(str_array)      # Output: ['1' '2' '3' '4' '5']
print(str_array.dtype)  # Output: <U1 -> The new array has dtype='<U1', meaning a Unicode string of max length 1 (since all are single-digit).

[1 2 3 4 5]
['1' '2' '3' '4' '5']
<U21


**üîπ Example 2 ‚Äî Integer to Float Conversion**

In [25]:
int_array = np.array([1, 2, 3, 4]) # Create an array of integers

float_array = int_array.astype('float') # Convert integer array to float

print(int_array)     # Output: [1 2 3 4]
print(float_array)   # Output: [1. 2. 3. 4.]
print(float_array.dtype)  # Output: float64

[1 2 3 4]
[1. 2. 3. 4.]
float64


In [26]:
import numpy as np

int_array = np.array([10, 20, 30, 40, 50]) # Create an array of integers
str_array = int_array.astype(str) # Correct conversion to string array

print("Integer array:", int_array) # Print the integer array
print("String array:", str_array) # Print the string array

Integer array: [10 20 30 40 50]
String array: ['10' '20' '30' '40' '50']
