## Datatypes

### Topics covered:
- types of dtype, size of dtypes, changing dtypes

Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes that you can use to construct arrays. Numpy tries to guess a datatype when you create an array, but functions that construct arrays usually also include an optional argument to explicitly specify the datatype.

**Why Do We Have So Many Data Types in NumPy?**
- Memory Efficiency: Different data types require different amounts of memory. For example, an int8 uses 1 byte, while an int64 uses 8 bytes. By choosing the appropriate data type, you can save memory, especially when working with large datasets.

- Performance: Operations on smaller data types can be faster because they require less memory bandwidth and can fit more data into CPU caches. This can lead to improved performance in numerical computations.

- Precision: Different data types offer varying levels of precision. For example, float32 has less precision than float64. Depending on the requirements of your calculations, you may need to choose a data type that provides the necessary precision.

Here is an example:

In [2]:
import numpy as np

In [2]:
x = np.array([1, 2])   # Let numpy choose the datatype
print(x.dtype)         # Prints "int64"

x = np.array([1.0, 2.0])   # Let numpy choose the datatype
print(x.dtype)             # Prints "float64"

x = np.array([1, 2], dtype=np.int64)   # Force a particular datatype
print(x.dtype)                         # Prints "int64"

int32
float64
int64


- **unsigned data types** refer to numeric types that can only represent non-negative integers (i.e., 0 and positive numbers). They do not support negative values.

**Unsigned Integer Types and Their Ranges**
```
  Data Type	Description              (Range)
np.uint8	Unsigned 8-bit integer	(0 to 255)
np.uint16	Unsigned 16-bit integer	(0 to 65,535)
np.uint32	Unsigned 32-bit integer	(0 to 4,294,967,295)
np.uint64	Unsigned 64-bit integer	(0 to 18,446,744,073,709,551,615)
```

**Signed Integer Types and Their Ranges**
```
Data Type   Bits    Minimum Value   Maximum Value
np.int8      8       -128             127
np.int16    16      -32,768          32,767
np.int32    32      -2,147,483,648   2,147,483,647
np.int64    64	  -9,223,372,036,854,775,808	9,223,372,036,854,775,807
```

In [None]:
# available data types

# 1. Integer Types
# Available types: int8, int16, int32, int64
# Create an array of int32
a = np.array([1, 2, 3], dtype=np.int32)
print("Integer Array:", a)
print("Data Type:", a.dtype)

# 2. Unsigned Integer Types: 
# Available types: uint8, uint16, uint32, uint64
# Create an array of uint8
a = np.array([1, 2, 3], dtype=np.uint8)
print("Unsigned Integer Array:", a)
print("Data Type:", a.dtype)

# 3. Floating Point Types
# Available types: float16, float32, float64
# Create an array of float64
a = np.array([1.0, 2.0, 3.0], dtype=np.float64)
print("Floating Point Array:", a)
print("Data Type:", a.dtype)

# 4. Complex Types
# Available types: complex64, complex128
# Create an array of complex128
a = np.array([1+2j, 3+4j], dtype=np.complex128)
print("Complex Array:", a)
print("Data Type:", a.dtype)

# 5. Boolean Type
# Available type: bool_
# Create a boolean array
a = np.array([True, False, True], dtype=np.bool_)
print("Boolean Array:", a)
print("Data Type:", a.dtype)

# 6. String Type
# Available type: str_
# Create an array of strings
a = np.array(['apple', 'banana', 'cherry'], dtype=np.str_)
print("String Array:", a)
print("Data Type:", a.dtype)

# 7. Object Type
# Available type: object_
# Create an array of objects
a = np.array([1, 'apple', 3.14], dtype=np.object_)
print("Object Array:", a)
print("Data Type:", a.dtype)

Integer Array: [1 2 3]
Data Type: int32
Unsigned Integer Array: [1 2 3]
Data Type: uint8
Floating Point Array: [1. 2. 3.]
Data Type: float64
Complex Array: [1.+2.j 3.+4.j]
Data Type: complex128
Boolean Array: [ True False  True]
Data Type: bool
String Array: ['apple' 'banana' 'cherry']
Data Type: <U6
Object Array: [1 'apple' 3.14]
Data Type: object


## important: when number are out of range

In [3]:
# Lets look at np.uint8 whose range is 0 - 255
a = np.array([3], dtype=np.uint8)
print(a)  # Output: [255]

a = np.array([255], dtype=np.uint8)
print(a)  # Output: [255]

#if you store higher number than 255, it will wrap the max value, which is 255. This may change in the future to fail
a = np.array([254, 255, 256, 257, 258], dtype=np.uint8) 
print(a)  # output-> [254 255   0   1   2]

a = np.array([-1, -2, -3], dtype=np.uint8)
print(a)  # Output: [255 254 253]

[3]
[255]


OverflowError: Python integer 256 out of bounds for uint8

In [4]:
# size of data types

# Demonstrating the size of different NumPy data types in bytes

# Integer Types
print("Integer Types:")
print("Size of int8:", np.dtype(np.int8).itemsize, "bytes")   # 1 byte
print("Size of int16:", np.dtype(np.int16).itemsize, "bytes") # 2 bytes
print("Size of int32:", np.dtype(np.int32).itemsize, "bytes") # 4 bytes
print("Size of int64:", np.dtype(np.int64).itemsize, "bytes") # 8 bytes

# Unsigned Integer Types
print("\nUnsigned Integer Types:")
print("Size of uint8:", np.dtype(np.uint8).itemsize, "bytes")   # 1 byte
print("Size of uint16:", np.dtype(np.uint16).itemsize, "bytes") # 2 bytes
print("Size of uint32:", np.dtype(np.uint32).itemsize, "bytes") # 4 bytes
print("Size of uint64:", np.dtype(np.uint64).itemsize, "bytes") # 8 bytes

# Floating Point Types
print("\nFloating Point Types:")
print("Size of float16:", np.dtype(np.float16).itemsize, "bytes") # 2 bytes
print("Size of float32:", np.dtype(np.float32).itemsize, "bytes") # 4 bytes
print("Size of float64:", np.dtype(np.float64).itemsize, "bytes") # 8 bytes

# Complex Types
print("\nComplex Types:")
print("Size of complex64:", np.dtype(np.complex64).itemsize, "bytes") # 8 bytes (2 x 4 bytes)
print("Size of complex128:", np.dtype(np.complex128).itemsize, "bytes") # 16 bytes (2 x 8 bytes)

# Boolean Type
print("\nBoolean Type:")
print("Size of bool:", np.dtype(np.bool_).itemsize, "bytes") # 1 byte

# Object Type
print("\nObject Type:")
print("Size of object:", np.dtype(np.object_).itemsize, "bytes") # Typically 8 bytes for pointer

Integer Types:
Size of int8: 1 bytes
Size of int16: 2 bytes
Size of int32: 4 bytes
Size of int64: 8 bytes

Unsigned Integer Types:
Size of uint8: 1 bytes
Size of uint16: 2 bytes
Size of uint32: 4 bytes
Size of uint64: 8 bytes

Floating Point Types:
Size of float16: 2 bytes
Size of float32: 4 bytes
Size of float64: 8 bytes

Complex Types:
Size of complex64: 8 bytes
Size of complex128: 16 bytes

Boolean Type:
Size of bool: 1 bytes

Object Type:
Size of object: 8 bytes


In [None]:
# change data types using astype() method

# Create a NumPy array with default data type (int)
a = np.array([1, 2, 3, 4, 5])
print("Original Array (int):")
print(a)
print("Data type:", a.dtype)

# Change data type to float
a_float = a.astype(float)
print("\nArray converted to float:")
print(a_float)
print("Data type:", a_float.dtype)

# Change data type to string
a_str = a.astype(str)
print("\nArray converted to string:")
print(a_str)
print("Data type:", a_str.dtype)

# Create a float array
a_float2 = np.array([1.1, 2.2, 3.3])
print("\nOriginal Array (float):")
print(a_float2)
print("Data type:", a_float2.dtype)

# Change data type to integer (will truncate the decimal part)
a_int2 = a_float2.astype(int)
print("\nArray converted to int:")
print(a_int2)
print("Data type:", a_int2.dtype)

# Change data type to a specific NumPy type (e.g., np.float32)
a_float32 = a.astype(np.float32)
print("\nArray converted to float32:")
print(a_float32)
print("Data type:", a_float32.dtype)

Original Array (int):
[1 2 3 4 5]
Data type: int64

Array converted to float:
[1. 2. 3. 4. 5.]
Data type: float64

Array converted to string:
['1' '2' '3' '4' '5']
Data type: <U21

Original Array (float):
[1.1 2.2 3.3]
Data type: float64

Array converted to int:
[1 2 3]
Data type: int64

Array converted to float32:
[1. 2. 3. 4. 5.]
Data type: float32
