#### Data Types in NumPy

In [None]:
# Most common data types in NumPy:
# NumPy auto uses int64 by default on 64-bit systems and int32 on 32-bit systems.
# We can also change the data type, either by explicitly typecasting at array creation 
# using dtype attribute or by using the astype method while creating a new array from existing one. 

# Integer literals: int32, int64
# Floating literals: float32, float64
# Literals are the actual values that are written directly in a program.

# Boolean: bool

# Complex numbers ->
# Complex numbers are made of two parts: Real part is normal number, imaginary part is written with a variable (for multiplication).
# Complex numbers: complex64 (if system is 32bit architecture then 32bit for real part + 32bit for imaginary part becomes 64bit), complex128 (if 64 bit system architecture)

# String: S(byte-str) & U(unicode-str)
# Object: generic python objects - object


#### It is not advised to use numpy for string data type and object data type.
#### Because Numpy is made for numerical computations.

In [None]:
import numpy as np

# Common Data Types
arr = np.array([1,2,3,4,5])
arr2 = np.array([1,2,3,4,5.5])
arr3 = np.array(["hello","ai/ml"])# U5 for max 5 chracters
arr4 = np.array(["hello","thequickbrownfox"])# U16 for max 16 chracters
# the number keeps on changing according to the number of characters you've written.

print(arr,arr.dtype)
print(arr2,arr2.dtype)
print(arr3,arr3.dtype)
print(arr4,arr4.dtype) 


# Complex Numbers
arrC1 = np.array([2 + 3j]) 
# NumPy stores complex numbers as complex128 on 64-bit systems
# because both the real (2) and imaginary (3j) parts use float64 (64 + 64 bits).
# Real part is normal number, imaginary part is written with a variable (for multiplication)
arrC2 = np.array([5 + 8j])  

print(arrC1, arrC1.dtype)  

# Real parts are calculated separately and imaginary parts are calculated serpartely.
# So, 2 + 5 => 7 and 3j + 8j => 11j
print(arrC1 + arrC2,(arrC1+arrC2).dtype)  
print(arrC2 - arrC1)



# Objects
# When a Numpy array contains mixed data types or due to mixed data types, numpy can't convert all of them into single data type,
# Since to make them of single datatype, Numpy assigns the object data type.
arrO = np.array(["hello",{1,2,3},3.14])
print(arrO,arrO.dtype)

#### Explicitly Changing the data type

In [17]:
arrEx = np.array([1,2,3,4,5,6])
new_arr = arrEx.astype("float32")
print(arrEx,arrEx.dtype)
print(new_arr,new_arr.dtype)

new_arr2 = np.array([1,2,3,4,5],dtype="float64")
print(new_arr2,new_arr2.dtype)

[1 2 3 4 5 6] int64
[1. 2. 3. 4. 5. 6.] float32
[1. 2. 3. 4. 5.] float64


#### Why does dtype matter?
- Memory efficiency
    - `np.int8` uses 1 byte per element,`np.int32` uses 4 bytes per element and `np.int64` uses 8 bytes.
- Performance
    - Smaller types = faster computations
- Compatibility
    - Black & White images often use `np.uint8`
    - ML libraries expect `float32`

#### Note:
- In some cases it is useful to do downcasting i.e converting data type to a smaller data type to reduce memory usage & improve performance. 
- Example - Suppose you have a dataset of 1 million peopleâ€™s ages. Storing them as `int64` wastes memory because ages are small numbers (0-120). So we can downcast these values to `int8`.
