<a href="https://colab.research.google.com/github/ranamaddy/numpy/blob/main/Topic_16%3D_NumPy_Data_Types.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Topic 16 = NumPy Data Types**


NumPy is a Python library that is used for scientific computing, and it provides a powerful array data type called ndarray. The ndarray can be used to store homogeneous data (i.e., data of the same type) in a multidimensional array.

NumPy supports a range of data types that can be used with the ndarray. These data types are specified using a single character code, which is used as a shorthand notation for the data type. Here are some of the most commonly used data types in NumPy:

1. int: integer( used to represent integer numbers. e.g. -1, -2, -3)

2. float: floating-point number(used to represent real numbers. e.g. 1.2, 42.42)

3. complex: complex number( used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j)

4. bool: Boolean (True or False)

5. object: Python object

6. string_: fixed-length string( used to represent text data, the text is given under quote marks. e.g. "ABCD")

7. unicode_: fixed-length Unicode string

Each data type has a corresponding bit depth, which determines the range of values that can be represented by the data type. For example, an int16 can represent integers between -32,768 and 32,767, while an int64 can represent integers between -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807.

NumPy also supports structured data types, which allow you to create arrays with fields that can be different data types. Structured data types are specified using a dictionary with keys that represent field names and values that represent the data type of each field.



1. int8: 8-bit signed integer (-128 to 127)
2. uint8: 8-bit unsigned integer (0 to 255)
3. int16: 16-bit signed integer (-32768 to 32767)
4. uint16: 16-bit unsigned integer (0 to 65535)
5. int32: 32-bit signed integer (-2147483648 to 2147483647)
5. uint32: 32-bit unsigned integer (0 to 4294967295)
6. int64: 64-bit signed integer (-9223372036854775808 to 7. 9223372036854775807)
8. uint64: 64-bit unsigned integer (0 to 18446744073709551615)
9. float16: 16-bit floating-point number
10. float32: 32-bit floating-point number
11. float64: 64-bit floating-point number
12. complex64: 64-bit complex number
13. complex128: 128-bit complex number
14. bool: Boolean (True or False)
15. object: Python object type, can hold any data type

These data types can be used to define the type of elements in a NumPy array or matrix, and they have different ranges and precision depending on the data type. It's important to choose the appropriate data type based on the type of data being stored to optimize memory usage and computation speed.


NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.

Below is a list of all data types in NumPy and the characters used to represent them.

1. i - integer
2. b - boolean
3. u - unsigned integer
4. f - float
5. c - complex float
6. m - timedelta
7. M - datetime
8. O - object
9. S - string
10. U - unicode string
11. V - fixed chunk of memory for other type ( void )

**Example 01: Get the data type of an array object:**

In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.dtype)

**Explanation**

The code creates a NumPy array named "arr" with the values [1, 2, 3, 4].

The "dtype" property of a NumPy array specifies the data type of the array elements.

The "print(arr.dtype)" statement prints out the data type of the elements in the "arr" array, which is likely to be "int64" (64-bit integer) in this case, as the default data type for integers in NumPy is "int64".

**Example 02: Get the data type of an array containing strings:**

In [None]:
import numpy as np
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)

**Explanation**

The given code creates a NumPy array called "arr" with the values ['apple', 'banana', 'cherry'].

The "dtype" property of a NumPy array specifies the data type of the array elements.

Since the elements in the "arr" array are strings, the "print(arr.dtype)" statement will output the data type as "string" or "object", depending on the version of NumPy being used.

**Example 03: Create an array with data type 4 bytes integer:**

In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4], dtype='i4')
print(arr)
print(arr.dtype)

**Explanation**

The given code creates a NumPy array called "arr" with the values [1, 2, 3, 4], and explicitly specifies the data type of the elements to be "i4" using the "dtype" parameter.

The "i4" data type refers to a 32-bit integer, meaning each element in the array is represented using 4 bytes (32 bits) of memory.

The "print(arr)" statement outputs the values of the "arr" array, which will be [1, 2, 3, 4].

The "print(arr.dtype)" statement outputs the data type of the elements in the "arr" array, which will be "int32" as "i4" is an alias for 32-bit integer data type in NumPy.

**Example 04: Change data type from float to integer by using 'i' as parameter value:**

In [None]:
import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype('i')
print(newarr)
print(newarr.dtype)

**Explanation**

The given code creates a NumPy array called "arr" with the values [1, 2, 3, 4], and explicitly specifies the data type of the elements to be "i4" using the "dtype" parameter.

The "i4" data type refers to a 32-bit integer, meaning each element in the array is represented using 4 bytes (32 bits) of memory.

The "print(arr)" statement outputs the values of the "arr" array, which will be [1, 2, 3, 4].

The "print(arr.dtype)" statement outputs the data type of the elements in the "arr" array, which will be "int32" as "i4" is an alias for 32-bit integer data type in NumPy.








**Example 05: Change data type from float to integer by using int as parameter value:**

In [None]:
import numpy as np
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype(int)
print(newarr)
print(newarr.dtype)

**Explanation**

This code creates a NumPy array called arr with three decimal numbers: 1.1, 2.1, and 3.1.

The second line of code, newarr = arr.astype(int), creates a new NumPy array called newarr by converting the data type of the original arr array to integers. This is done using the astype method with the argument int, which specifies that the data type of the new array should be integer.

The third line of code, print(newarr), outputs the contents of the newarr array, which will be the integer equivalents of the decimal numbers in arr. In this case, the output will be [1 2 3].

The fourth line of code, print(newarr.dtype), outputs the data type of the newarr array, which will be int32 in this case.

**Example 06: Change data type from integer to boolean:**

In [None]:
import numpy as np

arr = np.array([1, 0, 3])

newarr = arr.astype(bool)

print(newarr)
print(newarr.dtype)

**Explanation**
This code creates a NumPy array called arr with three integers: 1, 0, and 3.

The second line of code, newarr = arr.astype(bool), creates a new NumPy array called newarr by converting the data type of the original arr array to boolean. This is done using the astype method with the argument bool, which specifies that the data type of the new array should be boolean.

The third line of code, print(newarr), outputs the contents of the newarr array, which will be the boolean equivalents of the integers in arr. In this case, the output will be [ True False True]. Note that bool values are displayed as True and False in Python.

The fourth line of code, print(newarr.dtype), outputs the data type of the newarr array, which will be bool in this case.

**Example 01: Create an array with data type string:**

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4], dtype='S')

print(arr)
print(arr.dtype)

**Explanation**
The given code creates a NumPy array named "arr" which contains four elements [1, 2, 3, 4]. The "dtype='S'" parameter is used to specify the data type of the array elements as "S", which stands for string.

When we print the array using the first print statement, it will display [b'1', b'2', b'3', b'4'] because we specified the data type as "S" and it represents each element as a byte.

The second print statement outputs the data type of the array elements, which will display "S1" because we specified the data type as "S" and it takes up one byte of memory. Therefore, each element in the array is represented using one byte.

In [None]:
import numpy as np
# int8
x = np.array([1, 2, 3], dtype=np.int8)
print(x)

**Explanation:**

The given code creates a NumPy array named "x" which contains three elements [1, 2, 3]. The "dtype=np.int8" parameter is used to specify the data type of the array elements as 8-bit integer.

The maximum value that can be stored in an 8-bit integer is 127 and the minimum value is -128. Therefore, any value outside this range will result in an overflow or underflow error.

When we print the array using the print statement, it will display [1, 2, 3] because the values fit within the range of an 8-bit integer.

By specifying the data type as np.int8, we can reduce the memory usage of the array as compared to using a larger data type like np.int32 or np.float64, which can save memory especially for large arrays.

In [None]:
import numpy as np
# uint8
x = np.array([255, 128, 0], dtype=np.uint8)
print(x)

**Explanation**

The given code creates a NumPy array named "x" which contains three elements [255, 128, 0]. The "dtype=np.uint8" parameter is used to specify the data type of the array elements as 8-bit unsigned integer.

Unlike signed integers, unsigned integers only store positive numbers and zero. In this case, each element of the array can store values between 0 and 255.

When we print the array using the print statement, it will display [255, 128, 0] because each value fits within the range of an 8-bit unsigned integer.

By specifying the data type as np.uint8, we can further reduce the memory usage of the array as compared to using a larger data type like np.int32 or np.float64, which can save memory especially for large arrays.

In [None]:
import numpy as np
# int16
x = np.array([-32768, 0, 32767], dtype=np.int16)
print(x)

**Explanation**

The given code creates a NumPy array named "x" which contains three elements [-32768, 0, 32767]. The "dtype=np.int16" parameter is used to specify the data type of the array elements as 16-bit integer.

The maximum value that can be stored in a 16-bit integer is 32767 and the minimum value is -32768. Therefore, the first and last elements in the array are at the extreme ends of the range and any value outside this range will result in an overflow or underflow error.

When we print the array using the print statement, it will display [-32768, 0, 32767] because each value fits within the range of a 16-bit integer.

By specifying the data type as np.int16, we can reduce the memory usage of the array as compared to using a larger data type like np.int32 or np.float64, which can save memory especially for large arrays. However, it's important to ensure that the values stored in the array don't exceed the range of the specified data type to avoid overflow or underflow errors.

In [None]:
import numpy as np
# uint16
x = np.array([65535, 32768, 0], dtype=np.uint16)
print(x)

**Explanation**

The given code creates a NumPy array named "x" which contains three elements [65535, 32768, 0]. The "dtype=np.uint16" parameter is used to specify the data type of the array elements as 16-bit unsigned integer.

Unlike signed integers, unsigned integers only store positive numbers and zero. In this case, each element of the array can store values between 0 and 65535.

When we print the array using the print statement, it will display [65535, 32768, 0] because each value fits within the range of a 16-bit unsigned integer.

By specifying the data type as np.uint16, we can further reduce the memory usage of the array as compared to using a larger data type like np.int32 or np.float64, which can save memory especially for large arrays. However, it's important to ensure that the values stored in the array don't exceed the range of the specified data type to avoid overflow errors

In [None]:
import numpy as np
# int32
x = np.array([-2147483648, 0, 2147483647], dtype=np.int32)
print(x)

**Explanation**

The given code creates a NumPy array named "x" which contains three elements [-2147483648, 0, 2147483647]. The "dtype=np.int32" parameter is used to specify the data type of the array elements as 32-bit integer.

The maximum value that can be stored in a 32-bit integer is 2147483647 and the minimum value is -2147483648. Therefore, the first and last elements in the array are at the extreme ends of the range and any value outside this range will result in an overflow or underflow error.

When we print the array using the print statement, it will display [-2147483648, 0, 2147483647] because each value fits within the range of a 32-bit integer.

By specifying the data type as np.int32, we can ensure that the array can store larger values and perform arithmetic operations accurately without overflow or underflow errors. However, the memory usage of the array will be larger than using a smaller data type like np.int16 or np.uint16.

In [None]:
import numpy as np
# uint32
x = np.array([4294967295, 2147483648, 0], dtype=np.uint32)
print(x)

**Explanation**

The given code creates a NumPy array named "x" which contains three elements [4294967295, 2147483648, 0]. The "dtype=np.uint32" parameter is used to specify the data type of the array elements as 32-bit unsigned integer.

Unlike signed integers, unsigned integers only store positive numbers and zero. In this case, each element of the array can store values between 0 and 4294967295.

When we print the array using the print statement, it will display [4294967295, 2147483648, 0] because each value fits within the range of a 32-bit unsigned integer.

By specifying the data type as np.uint32, we can ensure that the array can store larger positive values and perform arithmetic operations accurately without overflow errors. However, it's important to ensure that the values stored in the array don't exceed the range of the specified data type to avoid overflow errors.

The memory usage of the array will be larger than using a smaller data type like np.uint16, but it can be more memory-efficient than using a signed integer data type like np.int32 as it can store a larger range of values without requiring the additional bit for the sign.

In [None]:
import numpy as np
# int64
x = np.array([-9223372036854775808, 0, 9223372036854775807], dtype=np.int64)
print(x)

In [None]:
import numpy as np
# uint64
x = np.array([18446744073709551615, 9223372036854775808, 0], dtype=np.uint64)
print(x)

In [None]:
import numpy as np
# float16
x = np.array([1.0, 2.0, 3.0], dtype=np.float16)
print(x)

In [None]:
import numpy as np
# float32
x = np.array([1.0, 2.0, 3.0], dtype=np.float32)
print(x)

In [None]:
import numpy as np
# float64
x = np.array([1.0, 2.0, 3.0], dtype=np.float64)
print(x)

In [None]:
import numpy as np
# complex64
x = np.array([1+2j, 3+4j, 5+6j], dtype=np.complex64)
print(x)

In [None]:
import numpy as np
# complex128
x = np.array([1+2j, 3+4j, 5+6j], dtype=np.complex128)
print(x)

In [None]:
import numpy as np
# bool
x = np.array([True, False, True], dtype=np.bool)
print(x)

In [None]:
 import numpy as np
# object
x = np.array([1, "two", [3, 4, 5]], dtype=np.object)
print(x)