In the previous lecture we were using the dtype attribute of the ndarray class to check the data type of the elements in the array. We know that all elements of a numpy array are of the same type, unlike in a regular Python list, where the elements may but don't have to be of the same type.

Let's create an array and use the attribute again to check the data type:

In [2]:
import numpy as np
A = np.array([4, 2, 7])
print(A.dtype)

int32


Here the elements are 32-bit integers. Let's try this array:

In [3]:
B = np.array([.2, .4, 2.1])
print(B.dtype)

float64


Now they are 64-bit floating point numbers.

Here's another one:

In [4]:
C = np.array([2.1+3.j, 1.+2.8j])
print(C.dtype)

complex128


Now we have 128-bit complex numbers. And now let's try this one out:

In [6]:
D = np.array([3, 2.4, 2.8+1.2j])
print(D.dtype)

complex128


As you can see, if we try to put an integer, a float and a complex number in a numpy array, they will all be upcast to a complex number because both integers and floats can be easily converted to complex numbers.

The main numerical data types that we can use in numpy are signed and unsigned integers like int8, int16, int32, int64, uint8, uint16, uint32, uint64, floats like float16, float32, float64, complex numbers like complex64, complex128, complex256 and bools. The numbers tell us how many bits are needed to represent a number of this type.

Apart from this, we can use nonnumerical types as well, like strings, objects or user-defined types. For now, however, we'll concentrate on the numerical types.

So, what if we want to create an array of 8-bit unsigned integers? These are nonnegative integers that can be represented on 8 bits. Let's try:

In [7]:
E = np.array([1, 2, 3])
print(E.dtype)

int32


Well, the elements 1, 2 and 3 can be definitely interpreted as unsigned 8-bit integers because they are nonnegative and 8 bits is more than enough to represent them. However, as we can see, they are treated as 32-bit signed integers. 

If we really want them to be of a specific data type, we can make them be of this type by passing the dtype argument to the array constructor. So, in our case:

In [8]:
E = np.array([1, 2, 3], dtype = np.uint8)
print(E.dtype)

uint8


Another example. Let's create an array of 64-bit floats:

In [10]:
F = np.array([1, 2, 3], dtype = np.float64)
print(F.dtype)

float64


There's also another syntax for this:

In [16]:
F = np.array([1, 2, 3], dtype = 'f8')
print(F.dtype)

float64


Here the string 'f8' means an 8-byte float, which is the same as a 64-bit float.

We can also only generally specify that we want an integer, a float or any other type:

In [18]:
G = np.array([1, 2, 3], int)
print(G.dtype)

int32


or like this:

In [20]:
H = np.array([1, 2, 3], complex)
print(H.dtype)

complex128


We can also treat the data as boolean values, where 0 is interpreted as False and any other number as True:

In [21]:
I = np.array([5, 0, -2, 0, 7], dtype = bool)
print(I)
print(I.dtype)

[ True False  True False  True]
bool


After we create an array, we actually cannot change the type of the elements. What we can do is typecast the array by creating a new array. We can do it like so:

In [22]:
J = np.array([1, 2, 3], float)
print(J)
print(J.dtype)

J = np.array(J, dtype = complex)
print(J)
print(J.dtype)

[1. 2. 3.]
float64
[1.+0.j 2.+0.j 3.+0.j]
complex128


or we can use the astype method of the ndarray class:

In [23]:
J = np.array([1, 2, 3], float)
print(J)
print(J.dtype)

J = J.astype(complex)
print(J)
print(J.dtype)

[1. 2. 3.]
float64
[1.+0.j 2.+0.j 3.+0.j]
complex128


EXERCISE

Create two arrays containing the numbers 10, 20 and 30. The elements in array X should be 16-bit ints, whereas the elements in array Y should be 16-bit floats. Then use the astype method to typecast the elements in array X to the same type as the elements in array Y, without passing the type directly (i.e. don't tell the code that it should be typecast to float16, instead read the type from the Y array using an appropriate attribute).

Print the arrays before and after the typecasting and the data types.

SOLUTION


In [24]:
X = np.array([10, 20, 30], dtype = np.int16)
print(X)
print(X.dtype)

Y = np.array([10, 20, 30], dtype = np.float16)
print(Y)
print(Y.dtype)

X = X.astype(Y.dtype)
print(X)
print(X.dtype)

[10 20 30]
int16
[10. 20. 30.]
float16
[10. 20. 30.]
float16
