In [2]:
import numpy as np
np.array([1, 2, 3]).dtype


dtype('int32')

In [3]:
np.array([255], np.uint8) + 1

array([0], dtype=uint8)

In [4]:
np.array([2 ** 31 - 1])

array([2147483647])

In [5]:
np.array([2 ** 31 - 1]) + 1

array([-2147483648])

In [6]:
np.array([2 ** 63 - 1]) + 1

array([-9223372036854775808], dtype=int64)

In [8]:
np.array([255], np.uint8)[0] + 1

256

In [9]:
np.array([2 ** 31 - 1]) [0] + 1

  np.array([2 ** 31 - 1]) [0] + 1


-2147483648

In [10]:
np.array([2 ** 63 - 1]) [0] + 1

  np.array([2 ** 63 - 1]) [0] + 1


-9223372036854775808

Unlike true floating point errors (where the hardware FPU sets a flag whenever it does anatomic operation that overflows), we need to implement the integer overflow detectionourselves. We do it on the scalars, but not arrays because it would be too slow toimplement for every atomic operation on arrays. (Robert Kern, one of the NumPy coredevelopers)

In [13]:
# You can turn in into an error
with np.errstate(over = 'raise'):
    print(np.array([2 ** 31 - 1])[0] + 1)

FloatingPointError: overflow encountered in scalar add

In [14]:
# or suppress it vtemporarily
with np.errstate(over = 'ignore'):
    print(np.array([2 ** 31 - 1])[0] + 1)

-2147483648


#### Floats
As pure Python
float
did not diverge from the IEEE 754-standardized C
double
type(note the difference in naming), the floating point numbers transition from Python to NumPy is pretty much hassle-free: Python
float
is directly compatible with
np.float64
and Python
complex
— with
np.complex128


In [15]:

x = np.array([-1234.5])
1 / (1 + np.exp(-x))
# Output: RuntimeWarning: overflow encountered in exp
# array([0.])

np.exp(np.array([1234.5]))
# Output: RuntimeWarning: overflow encountered in exp
# array([inf])


  1 / (1 + np.exp(-x))
  np.exp(np.array([1234.5]))


array([inf])

One thing that distinguishes floats from integers is that they are inexact.
You can’t compare two floats with a == b, unless you’re sure they are represented exactly. 
You can expect floats to exactly represent integers — but only below a certain level(limited by the number of the significant digits):

In [16]:
92 / 9945539648888.0 + 1

1.0000000000092504

In [17]:
len('9279945539648888')

16

For the financial data decimal.
Decimal type is handy as it involves no additional tolerances at all:

In [18]:
from decimal import Decimal as D

a = np.array([D('0.1'), D('0.2')]); a
# Output: array([Decimal('0.1'), Decimal('0.2')], dtype=object)

a.sum()
# Output: Decimal('0.3')


Decimal('0.3')

For pure mathematical calculations fractions.Fraction can be used:

In [19]:
from fractions import Fraction

a = np.array([1, 2]) + Fraction(); a
# Output: array([Fraction(1, 1), Fraction(2, 1)], dtype=object)

a /= 10; a
# Output: array([Fraction(1, 10), Fraction(1, 5)], dtype=object)

a.sum()
# Output: Fraction(3, 10)


Fraction(3, 10)

Complex numbers are treated the same way as floats.

 There are extra convenience functions with intuitive names like np.
real
(z), np.
imag
(z), np.
abs
(z), np.
angle
(z) that work on both scalars and arrays as a whole. The only difference from the pure Python
complex,
np.complex_
does not work with integers:

In [20]:
np.array([1 + 2j])

array([1.+2.j])

#### Bools
The boolean values are stored as single bytes for better performance.
np.bool_
is aseparate type from Python’s
bool
because it doesn’t need reference counting and a link to the base class required for any pure Python type. So if you think that using 8 bits to store one bit of information is excessive look at this:

In [22]:
import sys
sys.getsizeof(True)

28

#### Strings
Initializing a NumPy array with a list of Python strings packs them into a fixed-width native NumPy dtype called np.str_. 
Reserving a space necessary to fit the longest string for every element might look wasteful (especially in the fixed USC-4 encoding as opposed to ‘dynamic’ choice of the UTF width in Python
str
)

In [23]:
import numpy as np

np.array(['abcde', 'x', 'y', 'x'])
# Output: array(['abcde', 'x', 'y', 'x'], dtype='<U5')
# Comments: 4 bytes per any character, so 5 characters * 4 bytes = 20 bytes per element


array(['abcde', 'x', 'y', 'x'], dtype='<U5')

Another option is to keep references to Python
str's in a NumPy array of objects:

In [24]:
import numpy as np

np.array(['abcde', 'x', 'y', 'x'], object)
# Output: array(['abcde', 'x', 'y', 'x'], dtype=object)
# Comments: 1 byte per ASCII character, so each element size is 49 + len(element) bytes


array(['abcde', 'x', 'y', 'x'], dtype=object)

If you’re dealing with a raw sequence of bytes NumPy has a fixed-length version of a Python
bytes type called np.bytes_:

In [25]:
import numpy as np

np.array([b'abcde', b'x', b'y', b'x'])
# Output: array([b'abcde', b'x', b'y', b'x'], dtype='|S5')
# Comments: 1 byte per ASCII character, so each element size is 5 bytes


array([b'abcde', b'x', b'y', b'x'], dtype='|S5')

Here’s a useful function that decomposes a
datetime64 array to an array of 7 integer columns (years, months, days, hours, minutes, seconds, microseconds):

In [26]:
def dt2cal(dt):
    # allocate output
    out = np.empty(dt.shape + (7,), dtype='u4')
    # decompose calendar floors
    Y, M, D, h, m, s = [dt.astype(f'M8[{x}]') for x in "YMDhms"]
    out[..., 0] = Y + 1970  # Gregorian Year
    out[..., 1] = (M - Y) + 1  # month
    out[..., 2] = (D - M) + 1  # day
    out[..., 3] = (dt - D).astype("m8[h]")  # hour
    out[..., 4] = (dt - h).astype("m8[m]")  # minute
    out[..., 5] = (dt - m).astype("m8[s]")  # second
    out[..., 6] = (dt - s).astype("m8[us]")  # microsecond
    return out

# Example usage:
a = np.array(['2021-12-15T09:00:00.000000', 
              '2021-12-18T19:00:00.000000', 
              '2021-12-24T09:00:00.000000'], dtype='datetime64[us]')

dt2cal(a)
# Output:
# array([[2021, 12, 15,  9,  0,  0,     0],
#        [2021, 12, 18, 19,  0,  0,     0],
#        [2021, 12, 24,  9,  0,  0,     0]], dtype=uint32)


array([[2021,   12,   15,    9,    0,    0,    0],
       [2021,   12,   18,   19,    0,    0,    0],
       [2021,   12,   24,    9,    0,    0,    0]], dtype=uint32)

#### Combinations thereof
A ‘structured array’ in NumPy is an array with a custom dtype made from the typesdescribed above as the basic building blocks (akin to struct in C). 
A typical exampleis an RGB pixel color: a 3 bytes long type (usually 4 for alignment), in which thecolors can be accessed by name:

In [27]:
import numpy as np

rgb = np.dtype([('x', np.uint8), ('y', np.uint8), ('z', np.uint8)])

a = np.zeros(5, rgb); a
# Output:
# array([(0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0)],
#       dtype=[('x', 'u1'), ('y', 'u1'), ('z', 'u1')])

a[0]
# Output: (0, 0, 0)

a[0]['x']
# Output: 0

a[0]['x'] = 10
a
# Output:
# array([(10,  0,  0), ( 0,  0,  0), ( 0,  0,  0), ( 0,  0,  0), ( 0,  0,  0)],
#       dtype=[('x', 'u1'), ('y', 'u1'), ('z', 'u1')])

a['z'] = 5
a
# Output:
# array([(10,  0,  5), ( 0,  0,  5), ( 0,  0,  5), ( 0,  0,  5), ( 0,  0,  5)],
#       dtype=[('x', 'u1'), ('y', 'u1'), ('z', 'u1')])


array([(10, 0, 5), ( 0, 0, 5), ( 0, 0, 5), ( 0, 0, 5), ( 0, 0, 5)],
      dtype=[('x', 'u1'), ('y', 'u1'), ('z', 'u1')])

Even though this syntax is convenient for addressing particular columns as a whole, neither structured arrays nor recarrays are something you’d want to use  in the innermost loop of a compute-intensive code:


In [29]:
a = np.random.rand(100000, 4)

b = a.view(dtype=[('x', np.float64), ('y', np.float64)])

c = np.recarray(buf=a, shape=len(a), dtype=[('x', np.float64), ('y', np.float64)])

# Reference calculation
s1 = 0
for r in a:
    s1 += (r[0]**2 + r[1]**2)**-1.5

# 5x slower
s2 = 0
for r in b:
    s2 += (r['x']**2 + r['y']**2)**-1.5

# 7x slower
s3 = 0
for r in c:
    s3 += (r.x**2 + r.y**2)**-1.5

# 20x faster
s1_fast = np.sum((a[:, 0]**2 + a[:, 1]**2)**-1.5)

# Same as s1
s2_fast = np.sum((b['x']**2 + b['y']**2)**-1.5)

# Same as s1
s3_fast = np.sum((c.x**2 + c.y**2)**-1.5)


In [3]:
import numpy as np
# crear 1-dimensional array
arr_1d = np.array([5, 4, 3, 2, 1])
print(arr_1d)

[5 4 3 2 1]


In [4]:
# crear 2-dimensional array
arr_2d = np.array([[9, 8, 7], [6, 5, 4],[3, 2, 1]])
print(arr_2d)

[[9 8 7]
 [6 5 4]
 [3 2 1]]


In [5]:
# crear array de zeros
zeros = np.zeros((3, 5))
print(zeros)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


In [9]:
# crear array de unos
ones = np.ones((4, 2))
print(ones)


[[1. 1.]
 [1. 1.]
 [1. 1.]
 [1. 1.]]


In [14]:
# crear array con un numero determinado
full = np.full((3, 7), 666)
print(full)

[[666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666]
 [666 666 666 666 666 666 666]]


In [15]:
# crear array de un rango de numeros
arr_range = np.arange(10, 100, 4)
print(arr_range)

[10 14 18 22 26 30 34 38 42 46 50 54 58 62 66 70 74 78 82 86 90 94 98]


In [16]:
# crear array de un rango de numeros usando linspace
arr_linspace = np.linspace(0, 10, 19) 
# start 0, end 10, 19 uniformly distributed values
print(arr_linspace)

[ 0.          0.55555556  1.11111111  1.66666667  2.22222222  2.77777778
  3.33333333  3.88888889  4.44444444  5.          5.55555556  6.11111111
  6.66666667  7.22222222  7.77777778  8.33333333  8.88888889  9.44444444
 10.        ]


In [20]:
# NumPy also allows you to create arrays filled with random values ​​from different distributions
# using the np.random module
rand_int = np.random.randint(1, 51, size = (1, 5))
print(rand_int)  # para jugar Euromilliones

[[ 8 45 43 24 32]]


In [21]:
# hacer lo mismo con numeros float
rand_uniform = np.random.rand(3, 3) # random floats entre 0 y 1
print(rand_uniform)

[[0.8518515  0.58339615 0.82804326]
 [0.72062106 0.81406305 0.67435065]
 [0.24059689 0.5393053  0.81992704]]


In [22]:
# crear array con numeros flotantes random de una distribuición normal
rand_normal = np.random.randn(3, 5)
print(rand_normal)

[[ 0.0109932   0.39623004  1.02175575  0.12940946 -0.62328554]
 [-0.4206292  -0.20410721  1.27404493  0.48641331 -0.83650416]
 [-0.33399371  0.84741548  0.02101061  0.94347554  0.33105821]]


#### Atributos y propiedades del array
Discutamos varios atributos y propiedades de los arrays de NumPy, que proporcionan información importante sobre el array, como la forma, el tamaño y el tipo de datos. Comprender estos atributos es crucial al trabajar con arrays de NumPy para asegurar la manipulación correcta de los datos y la ejecución de operaciones matemáticas.

El atributo SHAPE devuelve una tupla que representa la dimensionalidad del array. 

El atributo NDIM indica el número de dimensiones del array.

In [27]:
arr = np.array([[[6, 5, 4], [3 ,2, 20]]])
print('Shape: ', arr.shape)
print('Number of dimensions: ', arr.ndim)

Shape:  (1, 2, 3)
Number of dimensions:  3


In [29]:
# El atributo size devuelve el número total de elementos en el array. 
# El atributo dtype indica el tipo de datos de los elementos del array.
arr = np.array([[[4, 5, 7, 9], [0, 1, 2, 0]]])
print('Size: ', arr.size)
print('Data type: ', arr.dtype)

Size:  8
Data type:  int32


In [30]:
# El atributo itemsize devuelve el tamaño (en bytes) de cada elemento del array. 
# El atributo nbytes proporciona la cantidad total de bytes utilizados por los datos del array.
arr = np.array([[[1, 1, 2], [3, 3, 5]]], dtype = np.float64)
print('Item size: ', arr.itemsize)
print('Total bytes: ', arr.nbytes)

Item size:  8
Total bytes:  48


In [34]:
# Puedes cambiar la forma del array sin alterar sus datos utilizando la función reshape()
arr = np.arange(1, 13)
print('Original array: ', arr)

reshaped_arr = arr.reshape(3, 4)
print('Reshaped array: ')
print(reshaped_arr)

Original array:  [ 1  2  3  4  5  6  7  8  9 10 11 12]
Reshaped array: 
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [44]:
# La función transpose() o el atributo T pueden ser utilizados para transponer un array, 
# lo que intercambia filas y columnas.
arr = np.array([[6, 5, 4], [3, 2, 1]])
print('Original array: ')
print(arr)

transposed_arr = arr.transpose()
print('Transposed array: ')
print(transposed_arr)

Original array: 
[[6 5 4]
 [3 2 1]]
Transposed array: 
[[6 3]
 [5 2]
 [4 1]]


#### Indexación y corte de arrays
Estos métodos permiten manipular eficazmente los elementos del array y extraer partes específicas de los datos para un análisis posterior. La indexación de arrays en NumPy funciona de manera similar a la indexación de listas en Python. Puedes usar corchetes para acceder a elementos individuales del array, especificando los índices para cada dimensión.








In [41]:
arr = np.array([[6, 5, 4], [3, 2, 1]])
print('Element at position (0, 0):' , arr[0, 0])
print('Element at position (1, 1):' , arr[1, 1])
      

Element at position (0, 0): 6
Element at position (1, 1): 2


El corte de arrays permite extraer una parte del array especificando índices de inicio y fin, así como el paso para cada dimensión. La sintaxis es similar a la del corte de listas en Python: también se usa el colon (:) para separar los valores de inicio, fin y paso.

In [43]:
arr = np.array([6, 5, 4, 3, 2, 1])
print('Elements from index 1 to 4: ', arr[1 : 5])
print('Every second element: ', arr[::2])

Elements from index 1 to 4:  [5 4 3 2]
Every second element:  [6 4 2]


Para arrays multidimensionales, puedes usar una coma para separar las operaciones de corte para cada dimensión

In [45]:
arr = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])
print('First two raws and columns: ')
print(arr[:2, :2])

First two raws and columns: 
[[9 8]
 [6 5]]



Puedes usar la indexación y el corte para modificar elementos del array en NumPy. Esto permite establecer ciertos elementos a un nuevo valor o actualizar una parte del array basándose en una condición.

In [53]:
arr = np.array([5, 2, 4, 8, 9, 6])
arr[2] = 999
print('Modified array: ', arr)

# Cambiar el valor de los elementos con índice mayor a 4
arr[5:] = -100  # Cambiar el valor del sexto elemento en adelante a -100

print('Array with index greater than 4 set to -100:', arr)

arr[arr > 4] = -100
print('Array with values greater than 4 set to -100:', arr)



Modified array:  [  5   2 999   8   9   6]
Array with index greater than 4 set to -100: [   5    2  999    8    9 -100]
Array with values greater than 4 set to -100: [-100    2 -100 -100 -100 -100]
