# Working with Numpy

Preliminary notes on working with Numpy, rather incomplete.

More details:

- https://www.w3schools.com/python/numpy/default.asp

These notes are mostly focused on numpy as used in pandas, in particular:

- <a href="#ndarray">np.ndarray</a>
- <a href="#random">np.random</a>

In [1]:
import numpy as np

## numpy.ndarray <a name="ndarray"/>

- https://numpy.org/doc/stable/reference/arrays.ndarray.html
- https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html

In [2]:
# These will both give you a new empty but leave its allocated memory unchanged (i.e., it contains random
# values and rerunning it gives you different values). In the second case you explicitely create empty input.

a1 = np.ndarray(shape=(2,3), dtype=int)
a2 = np.ndarray(shape=(2,3), dtype=int, buffer=np.empty((2,3)))
print(a1)
print(a2)

[[ 1152921504606846976  1152921504606846976 -9223372036854775800]
 [                   0           4294967296                    0]]
[[1152921504606846976 1152921504606846976      35871566856197]
 [5572452859464646656                   0                   0]]


In [3]:
# Put in zero values, without dtype you will get floats because that is the default. The input buffer can
# be larger than the shape defined by the ndarray but it cannot be smaller. We have 2 rows and 5 columns.

np.ndarray(shape=(2,5), dtype=int, buffer=np.zeros((10,10)))

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

In [4]:
# Put in some value.

np.ndarray(shape=(2,6), dtype=int, buffer=np.full((2,6), 10))

array([[10, 10, 10, 10, 10, 10],
       [10, 10, 10, 10, 10, 10]])

In [5]:
# Changing cells, we are accessing the cell in the second row and the third column.

array = np.ndarray(shape=(3,5), dtype=int, buffer=np.full((3,5), 10))
print(array[1])
array[1][2] = 0
print(array[1])

[10 10 10 10 10]
[10 10  0 10 10]


In [6]:
# two ways of accessing: array[x,y] and array[x,y]}

array = np.array([[1,2,3],[4,5,6]])
print(array[1][0], array[1,0])

4 4


In [7]:
# Cannot do that with a string, because, well, it is numpy not strpy. It does not give you an error though
# and gives you a weird dtype. Probably just stay away from that.

array = np.ndarray(shape=(2,6), dtype=str, buffer=np.full((2,6), 'hoppa'))
print(array[0])
array[0][0] = 'ole!'
print(array[0])
array

['' '' '' '' '' '']
['' '' '' '' '' '']


array([['', '', '', '', '', ''],
       ['', '', '', '', '', '']], dtype='<U0')

In [8]:
# Create an nparray straight from an array

array = np.array([[1, 2], [3, 4]])
print(f'{type(array)}  dtype={array.dtype}')
array

<class 'numpy.ndarray'>  dtype=int64


array([[1, 2],
       [3, 4]])

In [9]:
# The dtype is a bit of a mystery to me
# https://numpy.org/doc/stable/reference/generated/numpy.dtype.html
# the following gives you columns, could have used ''<i2' or '<i1' as well

x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i1')])
print(x['a'], x['b'])

[1 3] [2 4]


In [10]:
# zero-dimensional, three-dimensional and multi-dimensional

# this is an array but also acts like a scalar
array = np.array(90)
print(array, type(array), array == 90, array == 99, array + 10, end='\n\n')

array = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [7, 8, 9]]])
print(array, end='\n\n')

array = np.array([1, 2, 3, 4], ndmin=5)
print(array)
print('number of dimensions :', array.ndim)

90 <class 'numpy.ndarray'> True False 100

[[[1 2 3]
  [4 5 6]]

 [[1 2 3]
  [7 8 9]]]

[[[[[1 2 3 4]]]]]
number of dimensions : 5


In [11]:
# row-major versus column-major
# seems to be mostly about in-memory layout
# https://en.wikipedia.org/wiki/Row-_and_column-major_order

array = np.array([[1,2,3],[4,5,6]], order='C')
print(array[0])

array = np.array([[1,2,3],[4,5,6]], order='F')
print(array[0])

[1 2 3]
[1 2 3]


In [12]:
print(
    np.array([[1,2,3],[4,5,6]]).mean(),
    np.array([[1,2,3],[4,5,6]]).mean(0),
    np.array([[1,2,3],[4,5,6]]).mean(1))

3.5 [2.5 3.5 4.5] [2. 5.]


In [13]:
print(np.arange(27).reshape((3,9)), end='\n\n')
print(np.arange(27).reshape((3,3,3)))

[[ 0  1  2  3  4  5  6  7  8]
 [ 9 10 11 12 13 14 15 16 17]
 [18 19 20 21 22 23 24 25 26]]

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[ 9 10 11]
  [12 13 14]
  [15 16 17]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]]


## numpy.random <a name="random"></a>

- https://numpy.org/doc/stable/reference/random/index.html
- https://numpy.org/doc/stable/reference/random/generated/numpy.random.rand.html
- https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html

In [44]:
# Create an array with the given shape with random numbers between 0 and 1, actually, number is in [0,1)

np.random.rand(3,5)

array([[0.37256614, 0.56669596, 0.77546895, 0.18228675, 0.64546993],
       [0.35290813, 0.56794851, 0.45316717, 0.83217913, 0.75111696],
       [0.52881271, 0.48970144, 0.60287095, 0.64120829, 0.24909369]])

In [15]:
# Return one random number from a Gausian distribution. You can give a number as an argument
# which gives some kind of a range, but it does not serve as a strict range.

print(
    np.random.normal(),
    np.random.normal(10))

0.558624317817056 8.95739651691521


In [60]:
# Return a list of random numbers, it pulls from a Gaussian distribution and you give it the 
# center of the distribution, the standard deviation and the length of the list.

np.random.normal(0.0, 1.0, 10)

array([ 1.57906567,  0.80833633, -0.53700709, -0.40939393, -0.67742872,
        0.49380391, -0.91201225, -0.20768049, -0.41424752,  1.26916781])

## Others

In [63]:
# Print 5 numbers from 0 to 100 evenly soaced

np.linspace(0, 100, 5)

array([  0.,  25.,  50.,  75., 100.])

In [64]:
# Create an arry of the same number of the given length

np.repeat(1, 10)

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])