### Creating Arrays from Scratch

In [1]:
import numpy as np

We can also create `ndarray` objects by using certain functions that NumPy provides.

#### zeros

The `np.zeros` function is used to create arrays filled with just zeros.

In [2]:
a = np.zeros(5)

In [3]:
a

array([0., 0., 0., 0., 0.])

In [4]:
a.dtype

dtype('float64')

The default data `dtype` is `float64`, but we can instruct otherwise:

In [5]:
a = np.zeros(5, dtype=int)

In [6]:
a

array([0, 0, 0, 0, 0])

In [7]:
a.dtype

dtype('int64')

You'll notice how I use the Python `int` object to specify the data type for `dtype` - this works because NumPy will pick the corresponding C type most compatible with the Python data type. (Python `float` will result in `dtype.float64` being used).

So we can specify the length of the 1-D array, as well as the data type.

But we can also specify a shape for creating multi-dimensional arrays:

In [8]:
m = np.zeros((4, 3), dtype=np.uint8)

In [9]:
m

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]], dtype=uint8)

In [10]:
m.shape

(4, 3)

#### ones

The `np.ones` function works the same as `np.zeros` but will populate the array with `1`s.

In [11]:
m = np.ones((10, 2), dtype=float)

In [12]:
m

array([[1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.],
       [1., 1.]])

In [13]:
m.dtype

dtype('float64')

#### full

The `np.full` function is basically a more generic variant of `np.zeros` and `np.ones` where we can specify the element value we want to fill the array with.

In [14]:
m = np.full((2, 5), 3.14)

In [15]:
m

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

#### identity matrix

If you recall past lectures, we had some work to do to create identify matrices in Python. NumPy makes this much easier with the function `np.eye`.

In [16]:
m = np.eye(5)
m

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [17]:
m.dtype

dtype('float64')

And as usual, we can specify `dtype` if we want to:

In [18]:
m = np.eye(4, dtype=int)

In [19]:
m

array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 1, 0],
       [0, 0, 0, 1]])

In [20]:
m.dtype

dtype('int64')

We can create a non-square matrix, but this function expects the # rows and # columns to be passed individually as positional arguments, not as a shape tuple. By default it assumes number of columns is the same as number of rows.

In [21]:
np.eye(5, 3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.],
       [0., 0., 0.]])

#### range based

In Python we can create a list based on a range object:

In [22]:
list(range(2, 11, 2))

[2, 4, 6, 8, 10]

NumPy provides similar functionality, using the `np.arange` function.

In [23]:
np.arange(2, 11, 2)

array([ 2,  4,  6,  8, 10])

As usual, we can specify the element data type:

In [24]:
np.arange(2, 11, 2, dtype=np.uint8)

array([ 2,  4,  6,  8, 10], dtype=uint8)

#### linspace

`linspace` (linear space) allows us to create a specified number of evenly spread values between two (inclusive) values. This can be very useful for generating 1-D arrays representing some axis (like an x-axis, or a time axis).

In [25]:
np.linspace(2, 10, num=5)

array([ 2.,  4.,  6.,  8., 10.])

In [26]:
np.linspace(2, 10, 10)

array([ 2.        ,  2.88888889,  3.77777778,  4.66666667,  5.55555556,
        6.44444444,  7.33333333,  8.22222222,  9.11111111, 10.        ])

For example, we may want to generate some data to plot some function.

Suppose we want to generate data for `x --> sin(x)` - we can start by creating a linspace of evenly spaced values between `-2pi` to `2pi`:

In [27]:
import math

x_coords = np.linspace(-2 * math.pi, 2 * math.pi, 50)

In [28]:
x_coords

array([-6.28318531, -6.02672876, -5.77027222, -5.51381568, -5.25735913,
       -5.00090259, -4.74444605, -4.48798951, -4.23153296, -3.97507642,
       -3.71861988, -3.46216333, -3.20570679, -2.94925025, -2.6927937 ,
       -2.43633716, -2.17988062, -1.92342407, -1.66696753, -1.41051099,
       -1.15405444, -0.8975979 , -0.64114136, -0.38468481, -0.12822827,
        0.12822827,  0.38468481,  0.64114136,  0.8975979 ,  1.15405444,
        1.41051099,  1.66696753,  1.92342407,  2.17988062,  2.43633716,
        2.6927937 ,  2.94925025,  3.20570679,  3.46216333,  3.71861988,
        3.97507642,  4.23153296,  4.48798951,  4.74444605,  5.00090259,
        5.25735913,  5.51381568,  5.77027222,  6.02672876,  6.28318531])

And now we could create a list of the function value at each of those x coordinates:

In [29]:
y_values = np.array([math.sin(x) for x in x_coords])

In [30]:
y_values

array([ 2.44929360e-16,  2.53654584e-01,  4.90717552e-01,  6.95682551e-01,
        8.55142763e-01,  9.58667853e-01,  9.99486216e-01,  9.74927912e-01,
        8.86599306e-01,  7.40277997e-01,  5.45534901e-01,  3.15108218e-01,
        6.40702200e-02, -1.91158629e-01, -4.33883739e-01, -6.48228395e-01,
       -8.20172255e-01, -9.38468422e-01, -9.95379113e-01, -9.87181783e-01,
       -9.14412623e-01, -7.81831482e-01, -5.98110530e-01, -3.75267005e-01,
       -1.27877162e-01,  1.27877162e-01,  3.75267005e-01,  5.98110530e-01,
        7.81831482e-01,  9.14412623e-01,  9.87181783e-01,  9.95379113e-01,
        9.38468422e-01,  8.20172255e-01,  6.48228395e-01,  4.33883739e-01,
        1.91158629e-01, -6.40702200e-02, -3.15108218e-01, -5.45534901e-01,
       -7.40277997e-01, -8.86599306e-01, -9.74927912e-01, -9.99486216e-01,
       -9.58667853e-01, -8.55142763e-01, -6.95682551e-01, -4.90717552e-01,
       -2.53654584e-01, -2.44929360e-16])

We'll actually see a much better way to generate this new array that avoids using Python based calculations, and instead uses NumPy vectorization.

#### random

We can also generate arrays filled with random numbers - either floats or integers.

To do this, we can use the `random` module in the NumPy library (not the `random` module in Python's standard library).

In [31]:
np.random.random(5)

array([0.31780555, 0.06443291, 0.31728872, 0.91010808, 0.47373671])

This created a 1-D array with 5 random floats.

We can also set the seed if we want reproducible results:

In [32]:
np.random.seed(0)
np.random.random(5)

array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ])

In [33]:
np.random.seed(0)
np.random.random(5)

array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ])

We can specify the the shape of the generated array, by passing a shape tuple as the first argument instead:

In [34]:
np.random.random((5, 3))

array([[0.64589411, 0.43758721, 0.891773  ],
       [0.96366276, 0.38344152, 0.79172504],
       [0.52889492, 0.56804456, 0.92559664],
       [0.07103606, 0.0871293 , 0.0202184 ],
       [0.83261985, 0.77815675, 0.87001215]])

We can also generate random integers in some specified range:

In [35]:
np.random.seed(0)
np.random.randint(1, 10, 5)

array([6, 1, 4, 4, 8])

This generated a 1-D array of random integers between `1` and `10` (not inclusive of `10`), of length `5`.

For example, we could simulate rolling a die, 10 times:

In [36]:
np.random.seed(0)
np.random.randint(1, 6 + 1, 10)

array([5, 6, 1, 4, 4, 4, 2, 4, 6, 3])

By using a shape, we can simulate rolling two dice:

In [37]:
np.random.seed(0)
np.random.randint(1, 7, (10, 2))

array([[5, 6],
       [1, 4],
       [4, 4],
       [2, 4],
       [6, 3],
       [5, 1],
       [1, 5],
       [3, 2],
       [1, 2],
       [6, 2]])

or even 5 dice - this is still just a 2-D array:

In [38]:
np.random.seed(0)
np.random.randint(1, 7, (10, 5))

array([[5, 6, 1, 4, 4],
       [4, 2, 4, 6, 3],
       [5, 1, 1, 5, 3],
       [2, 1, 2, 6, 2],
       [6, 1, 2, 5, 4],
       [1, 4, 6, 1, 3],
       [4, 1, 2, 4, 6],
       [4, 4, 1, 2, 2],
       [2, 1, 3, 5, 4],
       [4, 3, 5, 3, 1]])

So now we have seen many ways of creating `ndarray` objects.

Either by converting a Python list (that could be loaded from a CSV file for example), or by generating them using these specialized `numpy` functions we saw in this lecture. Later we'll see how we can transform existing arrays into other arrays, by passing them through functions.