At the core of NumPy is the ndarray, where nd stands for n-dimensional. An ndarray is a multidimensional 
array of elements all of the same type. In other words, an ndarray is a grid that can take on 
many shapes and can hold either numbers or strings. There are several ways to create ndarrays in NumPy.
1. Using regular Python lists
2. Using built-in NumPy functions
We refer to 1D arrays as rank 1 arrays. In general N-Dimensional arrays have rank N. Therefore, 
we refer to a 2D array as a rank 2 array.

### 1. Using regular Python lists 

In [4]:
import time
import numpy as np

In [7]:
# create numpy array
x = np.random.random(100000000)

# create mean by python
start = time.time()
# sum(x) / len(x)
print(time.time() - start)  # took 8.32 sec to complete

# create mena by numpy
start = time.time()
np.mean(x)
print(time.time() - start)  # took 0.14 sec to complete
# numpy is way tooooo faster

0.0
0.13499975204467773


In [8]:
x = np.array([0, 1, 3, 5, 7])
# numpy array as mutable
# shape is size of array in each dimension
print(x, "   x has dimensions: (shape) ", x.shape)
# type of x is ndarray
print("x is an object of type: ", type(x))
# data type of elements of ndarray can be obtained by dtype attribute
# can handle more data-types than Python lists.
# ndarray can hold only single data type elements in it.
print("The elements in x are of type: ", x.dtype)
print("The array has total number of elements in it :", x.size)

[0 1 3 5 7]    x has dimensions: (shape)  (5,)
x is an object of type:  <class 'numpy.ndarray'>
The elements in x are of type:  int32
The array has total number of elements in it : 5


In [9]:
x = np.array([1, 2, "hello"])
print(x, x.shape, type(x), x.dtype, x.size)
# We can see that even though the Python list had mixed data types, the elements in x are all of the same type, namely

['1' '2' 'hello'] (3,) <class 'numpy.ndarray'> <U11 3


In [11]:
# rank 2 ndarray
x = np.array([[1, 2.2, 3], [4, 5, 6], [7, 8.8, 9], [10, 11, 12.12]])
print(x, x.shape, type(x), x.dtype, x.size)
# shape attribute returns the tuple (4,3) telling us that Y is of rank 2 and it has 4 rows and 3 columns.
# The .size attribute tells us that Y has a total of 12 elements.

[[ 1.    2.2   3.  ]
 [ 4.    5.    6.  ]
 [ 7.    8.8   9.  ]
 [10.   11.   12.12]] (4, 3) <class 'numpy.ndarray'> float64 12


This is called upcasting. Since all the elements of an ndarray must be of the same type,
in this case NumPy upcasts the integers in z to floats in order to avoid losing precision in numerical computations.

In [13]:
# Numpy assigns data type of elements when we input them
#  You can specify the dtype
z = np.array([2, 4.5, 7, 89.99, 6.3], dtype=np.int64)
print(z, type(z), z.dtype)   # The elements in x are of type: int64
# Specifying the data type of the ndarray can be useful in cases when you don't want NumPy to accidentally
# choose the wrong data type, or when you only need certain amount of precision
# in your calculations and you want to save memory.

[ 2  4  7 89  6] <class 'numpy.ndarray'> int64


In [14]:
# you can save np array to file and load it later
np.save("saved_array_to_file", x)
y = np.load("saved_array_to_file.npy")
print("\n\nloaded from file ...", y, y.dtype, y.size, y.shape)



loaded from file ... [[ 1.    2.2   3.  ]
 [ 4.    5.    6.  ]
 [ 7.    8.8   9.  ]
 [10.   11.   12.12]] float64 12 (4, 3)


### 2. Using built-in NumPy functions

In [15]:
# creates ndarray of given shape (input tuple x*y*z*... dimension)
a = np.zeros((2, 3))
print(a)
# default dtype of elements is float, we can change it by providing dtype
a = np.ones((2, 3, 4), dtype=int)
print(a, a.size, a.dtype, a.shape)

[[0. 0. 0.]
 [0. 0. 0.]]
[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]] 24 int32 (2, 3, 4)


In [16]:
# np.full function will create matrix will all values given by second params (similar to ones, zeros)
# with the same data type as the constant value used to fill in the array.
a = np.full((2, 3), 4.5)
print(a, a.size, a.dtype, a.shape)

[[4.5 4.5 4.5]
 [4.5 4.5 4.5]] 6 float64 (2, 3)


fundamental array in Linear Algebra is the Identity Matrix. An Identity matrix is a square matrix that has only 1s in its main diagonal and zeros everywhere else. Identity matrix is also diagonal matrix, where values are only on diagonal, and zero elsewhere, a square matrix

In [17]:
b = np.eye(5)  # as identity matrix is always square matrix
print(b, b.size, b.dtype, b.shape)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]] 25 float64 (5, 5)


In [18]:
# diagonal matrix
c = np.diag([10, 20, 30, 40])
print(c, c.size, c.shape, c.dtype)

[[10  0  0  0]
 [ 0 20  0  0]
 [ 0  0 30  0]
 [ 0  0  0 40]] 16 (4, 4) int32


to create ndarrays that have evenly spaced values within a given interval.similar to range in python (which creates iterable sequence); it excludes upper limit from calculation. upper limit is exclusive

In [19]:
x = np.arange(10)
print(x, x.size, x.shape, x.dtype)
x = np.arange(4, 11)
print(x, x.size, x.shape, x.dtype)
x = np.arange(1, 10, 3)  # here step is 3, which means it is distance between two adjacent values in array
print(x, x.size, x.shape, x.dtype)

[0 1 2 3 4 5 6 7 8 9] 10 (10,) int32
[ 4  5  6  7  8  9 10] 7 (7,) int32
[1 4 7] 3 (3,) int32


non-integer steps are required, it is usually better to use the function np.linspace(start, stop, N=50)
function returns N evenly spaced numbers over the closed interval, both start and stop inclusive

In [20]:
b = np.linspace(0, 10, 5)
print(b, b.size, b.shape, b.dtype)  # size will always be N
# default number of elements in the specified interval will be N= 50.
b = np.linspace(0, 10)
print(b, b.size, b.shape, b.dtype)  # size will always be N

[ 0.   2.5  5.   7.5 10. ] 5 (5,) float64
[ 0.          0.20408163  0.40816327  0.6122449   0.81632653  1.02040816
  1.2244898   1.42857143  1.63265306  1.83673469  2.04081633  2.24489796
  2.44897959  2.65306122  2.85714286  3.06122449  3.26530612  3.46938776
  3.67346939  3.87755102  4.08163265  4.28571429  4.48979592  4.69387755
  4.89795918  5.10204082  5.30612245  5.51020408  5.71428571  5.91836735
  6.12244898  6.32653061  6.53061224  6.73469388  6.93877551  7.14285714
  7.34693878  7.55102041  7.75510204  7.95918367  8.16326531  8.36734694
  8.57142857  8.7755102   8.97959184  9.18367347  9.3877551   9.59183673
  9.79591837 10.        ] 50 (50,) float64


arange uses step between values, where linspace uses number of elements we want in particular interval we can exclude endpoints in linspace by argument endpoint = False

In [21]:
a = np.linspace(2.5, 25, 10)
print(a, a.shape, a.dtype)
a = np.linspace(0, 25, 10)
print(a, a.shape, a.dtype)
a = np.linspace(0, 25, 10, endpoint=False)
print(a, a.shape, a.dtype)

[ 2.5  5.   7.5 10.  12.5 15.  17.5 20.  22.5 25. ] (10,) float64
[ 0.          2.77777778  5.55555556  8.33333333 11.11111111 13.88888889
 16.66666667 19.44444444 22.22222222 25.        ] (10,) float64
[ 0.   2.5  5.   7.5 10.  12.5 15.  17.5 20.  22.5] (10,) float64


we have only used the built-in functions np.arange() and np.linspace() to create rank 1 ndarrays. We can create rank 2 array by combining it with reshape()
np.reshape(ndarray, new_shape) , new_shape should be compatible with ndarray

In [22]:
x = np.arange(20)
y = np.reshape(x, (4, 5))  # second inout is tuple for new shape
print("original ndarray :", x, x.size, x.shape)
print("ndarray after reshape", y, y.size, y.shape)
print("reshape with method ", x.reshape((4, 5)))
print("rank 2 ndarray with linspace ", np.linspace(0, 50, 10, endpoint=False).reshape((5, 2)))

original ndarray : [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19] 20 (20,)
ndarray after reshape [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]] 20 (4, 5)
reshape with method  [[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]
rank 2 ndarray with linspace  [[ 0.  5.]
 [10. 15.]
 [20. 25.]
 [30. 35.]
 [40. 45.]]


### Random array

In [23]:
# We create a 2 x 5 ndarray with random floats in the half-open interval [0.0, 1.0).
x = np.random.random((2, 5))
print("radom ndarray ", x, x.dtype, x.size)

radom ndarray  [[0.36615209 0.42097444 0.61497082 0.58088091 0.46867866]
 [0.73653539 0.55759112 0.95983696 0.02873054 0.16225788]] float64 10


In [24]:
# create ndarrays with random integers within a particular interval  np.random.randint(start, stop, size = shape)
# end is exclusive
x = np.random.randint(5, 18, (2, 6))
print("random int [start, stop) rank 2 ndarray ", x, x.dtype, x.size, x.shape)

random int [start, stop) rank 2 ndarray  [[ 8  8 11 17  7 14]
 [16 17  8 17 14 14]] int32 12 (2, 6)


In some cases, you may need to create ndarrays with random numbers that satisfy certain statistical properties. For example, you may want the random numbers in the ndarray to have an average of 0. NumPy allows you create random
ndarrays with numbers drawn from various probability distributions.

The function np.random.normal(mean, standard deviation, size=shape), for example, creates an ndarray with the given shape that contains random numbers picked from a normal (Gaussian) distribution with the given mean and standard deviation.

In [25]:
x = np.random.normal(0, 0.1, size=(50, 50))
print(x, x.size, x.shape, x.dtype, type(x))
print(x.min(), x.max(), x.mean(), x.std(), "positive and negative number :", (x < 0).sum(), (x > 0).sum(), type((x<0)))

[[ 0.05682401 -0.1228124   0.14005236 ... -0.00752091  0.05367195
  -0.04600395]
 [ 0.0408086  -0.02035513  0.0790149  ... -0.0637998  -0.01043516
   0.04754993]
 [-0.02777672 -0.01074251 -0.06664601 ... -0.04382427  0.01115435
   0.06902208]
 ...
 [-0.09246595 -0.12382967  0.01758373 ...  0.02426526  0.01356994
  -0.06910082]
 [-0.11306558 -0.12282583  0.10991725 ... -0.13046863  0.0147325
  -0.23070754]
 [ 0.05594552 -0.00886304 -0.06087898 ... -0.00962179  0.23524497
  -0.11370831]] 2500 (50, 50) float64 <class 'numpy.ndarray'>
-0.29759627743856226 0.34703483546219316 0.0014496055001924381 0.10051233045394348 positive and negative number : 1254 1246 <class 'numpy.ndarray'>


As we can see, the average of the random numbers in the ndarray is close to zero, both the maximum and minimumvalues in X are symmetric about zero (the average), and we have about the same amount of positive and negative numbers.

In [26]:
p = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
q = np.array(p)
# print("calculation "(p < 5))   gives an error
print("second ", (q < 5))  # return ndarray o true and false

second  [ True  True  True  True  True False False False False False False]


In [27]:
# create a 4 x 4 ndarray that only contains consecutive even numbers from 2 to 32 (inclusive)
x = np.linspace(2, 32, 16, dtype=int).reshape((4, 4))
print(x)

[[ 2  4  6  8]
 [10 12 14 16]
 [18 20 22 24]
 [26 28 30 32]]
