# Numpy Basics

**By Shaiful Islam**




Numpy is a library for Python that is used for working with arrays. It provides a fast and efficient way to handle large datasets. If you’re using python for data science, either you have used NumPy or must have heard about it. Most of the statistical analysis which needs datato be stored in memory uses Numpy.


### Installation 

```bash
pip install numpy
```



### importing numpy


In [4]:
import numpy as np
print(np.__version__)


2.4.1


### WHY NUMPY?

1. Numpy is faster than Python lists.
2. Numpy arrays are more compact than Python lists.
3. Numpy uses less memory.
4. Numpy is easier to work with.


In [5]:
## NumPy is meant for creating homogeneous n-dimensional arrays (n = 1..n).
## Unlike Python lists, all elements of a NumPy array should be of same type.
py_arr = [1, 2, "Hello", 3, "World"]  # Python list can have mixed types

# Create NumPy array with homogeneous type
numpy_arr = np.array([1, 2, 3, 4, 5], dtype=np.int8)  # All elements are integers

In [12]:
print(numpy_arr.nbytes)
total_size = sys.getsizeof(py_arr)

for item in py_arr:
    total_size += sys.getsizeof(item)

print(total_size)

5
280


In [13]:
import sys
print(sys.getsizeof(py_arr))  # Size of Python list
print(sys.getsizeof(numpy_arr))  # Size of NumPy array

104
117


In [14]:
print(numpy_arr.nbytes)

5


In [16]:
numpy_arr = np.array([1,2,"Hello",3,"World"], dtype=np.int32) # Error

ValueError: invalid literal for int() with base 10: 'Hello'

Numpy advantages compared to Python lists:
1. Uses less memory
2. Provides many built-in functions for array operations
3. Allows specifying data types for optimization
4. Better performance for numerical computations


In [None]:
import sys

# Create Python list
py_arr = [1, 2, 3, 4, 5, 6]

# Create NumPy array with same values
numpy_arr = np.array([1, 2, 3, 4, 5, 6], dtype=np.int8)

# Calculate memory usage of Python list
sizeof_py_arr = sys.getsizeof(1) * len(py_arr)  # Size = 168

# Calculate memory usage of NumPy array  
sizeof_numpy_arr = numpy_arr.itemsize * numpy_arr.size  # Size = 48

# Print memory usage comparison
print(sizeof_py_arr)
print(sizeof_numpy_arr)

168
6


In [17]:
import numpy as np

py_arr = [1, 2, 3, 4, 5, 6]

# Create NumPy array with same values
numpy_arr = np.array([1, 2, 3, 4, 5, 6], dtype=np.int8)

print(numpy_arr.itemsize * numpy_arr.size)

6


Optimizing Further

In [None]:
# For NumPy arrays elements limited to 1 Byte / 8 Bits
numpy_arr = np.array([1, 2, 3, 4, 5, 6], dtype=np.int8)
sizeof_numpy_arr = numpy_arr.itemsize * numpy_arr.size  # Size = 6
print(sizeof_numpy_arr)

# For NumPy arrays elements limited to 2 Bytes / 16 Bits
numpy_arr = np.array([1, 2, 3, 4, 5, 6], dtype=np.int16)
sizeof_numpy_arr = numpy_arr.itemsize * numpy_arr.size  # Size = 12
print(sizeof_numpy_arr)

In [None]:
size = 1000000
py_list = list(range(size))
py_list

[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99,
 100,
 101,
 102,
 103,
 104,
 105,
 106,
 107,
 108,
 109,
 110,
 111,
 112,
 113,
 114,
 115,
 116,
 117,
 118,
 119,
 120,
 121,
 122,
 123,
 124,
 125,
 126,
 127,
 128,
 129,
 130,
 131,
 132,
 133,
 134,
 135,
 136,
 137,
 138,
 139,
 140,
 141,
 142,
 143,
 144,
 145,
 146,
 147,
 148,
 149,
 150,
 151,
 152,
 153,
 154,
 155,
 156,
 157,
 158,
 159,
 160,
 161,
 162,
 163,
 164,
 165,
 166,
 167,
 168,
 169,
 170,
 171,
 172,
 173,
 174,
 175,
 176,
 177,
 178,
 179,
 180,
 181,
 182,
 183,
 184,


In [None]:
# Let's compare the performance of NumPy arrays vs Python lists
import time
import numpy as np

# Create large Python list and NumPy array
size = 1000000
py_list = list(range(size))
np_array = np.array(range(size))

# Time operation on Python list
start_time = time.time()
py_result = [x * 2 for x in py_list]
py_time = time.time() - start_time

print(py_time)

0.05861353874206543


In [None]:
a = np.array([2,4,6])
print(a*2)

[ 4  8 12]


In [None]:

# Time operation on NumPy array 
start_time = time.time()
np_result = np_array * 2
np_time = time.time() - start_time

print(f"Python list time: {py_time:.4f} seconds")
print(f"NumPy array time: {np_time:.4f} seconds")
print(f"NumPy is {py_time/np_time:.1f}x faster")

Python list time: 0.0453 seconds
NumPy array time: 0.0027 seconds
NumPy is 16.8x faster


### Using numpy to create n-dimentional arrays
- An n-dimension array is generally used for creating a matrix or tensors, again mainly for the mathematical calculation purpose.
- Compare to python list base n-dimension arrays, NumPy not only saves the memory usage, it provide a significant number of additional benefits which makes it easy to mathematical calculations


creating a numpy array


In [None]:
ls = [10, 12, 14, 16, 20, 22]
ls_array = np.array(ls)
print(ls_array)
print(type(ls_array))
type(ls)


[10 12 14 16 20 22]
<class 'numpy.ndarray'>


list

In [19]:
arr_1d = np.array([1, 2, 3, 4, 5])
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_1d)
print(arr_2d)
print(arr_1d.nbytes)
print(arr_2d.nbytes)

[1 2 3 4 5]
[[1 2 3]
 [4 5 6]]
40
48


In [None]:
array_0 = np.ones((3,3))
print(array_0)



[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


In [None]:
array_0 = np.ones((2,3,3))
print(array_0)

[[[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]

 [[1. 1. 1.]
  [1. 1. 1.]
  [1. 1. 1.]]]


In [None]:

array_1 = np.zeros((4,2,2,3))
print(array_1)

[[[[0. 0. 0.]
   [0. 0. 0.]]

  [[0. 0. 0.]
   [0. 0. 0.]]]


 [[[0. 0. 0.]
   [0. 0. 0.]]

  [[0. 0. 0.]
   [0. 0. 0.]]]


 [[[0. 0. 0.]
   [0. 0. 0.]]

  [[0. 0. 0.]
   [0. 0. 0.]]]


 [[[0. 0. 0.]
   [0. 0. 0.]]

  [[0. 0. 0.]
   [0. 0. 0.]]]]


How do we know the size of the array?

In [None]:
print(np.shape(array_0))

(2, 3, 3)


The dimensions of the array can be changed at runtime as long as the multiplicity factor produces the same number of elements.
For example, a 2 * 5 matrix can be converted into 5 * 2 and a 1 * 4 into 2 * 2.

In [None]:
array_2=np.zeros((3,6))
print(array_2)

[[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]


In [None]:
ls_array = array_2.reshape((9,2))
print(ls_array)



[[0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]
 [0. 0.]]


**NumPy can also generate a predefined set of number for an array.**

The output of this function will always be a single dimension set of numbers. However, we can use reshape on this output to generate dimension of our choice.

In [None]:
np_nd_arr = np.arange(0, 16)
print(np_nd_arr)



[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


In [None]:
np_nd_arr = np_nd_arr.reshape((2, 2, 2, 2))
print(np_nd_arr)
np_nd_arr.shape

[[[[ 0  1]
   [ 2  3]]

  [[ 4  5]
   [ 6  7]]]


 [[[ 8  9]
   [10 11]]

  [[12 13]
   [14 15]]]]


(2, 2, 2, 2)

As reshape(x,y) can convert an array into multi dimensional array, similarly, its possible to create a single dimensional array from any N-D array


In [None]:
f_arr = np_nd_arr.ravel()
print(f_arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


### Slicing an array



In [20]:
np_nd_arr = np.arange(0, 100)
print(np_nd_arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


In [None]:
np_nd_arr = np.arange(0, 100)
np_nd_arr = np_nd_arr.reshape((10, 10))
print(np_nd_arr.shape)
print(np_nd_arr)
print(np_nd_arr[3:,3:])

(10, 10)
[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]
 [40 41 42 43 44 45 46 47 48 49]
 [50 51 52 53 54 55 56 57 58 59]
 [60 61 62 63 64 65 66 67 68 69]
 [70 71 72 73 74 75 76 77 78 79]
 [80 81 82 83 84 85 86 87 88 89]
 [90 91 92 93 94 95 96 97 98 99]]
[[33 34 35 36 37 38 39]
 [43 44 45 46 47 48 49]
 [53 54 55 56 57 58 59]
 [63 64 65 66 67 68 69]
 [73 74 75 76 77 78 79]
 [83 84 85 86 87 88 89]
 [93 94 95 96 97 98 99]]


### Mathematical operations on NumPy n-Dimension Arrays

- NumPy is not only about efficient storing the data, it also makes it extremely easy to perform mathematical operations on it.

- Any actions on n-dimension arrays behaves exactly similar to mathematical operations.

- NumPy n-dimensional arrays makes it extremely easy to perform mathematical operations on it.



In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))


[1 2 3 4 5]
<class 'numpy.ndarray'>


In [None]:
print(np.zeros((2,3)))     # 2x3 matrix of zeros
print(np.ones((2,3)))      # 2x3 matrix of ones
print(np.arange(1,10,2))   # numbers 1 to 9 with step 2


[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]]
[1 3 5 7 9]


In [None]:
arr = np.arange(1, 13)    # 1 to 12
print(arr.reshape(3,4))   # reshape into 3x4


[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [None]:
arr = np.array([10, 20, 30, 40, 50])
print(arr[0])    # first element
print(arr[-1])   # last element
print(arr[1:4])  # slice


10
50
[20 30 40]


In [None]:
a = np.array([1,2,3])
b = np.array([4,5,6])

print(a + b)   # addition
print(a * b)   # element-wise multiplication
print(a ** 2)  # power
print(np.sqrt(a))
print(np.sort(a))

[5 7 9]
[ 4 10 18]
[1 4 9]
[1.         1.41421356 1.73205081]
[1 2 3]


In [None]:
data = np.array([10, 20, 30, 40, 50])

print("Mean:", np.mean(data))
print("Median:", np.median(data))
print("Std Dev:", np.std(data))


In [None]:
#dot() = matrix multiplication.
#.T = transpose.
A = np.array([[1,2],[3,4]])
B = np.array([[5,6],[7,8]])

print("Matrix Multiplication:\n", np.dot(A,B))
print("Transpose:\n", A.T)


Matrix Multiplication:
 [[19 22]
 [43 50]]
Transpose:
 [[1 3]
 [2 4]]


In [2]:
import numpy as np
print(np.random.rand(2,3))  # random floats 0–1
print(np.random.randint(1,10,(2,3)))  # random ints


[[0.78575494 0.1359721  0.07324226]
 [0.54053611 0.93928431 0.15581864]]
[[1 1 4]
 [6 6 2]]


**Numpy**

In [3]:
import numpy as np
arr = np.array([1, 2, 3])
print(arr)

[1 2 3]


In [7]:
arr=np.array([1, 2, 3], dtype=np.int8)
print(arr.dtype)
print(arr)

int8
[1 2 3]


In [14]:
print(np.zeros((2,3)))     # 2x3 matrix of zeros
print(np.ones((2,3)))      # 2x3 matrix of ones
print(np.arange(1,10,2))   # numbers 1 to 9 with
print(np.full((2,3), 7))  # 2x3 matrix filled with 7
print(np.full(3, 7)) # 1D array of size 3 filled with 7
print(np.linspace(0,10,5)) # 5 numbers from 0 to 10 inclusive
print(np.random.rand(2,3))  # random floats 0–1
print(np.random.randint(1,10,(2,3)))  # random ints
print(np.eye(3))  # 3x3 identity matrix


[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]]
[1 3 5 7 9]
[[7 7 7]
 [7 7 7]]
[7 7 7]
[ 0.   2.5  5.   7.5 10. ]
[[0.46852689 0.92611621 0.37800502]
 [0.75696402 0.2753138  0.59617401]]
[[6 1 3]
 [6 3 6]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [15]:
arr = np.array([[1,2,3],[4,5,6]])

In [18]:
print(arr.dtype)
print(arr.ndim)
print(arr.shape)
print(arr.size)
print(arr.nbytes)

int64
2
(2, 3)
6
48


In [19]:
arr = np.array([10,20,30,40])
print(arr[1])
print(arr[1:3])

20
[20 30]


In [20]:
arr = np.array([[1,2,3],[4,5,6]])
print(arr[1,2])

6


In [26]:
print(arr.shape)
print(arr.reshape(3,2))
print(arr.reshape(2,3,1))
print(arr.flatten()) #1D copy of array
print(arr.ravel()) #1D view of array

(2, 3)
[[1 2]
 [3 4]
 [5 6]]
[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]]
[1 2 3 4 5 6]
[1 2 3 4 5 6]


In [29]:
a = np.array([1,2,3])
b = np.array([4,5,6])
print(a + b)
print(a * b)
print(a ** 2)
print(np.sqrt(a))
print(np.sort(a))
print(np.sum(a))
print(np.mean(a))
print(np.std(a))
print(a+5)

[5 7 9]
[ 4 10 18]
[1 4 9]
[1.         1.41421356 1.73205081]
[1 2 3]
6
2.0
0.816496580927726
[6 7 8]


In [31]:
print(a.var())
print(b.min())
print(b.max())
print(b.argmax())
print(b.argmin())
print(np.dot(a,b))
print(np.cross(a,b))
print(np.matmul(a.reshape(3,1),b.reshape(1,3)))
print(np.log(a))
print(np.exp(a))

0.6666666666666666
4
6
2
0
32
[-3  6 -3]
[[ 4  5  6]
 [ 8 10 12]
 [12 15 18]]
[0.         0.69314718 1.09861229]
[ 2.71828183  7.3890561  20.08553692]


In [32]:
arr = np.array([[1,2,3],[4,5,6]])
print(np.sum(arr, axis=0))
print(np.sum(arr, axis=1))

[5 7 9]
[ 6 15]


In [33]:
arr = np.array([10,20,30,40])
print(arr[arr > 20])

[30 40]


In [34]:
import numpy as np
arr = np.array([40, 10, 30, 20])
print(np.sort(arr))

[10 20 30 40]


In [35]:
print(np.argsort(arr))

[1 3 2 0]


In [36]:
print(np.where(arr > 25))

(array([0, 2]),)


In [37]:
arr2 = np.array([1, 2, 2, 3, 3, 3])
print(np.unique(arr2))

[1 2 3]


In [38]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
print(np.matmul(a, b))

[[19 22]
 [43 50]]


In [None]:
print(np.linalg.det(a)) # determinant

-2.0000000000000004


In [None]:
print(np.linalg.inv(a)) # inverse

In [40]:
a = np.array([[1,2],[3,4]])
print(np.linalg.det(a)) # determinant
print(np.linalg.inv(a)) # inverse

-2.0000000000000004
[[-2.   1. ]
 [ 1.5 -0.5]]


In [41]:
import numpy as np
arr = np.array([10, 20, 30, 40])
np.save('data.npy', arr)

In [43]:
loaded_arr = np.load('data.npy')
print(loaded_arr)

[10 20 30 40]


In [44]:
np.save('data.npy', arr)
new_arr = np.load('data.npy')