# **Getting Started with NumPy**

Hey everyone! It's me **Nir Bahadur Raya**. It's February 26, 2023. 

This notebook is a compilation of all the concepts related to Numpy that I learned and revised. The purpose of this notebook was to serve as a revision guide for me to review and solidify my understanding of Numpy. It can also serve as a quick reference guide for anyone looking to learn or refresh their knowledge of Numpy.

**NumPy** is a Python library that provides support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays. NumPy is a fundamental library for data science and scientific computing with Python, as it provides the foundation for other libraries such as Pandas, Matplotlib, and SciPy.

NumPy arrays are similar to Python lists, but they have several advantages, such as being more memory-efficient, faster, and more convenient for mathematical operations.The following example demonstrates the advantage of using NumPy over pure Python for numerical computations. You don't need to have any prior knowledge of NumPy to understand the example. The output will show you how fast NumPy calculations can be compared to pure Python calculations. However, keep in mind that the speed of computations also depends on the hardware you're using, so your results may vary.

In [1]:
import numpy as np
import time

# create a large array
n = 1000000
a = np.random.rand(n)
b = np.random.rand(n)
c = np.zeros(n)

# measure the time to perform element-wise multiplication using NumPy
start_time = time.time()
c = a * b
end_time = time.time()
numpy_time = end_time - start_time
print("Time taken by NumPy:", numpy_time)

# measure the time to perform element-wise multiplication using pure Python
start_time = time.time()
for i in range(n):
    c[i] = a[i] * b[i]
end_time = time.time()
python_time = end_time - start_time
print("Time taken by Python:", python_time)

# compare the times
print("NumPy is", python_time/numpy_time, "times faster than Python.")


Time taken by NumPy: 0.004288434982299805
Time taken by Python: 0.3609745502471924
NumPy is 84.1739589703675 times faster than Python.


**1. Installing NumPy**

NumPy is not included in Python by default, so you'll need to install it separately. You can use pip, the package installer for Python, to install NumPy by running the following command in your terminal or command prompt:

In [2]:
# pip install numpy

NumPy is already pre-installed in Google Colab. So, you don't need to install it separately. You can simply import NumPy and start using its functions in your code cells.

When you import NumPy, you can assign it an alias 'np' so that you can refer to it using 'np' instead of typing 'numpy' every time you want to use a function or an object from NumPy. This can be done by:

In [3]:
import numpy as np

**2. Creating NumPy Arrays**

The primary data structure in NumPy is the ndarray (short for "n-dimensional array"), which is a multi-dimensional container for homogeneous data. You can create an ndarray using the np.array() function, like this:

In [4]:
a = np.array([1, 2, 3])  # 1D array
b = np.array([[1, 2, 3], [4, 5, 6]])  # 2D array
c = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])  # 3D array

Here, we're creating 1D, 2D, and 3D arrays a, b, and c, respectively. Note that the arrays are homogeneous (i.e., they contain elements of the same data type) and that the dimensions of the arrays are specified using nested lists.

**3. Array Attributes**

NumPy arrays have several attributes that you can use to get information about the array, such as its shape, size, and data type:

In [5]:
a = np.array([1, 2, 3])
print(a.shape)  # prints "(3,)"
print(a.size)   # prints "3"
print(a.dtype)  # prints "int64"

(3,)
3
int32


Here, we're printing the shape, size, and data type of the 1D array a.

**4. Array Indexing**

You can access individual elements of a NumPy array using indexing, which works similar to indexing in lists:

In [6]:
a = np.array([1, 2, 3])
print(a[0])  # prints "1"
print(a[-1])  # prints "3"

b = np.array([[1, 2, 3], [4, 5, 6]])
print(b[0, 0])  # prints "1"
print(b[1, 2])  # prints "6"

1
3
1
6


Here, we're accessing individual elements of the 1D array 'a' and the 2D array 'b' using indexing.

**5. Array Slicing**

You can also extract sub-arrays from a NumPy array using slicing, which works similar to slicing in lists:

In [7]:
a = np.array([1, 2, 3, 4, 5])
print(a[1:4])  # prints "[2 3 4]"

b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(b[:2, 1:])  # prints "[[2 3]
                  #          [5 6]]"

[2 3 4]
[[2 3]
 [5 6]]


Here, we're extracting sub-arrays from the 1D array 'a' and the 2D array 'b'. The syntax for slicing is array[start:end:step], where start and end specify the range of indices to include (end is not inclusive), and step specifies the step size between indices. You can also use negative indices to count from the end of the array. In the example above, we're slicing a to include elements with indices 1, 2, and 3, and we're slicing b to include the first two rows and the second and third columns.

You can also assign values to a slice of an array:

In [8]:
a[1:4] = np.array([10, 11, 12])
print(a)  # prints "[1 10 11 12 5]"

[ 1 10 11 12  5]


Here, we're assigning new values to the slice of 'a' with indices 1, 2, and 3.

Note that when you slice an array, NumPy returns a view of the original array, not a copy. This means that if you modify the view, the original array will also be modified. If you want to create a copy of an array, you can use the 'copy()' method:

In [9]:
a = np.array([1, 2, 3, 4, 5])
b=a[1:4]
b[0]=10
print(a)    # prints "[1 10 3 4 5]"
print(b)    # prints "[10  3  4]""
a = np.array([1, 2, 3, 4, 5])
b = a[1:4].copy()
b[0] = 10
print(a)  # prints "[1 2 3 4 5]"
print(b)  # prints "[10 3 4]"

[ 1 10  3  4  5]
[10  3  4]
[1 2 3 4 5]
[10  3  4]


Here, we're creating a copy of the slice of 'a' with indices 1, 2, and 3, and modifying the copy without affecting the original array 'a'.

**6. Array Reshaping**

You can reshape a NumPy array using the 'reshape()' method, which returns a new array with the same data but a new shape:

In [10]:
a = np.array([1, 2, 3, 4, 5, 6])
b = a.reshape(2, 3)  # reshape to 2x3 array
print(b)
# prints "[[1 2 3]
#          [4 5 6]]"

[[1 2 3]
 [4 5 6]]


Here, we're reshaping the 1D array 'a' into a 2D array 'b' with a shape of (2, 3).

**7. Array Broadcasting**

NumPy arrays can be broadcasted to perform element-wise operations even if they don't have the same shape. The smaller array is broadcasted to match the shape of the larger array:

In [11]:
a = np.array([1, 2, 3])
b = np.array([[1, 2, 3], [4, 5, 6]])
c = a + b
print(c)
# prints "[[2 4 6]
#          [5 7 9]]"


[[2 4 6]
 [5 7 9]]


Here, we're adding the 1D array 'a' to the 2D array 'b', and NumPy is automatically broadcasting the smaller array 'a' to match the shape of 'b'.

**8. Array Concatenation**

You can concatenate two or more NumPy arrays along a given axis using the 'concatenate()' function or the 'vstack()' and 'hstack()' functions:

In [12]:
# create two 1D arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# concatenate the arrays along the 0th axis (vertically)
c = np.concatenate([a, b])
print(c)  # prints [1 2 3 4 5 6]

# create two 2D arrays
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])

# concatenate the arrays along the 0th axis (vertically)
z = np.concatenate([x, y])
print(z)  # prints [[1 2]
          #         [3 4]
          #         [5 6]
          #         [7 8]]

# concatenate the arrays along the 1st axis (horizontally)
z = np.concatenate([x, y], axis=1)
print(z)  # prints [[1 2 5 6] 
          #         [3 4 7 8]]

e = np.hstack((x, y))  # horizontally stack arrays
print(e)   # prints [[1 2 5 6] 
           #         [3 4 7 8]]

[1 2 3 4 5 6]
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]
[[1 2 5 6]
 [3 4 7 8]]


Firstly, we create two 1D arrays 'a' and 'b' and then concatenate the two arrays along the 0th axis (axis=0) to produce a new array 'c'. This function takes a list of arrays as input, which is why we pass [a, b] as an argument.

Next, we create two 2D arrays 'x' and 'y'. We then use the 'np.concatenate()' function again to vertically stack the two arrays along the 0th axis to produce a new array 'z'. This time, we pass [x, y] as an argument to concatenate the two arrays. 

We then use the 'np.concatenate()' function again to horizontally stack the two arrays along the 1st axis (axis=1) to produce a new array 'z'. This time, we pass [x, y] and the argument axis=1 to concatenate the two arrays.

Finally, we use the 'np.hstack()' function to horizontally stack the two arrays 'x' and 'y' to produce a new array 'e'. This function is similar to using 'np.concatenate()' with axis=1, but takes a tuple of arrays as input instead of a list.

**9. Array Operations**

NumPy provides many mathematical operations for arrays, such as element-wise arithmetic, trigonometric functions, statistical functions, and linear algebra operations:

In [13]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = a + b
d = np.sin(a)
e = np.mean(b)
f = np.dot(a, b)  # dot product of two arrays
print(c)  # prints "[5 7 9]"
print(d)  # prints "[0.84147098 0.90929743 0.14112001]"
print(e)  # prints "5.0"
print(f)  # prints "32"

[5 7 9]
[0.84147098 0.90929743 0.14112001]
5.0
32


Here, we're adding 'a' and 'b' element-wise, taking the sine of 'a', computing the mean of 'b', and taking the dot product of 'a' and 'b'.

These are the basic concepts of NumPy that you'll need to start working with data in Python. Once you're comfortable with these concepts, you can explore more advanced topics.