# Array-Oriented Programming with NumPy - Part 1


1. Introduction

2. Creating an array using different approaches (Constructors)

3. Indexing and slicing (Getter and Setter)

4. NumPy calculation methods (Reduction)

## 1. Introduction

The `NumPy` (Numerical Python) library is the favored Python array implementation. It provides a high-performance, feature-rich $n$-dimensional array type called `array`. Array operations are typically one or two orders of magnitude faster than those on `lists`. 

Although the built-in `lists` can also possess multiple dimensions and be processed using nested loops. A key advantage of NumPy is "array-oriented programming," which employs <u>functional-style programming</u> and <u>internal iteration</u> to make array manipulation concise and straightforward, reducing the likelihood of bugs that can arise from explicitly programmed loops.

<center><img src="https://drive.google.com/uc?id=1zVQ_M0bO6--WuESRutEovfp0zNndgNDh" width="60%" height="60%"></center>

In `Python`, types are dynamically determined, and there is no need for manual memory allocation. This flexibility highlights that `Python` variables encompass more than just their values; they also include additional information about the value's <u>type</u> and <u>size</u>:

<center><img src="https://drive.google.com/uc?id=18QfgstLWM2C0xm7EhP-aA8-IfZiFBSmn" width="60%" height="60%"></center>

Likewise, the `list` in `Python` is highly versatile, capable of storing <u>objects</u> with different types. However, this flexibility comes with a trade-off: to accommodate these adaptable types, each item in the `list` must carry its type, size, and other details. 

Every element is a `Python` object. In cases where all variables share the same type, a significant portion of this information becomes redundant, making data storage in a fixed-type `array` considerably more efficient. The distinction between a dynamic-type `list` and a fixed-type `NumPy` `array` can be illustrated as follows:

<center><img src="https://drive.google.com/uc?id=18QyPOocwIfUX6X3VXJI6uu3vQpoksfZZ" width="70%" height="70%"></center>

From the figure, we can see that at the implementation level, the `array` primarily consists of a single pointer to a contiguous data block. In contrast, the `Python` `list` features a pointer to a block of pointers, each of which points to a `Python` object, such as a `Python` `integer`.

All in all, the primary benefit of the `list` is its flexibility. Since each `list` element is a comprehensive structure containing data and type information, the `list` can accommodate data of any type. While fixed-type `NumPy` `arrays` do not offer this level of adaptability

- They are significantly more efficient for storing and manipulating data. 
- In addition, we know that every object consists of <u>data</u> and <u>methods</u>. The `array` object of the `NumPy` package not only provides efficient storage of array-based data but adds to this efficient operation on that data. 

In the first step, we need to install `NumPy` as follows:

In [1]:
package_name = "numpy"

try:
    __import__(package_name)
    print(f"{package_name} is already installed.")
except ImportError:
    print(f"{package_name} not found. Installing...")
    %pip install {package_name}

numpy is already installed.


The official `NumPy` documentation recommends importing the `numpy` <u>module</u> as `np` so that we can access its methods with `np.`:

In [2]:
import numpy as np

## 2. Creating  `array` using different approaches (Constructors)

### 2.1 Creating  `array` from fix sequence

The `numpy` module offers numerous <u>functions</u> to create arrays. In this case, we employ the `array()` function, which accepts a sequence of elements and returns a new `array` containing the input elements. For instance, let's pass a `list`:

In [3]:
import numpy as np
numbers = np.array([2, 3, 5, 7, 11])
numbers, type(numbers)

(array([ 2,  3,  5,  7, 11]), numpy.ndarray)

The `array()` function copies its <u>argument</u>'s contents into the `array`. Note that the type is `numpy.ndarray` and all the output will prefix the data with the <u>keyword</u> `array`.

The `array()` function copies its argument's dimensions. Let's create an `array` from a two-row-by-three-column nested `list`:

In [4]:
np.array([[1, 2, 3], [4, 5, 6]]), type(np.array([[1, 2, 3], [4, 5, 6]]))

(array([[1, 2, 3],
        [4, 5, 6]]),
 numpy.ndarray)

A 2D array is a sequence of 1D arrays that represent each row.

####  `array`  Attributes 

The `array` function determines an array's element type from its argument's elements. We can check the element type with an array's `dtype` <u>attribute</u>:

In [5]:
integers = np.array([[1, 2, 3], [4, 5, 6]])
floats = np.array([0.0, 0.1, 0.2, 0.3, 0.4])

integers.dtype, floats.dtype

(dtype('int32'), dtype('float64'))

In the upcoming section, we will notice that several array-creation functions include a `dtype` keyword argument, allowing us to define an array's element type.

> For efficiency purposes, `NumPy` is written in the C programming language and utilizes C's data types. By default, `NumPy` stores integers as `int_` values in the `NumPy` type, which correspond to 64-bit (8-byte) integers in C (this may vary depending on the platform). Floating-point numbers are stored as `float64` values in the `NumPy` type, corresponding to 64-bit (8-byte) floating-point values (double) in C. In our examples, we will most commonly encounter the types `int64`, `float64`, and `bool` for non-numeric data. The full list of supported types can be found at [https://docs.scipy.org/doc/numpy/user/basics.types.html](https://docs.scipy.org/doc/numpy/user/basics.types.html).

The attribute `ndim` contains an array's number of dimensions and the attribute `shape` contains a `tuple` specifying an array's dimensions: 

In [6]:
print(integers.ndim)
print(floats.ndim)

2
1


In [7]:
print(integers.shape)
print(floats.shape)

(2, 3)
(5,)


Here, integers have 2 rows and 3 columns (6 elements) and floats are one-dimensional, containing 5 floating numbers.

We can view an array's total number of elements with the attribute `size` and the number of bytes required to store each element with `itemsize`:

In [8]:
print(integers.size)
print(integers.itemsize)
print(floats.size)
print(floats.itemsize)

6
4
5
8


Note that the `size` of the integers is the result of multiplying the values in the `tuple` — two rows with three elements each, totaling six elements. In each instance, `itemsize` is 8 because integers comprise `int64` values, and as floats consist of `float64` values.

### 2.2 Filling `array` with specific values

`NumPy` offers the functions `zeros()`, `ones()`, and `full()` for creating arrays filled with 0s, 1s, or a specified value, respectively. By default, `zeros()` and `ones()` generate arrays containing `float64` values. We will demonstrate how to customize the element type shortly. The first argument for these functions should be either an `integer` or a `tuple` of integers defining the desired dimensions. When given an integer, each function returns a one-dimensional array containing the specified number of elements:

In [9]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

When provided with a `tuple` of integers, these functions return a multidimensional array featuring the specified dimensions. We can define the array's element type using the `dtype` keyword argument for the `zeros()` and `ones()` functions:

In [10]:
np.ones((2, 4), dtype=np.int32)

array([[1, 1, 1, 1],
       [1, 1, 1, 1]])

The `array` returned by `full()` contains elements with the second argument's value and type: 

In [11]:
np.full((3, 5), 13+2j), np.full((3, 5), 13+2j).dtype

(array([[13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j],
        [13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j],
        [13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j, 13.+2.j]]),
 dtype('complex128'))

### 2.3 Creating `array` from sequence generated by different methods

#### Creating sequence with fix step by `arange()` 

We can employ `NumPy`'s `arange()` function to create integer ranges, similar to using the built-in `range()` function. The first two arguments of the function determine the starting and ending values of the range, with the ending value excluded from the array. The optional third argument represents the step size which has a default value of 1:

In [12]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [13]:
np.arange(5, 10)

array([5, 6, 7, 8, 9])

In [14]:
np.arange(10, 1, -2) 

array([10,  8,  6,  4,  2])

Note that it is the same as `range()`, which takes three arguments `numpy.arange(start, stop, step)` and the first and third arguments can be omitted.

#### Creating sequence with fix sample number by `linspace()`

Additionally, we can generate evenly spaced floating-point ranges using `NumPy`'s `linspace()` function. The first two arguments of the function determine the starting and ending values of the range, with the ending value included in the `array`. The optional keyword argument `num` designates the number of evenly spaced values to create:

In [15]:
np.linspace(0.0, 1.0, num=5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

#### Reshaping an `array` 

We can also first create an `array` using the previous methods and then utilize the `array` method `reshape()` to convert the one-dimensional array into a multidimensional array. Let's generate an array containing values from 1 to 20 and then reshape it into a matrix with four rows and five columns:

In [16]:
np.arange(1, 21).reshape(4, 5)

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20]])

Note the ***chained method*** calls in the previous example. Initially, `arange()` generates an array containing values 1 to 20. Then, we invoke `reshape()` on that array to obtain the displayed 4-by-5 array. We can `reshape()` any array as long as the new shape contains the same number of elements as the original. Thus, a six-element one-dimensional array can be transformed into a 3-by-2 or 2-by-3 array, and vice versa!

### Example 1: `List` vs. `array`  performance: Introducing  `%%timeit`  

Most `array` operations execute significantly faster than corresponding `list` operations. To demonstrate, we'll use the `%%timeit` magic command, which benchmarks the average duration of operations. 

In [17]:
import random

Here, let's use the `random` module’s `randint()` function with a list comprehension to create a list of six million die rolls and time the operation using `%%timeit`:

In [18]:
%%timeit 
rolls_list = [random.randint(1, 6) for i in range(0, 6_000_000)] #_ is use to separate long integer

3.69 s ± 48 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


> By default, `%%timeit` executes a statement in a loop, and it runs the loop seven times. If we do not indicate the number of loops, [`%%timeit`](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit) chooses an appropriate value.

Now, let's use the `randint()` function from the `numpy.random` module to create an array

In [19]:
%%timeit 
rolls_array = np.random.randint(1, 7, 6_000_000)

42.1 ms ± 227 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


## 3. Indexing and slicing (Getter and Setter)

One-dimensional `arrays` can be *indexed* and *sliced* using the same syntax and techniques applied when handling other sequence data types, such as built-in `lists` or `tuples`.

To select an element in a two-dimensional array, specify two indices containing the element's row and column indices in square brackets:

In [20]:
import numpy as np
grades = np.array([[87, 96, 70], [60, 87, 90],
                   [94, 77, 92], [100, 81, 82]])
grades

array([[ 87,  96,  70],
       [ 60,  87,  90],
       [ 94,  77,  92],
       [100,  81,  82]])

In [21]:
grades[0, 1]  # row 0, column 1

96

To select a single row, we can specify only one index in square brackets:

In [22]:
grades, grades[1,:]

(array([[ 87,  96,  70],
        [ 60,  87,  90],
        [ 94,  77,  92],
        [100,  81,  82]]),
 array([60, 87, 90]))

To select multiple sequential rows, use slice notation:

In [23]:
grades[0:2]

array([[87, 96, 70],
       [60, 87, 90]])

To select multiple non-sequential rows, use a list of row indices which is called *fancy indexing*:

In [24]:
grades[[1, 3]]

array([[ 60,  87,  90],
       [100,  81,  82]])

Let's select only the elements in the first column: 

In [25]:
grades, grades[:, 0]

(array([[ 87,  96,  70],
        [ 60,  87,  90],
        [ 94,  77,  92],
        [100,  81,  82]]),
 array([ 87,  60,  94, 100]))

The 0 after the comma signifies that we are selecting only column 0. The `:` before the comma indicates which rows within that column to choose. In this instance, `:` is a slice representing all rows. We can also select consecutive columns using a slice:

In [26]:
grades[:, 1:3]

array([[96, 70],
       [87, 90],
       [77, 92],
       [81, 82]])

or specific columns with fancy indexing using a list of column indices:

In [27]:
grades, grades[:, [0, 2]]

(array([[ 87,  96,  70],
        [ 60,  87,  90],
        [ 94,  77,  92],
        [100,  81,  82]]),
 array([[ 87,  70],
        [ 60,  90],
        [ 94,  92],
        [100,  82]]))

`array` is <u>mutable</u>. Therefore, if we want to modify the value of the array, we can use the previous method and put the result on the left-hand side: 

In [28]:
print(grades)
grades[3, 2] = 42
grades

[[ 87  96  70]
 [ 60  87  90]
 [ 94  77  92]
 [100  81  82]]


array([[ 87,  96,  70],
       [ 60,  87,  90],
       [ 94,  77,  92],
       [100,  81,  42]])

### Views: Shallow copies

***Views*** are objects that see the data in other objects, instead of having their own copies of the data. Views are also referred to as ***shallow copies***. Several `array` methods and slicing operations generate views of an `array`'s data. The `array` method `view()` returns a new `array` object with a view of the original `array` object's data. First, let's create an `array` and a view of that `array`:

In [30]:
numbers = np.arange(1, 6)
numbers2 = numbers.view()

We can use the built-in `id()` function to verify that `numbers` and `numbers2` are different objects:

In [31]:
id(numbers), id(numbers2)

(2566766449744, 2566766451472)

`NumPy` also has a handy function called `shares_memory()` that can be utilized in this scenario:

In [32]:
np.shares_memory(numbers, numbers2)

True

To prove that `numbers2` views the same data as `numbers`, let's modify an element in `numbers`, then display both arrays:

In [33]:
numbers[1] *= 10
numbers

array([ 1, 20,  3,  4,  5])

In [34]:
numbers2

array([ 1, 20,  3,  4,  5])

Similarly, changing a value in the view also changes that value in the original array:

In [35]:
numbers2[1] /= 5
numbers, numbers2

(array([1, 4, 3, 4, 5]), array([1, 4, 3, 4, 5]))

Slices also create views. Let’s make `numbers2` a slice that views only the first three elements of numbers:

In [36]:
numbers2 = numbers[0:3]
numbers2

array([1, 4, 3])

Now, let's modify an element both arrays share, then display them. Again, we see that `numbers2` is a view of `numbers`:

In [37]:
numbers[1] *= 20
numbers

array([ 1, 80,  3,  4,  5])

In [38]:
numbers2

array([ 1, 80,  3])

> Note that this behavior is different from `list`, where the slicing will create a new sub `list`! 

### Deep Copies

While views are distinct `array` objects, they save memory by sharing element data with other `arrays`. Nonetheless, when dealing with mutable values, it is occasionally essential to create a ***deep copy*** containing independent copies of the original data.

> This is particularly crucial in multi-core programming, where different components of our program may try to modify our data simultaneously, potentially leading to data corruption.

The `array` method `copy()` returns a new `array` object with a deep copy of the original `array` object's data. First, let's create an `array` and a deep copy of that `array`:

In [39]:
numbers = np.arange(1, 6)
numbers2 = numbers.copy()

To prove that `numbers2` has a separate copy of the data in `numbers`, let's modify an element in `numbers`, then display both arrays: 

In [40]:
numbers[1] *= 5
numbers

array([ 1, 10,  3,  4,  5])

In [41]:
numbers2

array([1, 2, 3, 4, 5])

>  Recall that if we need deep copies of other types of `Python` objects, just pass them to the `copy` module’s `deepcopy()` function. 

### More about Reshaping and Transposing 

We've used `array` method `reshape()` to produce two-dimensional arrays from one-dimensional ranges. `NumPy` provides various other ways to reshape arrays.

Both the `reshape()` and `resize()` array methods allow us to alter an array's dimensions. The `reshape()` method returns a view (shallow copy) of the original array with updated dimensions, leaving the original array unaltered:

In [42]:
grades = np.array([[87, 96, 70], [99, 87, 90]])
grades

array([[87, 96, 70],
       [99, 87, 90]])

In [43]:
grades2 = grades.reshape(1, 6)

In [44]:
grades2[0, 0] = 0
grades2, grades

(array([[ 0, 96, 70, 99, 87, 90]]),
 array([[ 0, 96, 70],
        [99, 87, 90]]))

A widely used technique involves using `-1` to specify the shape in `reshape()`. The length of the dimension set to `-1` is automatically deduced based on the specified values of other dimensions:

In [45]:
grades, grades.reshape(-1, 3) # Same as grades.reshape(2, 3)

(array([[ 0, 96, 70],
        [99, 87, 90]]),
 array([[ 0, 96, 70],
        [99, 87, 90]]))

Method `resize()`, on the other hand, modifies the original `array`'s shape <u>in-place</u>:

In [46]:
grades.resize(1, 6)
grades

array([[ 0, 96, 70, 99, 87, 90]])

We can also do the opposite operation, which takes a multidimensional array and flatten it into a single dimension with the methods `flatten()`. Method `flatten()` deep copies the original array's data:

In [47]:
grades = np.array([[87, 96, 70], [99, 87, 90]])
grades

array([[87, 96, 70],
       [99, 87, 90]])

In [48]:
flattened = grades.flatten()
flattened

array([87, 96, 70, 99, 87, 90])

In [49]:
flattened[0] = 100
grades # Original array does not change

array([[87, 96, 70],
       [99, 87, 90]])

Additionally, we can transpose an `array`'s rows and columns, the `T` attribute returns a transposed view of the array. 

Assume that the original `grades` `array` presents two students' grades (the rows) across three exams (the columns). Let's transpose the rows and columns to examine the data as the grades for three exams (the rows) taken by two students (the columns):

In [50]:
grades.T

array([[87, 99],
       [96, 87],
       [70, 90]])

Transposing does not modify the original array:

In [51]:
grades

array([[87, 96, 70],
       [99, 87, 90]])

Finally, we can combine `arrays` by adding more columns or more rows — known as horizontal stacking and vertical stacking. Let's first create another 2-by-3 `array` of grades:

In [52]:
grades2 = np.array([[94, 77, 90], [100, 81, 82]])
grades2

array([[ 94,  77,  90],
       [100,  81,  82]])

Suppose `grades2` represents three more exam grades for the two students in the `grades` array. We can merge `grades` and `grades2` using `NumPy`'s `hstack()` (horizontal stack) function by passing a `tuple` containing the arrays to combine. The extra parentheses are necessary because `hstack()` expects a single argument:

In [53]:
np.hstack((grades, grades2))

array([[ 87,  96,  70,  94,  77,  90],
       [ 99,  87,  90, 100,  81,  82]])

Moving forward, let's suppose that `grades2` represents the grades of two additional students on three exams. In this scenario, we can combine `grades` and `grades2` using `NumPy`'s `vstack()` (vertical stack) function:

In [54]:
np.vstack((grades, grades2))

array([[ 87,  96,  70],
       [ 99,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

### Exercise 1: Suppose we are developing a chess game and the chess game provide two special checkerboards as follows:

<center><img src="https://drive.google.com/uc?id=1zAMarVoAZE2immDDiwo9JGQ2eGNkEVTJ" width="20%" height="20%"></center>

<center><img src="https://drive.google.com/uc?id=1zA_3dLW7dJOFh9SaMVIsuGJo73JckWF7" width="40%" height="40%"></center>

We decide to use 1 to represent the white square and 0 to represent the black square. Write a program to create two 2D arrays to represent the two checkerboards as follows:

```python
[[1, 0, 1, 0, 1, 0],
 [0, 1, 0, 1, 0, 1],
 [1, 0, 1, 0, 1, 0],
 [0, 1, 0, 1, 0, 1],
 [1, 0, 1, 0, 1, 0],
 [0, 1, 0, 1, 0, 1]]
```

```python
[[1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1],
 [0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0],
 [1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 1]]
```

Note you should not directly hardcode the above arrays. You should use `Numpy` methods to create the arrays. After you have finished the exercise, you can print out the checkerboard using the following code cell.

In [None]:
# Your answer here

In [None]:
# Your answer here

In [None]:
# Plot the checkerboard
package_name = "matplotlib"

try:
    __import__(package_name)
    print(f"{package_name} is already installed.")
except ImportError:
    print(f"{package_name} not found. Installing...")
    %pip install {package_name}

import matplotlib.pyplot as plt
plt.imshow(chb2, cmap='gray')
plt.show()

## 4. `NumPy` calculation methods (Reduction)

An `array` includes several methods that carry out computations based on its contents. **By default, these methods disregard the array's shape and utilize all the elements in the calculations.** 

For instance, when computing the mean of an array, it sums all of its elements irrespective of its shape, and then divides by the total number of elements. **We can also execute these calculations on each dimension.** For example, in a two-dimensional array, we can determine the mean of each row and each column.

In [55]:
import numpy as np
grades = np.array([[87, 96, 70], [100, 87, 90],
                    [94, 77, 90], [100, 81, 82]])
grades

array([[ 87,  96,  70],
       [100,  87,  90],
       [ 94,  77,  90],
       [100,  81,  82]])

We can use methods to calculate `sum()`, `min()`, `max()`, `mean()`, `std()` (standard deviation) and `var()` (variance) — each is a functional-style programming reduction:

In [56]:
print(grades.sum())
print(grades.min())
print(grades.max())
print(grades.mean())
print(grades.std())
print(grades.var())

1054
70
100
87.83333333333333
8.792357792739987
77.30555555555556


### Calculations by Row or Column

Numerous calculation methods can be applied to specific `array` dimensions, referred to as the `array`'s ***axes***. These methods accept an `axis` keyword argument that designates the dimension to be utilized in the calculation, providing a convenient means to perform computations by row or column in a two-dimensional `array`.

Suppose we want to find the maximum grade for each exam, represented by the columns of `grades`. By specifying `axis=0`, the calculation is performed on all the row values within each column:

In [57]:
grades, grades.max(axis=0), grades.argmax(axis=0)

(array([[ 87,  96,  70],
        [100,  87,  90],
        [ 94,  77,  90],
        [100,  81,  82]]),
 array([100,  96,  90]),
 array([1, 0, 1], dtype=int64))

Here, 100 is the maximum value in the first column and its corresponding index (row) is 1 (if there are duplicate elements, the index of the first element will be reported). 96 and 90 are the maximum values in the second and third columns, respectively.

In [58]:
grades, grades.mean(axis=0)

(array([[ 87,  96,  70],
        [100,  87,  90],
        [ 94,  77,  90],
        [100,  81,  82]]),
 array([95.25, 85.25, 83.  ]))

Hence, 95.25 above represents the average of the first column's grades (87, 100, 94, and 100), 85.25 is the average of the second column's grades (96, 87, 77, and 81), and 83 is the average of the third column's grades (70, 90, 90, and 82).

Similarly, specifying `axis=1` performs the calculation on all the column values within each individual row. To determine each student's average grade for all exams, we can use:

In [59]:
grades.mean(axis=1)

array([84.33333333, 92.33333333, 87.        , 87.66666667])

This generates four averages — one for the values in each row. Therefore, 84.33333333 is the average of row 0's grades (87, 96, and 70), and the other averages correspond to the remaining rows. For more methods, refer to [https://numpy.org/doc/stable/reference/arrays.ndarray.html](https://numpy.org/doc/stable/reference/arrays.ndarray.html).

<center><img src="https://drive.google.com/uc?id=18XpLrRhwLp9YoFqg4AdmqJvZ4pb1-Wu-" width="50%" height="50%"></center>

> For more operations such as methods related to linear algebra, we can use the sub-module `numpy.linalg`, which implements basic linear algebra, such as solving linear systems, singular value decomposition, etc. However, it is not guaranteed to be compiled using efficient routines, and thus we recommend the use of `scipy.linalg`, which will introduce in a later chapter.

### Exercise2: Find the maximum and minimum values of the function $f(x) = x^2$ on the interval $[-3, 5]$ by substituting 1000 evenly spaced numbers between $-3$ and $5$ into the function. What is the corresponding $x$ value for the maximum and minimum values and how do they compare with the actual values?

Hint: You may find `np.linspace()`, `np.max()/np.min()` and `np.argmax()/np.argmin()` useful.

In [None]:
# Your answer here

## References

1. [https://scipy-lectures.org/intro/numpy/index.html](https://scipy-lectures.org/intro/numpy/index.html)

2. [https://scipy-lectures.org/advanced/advanced_numpy/index.html](https://scipy-lectures.org/advanced/advanced_numpy/index.html)

3. [https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html](https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html)

## Key terms

- **Array-Oriented Programming**: This is a style of programming that emphasizes the use of arrays and collective operations, reducing the need for explicit loops. It leads to concise, readable, and often more efficient code.
- **Functional-Style Programming**: This is a paradigm where programs are built using pure functions without changing the state or using mutable data. It encourages the creation of new objects from existing ones and minimizes side effects.
- **Class**: In object-oriented programming (OOP), a class is a blueprint for creating objects. It defines a set of attributes that will characterize any object that is instantiated from the class. These attributes are data members and methods, accessed via dot notation.
- **Object**: In object-oriented programming (OOP), an object is an instance of a class. It is a concrete entity that represents something in the program that can perform certain actions and has certain characteristics. Objects are characterized by methods (representing behaviors) and attributes (representing states or properties).
- **Type**: The term "type" in programming generally refers to the kind of data that a variable contains. For example, the type could be an integer, a string, a list, a boolean, or a more complex user-defined type. 
- **Module**: A module is a file containing Python definitions and statements. The file name is the module name with the suffix `.py` added. Modules are used to organize code in a logical and manageable way. In other languages, similar concepts are packages or libraries.
- **Function**: A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for our application and a high degree of code reusing.
- **Constructor**: A special method in a class in object-oriented programming. Its primary purpose is to initialize new instances of that class. It can set the initial state of an object by assigning values to instance variables. The constructor method is automatically called when an object of a class is instantiated.
- **Argument**: In programming, an argument is a value that is passed to a function or method when it is called. In function definitions, the variables that receive these passed values are called parameters.
- **Keyword**: In programming, a keyword is a word that is reserved by a program because the word has a special meaning. Keywords can be commands or parameters and cannot be used for other purposes, such as naming variables or functions.
- **Attribute**: In OOP, an attribute is a named property of a class. It has a type and is typically used to hold data. The state of an object is determined by the values of its attributes.
- **Method**: A method in OOP is a function that belongs to an object. Methods define the behavior of the object. They often operate on the object's attributes and may be used to modify the object's state.
- **Chain Method**: In programming, chain methods refer to the practice of calling multiple method calls in a single statement. This is done by making a method return the object on which it was called, allowing another method to be immediately called on that object. This technique leads to code that is more readable and concise.
- **Getter**: A getter is a method used in OOP that gets the value of a private attribute. It's a way of accessing read-only data. It provides indirect access to private attributes, often for the purpose of retrieving values.
- **Setter**: A setter is a method used in OOP to control changes to a variable. It's a way of accessing write-only data. Setter methods are commonly used to validate the data that will be stored in an object, ensuring it meets certain criteria before it is assigned to an attribute.
- **Indexing**: Indexing is a programming concept used to access specific elements within data structures such as arrays, lists, or strings. Each element in the data structure is assigned a unique numerical identifier, known as an index, which can be used to retrieve that element. Indexing begins from zero in Python.
- **Fancy Indexing**: Fancy Indexing, also known as integer array indexing or boolean indexing, is a method in Python (especially NumPy) that allows you to select and manipulate complex patterns of data. It involves passing an array (or a list) of integers or booleans in place of an index to access multiple, non-sequential, or conditional elements of the data structure.
- **Slicing**: Slicing is a method used in programming to access a range or subset of elements from a data structure. It is particularly useful when you want to create a new list, array, or string from a specific portion of an existing one.
- **Copies**: In programming, a copy refers to creating a new instance that replicates the content of an existing object or data structure. Making a copy involves allocating new memory space where the copied data is stored. Changes made to the copy do not affect the original data structure and vice versa. This process is also known as deep copying.
- **Views**: In programming, a view, on the other hand, does not involve the creation of a new data structure. Instead, it creates a new way of accessing or viewing the original data. When changes are made to the view, they directly affect the original data. This is because the view and the original data structure share the same memory space. This process is also known as shallow copying.
- **Mutable**: In programming, a mutable object is one whose state or value can be changed after it's created. This means that methods or operations can modify the data held by the object without needing to create a new object.
- **Reduction**: In the context of programming and data processing, reduction often refers to the process of reducing a set of values to a single value. This can be done through various operations such as sum, product, minimum, maximum, mean, etc. For example, if you have an array of numbers, a reduction operation could be to find the sum of all the numbers in the array, thus 'reducing' the array to a single value.