Foundational Python for Data Science

By Kennedy Behrman

Addison-Wesley ProfessionalSeptember 2021

https://learning.oreilly.com/library/view/foundational-python-for/9780136624417/

The first library we will look at, NumPy, is the backbone of many of the other data science libraries. In this chapter, you will learn about the NumPy array, which is an efficient multidimensional data structure.

## Installing and Importing NumPy

In [2]:
!pip install numpy

Looking in indexes: https://pypi.org/simple, https://gsj5sl8:****@devstack.vwgroup.com/artifactory/api/pypi/adapmt-python-release/simple, https://gsj5sl8:****@devstack.vwgroup.com/artifactory/api/pypi/pypi/simple, https://gsj5sl8:****@devstack.vwgroup.com/artifactory/api/pypi/camsys-gmdm-pypi/simple


In [3]:
import numpy as np

## Creating Arrays

A NumPy array is a data structure that is designed to efficiently handle operations on large data sets. These data sets can be of varying dimensions and can contain numerous data types—though not in the same object. NumPy arrays are used as input and output to many other libraries and are used as the underpinning of other data structures that are important to data science, such as those in Pandas and SciPy.


In [4]:
np.array([1,2,3])      # Array from list

array([1, 2, 3])

In [5]:
np.zeros(3)            # Array of zeros 

array([0., 0., 0.])

In [6]:
np.ones(3)             # Array of ones 

array([1., 1., 1.])

In [7]:
np.empty(3)            # Array of arbitrary data 

array([1., 1., 1.])

In [8]:
np.arange(3)           # Array from range of numbers 

array([0, 1, 2])

In [9]:
np.arange(0, 12, 3)    # Array from range of numbers 

array([0, 3, 6, 9])

In [10]:
np.linspace(0, 21, 7)  # Array over an interval 

array([ 0. ,  3.5,  7. , 10.5, 14. , 17.5, 21. ])

Arrays have dimensions. A one-dimensional array has only one dimension, which is the number of elements. In the case of the `np.array` method, the dimension matches that of the list(s) used as input. For the `np.zeros`, `np.ones`, and `np.empty` methods, the dimension is given as an explicit argument.

The `np.arange` method produces an array in a way similar to a `range` sequence. The resulting dimension and values match those that would be produced by using `range`. You can specify beginning, ending, and step values.

The `np.linspace` method produces evenly spaced numbers over an interval. The first two arguments define the interval, and the third defines the number of items.

The `np.empty` method is useful in producing large arrays efficiently. Keep in mind that because the data is arbitrary, you should only use it in cases where you will replace all of the original data.

**Characteristics of an Array**

In [14]:
oned = np.arange(21)
oned

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20])

In [15]:
oned.dtype     # Data type 

dtype('int64')

In [16]:
oned.size      # Number of elements 

21

In [17]:
oned.nbytes    # Bytes(memory) consumed by elements of the array 
168

168

In [18]:
oned.shape     # Number of elements in each dimension 

(21,)

In [19]:
oned.ndim      # Number of dimensions 

1

In [20]:
type(oned)

numpy.ndarray

**Matrix from Lists**

In [21]:
list_o_lists = [[1,2,3],
                [4,5,6],
                [7,8,9]]

twod = np.array(list_o_lists)
twod

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [22]:
twod.shape

(3, 3)

In [23]:
twod.ndim

2

**Using `reshape`**

You can produce an array with the same elements but different dimensions by using the `reshape` method. This method takes the new shape as arguments. The next listing demonstrates using a one-dimensional array to produce a two-dimensional one and then producing one-dimensional and three-dimensional arrays from the two-dimensional one.

In [24]:
oned = np.arange(12)
oned

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [25]:
twod = oned.reshape(3,4)
twod

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [26]:
twod.reshape(12)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [27]:
twod.reshape(2,2,3)

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]]])

The shape you provide for an array must be consistent with the number of elements in it. For example, if you take the 12-element array twod and try to set its dimensions with a shape that does not include 12 elements, you get an error:

In [28]:
twod.reshape(2,3)

ValueError: cannot reshape array of size 12 into shape (2,3)

In [29]:
twod.reshape(13,1)

ValueError: cannot reshape array of size 12 into shape (13,1)

Reshaping is commonly used with the `np.zeros`, `np.ones`, and `np.empty` methods to produce multidimensional arrays with default values. For example, you could create a three-dimensional array of ones like this:

In [30]:
np.ones(12).reshape(2,3,2)

array([[[1., 1.],
        [1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.],
        [1., 1.]]])

## Indexing and Slicing

You can access the data in arrays by indexing and slicing. In Listing 7.5, you can see that indexing and slicing with a one-dimensional array is the same as with a list. You can index individual elements from the start or end of an array by supplying an index number or multiple elements using a slice.

** Indexing and Slicing a one-Dimensional Array**

In [31]:
oned = np.arange(21)
oned

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20])

In [32]:
oned[3]

3

In [33]:
oned[-1]

20

In [34]:
oned[3:9]

array([3, 4, 5, 6, 7, 8])

**Indexing and Slicing a Two-Dimensional Array**

For multidimensional arrays, you can supply one argument for each dimension. If you omit the argument for a dimension, it defaults to all elements of that dimension. So, if you supply a single number as an argument to a two-dimensional array, that number will indicate which row to return. If you supply single-number arguments for all dimensions, a single element is returned. You can also supply a slice for any dimension. In return you get a subarray of elements, whose dimensions are determined by the length of your slices. The next listing demonstrates various options for indexing and slicing a two-dimensional array.

In [35]:
twod = np.arange(21).reshape(3,7)
twod

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20]])

In [36]:
twod[2]   # Accessing row 2

array([14, 15, 16, 17, 18, 19, 20])

In [37]:
twod[2, 3]  # Accessing item at row 2, column 3

17

In [38]:
twod[0:2]         # Accessing rows 0 and 1

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13]])

In [40]:
twod[:, 3]        # Accessing column 3

array([ 3, 10, 17])

In [41]:
twod[0:2, -3:]    # Accessing the last three columns of rows 0 and 1

array([[ 4,  5,  6],
       [11, 12, 13]])

**Changing Values in an Array**

You can assign new values to an existing array, much as you would with a list, by using indexing and slicing. If you assign a values to a slice, the whole slice is updated with the new value. The next listing demonstrates how to update a single element and a slice of a two-dimensional array.


In [42]:
twod = np.arange(21).reshape(3,7)
twod

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20]])

In [43]:
twod[0,0] = 33
twod

array([[33,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20]])

In [45]:
twod[1:,:3] = 0
twod

array([[33,  1,  2,  3,  4,  5,  6],
       [ 0,  0,  0, 10, 11, 12, 13],
       [ 0,  0,  0, 17, 18, 19, 20]])

## Element-by-Element Operations

An array is not a sequence. Arrays do share some characteristics with lists, and on some level it is easy to think of the data in an array as a list of lists. There are many differences between arrays and sequences, however. One area of difference is when performing operations between the items in two arrays or two sequences.

Remember that when you do an operation such as multiplication with a sequence, the operation is done to the sequence, not to its contents. So, if you multiply a list by zero, the result is a list with a length of zero:

In [51]:
[1, 2, 3] * 0

[]

You cannot multiply two lists, even if they are the same length:

In [49]:
[1, 2, 3] * [4, 5, 6]

TypeError: can't multiply sequence by non-int of type 'list'

**Element-by-Element Operations with Lists**

You can write code to perform operations between the elements of lists. For example, the next listing demonstrates looping through two lists in order to create a third list that contains the results of multiple pairs of elements. The `zip()` function is used to combine the two lists into a list of tuples, with each tuple containing elements from each of the original lists.

In [56]:
L1 = list(range(10))
L2 = list(range(10, 0, -1))

L3 = []
for i, j in zip(L1, L2):
    L3.append(i*j)

for x,y,z in zip(L1, L2, L3):
    print(f"{x:2} * {y:2} = {z:4}")

 0 * 10 =    0
 1 *  9 =    9
 2 *  8 =   16
 3 *  7 =   21
 4 *  6 =   24
 5 *  5 =   25
 6 *  4 =   24
 7 *  3 =   21
 8 *  2 =   16
 9 *  1 =    9


**Element-by-Element Operations with Arrays**

While it is possible to use loops to perform element-by-element operations on lists, it is much simpler to use NumPy arrays for such operations. Arrays do element-by-element operations by default. The next listing demonstrates multiplication, addition, and division operations between two arrays. Notice that the operations in each case are done between the elements of the arrays.

In [57]:
array1 = np.array(L1)
array2 = np.array(L2)

In [58]:
array1 * array2

array([ 0,  9, 16, 21, 24, 25, 24, 21, 16,  9])

In [59]:
array1 + array2

array([10, 10, 10, 10, 10, 10, 10, 10, 10, 10])

In [60]:
array1 / array2

array([0.        , 0.11111111, 0.25      , 0.42857143, 0.66666667,
       1.        , 1.5       , 2.33333333, 4.        , 9.        ])

Numpy does also perform element-by-element operations for arrays and numbers:

In [62]:
array1 * 10

array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

And Numpy applies functions on arrays to each element:

In [63]:
np.sqrt(array1)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

## Filtering Values

One of the most used aspects of NumPy arrays and the data structures built on top of them is the ability to filter values based on conditions of your choosing. In this way, you can use an array to answer questions about your data.

**Filtering Using Booleans**

The next code shows a two-dimensional array of integers, called `twod`. A second array, `mask`, has the **same dimensions** as `twod`, but it contains Boolean values. `mask` specifies which elements from `twod` to return. The resulting array contains the elements from `twod` whose corresponding positions in `mask` have the value True.

In [64]:
twod = np.arange(21).reshape(3,7)
twod

array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20]])

In [65]:
mask = np.array([[ True,  False,  True,  True,  False, True, False],
                 [ True,  False,  True,  True,  False, True, False],
                 [ True,  False,  True,  True,  False, True, False]])

In [66]:
twod[mask]

array([ 0,  2,  3,  5,  7,  9, 10, 12, 14, 16, 17, 19])

**Filtering Using Comparison**

Comparison operators that you have seen returning single Booleans before return arrays when used with arrays. So, if you use the less-than operator (`<`) against the array `twod` as follows, the result will be an array with True for every item that is below five and False for the rest:

In [67]:
twod < 5

array([[ True,  True,  True,  True,  True, False, False],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False]])

You can use this result as a mask to get only the values that are True with the comparison. For example, the next listing creates a mask and then returns only the values of `twod` that are less than 5.

In [68]:
mask = twod < 5
mask

array([[ True,  True,  True,  True,  True, False, False],
       [False, False, False, False, False, False, False],
       [False, False, False, False, False, False, False]])

In [69]:
twod[mask]

array([0, 1, 2, 3, 4])

**Filtering Using Multiple Comparisons**

As you can see, you can use comparison and order operators to easily extract knowledge from data. You can also combine these comparisons to create more complex masks. The next listing uses `&` to join two conditions to create a mask that evaluates to True only for items meeting both conditions.

In [70]:
mask = (twod < 5) & (twod%2 == 0)

twod[mask]

array([0, 2, 4])

**Note**

> Filtering using masks is a process that you will use time and time again, especially with Pandas `DataFrames`, which are built on top of NumPy `arrays`. You will learn about DataFrames in Chapter 9, “Pandas.”

## Views Versus Copies

NumPy arrays are designed to work efficiently with large data sets. One of the ways this is accomplished is by using **views**. When you slice or filter an array, the returned array is, when possible, a view and not a copy. A view allows you to look at the same data differently. It is important to understand that memory and processing power are not used in making copies of data every time you slice or filter. If you change a value in a view of an array, you change that value in the original array as well as any other views that represent that item.

**Changing Values in a View**

For example, the next listing takes a slice from the array `data1` and names it `data2`. It then replace the value `11` in `data2` with `-1`. When you go back to `data1`, you can see that the item that used to have a value of `11` is now set to `-1`.

In [71]:
data1 = np.arange(24).reshape(4,6)
data1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

In [72]:
data2 = data1[:2,3:]
data2

array([[ 3,  4,  5],
       [ 9, 10, 11]])

In [73]:
data2[1,2] = -1
data2

array([[ 3,  4,  5],
       [ 9, 10, -1]])

In [74]:
data1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, -1],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

**Changing Values in a Copy**

This behavior can lead to bugs and miscalculations, but if you understand it, you can gain some important benefits when working with large data sets. If you want to change data from a slice or filtering operation without changing it in the original array, you can make a copy. For example, in then ext listing, notice that when an item is changed in the copy, the original array remains unchanged.

In [75]:
data1 = np.arange(24).reshape(4,6)
data1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

In [77]:
data2 = data1[:2,3:].copy()
data2

array([[ 3,  4,  5],
       [ 9, 10, 11]])

In [78]:
data2[1,2] = -1
data2

array([[ 3,  4,  5],
       [ 9, 10, -1]])

In [79]:
data1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

It is important to know that nearly all numpy functions return copies and not views, so the input data is never changed:

In [83]:
data2 = data1.reshape(6,4)
data2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [84]:
data1

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23]])

## Some Array Methods

NumPy arrays have built-in methods both to get statistical summary data and to perform matrix operations.

**Introspection**

The next listing shows methods producing summary statistics. There are methods to get the maximum, minimum, sum, mean, and standard deviation. **All these methods produce results across the whole array unless an axis is specified**. If an axis value of `1` is specified, an array with results for **each row** is produced. With an axis value of `0`, an array of results is produced for **each column**.

In [85]:
data = np.arange(12).reshape(3,4)
data

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [86]:
data.max()

11

In [87]:
data.min()

0

In [88]:
data.sum()

66

In [89]:
data.mean()

5.5

In [90]:
data.std()

3.452052529534663

In [91]:
data.sum(axis=1)

array([ 6, 22, 38])

In [92]:
data.sum(axis=0)

array([12, 15, 18, 21])

**Matrix Operations**

The next listing demonstrates some of the matrix operations that are available with arrays. These include returning the transpose, returning matrix products, and returning the diagonal. Remember that you can use the multiplication operator (`*`) between arrays to perform element-by-element multiplication. If you want to calculate the dot product of two matrices, you need to use the `@` operator or the `.dot()` method

In [93]:
A1 = np.arange(9).reshape(3,3)
A1

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

In [95]:
A1.T  # Transpose 

array([[0, 3, 6],
       [1, 4, 7],
       [2, 5, 8]])

In [96]:
A2 = np.ones(9).reshape(3,3)
A2

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [97]:
A1 * A2  # element-wise multiplication

array([[0., 1., 2.],
       [3., 4., 5.],
       [6., 7., 8.]])

In [98]:
A1 @ A2  # Matrix product 

array([[ 3.,  3.,  3.],
       [12., 12., 12.],
       [21., 21., 21.]])

In [99]:
A1.dot(A2)  # Matrix product = dot product

array([[ 3.,  3.,  3.],
       [12., 12., 12.],
       [21., 21., 21.]])

In [100]:
A1.diagonal()    # Diagonal 

array([0, 4, 8])

**Setting Type Automatically**

An array, unlike many sequence types, can contain **only one data type**. You cannot have an array that contains both strings and integers. If you do not specify the data type, NumPy guesses the type, based on the data. The next listing shows that when you start with integers, NumPy sets the data type to `int64`. You can also see, by checking the `nbytes` attribute, that the data for this array takes 800 bytes of memory.

In [101]:
darray = np.arange(100)
darray

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [102]:
darray.dtype

dtype('int64')

In [103]:
darray.nbytes

800

In [104]:
100*64/8

800.0

**Setting Type Explicitly**

For lager data sets, you can control the amount of memory used by setting the data type explicitly. The `int8` data type can represent numbers from `–128` to `127`, so it would be adequate for a data set of `1–99`. You can set an array’s data type at creation by using the parameter `dtype`. The next listing does this to bring the size of the data down to 100 bytes.



In [105]:
darray = np.arange(100, dtype=np.int8)
darray

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99],
      dtype=int8)

In [106]:
darray.nbytes

100

Because an array can store only one data type, you cannot insert data that cannot be cast to that data type. For example, if you try to add a string to the int8 array, you get an error:

In [107]:
darray[14] = 'a'

ValueError: invalid literal for int() with base 10: 'a'

A subtle error with array type occurs if you add to an array data of a finer granularity than the array’s data type; this can lead to **data loss**. For example, say that you add the floating-point number 0.5 to the int8 array:

In [109]:
darray[14] = 0.5
darray[14]

0

## Broadcasting

You can perform operations between arrays of different dimensions. Operations can be done when the dimension is the same or when the dimension is one for at least one of the arrays. The next listing adds 1 to each element of the array `A1` three different ways: first with an array of ones with the same dimensions `(3, 3)`, then with an array with one dimension of one `(1, 3)`, and finally by using the integer 1.

In [111]:
A1 = np.array([[1,2,3],
               [4,5,6],
               [7,8,9]])
A2 = np.array([[1,1,1],
               [1,1,1],
               [1,1,1]])


In [112]:
A1 + A2

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

In [113]:
A2 = np.array([1,1,1])

In [114]:
A1 + A2

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

In [116]:
A1 + 1

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

In all three cases, the result is the same: an array of dimension `(3, 3)`. This is called **broadcasting** because a dimension of one is expanded to fit the higher dimension. So if you do an operation with arrays of dimensions `(1, 3, 4, 4)` and `(5, 3, 4, 1)`, the resulting array will have the dimensions `(5, 3, 4, 4)`. Broadcasting does not work with dimensions that are different but not one.

**Expanding Dimensions**

The next listing does an operation on arrays with the dimensions `(2, 1, 5)` and `(2, 7, 1)`. The resulting array has the dimensions `(2, 7, 5)`.

In [117]:
A4 = np.arange(10).reshape(2,1,5)
A4

array([[[0, 1, 2, 3, 4]],

       [[5, 6, 7, 8, 9]]])

In [118]:
A5 = np.arange(14).reshape(2,7,1)
A5

array([[[ 0],
        [ 1],
        [ 2],
        [ 3],
        [ 4],
        [ 5],
        [ 6]],

       [[ 7],
        [ 8],
        [ 9],
        [10],
        [11],
        [12],
        [13]]])

In [119]:
A6 = A4 - A5
A6

array([[[ 0,  1,  2,  3,  4],
        [-1,  0,  1,  2,  3],
        [-2, -1,  0,  1,  2],
        [-3, -2, -1,  0,  1],
        [-4, -3, -2, -1,  0],
        [-5, -4, -3, -2, -1],
        [-6, -5, -4, -3, -2]],

       [[-2, -1,  0,  1,  2],
        [-3, -2, -1,  0,  1],
        [-4, -3, -2, -1,  0],
        [-5, -4, -3, -2, -1],
        [-6, -5, -4, -3, -2],
        [-7, -6, -5, -4, -3],
        [-8, -7, -6, -5, -4]]])

In [120]:
A6.shape

(2, 7, 5)

## NumPy Math

In addition to the NumPy array, the NumPy library offers many mathematical functions, including trigonometric functions, logarithmic functions, and arithmetic functions. These functions are designed to be performed with NumPy arrays and are often used in conjunction with data types in other libraries. This section takes a quick look at NumPy polynomials.

NumPy offers the class `poly1d` for modeling **one-dimensional polynomials**. To use this class, you need to import it from NumPy:

In [121]:
from numpy import poly1d

In [122]:
poly1d((4,5))

poly1d([4, 5])

If you print a `poly1d` object, it shows the polynomial representation:

In [123]:
c = poly1d([4,3,2,1])
print(c)

   3     2
4 x + 3 x + 2 x + 1


If for a second argument you supply the value `True`, the first argument is interpreted as roots rather than coefficients. The following example models the polynomial resulting from the calculation $(x – 4)(x – 3)(x – 2)(x – 1)$:

In [124]:
r = poly1d([4,3,2,1], True)
print(r)

   4      3      2
1 x - 10 x + 35 x - 50 x + 24


You can evaluate a polynomial by supplying the `x` value as an argument to the object itself. For example, you can evaluate the preceding polynomial for a value of `x` equal to 5:

In [125]:
r(5)

24.0

**Polynomials**

The `poly1d` class allows you to do operations between polynomials, such as addition and multiplication. It also offers polynomial functionality as special class methods. The next listing demonstrates the use of this class with polynomials.

In [126]:
p1 = poly1d((2,3))
print(p1)

 
2 x + 3


In [127]:
p2 = poly1d((1,2,3))
p2

poly1d([1, 2, 3])

In [129]:
p2 * p1 # Multiplying polynomials

poly1d([ 2,  7, 12,  9])

In [130]:
print(p2.deriv())      # Taking the derivative

 
2 x + 2


In [131]:
print(p2.integ())      # Returning anti-derivative

        3     2
0.3333 x + 1 x + 3 x


The `poly1d` class is just one of the many specialized mathematical tools offered in the NumPy toolkit. These tools are used in conjunction with many of the other specialized tools that you will learn about in the coming chapters.

## Questions

1. Name three differences between NumPy arrays and Python lists.

2. Given the following code, what would you expect for the final value of `d2`?

In [132]:
d1 = np.array([[0, 1, 3],
               [4, 2, 9]])
d2 = d1[:, 1:]

3. Given the following code, what would you expect for the final value of `d1[0,2]`?

In [133]:
d1 = np.array([[0, 1, 3],
               [4, 2, 9]])
d2 = d1[:, 1:]
d2[0,1] = 0

4. If you add two arrays of dimensions `(1, 2, 3)` and `(5, 2, 1)`, what will be the resulting array’s dimensions?

5. Use the `poly1d` class to model the following polynomial:

In [135]:
s = """
   4     3     2
6 x + 2 x + 5 x + x -10
"""
print(s)


   4     3     2
6 x + 2 x + 5 x + x -10

