# Book: Python Data Science Handbook
Essential Tools for Working With Data\
By **Jake VanderPlas** (2017)

## Chapter 1: IPython: Beyond Normal Python

In [2]:
L = [1, 2, 3]
L.insert?

[31mSignature:[39m L.insert(index, object, /)
[31mDocstring:[39m Insert object before index.
[31mType:[39m      builtin_function_or_method

In [3]:
def square(a):
    return a**2
square??

[31mSignature:[39m square(a)
[31mDocstring:[39m <no docstring>
[31mSource:[39m   
[38;5;28;01mdef[39;00m square(a):
    [38;5;28;01mreturn[39;00m a**[32m2[39m
[31mFile:[39m      /tmp/ipykernel_20779/825368989.py
[31mType:[39m      function

In [4]:
square(3)

9

Magic Commands

- Line magics: _prefixed by the % character_
- Cell magics: _denoted by double % (%%) prefix_

In [5]:
def func1(a, b):
    return a / b
def func2(x):
    a = x
    b = x - 1
    return func1(a, b)

## %xmode 
- xmode short for Exception mode, 
- it has three possibilities Plain, Context, and Verbose
- the default being Context

In [6]:
%xmode Plain

Exception reporting mode: Plain


In [7]:
func2(1)

ZeroDivisionError: division by zero

In [None]:
%xmode Verbose

Exception reporting mode: Verbose


In [None]:
func2(1)

ZeroDivisionError: division by zero

In [None]:
%xmode Context

Exception reporting mode: Context


In [None]:
func2(1)

ZeroDivisionError: division by zero

## Chapter 2: Introduction to NumPy

NumPy and Pandas packages are the specialized tools in Python for handling numerical arrays.

NumPy = _Numerical Python_

In [1]:
import numpy as np
np.__version__

'2.2.3'

In [None]:
# +=, this is called Addition assignment
x = 2
x += 4
print(x)

6


### Arrays
Unlike Python lists, NumPy is constrained to arrays that all contain the same type.

In [4]:
np.array([1,4,2,5,3])

# Explicity setting the data type
np.array([1,4,2,5,3], dtype='float32')

array([1., 4., 2., 5., 3.], dtype=float32)

Unlike Python lists, NumPy arrays can be multidimensional

In [5]:
# Nested lists result in multidimensiona; arrays
np.array([range(i, i+3) for i in [2,4,6]])

array([[2, 3, 4],
       [4, 5, 6],
       [6, 7, 8]])

The inner lists are treated as rows of the resulting two-dimensional array.

#### Creating Arrays from Scratch

In [6]:
# Create a length-10 integer array filled with zeros
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [7]:
# Create a 3x5 floating-point array filled with 1s
np.ones((3,5), dtype=float)

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [8]:
# Create a 3x5 array filled with 3.14
np.full((3,5), 3.14)

array([[3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14],
       [3.14, 3.14, 3.14, 3.14, 3.14]])

In [10]:
# Create an array filled with a linear sequence
# Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
np.arange(0,20,2)

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [12]:
# Create an array of five(5) values evenly spaced between 0 and 1
np.linspace(0,1,5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [13]:
# Create a 3x3 array of uniformly distributed 
# random values between 0 and 1
np.random.random((3,3))

array([[0.07268278, 0.8853733 , 0.30940915],
       [0.0323405 , 0.17538956, 0.18072717],
       [0.39409606, 0.68565646, 0.05765352]])

In [15]:
# Create a 3x3 array of random integers in the interval (0,10)
np.random.randint(0,10, (3,3))

array([[5, 4, 3],
       [9, 2, 7],
       [2, 0, 9]])

In [16]:
# Create a 3x3 identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [17]:
# Create an uninitialized array of three integers
# The values will be whatever happens to already exist at that
# memory location
np.empty(3)

array([1., 1., 1.])

#### NumPy Standard Data Types

![image.png](attachment:image.png)

In [18]:
# You can specify them using a string
np.zeros(10, dtype='int16')

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int16)

In [19]:
# Or using the associated NumPy object
np.zeros(10, dtype=np.int16)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int16)

#### The Basics of NumPy Arrays
Data manipulation in Python is nearly synonymous with NumPy array manipulation: even newer tools like Pandas (Chapter 3) are built around the NumPy array.

##### 1. NumPy Array Attributes
Determining the size, shape, memory consumption, and data types of arrays

In [20]:
# import numpy as np # Already imported
np.random.seed(0) # Seed for reproducibility

x1 = np.random.randint(10, size=6) # One-dimensional array
x2 = np.random.randint(10, size=(3,4)) # Two-dimensional array
x3 = np.random.randint(10, size=(3,4,5)) # Three-dimensional array

Each array has attributes `ndim` (the number of dimensions), `shape` (the size of each dimension), and `size` (the total size of the array):

In [23]:
x2

array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [24]:
print("x2 ndim: ", x2.ndim)
print("x2 shape:", x2.shape)
print("x2 size: ", x2.size)

x2 ndim:  2
x2 shape: (3, 4)
x2 size:  12


In [21]:
print("x3 ndim: ", x3.ndim)
print("x3 shape:", x3.shape)
print("x3 size: ", x3.size)

x3 ndim:  3
x3 shape: (3, 4, 5)
x3 size:  60


In [22]:
x3

array([[[8, 1, 5, 9, 8],
        [9, 4, 3, 0, 3],
        [5, 0, 2, 3, 8],
        [1, 3, 3, 3, 7]],

       [[0, 1, 9, 9, 0],
        [4, 7, 3, 2, 7],
        [2, 0, 0, 4, 5],
        [5, 6, 8, 4, 1]],

       [[4, 9, 8, 1, 1],
        [7, 9, 9, 3, 6],
        [7, 2, 0, 3, 5],
        [9, 4, 4, 6, 4]]])

Another useful attribute is the `dtype`, the data type of the array:

In [26]:
# dtype
print("dtype:", x3.dtype)

dtype: int64


Other attributes include `itemsize`, which lists the size (in bytes) of each array element, and `nbytes`, which lists the total size (in bytes) of the array. 

(In general, we expect that nbytes is equal to itemsize times size.)

In [27]:
# itemsize
print("itemsize:", x3.itemsize, "bytes")
print("nbytes:", x3.nbytes, "bytes")

itemsize: 8 bytes
nbytes: 480 bytes


##### 2. Array Indexing: Accessing Single Elements

In [28]:
x1

array([5, 0, 3, 3, 7, 9])

In a **one-dimensional array**, you can access the $i^{th}$ value (counting from zero `0`) by specifying the desired index in square brackets `[]`, just as with Python lists

In [None]:
print(x1[0])
print(x1[4])

5
7


##### 3. NumPy Array Attributes