# Python and Numpy Tutorial with Jupyter

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/GLI-Lab/jupyterlite/blob/main/content/data_science/2024-fall/Tutorial.ipynb)

ref: https://cs231n.github.io/python-numpy-tutorial/

Python is an excellent all-purpose programming language, and when paired with libraries like numpy, scipy, and matplotlib, it becomes an incredibly powerful tool for scientific computing. Whether you're already familiar with Python or just starting out, this section will provide a quick introduction to both Python and its application in scientific computing.

Similarly, several libraries enhance Python's capabilities in specific areas:

- **NumPy**: Essential for handling large arrays and matrices, NumPy offers efficient numerical computation tools, allowing for advanced mathematical operations with ease.

- **SciPy**: Built on top of NumPy, SciPy adds more functionality for scientific and technical computing, including modules for optimization, integration, and signal processing.

- **Matplotlib**: This library is the go-to for data visualization in Python, making it easy to create static, animated, or interactive plots with just a few lines of code.

- **scikit-learn (sklearn)**: Widely used in the machine learning community, scikit-learn provides a simple and efficient platform for data mining, machine learning algorithms, and statistical modeling.

Together, these libraries transform Python into a comprehensive environment for data analysis, visualization, and computational science.

In this tutorial, we will explore the following topics:

- **Basic Python**: containers (Lists, Dictionaries, Sets, Tuples) and functions

- **NumPy**: array creation, indexing, data types, mathematical operations, and broadcasting

- **IPython**: interactive notebooks using Jupyter and typical workflows

## 📌 Basic Python

In [1]:
!python --version

Python 3.11.9


In [2]:
import random

random_integers = [random.randint(1, 1000) for _ in range(10000)]

In [3]:
%%time

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quick_sort(left) + middle + quick_sort(right)

print(quick_sort(random_integers)[:100])

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10]
CPU times: user 5.5 ms, sys: 0 ns, total: 5.5 ms
Wall time: 5.42 ms


In [4]:
%%time

def bubble_sort(arr):
    n = len(arr)
    for i in range(n):
        for j in range(0, n-i-1):
            if arr [j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]
    return arr

print(bubble_sort(random_integers)[:100])

[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10]
CPU times: user 2.2 s, sys: 0 ns, total: 2.2 s
Wall time: 2.2 s


### Numbers

In [5]:
x = 3
print(x, type(x))

3 <class 'int'>


In [6]:
print(x + 1)   # Addition
print(x - 1)   # Subtraction
print(x * 2)   # Multiplication
print(x ** 2)  # Exponentiation

4
2
6
9


In [7]:
x += 1
print(x)

x *= 2
print(x)

4
8


In [8]:
y = 2.5
print(type(y))
print(y, y + 1, y * 2, y ** 2)

<class 'float'>
2.5 3.5 5.0 6.25


### Booleans

In [9]:
t, f = True, False
print(type(t))

<class 'bool'>


In [10]:
print(t and f)  # AND
print(t or f)   # OR
print(not t)    # NOT
print(t != f)   # XOR

False
True
False
True


### Strings

In [11]:
hello = 'hello'
world = "world"
print(hello, len(hello))

hello 5


In [12]:
hw = hello + ' ' + world  # String concatenation
print(hw)

hello world


In [13]:
hw2024 = '{} {} {}'.format(hello, world, 2024)
print(hw2024)

hello = 'hello'
world = "world"
hw2024 = f'{hello} {world} 2024'
print(hw2024)

hello world 2024
hello world 2024


In [14]:
s = "hello"
print(s.capitalize())         
print(s.upper())              
print(s.rjust(7))             
print(s.center(7))            
print(s.replace('l', '(ell)'))
print('  world '.strip())     

Hello
HELLO
  hello
 hello 
he(ell)(ell)o
world


### Containers

#### Lists

In [15]:
xs = [3, 1, 2]
print(xs, xs[2])  
print(xs[-1])     

xs[2] = 'foo'
print(xs)         

xs.append('bar')
print(xs)         

x = xs.pop()
print(x, xs)      

[3, 1, 2] 2
2
[3, 1, 'foo']
[3, 1, 'foo', 'bar']
bar [3, 1, 'foo']


In [16]:
nums = list(range(5))     
print(nums)               
print(nums[2:4])          
print(nums[2:])           
print(nums[:2])           
print(nums[:])            
print(nums[:-1])          

nums[2:4] = [8, 9]        
print(nums)               

[0, 1, 2, 3, 4]
[2, 3]
[2, 3, 4]
[0, 1]
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 8, 9, 4]


In [17]:
animals = ['cat', 'dog', 'monkey']

for animal in animals:
    print(animal)

for idx, animal in enumerate(animals):
    print('#%d: %s' % (idx + 1, animal))

cat
dog
monkey
#1: cat
#2: dog
#3: monkey


##### List Comprehensions

In [18]:
nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares)

nums = [0, 1, 2, 3, 4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares)

[0, 1, 4, 9, 16]
[0, 4, 16]


#### Dictionaries

In [19]:
d = {'cat': 'cute', 'dog': 'furry'} 

print(d['cat'])                # "cute"
print('cat' in d)              # "True"

d['fish'] = 'wet'              
print(d['fish'])               # "wet"
# print(d['monkey'])           # KeyError

print(d.get('monkey', 'N/A'))  # "N/A"
print(d.get('fish', 'N/A'))    # "wet"

del d['fish']                  # Remove
print(d.get('fish', 'N/A'))    # "N/A"

cute
True
wet
N/A
wet
N/A


##### Dictionary Comprehensions

In [20]:
nums = [0, 1, 2, 3, 4]
even_num_to_square = {x: x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)

{0: 0, 2: 4, 4: 16}


#### Sets

In [21]:
animals = {'cat', 'dog'}

print('cat' in animals)   # "True"
print('fish' in animals)  # "False"

animals.add('fish')      
print('fish' in animals)  # "True"
print(len(animals))       # "3"

animals.add('cat')        
print(len(animals))       # "3"

animals.remove('cat')     
print(len(animals))       # "2"

True
False
True
3
3
2


##### Set Comprehensions

In [22]:
from math import sqrt

nums = {int(sqrt(x)) for x in range(30)}
print(nums)

{0, 1, 2, 3, 4, 5}


#### Tuples

In [23]:
d = {(x, x + 1): x for x in range(10)}
t = (5, 6)        
print(type(t))    # "<class 'tuple'>"
print(d[t])       # "5"
print(d[(1, 2)])  # "1"

<class 'tuple'>
5
1


### Functions

In [24]:
def sign(x):
    if x > 0:
        return 'positive'
    elif x < 0:
        return 'negative'
    else:
        return 'zero'

for x in [-1, 0, 1]:
    print(sign(x))  # "negative", "zero", "positive"

negative
zero
positive


In [25]:
def hello(name, loud=False):
    if loud:
        print('HELLO, %s!' % name.upper())
    else:
        print('Hello, %s' % name)

hello('Bob')              # "Hello, Bob"
hello('Fred', loud=True)  # "HELLO, FRED!"

Hello, Bob
HELLO, FRED!


## 📌 Numpy

In [26]:
import numpy as np

### Arrays

In [27]:
a = np.array([1, 2, 3])   # Create a rank 1 array

print(type(a))            # "<class 'numpy.ndarray'>"
print(a.shape)            # "(3,)"
print(a[0], a[1], a[2])   # "1 2 3"

a[0] = 5                  
print(a)                  # "[5, 2, 3]"

<class 'numpy.ndarray'>
(3,)
1 2 3
[5 2 3]


In [28]:
b = np.array([[1, 2, 3],
              [4, 5, 6]])

print(b.shape)                     # "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0])   # "1 2 4"

(2, 3)
1 2 4


In [29]:
a = np.zeros((2,2))   
print(a)
# [[ 0.  0.]
#  [ 0.  0.]]

b = np.ones((1,2))    
print(b)              
# [[ 1.  1.]]

c = np.full((2,2), 7)
print(c)
# [[ 7.  7.]
#  [ 7.  7.]]

d = np.eye(2)         
print(d)
# [[ 1.  0.]
#  [ 0.  1.]]
        
e = np.random.random((2,2)) 
print(e)                    
# [[ 0.91940167  0.08143941]
#  [ 0.68744134  0.87236687]]

[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.10820736 0.93265503]
 [0.2792792  0.97664304]]


### Array indexing

#### Slicing

In [30]:
a = np.array([[1, 2, 3, 4], 
              [5, 6, 7, 8], 
              [9, 10, 11, 12]])

b = a[:2, 1:3]
# [[2 3]
#  [6 7]]

print(a[0, 1])   # "2"
b[0, 0] = 77     # b[0, 0] == a[0, 1]
print(a[0, 1])   # "77"

2
77


#### Integer array indexing

In [31]:
a = np.array([[1, 2, 3, 4], 
              [5, 6, 7, 8], 
              [9, 10, 11, 12]])

row_r1 = a[1, :]             
row_r2 = a[1:2, :]           
print(row_r1, row_r1.shape)  # "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape)  # "[[5 6 7 8]] (1, 4)"

col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)  # "[ 2  6 10] (3,)"
print(col_r2, col_r2.shape)
# [[ 2]
#  [ 6]
#  [10]] (3, 1)

[5 6 7 8] (4,)
[[5 6 7 8]] (1, 4)
[ 2  6 10] (3,)
[[ 2]
 [ 6]
 [10]] (3, 1)


In [32]:
a = np.array([[1, 2], 
              [3, 4], 
              [5, 6]])

print(a[[0, 1, 2], [0, 1, 0]])                # "[1 4 5]"
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # "[1 4 5]"

print(a[[0, 0], [1, 1]])             # "[2 2]"
print(np.array([a[0, 1], a[0, 1]]))  # "[2 2]"

[1 4 5]
[1 4 5]
[2 2]
[2 2]


In [33]:
a = np.array([[1, 2, 3], 
              [4, 5, 6], 
              [7, 8, 9], 
              [10, 11, 12]])

print(a)
# array([[ 1,  2,  3],
#        [ 4,  5,  6],
#        [ 7,  8,  9],
#        [10, 11, 12]])

b = np.array([0, 2, 0, 1])

print(a[np.arange(4), b])
# [ 1  6  7 11]

a[np.arange(4), b] += 10
print(a)
# array([[11,  2,  3],
#        [ 4,  5, 16],
#        [17,  8,  9],
#        [10, 21, 12]]

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
[ 1  6  7 11]
[[11  2  3]
 [ 4  5 16]
 [17  8  9]
 [10 21 12]]


#### Boolean array indexing

In [34]:
a = np.array([[1, 2], 
              [3, 4], 
              [5, 6]])

bool_idx = (a > 2)

print(bool_idx)
# [[False False]
#  [ True  True]
#  [ True  True]]

print(a[bool_idx])  
# [3 4 5 6]

print(a[a > 2])     
# [3 4 5 6]

[[False False]
 [ True  True]
 [ True  True]]
[3 4 5 6]
[3 4 5 6]


### Datatypes

In [35]:
x = np.array([1, 2])   
print(x.dtype)  # "int64"

x = np.array([1.0, 2.0])   
print(x.dtype)  # "float64"

x = np.array([1, 2], dtype=np.int64)
print(x.dtype)  # "int64"

int64
float64
int64


### Array math

In [36]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

#### Elementwise

In [37]:
# Elementwise sum
print(x + y)
print(np.add(x, y))
# [[ 6.0  8.0]
#  [10.0 12.0]]

# Elementwise difference
print(x - y)
print(np.subtract(x, y))
# [[-4.0 -4.0]
#  [-4.0 -4.0]]

# Elementwise product
print(x * y)
print(np.multiply(x, y))
# [[ 5.0 12.0]
#  [21.0 32.0]]

# Elementwise division
print(x / y)
print(np.divide(x, y))
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]

# Elementwise square root
print(np.sqrt(x))
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


#### Product

In [38]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product
print(v.dot(w))
print(np.dot(v, w))

# Matrix / vector product
print(x.dot(v))
print(np.dot(x, v))

# Matrix / matrix product
print(x.dot(y))
print(np.dot(x, y))
# [[19 22]
#  [43 50]]

219
219
[29 67]
[29 67]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


#### Sum

In [39]:
x = np.array([[1,2],[3,4]])

print(np.sum(x))          # "10"
print(np.sum(x, axis=0))  # "[4 6]"
print(np.sum(x, axis=1))  # "[3 7]"

10
[4 6]
[3 7]


### Broadcasting

#### Outer product

In [40]:
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)

print(np.reshape(v, (3, 1)) * w)
# [[ 4  5]
#  [ 8 10]
#  [12 15]]

[[ 4  5]
 [ 8 10]
 [12 15]]


#### Add

In [41]:
x = np.array([[1,2,3], 
              [4,5,6]])

print(x + v)
# [[2 4 6]
#  [5 7 9]]

print((x.T + w).T)
# [[ 5  6  7]
#  [ 9 10 11]]

print(x + np.reshape(w, (2, 1)))
# [[ 5  6  7]
#  [ 9 10 11]]

[[2 4 6]
 [5 7 9]]
[[ 5  6  7]
 [ 9 10 11]]
[[ 5  6  7]
 [ 9 10 11]]


#### Multiply

In [42]:
print(x * 2)
# [[ 2  4  6]
#  [ 8 10 12]]

[[ 2  4  6]
 [ 8 10 12]]
