### Python Fundamentals

Some familiarity with Python is ideal, but we'll start from the ground up, going through some of the basics of Python, from `hello world`, basic and more advanced data types, control structures, and user-defined functions.

We'll pay special attention to some of the important language-specific features of Python.

This is an ipynb--"iPython notebook"--file. It incorporates both code and markdown cells, so that you can include both code and documentation. This is part of the notion of "literate programming" and reproducible workflows.

I'm a **markdown cell**. You can use my ilk to document what you're doing, while most of the cells below are **code** cells, that will execute Python code. You can also simply use comments, using `#`, in code cells to document.

*Liberally add cells and experiment with the notions presented, to help internalize Python!*

<img src="monty.jpg" alt="Monty Python!" style="width:500px;"/>

#### Goals Today:
- Scalar types and mutability vs. immutability
- Variable name binding
- Advanced types: Lists, Tuples, Dictionaries, and Sets
- List comprehensions
- Sequence functions
- Control structures and user-defined functions
- Generators
- The Importance of Being Pythonic, and the Zen of Python
- Start matplotlib


And overall Intro to Python overview:


#### Pythonic programming

- Review of Python basics and advanced types
- List comprehensions, etc.
- The Zen of Python


#### Matplotlib

- Fundamental plotting library for scientific computing and data science with Python
- Interface with pandas, seaborn, and geopandas


#### NumPy

- numpy was developed for fast computation with large arrays
- Fast vectorized operations without need for loops
- C API for connecting NumPy with libraries written in C, C++, FORTRAN

- NumPy stores data internally in large contiguous blocks of memory
- NumPy libraries written in C and act on memory without Python interpreter overhead
- Much faster than other Python data types


#### Pandas

- The pandas library is our fundamental library for working with tabular data/data frames.

- pandas is often used with numerical computing tools NumPy and SciPy, analytical libraries like scikit-learn, and data visualization libraries such as matplotlib

- Adopts parts of NumPy's style for array-based computing and data processing without `for` loops


In [None]:
### Note: The Anaconda or miniconda package manager is strongly recommended

### Additionally, it is advisable to use a virtual environment
### You can set up and install packages like so in the Anaconda terminal:

#conda create -n my_env
#conda activate my_env
#conda config --env --add channels conda-forge
#conda config --env --set channel_priority strict
#conda install <my_package>

### You may also need to install packages using "pip install <package>"

In [None]:
## The packages below are included with Anaconda, you'll need to install anew if you use a virtual environment or miniconda

In [23]:
## At minimum we'll use matplotlib.pyplot and numpy for almost everything
## And usually pandas too, so let's just import now:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd


#Note you can do things like:
########

#from numpy import sin

#Or:
#from numpy import *

#I don't generally recommed doing the latter


In [24]:
#Also, we will need geopandas shortly, so let's go ahead and get it now
#####

import geopandas as gpd

<img src="Hello_World/hello1.jpg" alt="Hello World!" style="width:300px;"/>

In [187]:
x = "HELLO"
y = "WORLD"

x = 42

print(str(x) + ' ' + y + '\n\nTHIS IS SO GOOD')

#Also,
x = 42
print(x, ' ', y, '\n\nTHIS IS SO GOOD')

42 WORLD

THIS IS SO GOOD
42   WORLD 

THIS IS SO GOOD


<img src="Hello_World/hello2.jpg" alt="Hello World!" style="width:350px;"/>

#### Python Scalar Types

There are several basic single value *scalar* types in the standard Python library:


| Type | Description |
| :- | :- |
| `None` | NULL, Only a single instance of `None` exists
| `str` | String, UTF-8 encoded strings
| `bytes` | Raw ASCII bytes
| `float` | Double-precision (64-bit) float (no separate `double` type)
| `int` | Arbitrary precision signed integer
| `bool` | `True` or `False` boolean

**Scalar types are always immutable**

In [188]:
#Implicit casting occurs only in very obvious cases:

x = 5
y = x + .1

type(y)

float

In [None]:
#Explicitly cast with:
float(x)
int(x)
str(x)

#Stuff like this will fail:
#int("5.5")

#### Immutable, all!

In [189]:
#Make a float:
x = 1.1

#Look at the id in memory:
hex(id(x))

'0x14d86a6ccb0'

In [190]:
#Now, do an operation on the float:
x = x + 2

#And id in memory?
hex(id(x))

'0x14d86a6cd70'

In [194]:
#So, x now refers to a new object

# Note with ints, and bools:
#######
x = 4
y = 3+1

print(id(x) == id(y))

#Bools:
x = False
y = 5 > 10

print(id(x) == id(y))

True
True


In [195]:
x = 'Monty'

In [201]:
#Strings are also immutable!
#####

#This works just fine, assigning 'x' to new objects
x = 'Monty'
x = x + ' Python'

#This does not work:
#x[0] = 'D'

#This does:
x = 'D' + x[1:]

x

'Donty Python'

In [204]:
#Note None:
x = None
y = None

print(x == False)

False


In [209]:
'String'.lower().islower()

True

### Typing and Variable Name Binding

- Everything in Python, including functions, etc. is a Python object, and has an associated type and internal data.


- Any time you assign a variable you are creating a *reference* to the object on the righthand side.  Assignment is also known as *binding*: **Binding a name to an object.**


- Python is a **strongly typed language**:  objects still have types and implicit conversions only occur in obvious cases.


- But the references to objects have no type, so you can do stuff like:

In [210]:
a = 5
print('I\'m a ', type(a), ', living at ', hex(id(a)))

a = 'I\'m a string now!'
print('I\'m a ', type(a), ', living at ', hex(id(a)))

I'm a  <class 'int'> , living at  0x14dd72169b0
I'm a  <class 'str'> , living at  0x14ddc299cb0


#### Be very careful:

Python uses "pass by assignment"

- When passing variables to a function, new local variables are created referencing the original objects, without any copying

- If you bind a new object to a passed variable within the function, the change is *not* reflected in the parent scope

- However, you can alter *mutable* objects within the function and this *is* reflected in the parent scope

For example:

In [211]:
def foo1(x):
    x = x + 1
    
y = 5
foo1(y)
print(y)

5


In [212]:
def foo1(x):
    x = x + 1
    
def foo2(x):
    x = [1, 2, 3]
    
def foo3(x):
    x.append(99)
    
y = 2
foo1(y)
print(y)

y = [2]
foo2(y)
print(y)

y = [2]
foo3(y)
print(y)

2
[2]
[2, 99]


### Help?

<img src="Hello_World/hello3.jpg" alt="Hello World!" style="width:375px;"/>


In [213]:
?float

In [214]:
dir(str)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'removeprefix',
 'removesuffix',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',


In [215]:
x = [1,2,3]

In [216]:
help(x)

Help on list object:

class list(object)
 |  list(iterable=(), /)
 |  
 |  Built-in mutable sequence.
 |  
 |  If no argument is given, the constructor creates a new empty list.
 |  The argument must be an iterable if specified.
 |  
 |  Methods defined here:
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __delitem__(self, key, /)
 |      Delete self[key].
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(...)
 |      x.__getitem__(y) <==> x[y]
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __iadd__(self, value, /)
 |      Implement self+=value.
 |  
 |  __imul__(self, value, /)
 |      Implement self*=value.
 |  
 |  __init__(self, /, *args, **kwargs)
 |      Initialize self.  See help(type(self)) for accurate sign

In [None]:
#Add <tab>:

#x.

### Don't Forget to Comment!

In [217]:
### Comment!

"""Also comment
But also multi-line string"""

'Also comment\nBut also multi-line string'

### Advanced variable types

- Lists
- Dictionaries
- Tuples
- Sets

### Lists

Lists are an ordered array-like structure that is *mutable*. List elements can be of any type, including other lists, dictionary, and tuples

In [218]:
#Declare a list like so:
L = [3, 5, 9.2, 'oye', ['a', 'b', 8.8, [1, 2,3]]] #Note the list as the last element of L

print(L)
type(L)

[3, 5, 9.2, 'oye', ['a', 'b', 8.8, [1, 2, 3]]]


list

In [224]:
#Access by index, starting at 0
L[4][3][:]

[1, 2, 3]

#### Methods: Lists, like most Python objects, have methods that you can call...

In [225]:
L[4][0]

'a'

In [226]:
#Note some useful string methods:
#And how we "chain" methods:

L[4][0].upper().zfill(3)

'00A'

In [227]:
## Check out the many attributes/methods of the mighty list:
######

dir(list)


['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [228]:
#Let's try some!
#Note that these typically alter the list in place:

L = [1, 5, 3, 2]

L.sort()  #Could add reverse = True as optional argument
print(L)

L.reverse()
print(L)

[1, 2, 3, 5]
[5, 3, 2, 1]


In [229]:
#Once again, variable name binding:
#####

#To see:
a = [1, 5, 3, 2]

b = a
b[0] = 99

print(a)
print(b)


[99, 5, 3, 2]
[99, 5, 3, 2]


In [233]:
#Copy method, or [:]:

#So instead:
a = [1, 5, 3, 2]

if (1):
    b = a[:]
else:
    b = a.copy()
    
    
b[0] = 88

print(a)
print(b)


[1, 5, 3, 2]
[88, 5, 3, 2]


#### Accessing lists and other list-like objects:

In [234]:
#Various ways to access...
a[0]

a[2:4]

a[:] #Everythin
a[:-1] #Everything except last element

a[-1] #Last element
a[-3] #Third from last
a[-2:] #Second to last to the end

a[1:6:2] #1:6 by 2, note this is different order than MATLAB

a[::2] #Everything by 2

a[::-1] #Go backwards

[2, 3, 5, 1]

#### Extending/Appending Lists

In [257]:
#Let's append and extend our lists, I say
#Consider the following

a = [1, 2, 3]
b = [4, 5, 6]

a + b

#a*5
#Go from there...
#a.append(b)
#a

#a.extend(b)
#a

#a*3 + b*5

[1, 2, 3, 4, 5, 6]

In [252]:
a

[1, 2, 3]

#### Math on lists?

In [261]:
#Try the following
#############

a = [1,2,3]
b = [1,1,1]

a + [1]

[1, 2, 3, 1]

In [262]:
a + [1]

[1, 2, 3, 1]

In [263]:
a + b

[1, 2, 3, 1, 1, 1]

In [264]:
a * 3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [265]:
## To do actually do math...
#######

#List comprehesion comes in handy:

a = [1,2,3]

[i + 1 for i in a]


[2, 3, 4]

In [266]:
[i*3 for i in a]

[3, 6, 9]

In [267]:
#Add two lists:
#####

a = [1,2,3]
b = [5,10,15]

[i + j for i, j in zip(a, b)]


[6, 12, 18]

In [270]:
for i,j in zip(a,b):
    print(i,j)

1 5
2 10
3 15


In [269]:
list(zip(a, b))

[(1, 5), (2, 10), (3, 15)]

#### List Comprehension

The basic form of a list comprehension is:

```
[expr for val in collection if condition]
```

Which is equivalent to:

```
result = []
for val in collection:
    if condition:
        result.append(expr)
```

Don't necessarily need the condition, in which case we just append vals all together.

In [33]:
[i**3 if (i%3 == 0) else i for i in [1,2,3] if i < 15]

[1, 2, 27]

In [274]:
list(range(1,10))

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [275]:
#Create a new array of the even values, squared:
#C-style:
x = list(range(1,10))

y = []
for k in x:
    if (k % 2 == 0):
        y.append(k**2)

y

[4, 16, 36, 64]

In [283]:
#Python Style:
x = list(range(1,10))

#y = [el**2 for el in x if el % 2 == 0]

#Note this will not work:
y = [el**2 if el % 2 == 0 else el**3 for el in x]
y

[1, 4, 27, 16, 125, 36, 343, 64, 729]

### Dictionaries

Dictionaries are mutable list-like objects, declared using curly {}, define a set of key/value pairs

In [284]:
#Make a dictionary
my_dict = {'thing1':12.3, 'thing2':14, 'cat':'hat', 'yurtle':'turtle', 5:42}
my_dict

{'thing1': 12.3, 'thing2': 14, 'cat': 'hat', 'yurtle': 'turtle', 5: 42}

In [287]:
#Get by key
my_dict['thing1']

12.3

In [None]:
#Note that dictionaries are unordered, can't use use indexing!
#This will give error:
my_dict[0]


In [None]:
#Can add to dictionary
#my_dict.update({'lorax': 'trees'})
#my_dict

#Can also just use:
my_dict['lorax'] = 'trees'
my_dict

In [None]:
#Can pop an entry by key: Remove key and return the item
a = my_dict.pop('cat')
print(a)
my_dict

#Using del also an option:
#del my_dict['cat']
#my_dict

In [None]:
#Can also pop last key:item pair
a = my_dict.popitem()
print(a)
my_dict

In [None]:
#See if a key or value in the dictionary
'thing1' in my_dict
'turtle' in my_dict.values()

In [None]:
dir(dict)

Note that any Python object can be a value, but keys must be *hashable* objects = immutable objects like scalar types and tuples (below). To check, use `hash()` function:

In [None]:
hash("sdf")
hash([1,2,3])

### Tuples

Tuples are ordered, *immutable* array-like objects

In [290]:
#Let's make us a tuple
t = 1, 3, 'test', 9, [1,2,3]
t

(1, 3, 'test', 9, [1, 2, 3])

In [291]:
#Index and access similar to lists
#But tuples are immutable
t[4][0]

1

In [292]:
#We can have lists, dictionaries, other tuples, etc. as elements
t2 = ([1,2], (5,6,7), {'huey':'dewey', 10.1:20})

t2[0][1]

2

In [293]:
#Don't actually need parentheses...
t3 = [1,2,3], 15, 'string!', True, False

t3

([1, 2, 3], 15, 'string!', True, False)

In [None]:
#Unless necessary for more complex expressions, e.g. nested tuples (a tuple of tuples):
nested_tuple = (4, 5, 6), (7, 8)


In [296]:
tuple(range(0,5))

(0, 1, 2, 3, 4)

In [None]:
#Can convert any sequence or iterator to a tuple with tuple():

tuple([1,2,3])

#tuple(range(5,15))

In [None]:
tuple("string")

Tuples are *immutable*. The following gives an error

In [297]:
t = tuple([1,2,3,False])

t[0] = 10

TypeError: 'tuple' object does not support item assignment

However, we *can* modify mutable objects within a tuple:

In [298]:
t = 1, 5, [1, 2], True, "string"

t[2].extend([1,5,6])
t[2].remove(1)
t[2][1] = 99

t

(1, 5, [2, 99, 5, 6], True, 'string')

**Unpacking tuples**

In [299]:
#Can upack like so:
###

t = (4, 5, 6)

a, b, c = t
print(a,b,c)

4 5 6


In [301]:
#For a nested tuple:
###

t = 4, 5, (6, 7)

a, b, c = t

print(a,b,c)

#OR

a, b, (c, d) = t
print(a,b,c,d)

4 5 (6, 7)
4 5 6 7


### Sets
Sets are unordered, unindexed, and do not allow duplicate values. Can add or remove items, but cannot change existing items.

In [304]:
#A quick example
A = {1, 2, 2, 3, 4, 5, 5}
print(A)

A.add(6)
A.discard(2)

A

{1, 2, 3, 4, 5}


{1, 3, 4, 5, 6}

#### Can be a useful shortcut to getting unique elements:

In [305]:
a = [1, 2, 3, 1, 1, 5]

print(set(a))

print(len(set(a)))

{1, 2, 3, 5}
4


### Built-In Sequence Functions

#### enumerate

In [307]:
#When we use a for loop, we loop over an iterable object
#Often want to track index. Can do:

index = 0
L = [1,5,6,"sdf",11]

for k in L:
    print("Index " + str(index) + " has value: " + str(k))
    
    index += 1

Index 0 has value: 1
Index 1 has value: 5
Index 2 has value: 6
Index 3 has value: sdf
Index 4 has value: 11


In [308]:
#Alternative is to use enumerate(): Returns sequence of (i, value) tuples:

for index, value in enumerate(L):
    print("Index " + str(index) + " has value: " + str(value))

Index 0 has value: 1
Index 1 has value: 5
Index 2 has value: 6
Index 3 has value: sdf
Index 4 has value: 11


In [309]:
#We can also map the (unique) values in a list to their location in the list, using a dictionary plus enumerate:

my_list = ['archer', 'mage', 'fighter']

mapping = {}

for index, value in enumerate(my_list):
    mapping[value] = index
    
print(mapping)

#Now can do:
my_list[mapping['fighter']] = 'barbarian'
my_list

{'archer': 0, 'mage': 1, 'fighter': 2}


['archer', 'mage', 'barbarian']

#### zip
`zip` pairs elements of other sequences to create a list of tuples:

In [320]:
seq1 = [1, 2, 3]
seq2 = ["one", "two", "three"]

zipped = zip(seq1, seq2)
print(zipped)

#print(list(zipped))


<zip object at 0x0000014D95E36200>


In [315]:
list(zipped)

[]

In [321]:
#Can also "unzip":
########

L = list(zipped)

a, b = zip(*L)

print(a, b)

(1, 2, 3) ('one', 'two', 'three')


In [322]:
#Note if you zip sequences of unequal length:
#########

seq1 = [1, 2, 3]
seq2 = ["one", "two", "three"]
seq3 = ["A", "B"]

zipped = zip(seq1, seq2, seq3)
l = list(zipped)
l

[(1, 'one', 'A'), (2, 'two', 'B')]

### Control Structures

- Conditional statements
- `for`, `while`
- `range`
- `try`-`except`

In [None]:
#Basic if-elif-else
#####
mylist = [1,2,3,4,5]

if (2 in mylist):
    print('It is!')
elif (4 in mylist):
    print('Ooh, found this!')
elif (7 in mylist):
    print('A boring 7.')
else:
    print('None, oh no!')

#More standard comparisons
# ==, !=, and, or, <=, >=, <, >

In [None]:
#Special option is pass:
if (2 == 2):
    #I'll implement something brilliant later
    pass
else:
    pass


In [324]:
#Note try-except
####

try:
    x = (1,2,3)
    #x[0] = 99
except:
    print('Failure!')
else:
    print('Success! But remember, the paths of glory lead but to the grave.')
finally:
    print('Depressing either way!')
    

Success! But remember, the paths of glory lead but to the grave.
Depressing either way!


In [None]:
#For loops
#########

#Note, we always use "in" structure, often with range function
for k in range(0,10,2):
    print(k)

In [None]:
#Range stuff
a = range(0,10)
a

#list(a)


In [None]:
b = range(0, 10, 3)
for k in b:
    print(k)

In [None]:
#Construct list using for and range
mylist = [i for i in range(0,20,2)]

mylist

In [None]:
#Can iterate over lots of things:
name_list = ['A', 'B', 'C', 'D']
my_dict ={'thing1':12.3, 'thing2':14, 'cat':'hat', 'yurtle':'turtle', 5:42}

for n in name_list:
    print(n)
    
print('')
    
for d in my_dict.items():
    print(d[1])
    
print('')

for k in my_dict:
    print(k, my_dict[k])

In [None]:
#The venerable while loop

count = 0
while (count < 5):
    count += 1
    
print('count is now ' + str(count))

In [None]:
#We have some special commands for looping
#break, continue
my_list = [i for i in range(3,10)]

#Let's try to find the first index corresponding to 5 in my_list
i = -1
for k in range(len(my_list)):
    
    print(k)
    
    if (my_list[k] == 5):
        i = k
        break
        
    #Let's avoid printing 'Hey!'
    continue
    
    print('Hey!')

In [None]:
#Compare to:
my_list.index(5)


In [None]:
#What if we want to find all indices?
#Can do with a for loop:
my_list = [i for i in range(3,10)] + [1,5,5,2,5]

index_list = []

for k in range(len(my_list)):
    
    if (my_list[k] == 5):
        index_list.append(k)

index_list


In [None]:
#OR MAX PYTHON "List Comprehension"
######

index_list = [i for i, x in enumerate(my_list) if x == 5]
index_list

### List, Set, and Dict Comprehension and being "Pythonic"

Have already seen basic form of a **list comprehension** is:

```
[expr for val in collection if condition]
```

Which is equivalent to:

```
result = []
for val in collection:
    if condition:
        result.append(expr)
```

Don't necessarily need the condition, in which case we just append vals all together.

In [None]:
#Get ridiculous...

#Square if divisible by 2 and 3
#Cube if just divisble by 3
#Otherwise keep the same
#Do for first 6 indices

x = list(range(1,10))

x = [el**2 if (el % 2 == 0 and el % 3 != 0) else el**3 if el % 3 == 0 else el for i, el in enumerate(x) if i < 6]
x


#### Set Comprehensions

For set comprehensions, we just use `{}` instead of `[]`:

```
{expr for val in collection if condition}
```

#### Dict Comprehensions

Use format:

```
{key-expr : value-expr for value in collection if condition}
```


### Functions

In [325]:
#Let's define our own functions!

#Can set default values
def add(a = 3, b = 4):
    x = a + b
    
    return(x)

In [326]:
#Can return multiple values
#Really a tuple

def get_123():
    return 1,2,3

In [330]:
#Try calling...
####


(1, 2, 3)

In [None]:
#Let's consider variable numbers of positional arguments
#####
def add_n(*args):
    
    print(args)
    
    total_sum = 0
    
    for k in args:
        total_sum += k
        
    return (total_sum)

In [None]:
#Let's consider variable numbers of keyword arguments
#####
def test_var(**kwargs):
    
    print(kwargs)
    
    for (k,v) in kwargs.items():
        print(k,v)
        

In [None]:
test_var(x=4, fat='rat')

### Anonymous Functions

In [34]:
#Can define simple functions using lambda:
cube = lambda x: x**3
mult = lambda x,y: x*y

cube(3)
#Try out...

27

In [36]:
#Can use a lambda expression without naming the function, hence, "anonymous function"

(lambda x: x**3)(5)


125

### Generators

Iterators are objects that yield, in turn, objects to the Python interpreter when used in a context like a `for` loop:

Consider iterating over a range:

In [None]:
for k in range(5):
    print(k)
    

In [None]:
#Python creates an iterator to do this:
range_iterator = iter(range(5))

range_iterator

In [None]:
list(range_iterator)

#Do twice:
#list(range_iterator)

We can make a *generator* to contruct an iterable object.

- Generators return a sequence of results *lazily*
- Use `yield` instead of `return` in a function

Ex:

In [37]:
#Generate cubes from 1 to n:
def gen_cubes(n = 10):
    for i in range(1, n+1):
        yield i**3

In [40]:
gen = gen_cubes(int(1e3))
gen

<generator object gen_cubes at 0x000002440A4D2270>

In [41]:
#for x in gen:
#    print(x)
    
list(gen)

[1,
 8,
 27,
 64,
 125,
 216,
 343,
 512,
 729,
 1000,
 1331,
 1728,
 2197,
 2744,
 3375,
 4096,
 4913,
 5832,
 6859,
 8000,
 9261,
 10648,
 12167,
 13824,
 15625,
 17576,
 19683,
 21952,
 24389,
 27000,
 29791,
 32768,
 35937,
 39304,
 42875,
 46656,
 50653,
 54872,
 59319,
 64000,
 68921,
 74088,
 79507,
 85184,
 91125,
 97336,
 103823,
 110592,
 117649,
 125000,
 132651,
 140608,
 148877,
 157464,
 166375,
 175616,
 185193,
 195112,
 205379,
 216000,
 226981,
 238328,
 250047,
 262144,
 274625,
 287496,
 300763,
 314432,
 328509,
 343000,
 357911,
 373248,
 389017,
 405224,
 421875,
 438976,
 456533,
 474552,
 493039,
 512000,
 531441,
 551368,
 571787,
 592704,
 614125,
 636056,
 658503,
 681472,
 704969,
 729000,
 753571,
 778688,
 804357,
 830584,
 857375,
 884736,
 912673,
 941192,
 970299,
 1000000,
 1030301,
 1061208,
 1092727,
 1124864,
 1157625,
 1191016,
 1225043,
 1259712,
 1295029,
 1331000,
 1367631,
 1404928,
 1442897,
 1481544,
 1520875,
 1560896,
 1601613,
 1643032,
 

In [42]:
list(gen)

[]

#### Can also use a *generator expression*, analogous to list comprehension. Use ():

Note that generators are "forgetful:" You can only go through the values once:

In [None]:
gen = (x ** 2 for x in range(11))

#max(gen)

for x in gen:
    print(x)
    
list(gen)

### A final note on being "Pythonic"

In [43]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


See **PEP 8**: https://peps.python.org/pep-0008/#introduction

"A Foolish Consistency is the Hobgoblin of Little Minds"

Paragraph from Emerson:

"A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. With consistency a great soul has simply nothing to do. He may as well concern himself with his shadow on the wall. Speak what you think now in hard words, and to-morrow speak what to-morrow thinks in hard words again, though it contradict every thing you said to-day. -- Ah, so you shall be sure to be misunderstood. -- Is it so bad, then, to be misunderstood? Pythagoras was misunderstood, and Socrates, and Jesus, and Luther, and Copernicus, and Galileo, and Newton, and every pure and wise spirit that ever took flesh. To be great is to be misunderstood."

In [None]:
#Example 1:
#Get sum from number a=5 to b=50, inclusive
#C-style:
a = 5
b = 50
total_sum = 0

while a <= b:
    total_sum += a
    a += 1
total_sum

In [None]:
#Python style
total_sum = sum(range(5,51))
total_sum

In [None]:
#Square every even value in an array
#C-style:
x = [1, 2, 3, 4, 5, 6, 7, 8]

for k in range(len(x)):
    if (x[k] % 2 == 0):
        x[k] = x[k]**2

x

In [None]:
#Python style
x = [1, 2, 3, 4, 5, 6, 7, 8]

x = [el**2 if el % 2 == 0 else el for el in x]
x

### snake_case is generally considered appropriate for *Python*

**snake_case_is_the_best_case**

**CamelCase** for classes (aka **StudlyCaps**)

<img src="https://i.redd.it/24p3e2gvotg31.jpg" alt="Hello World!" style="width:350px;"/>

## Plotting in matplotlib

The fundamentals:

- Introduce basic plots
- Annotation, tips and tricks
- Customization
- Advanced subplotting
- Colormaps and custom colormaps; matplotlib colorcycle
- Custom legends
- Saving figures

In [None]:
#Note we'll use some numpy, imported as np at the start, also pyplot:
#Just reimport:

import matplotlib.pyplot as plt
import numpy as np

In [None]:
## We'll also use these for custom colormaps:
from matplotlib import cm
from matplotlib.colors import ListedColormap


In [None]:
#Bare bones plot:
#######

x = np.linspace(0,10,100)
y = np.sin(x)

#And can just plot with good ole' matplotlib
plt.plot(y)

#Adds a plot to same figure:
plt.plot(y**2)

In [None]:
#Let's now use an AxesSubplot object for plotting and customize a little
#######

fig1, ax1 = plt.subplots(1, 1, figsize=(8, 6)) #, dpi=120)


ax1.plot(x,y/4, linewidth=3, linestyle="dashed", color=(.9, .1, .1, .7), label="Line 1")
ax1.plot(x,y**2, linewidth=5, color='blue', alpha=.5, label="Line 2")


#Let's set labels and fontsizes:
ax1.set_xlabel('Why, this is the x-axis', fontsize=16, fontweight='bold')
ax1.set_ylabel('And here\'s y', fontsize=16)
ax1.tick_params(axis='both', labelsize=16)
ax1.set_xticks(ax1.get_xticks(), weight='bold')


#We can rotate the tick labels:
ax1.set_xticks(ax1.get_xticks(), ax1.get_xticklabels(), rotation=60, ha='right')


#Title and legend:
#Add a y offset to the title:
ax1.set_title("The Title!", fontsize=16, fontweight="bold", y = 1.1);
ax1.legend(fontsize=14, loc='lower left')


#Plus, let's customize the x and y limits:
ax1.set_xlim([0, 10]);
ax1.set_ylim([-.5, 1.25]);

In [None]:
## Scatter Plots:

#Note the subplotting:

#We can make a "scatter plot" like this:
fig1, ax1 = plt.subplots(1, 2, figsize=(12, 5))

ax1[0].plot(x, y, 'o', markersize = 10, markerfacecolor='red', markeredgecolor='black', markeredgewidth=3)


#Or like this:
#Or use a designated scatter plot: 
ax1[1].scatter(x, y, c = y, s = abs(y)*100 + 10, cmap='viridis', edgecolor='black')


In [None]:
#Note how size scaling differs:
fig1, ax1 = plt.subplots(1, 2, figsize=(8, 4))

for k in range(1,5):
    ax1[0].plot(k, 1, 'o', markersize = k*10, markerfacecolor='red', markeredgecolor='black', markeredgewidth=1)

    ax1[1].scatter(k, .975, s = k*100, c = 'blue', edgecolor='black')
    ax1[1].scatter(k, 1, s = k**2*100, c = 'red', edgecolor='black')
    
for i in range(2):
    ax1[i].set_xlim([.5,4.5])
    ax1[i].set_ylim([.96,1.01])

### Subplots and *Advanced Subplots*

In [None]:
#Simplest method
################

fig1, ax1 = plt.subplots(2,3,figsize=(8,6))


#And proceed:
for i in range(2):
    for j in range(3):
        
        x = np.linspace(0,10,100)
        y = np.sin(x + np.random.uniform(-2,2))

        ax1[i, j].plot(x, y, marker = '.', markersize = 7)

In [None]:
fig1, ax1 = plt.subplots(2,3,figsize=(8,6))

#Sometimes this is useful:     
ax1 = ax1.flatten()

#And proceed:
for i in range(6):
    x = np.linspace(0,10,100)
    y = np.sin(x + np.random.uniform(-2,2)) 

    ax1[i].plot(x, y, marker = '.', markersize = 7)

    
    #Also often useful:
    ####
    if (i < 3):
        ax1[i].set_xticks([])
    if (i % 3 != 0):
        ax1[i].set_yticks([])

#### Grab the default colorcycle:

In [None]:
#Note, compare colors:
fig1, ax1 =  plt.subplots(1,1,figsize=(8,6))

#All in one plot:
for i in range(6):
    x = np.linspace(0,10,100)
    y = np.sin(x + np.random.uniform(-2,2)) 

    ax1.plot(x, y, marker = '.', markersize = 7)
    

#Can get the default colorcycle
######
prop_cycle = plt.rcParams['axes.prop_cycle']
colors = prop_cycle.by_key()['color']


#And:
fig1, ax1 = plt.subplots(2,3,figsize=(8,6))

ax1 = ax1.flatten()

for i in range(6):
    x = np.linspace(0,10,100)
    y = np.sin(x + np.random.uniform(-2,2)) 

    ax1[i].plot(x, y, marker = '.', markersize = 7, color=colors[i])

    


In [None]:
## "Turn off" unused subplots?
###########

fig1, ax1 = plt.subplots(2,3,figsize=(8,6))

ax1 = ax1.flatten()
for i in range(4):
    ax1[i].plot(x, y)

    
#Use this!!
ax1[5].set_axis_off()
ax1[4].set_axis_off()   

In [None]:
#Alternative subplot method
###########################

#Can set up grids of subplots using add_subplot:

#Start with a nice figure
fig = plt.figure(figsize=(10,6), dpi=90)

ax1 = fig.add_subplot(2, 2, 1) 
ax2 = fig.add_subplot(2, 2, 2)  
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)

#Can make a nested tuple or list of ax for easier reference:
ax = [[ax1, ax2], [ax3, ax4]]

In [None]:
#This style is more useful in creating custom grids:
#Example:

fig = plt.figure(figsize=(12,8))

ax1 = fig.add_subplot(4, 4, 1) 
ax2 = fig.add_subplot(4, 4, 2)
ax3 = fig.add_subplot(4, 4, 5)
ax4 = fig.add_subplot(4, 4, 6)

ax5 = fig.add_subplot(2, 2, 3)
ax6 = fig.add_subplot(1, 2, 2)


#We can also adjust space around subplots:
###
plt.subplots_adjust(wspace=0.25, hspace=0.25)

#And add one title to rule them all!
####
plt.suptitle('There Can Be Only One', fontweight='bold', fontsize=18)

#### Inset axes

Let's preview some geopandas here as well...

In [None]:
#We'll need this:

from mpl_toolkits.axes_grid1.inset_locator import inset_axes

In [None]:
#Some built-in geopandas teaching data:
#Recall we had:
import geopandas as gpd

#"gdf" is a common generic variable name = GeoDataFrame
gdf = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))


In [None]:
gdf.head()

In [None]:
#Super basic plot: See how beautiful and automatic this is?!
fig, ax1 = plt.subplots(1,1, figsize=(12, 10), dpi=300)

gdf.plot(ax = ax1, column = 'pop_est', edgecolor='black', linewidth=1, cmap='jet')

In [None]:
#Now let's inset a histogram of the population estimates:
##########

fig, ax1 = plt.subplots(1,1, figsize=(12, 10), dpi=300)

#Note the added alpha argument:
#Also, we'll use the LOG of population:
gdf.plot(ax = ax1, column = np.log10(gdf.pop_est), edgecolor='black', alpha=.5, linewidth=1, cmap='jet')

axins = inset_axes(ax1, width="20%", height="30%", loc=3, borderpad=6)


#Make a histogram on a LOG scale:
######

#Note: Try this vs. next:

if (0):
    n, bins, patches = axins.hist(gdf.pop_est, bins = 30, rwidth = 1, facecolor='red', \
                                edgecolor='black', linewidth = .5, alpha=0.75)

    #Set scale to LOG:
    axins.set_xscale('log')

else:
    my_bins = np.logspace(2,10,30)

    n, bins, patches = axins.hist(gdf.pop_est, bins = my_bins, rwidth = 1, facecolor='red', \
                                edgecolor='black', linewidth = .5, alpha=0.75)

    #Set scale to LOG:
    axins.set_xscale('log')
    
axins.set_title('Pop. Histogram: \n Log Scale', fontsize=10, fontweight='bold')


In [None]:
#By Continents...
#######
fig, ax1 = plt.subplots(3,3, figsize=(12, 10), dpi=300)

ax1 = ax1.flatten()

#Note the enumerate: Loop through unique continents:
for i, c in enumerate(gdf.continent.unique()):

    gdf_plot = gdf.loc[gdf['continent'] == c]
    
    gdf_plot.plot(ax = ax1[i], column = 'pop_est', edgecolor='black', alpha=.5, linewidth=1, cmap='jet')
    
    #Throw this in:
    ax1[i].set_xticks([])
    ax1[i].set_yticks([])
    ax1[i].set_title(c)
    
    
    #And inset axes:
    axins = inset_axes(ax1[i], width="30%", height="40%", loc=1, borderpad=0)

    #my_bins = np.logspace(2,10,30)
    axins.hist(gdf_plot.pop_est, bins = 30, rwidth = 1, facecolor='red', \
                                edgecolor='black', linewidth = .5, alpha=0.75)

#And:
ax1[8].set_axis_off()

#### And super extra advanced subplots...

- Also note some bar plot customization
- And custom colormap

In [None]:
#Observe for reference...
#########

fig = plt.figure(figsize=(15, 14), constrained_layout=False, dpi=300)

widths = [.8, .1, .3]
heights = [.05, .75, .5, .05, .75, .5]

spec = fig.add_gridspec(ncols=3, nrows=6, width_ratios=widths,
                          height_ratios=heights, wspace=0.01, hspace=.55)

#Main Plot 1
ax1 = fig.add_subplot(spec[0:3,0])

#Main Plot 2
ax2 = fig.add_subplot(spec[3:6,0])

#Barplots 1
ax3 = fig.add_subplot(spec[1,2])
ax3b = fig.add_subplot(spec[2,2])

#Barplots 2
ax4 = fig.add_subplot(spec[4,2])
ax4b = fig.add_subplot(spec[5,2])


#Maps:
#####

gdf.plot(ax=ax1, column='pop_est', edgecolor='black', linewidth=.75, alpha=1, cmap='jet')
ax1.set_axis_off()
ax1.set_title('Population', fontweight='bold')

gdf.plot(ax=ax2, column='gdp_md_est', edgecolor='black', linewidth=.75, alpha=1, cmap='jet')
ax2.set_axis_off()
ax2.set_title('GDP', fontweight='bold')


##############
#Top barplots
###############

gdf = gdf.sort_values(by = 'pop_est', ascending=False)

X = 5

#A colormap:
from matplotlib import cm
my_cmap = cm.get_cmap('jet_r', X)

ax3.bar(gdf['name'].iloc[0:X], np.round(gdf['pop_est'].iloc[0:X] / 1e6), edgecolor='black',
            color=my_cmap([i for i in range(X)]))

for container in ax3.containers[0:X]:
    ax3.bar_label(container, fontweight='bold', fontsize=9)
        
ax3.set_ylabel('Top Pops (millions)')
ax3.spines['top'].set_visible(False)
ax3.spines['right'].set_visible(False)
   
ax3.set_xticks(ax3.get_xticks(), ax3.get_xticklabels(), rotation=15, ha='right', fontsize=8);


##

X = 50
my_cmap = cm.get_cmap('jet_r', X)

ax3b.bar(gdf['name'].iloc[0:X], np.round(gdf['pop_est'].iloc[0:X] / 1e6),
            color=my_cmap([i for i in range(X)]))


ax3b.set_ylabel('Top 50 Pops')
ax3b.spines['top'].set_visible(False)
ax3b.spines['right'].set_visible(False)
ax3b.set_xticks([])


###############
#Bottom barplots
###############

#Just a hacky cut-and-paste job of above
#And modified to hard code gdp_est, ugh

gdf = gdf.sort_values(by = 'gdp_md_est', ascending=False)

X = 5

#A colormap:
from matplotlib import cm
my_cmap = cm.get_cmap('jet_r', X)

ax4.bar(gdf['name'].iloc[0:X], np.round(gdf['gdp_md_est'].iloc[0:X] / 1e6), edgecolor='black',
            color=my_cmap([i for i in range(X)]))

for container in ax4.containers[0:X]:
    ax4.bar_label(container, fontweight='bold', fontsize=9)
        
ax4.set_ylabel('Top GDPs (trillions)')
ax4.spines['top'].set_visible(False)
ax4.spines['right'].set_visible(False)
   
ax4.set_xticks(ax4.get_xticks(), ax4.get_xticklabels(), rotation=15, ha='right', fontsize=8);


##

X = 50
my_cmap = cm.get_cmap('jet_r', X)

ax4b.bar(gdf['name'].iloc[0:X], np.round(gdf['gdp_md_est'].iloc[0:X] / 1e6),
            color=my_cmap([i for i in range(X)]))


ax4b.set_ylabel('Top 50 GDPs')
ax4b.spines['top'].set_visible(False)
ax4b.spines['right'].set_visible(False)
ax4b.set_xticks([])

### Text, annotation, and drawing

In [None]:
#Add some annotations to a random walk time-series
##########

#Note the cumsum() method:
######

N = 500
data = np.random.choice([-1,1], N).cumsum()
t = np.arange(N)

fig1, ax1 = plt.subplots(1, 1, figsize=(10,8), dpi=90)

ax1.plot(t, data, color=(.7, .3, .3))


#Add some text:
####

N1 = 50

x = t[N1]; y = data[N1] + 10
text_str = 'I\'m text!'

#Could do:
#ax1.text(x, y, text_str, fontsize=16)

#Fancier:
ax1.text(x, y, text_str, fontsize=16, color=(.75, 0.1, 0.1, .9), fontweight='bold',
         bbox={'facecolor':'blue','alpha':.25,'edgecolor':'black','pad':10},
         ha='center', va='center')


#Add some "Annotations"
#########

crisis_data = ((150, 'Big Crisis'), (200, 'Little Crisis'), (300, 'Meh Crisis'))

for n, label in crisis_data:
    ax1.annotate(label, xy = (n, data[n] + 2),
                        xytext = (n, data[n] + 5),
                        arrowprops = dict(facecolor='green', edgecolor='blue', headwidth=8, width=3, headlength=8),
                        fontsize=14, color='blue')

In [None]:
#Can make shapes:
################

fig = plt.figure(figsize = (10,8), dpi=90)

ax1 = fig.add_subplot(1,1,1)

rect = plt.Rectangle((0.1, .2), .5, .15, color='blue', alpha=.5)
circ = plt.Circle((.5, .5), .2, color=(.9, .0, .0), alpha=.6)
pgon = plt.Polygon([[.15, .15], [.35, .4], [.2, .6]], color='green', alpha=.5)

ax1.add_patch(rect)
ax1.add_patch(circ)
ax1.add_patch(pgon)

### Custom Legends

Here's one way to make a custom legend:

In [None]:
from matplotlib.patches import Patch
from matplotlib.lines import Line2D

fig1, ax1 = plt.subplots(1, 1, figsize=(4,3), dpi=120)

legend_elements = [Patch(facecolor='blue', edgecolor='black', alpha=1, label='Ocean'),
                   Patch(facecolor='green', edgecolor='black', alpha=1, label='Land'),
                   
                   Line2D([0], [0], alpha=.75, marker='o', color='w', label='City',
                          markeredgecolor='black', markerfacecolor='darkblue', markersize=16),
                   Line2D([0], [0], alpha=.75, marker='o', color='w', label='Village',
                          markerfacecolor='red', markersize=10),
                   
                   Line2D([0], [0], alpha=1, color='grey', label='Road', linewidth=4)]


#Like so:
#ax1.legend(handles=legend_elements, fontsize=11, loc='lower left')

#Or so:
ax1.legend(handles=legend_elements, fontsize=11, bbox_to_anchor=(0.6, .85))


### Custom Colormaps

Some convienent ways to get a custom colormap:

In [None]:
from matplotlib import cm
from matplotlib.colors import ListedColormap

my_cmap = cm.get_cmap('jet', 4)

my_cmap

In [None]:
my_cmap = cm.get_cmap('plasma_r', 15)

my_cmap

In [None]:
#Listed:
####

vals = [[46/255, 169/255, 222/255],
        [255/255, 199/255, 2/255],
        [88/255, 143/255, 48/255]]

my_cmap = ListedColormap(vals)

my_cmap

In [None]:
#Note this hack to add alpha:
####

my_cmap = cm.get_cmap('viridis', 8)
my_cmap


In [None]:
new_map = [list(my_cmap(i))[:3] + [.5] for i in range(8)]

my_cmap = ListedColormap(new_map)
my_cmap

## Saving plots: Important!

Use something like:

```
plt.savefig('My_Fig.png', dpi=300, facecolor="white", bbox_inches='tight', pad_inches=0.05)
```

### NumPy

The following cells go over some fundamentals of NumPy. We will probably not cover most of this in class, but this is provided as reference.

- numpy was developed for fast computation with large arrays
- Fast vectorized operations without need for loops
- C API for connecting NumPy with libraries written in C, C++, FORTRAN

- NumPy stores data internally in large contiguous blocks of memory
- NumPy libraries written in C and act on memory without Python interpreter overhead
- Much faster than other Python data types

N-Dimensional array object, `ndarray`, is primary data container in NumPy 

In [None]:
#Let's make a numpy array, and plot
#Already saw some of this
import numpy as np

#One way to make an ndarray (nd = n-dimensional array, very similar to MATLAB arrays/matrices)
x = np.array([1,2,3,4,5,6,7,8,9,10])

#A better way:
y = np.array(np.arange(1,10))

#Or just:
z = np.arange(1,10)

#And can do simple plot, as above:
plt.plot(x,x**2)

In [None]:
#Another good way to make numpy arrays
####
x = np.linspace(0,10,100)

y = np.cos(x)**3

plt.plot(x,y)

`ndarray`s have `ndim`, `shape`, and `dtype`:

In [None]:
data = np.random.randn(2,3).round(2)

display(data)

print(data.ndim)
print(data.shape)
print(data.dtype)

In [None]:
%%time

#numpy is *way* faster than lists
#Can see with the following...
#Do element-wise multiplication a bunch of times...

#Will import time package
import time


arr_list = list(range(0,int(1e6)))
arr_np = np.arange(0,1e6)

#Do list way
####
start = time.time()
for k in range(10):
    arr_list = [i*2 for i in arr_list]

end = time.time()

print('Elapsed: ' + str(end-start))


#Do numpy way
####
start = time.time()
for k in range(10):
    arr_np = arr_np*2

end = time.time()

print('Elapsed: ' + str(end-start))


In [None]:
%%time
#Can also do a quick version using magic %time:
#Above is a magic command applied to whole cell: MUST come on first line of cell (including comments)
#A magic command prefixed with a single % applies to the following line

%time for k in range(100): arr_np = arr_np*2   

In [None]:
#Note int vs. float, multiplication vs. division
####
#Try ushort, cdouble,...
arr_np = np.arange(0,1e6, dtype= np.float32)

%time for k in range(100): arr_np = arr_np*2

In [None]:
#Some casting
arr_np = np.arange(0,1e1, dtype=np.int32)

arr_np = arr_np.astype(np.int64)

type(arr_np[0])
#######
#See: https://numpy.org/doc/stable/user/basics.types.html for type lists
#######

In [None]:
#Note scalars of different types:
f16 = np.float16(.1)
f32 = np.float32(.1)
f64 = np.float64(.1)

f16 == f32 == f64

#type(f16)

In [None]:
#Note overflow: This can be source of cryptic error
u32 = np.int32(2)
u32 = u32**31
u32
#type(u32)

#### Multi-dimensional arrays

More on creating arrays...There are many built-in functions for making arrays, in addition to `array`:

In [None]:
#A random array
###
rand_arr = np.random.rand(4,3).round(2)

rand_arr

In [None]:
B = np.empty([5,3])
B

In [None]:
#Some other arrays
#Note slightly different formats
A = np.eye(5,5) #Or identity
A
#B = np.ones([2,3])
#C = np.zeros([3,5])
#D = np.empty((2, 3, 2))

In [None]:
#And this way:
A = np.array([[1,2,3],[4,5,6]])
A

In [None]:
#Also have ones_like, zeros_like, empty_like:
B = np.empty_like(A)
B
#C = np.zeros_like(A)
#D = np.empty_like(A)

In [None]:
#Can do matrix multiplication
A = np.array([[1,2],[3,4]])
B = np.array([[1, 2], [1,1]])
print(A)
print("")
print(B)

print("")

#Element-wise:
print(A*B)

#Matrix multiplication:
np.dot(A,B)

Arithmetic on NumPy arrays:

In [None]:
#Addition, subtraction, scalar multiplication...
A = np.ones([2,3])


A*2 - A*3

In [None]:
A * .3

In [None]:
#Can do inner (dot) and outer products:
a = np.arange(4)
b = np.array([-2,1,0,3])

np.dot(a,b)

In [None]:
np.outer(a,b)

Concatenating:

In [None]:
#Concatenate
A = np.ones([2,2])
B = np.ones([2,2])

A
np.concatenate((A,B), axis=0)

#### Basic Indexing and Slicing

In [None]:
#In 1-D, similar to Python lists:
#######

a = np.arange(10)

a[2:5]

a[:-5]
a[-5:-2]

a[::-2]
a[::2]
a[-3:-5]

#Can use a list in numpy!:
a[[1,2,3]]

a[-5:-2]

In [None]:
a = [1,2,3,4,5]
b = a[:]

b[0] = 99
a

In [None]:
#Note slicing yields references in numpy:
a = np.arange(10)

a_slice = a[5:8]
#a_slice

a_slice[0:2] = 99

a

In [None]:
#Need to use copy()
a = np.arange(10)

a_slice = a[5:8].copy()
a_slice[0:2] = 99
a

#### Slicing in 2-D...

In [None]:
#In 2-D
#########
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
A

In [None]:
#Same:
A[:][0]
#A[1, 2]

In [None]:
A[0:3, 1]

In [None]:
A[:,0]

#Note this doesn't work (as intended):
#A[:][0]

In [None]:
A[:,1]
A[:,1:2]
A[:,1:3]

A[:2, 1:]

A[0:2,-1]
A[0:2,::-1]

In [None]:
#We can assign single values to slices:
A[1] = 99
A

In [None]:
A[:] = 42
A

In [None]:
A[0:2,0:2] = 1

A

In [None]:
#We can cast, but this *sometimes* creates a copy:
######

#Try with np.float64:
B = A[0:2,0:2].astype(np.float64)
B[:,:] = 2

display(B)
A

#### 3-D arrays

In [None]:
a = np.zeros([2,3,4])
a

In [None]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d

In [None]:
arr3d.ndim

In [None]:
arr3d[0, :, 0:2]

In [None]:
arr3d[0:2, 0]

#### Boolean Indexing

In [None]:
A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([[0, 1, 4], [6, -2, 7]])

print(A, "\n")
print(B, "\n")

#Can do boolean masking similar to R:
print(A > B, "\n")

A[A > B]


In [None]:
#Consider also:
data = np.array(np.arange(1,29)) #np.random.randn(7,4).round(2)

data = data.reshape((7,4))

data

In [None]:
#Now do boolean indexing:

codes = np.array(['A', 'B', 'C', 'D', 'A', 'E', 'F'])

#codes == 'A'

data[codes == 'A']

In [None]:
data[~(codes == 'A')]
#Or:
data[codes != 'A']

In [None]:
#Can use &, | (Not and, or)
mask = (codes == 'A') | (codes == 'B')

data[mask]

In [None]:
#More masking
####
data[data > 20] = 0
data

data[mask] = -1

data

#### Re-shaping and transposing

In [None]:
arr = np.arange(15).reshape((3, 5))
print(arr)
arr.T

In [None]:
#In higher dimensions:
arr = np.arange(16).reshape(2,2,4)
print(arr)

#Re-order with axis 1 as second, axis 2 as first, axis three stays third
arr.transpose(1,0,2)

#### Universal Functions

Universal functions perform element-wise operations on ndarrays, often simple *unary* ufuncs:

In [None]:
A = np.arange(10)
A

In [None]:
A = np.arange(10)

print(np.sqrt(A), '\n')

print(np.exp(A))

Act on two arrays, return a single array:

In [None]:
x = np.array([1, 2, 3, 4, 9, 10])
y = np.array([8, 5, 4, 1, 9, 10])

np.maximum(x, y)

Act on a single array, return two arrays:

In [None]:
x = (np.random.randn(7) * 5).round(2)

print(x, '\n')

remainder, whole_part = np.modf(x)

print(remainder)
print(whole_part)

### Some Aggregation Functions in numpy

In [None]:
x = np.arange(20).reshape(5,4)
x

In [None]:
x.mean()

In [None]:
x.sum()
x.min()
x.max()
x.std() #or x.var()

In [None]:
x.mean(axis=1)

In [None]:
x.sum(axis=0)

In [None]:
x.cumsum(axis=0)

In [None]:
x.cumprod(axis=1)

Note for booleans:

In [None]:
x = np.random.randn(50)

(x > 0).sum()

#(x > 0).any()
#(x > 0).all()

Get unique elements, sort:

In [None]:
x = np.array([7, 2, 1, 2, 1, 2, 3, 4, 2])

x.sort()
x

#To reverse sort:
#x[::-1].sort()

#Or
x.sort()
x = x[::-1]
x

In [None]:
#Note that this sorts:
np.unique(x)

#Equivalent to
sorted(set(x))

To sort along an axis...

In [None]:
x = np.random.randint(1, 50, 12).reshape(3,4)

print(x)

x.sort(0)
x

#### Linear Algebra

- From above, we had `A * B` gives us element-wise matrix multiplication. We use `np.dot(A,B)` for actual matrix multiplication.

- `numpy.linalg` has a standard set of matrix decomposition and other matrix functions, like inverse, determinant, etc.

Common functions:
- `diag` ~ Return diagonal elements as 1D array
- `dot` ~ Matrix multiplication
- `trace` ~ Sum of diagonal elements
- `det` ~ Matrix determinant
- `eig` ~ eigenvalues/vectors of square matrix
- `inv` ~ Inverse of square matrix
- `pinv` ~ Moore-Penrose pseudo-inverse
- `qr` ~ QR Decomposition
- `svd` ~ Singular Value Decomposition
- `solve` ~ Solve linear system Ax = b for x, where A is square matrix
- `lstsq` ~ Compute least-squares solution to Ax=b

In [None]:
#Ex:
X = ((np.arange(16) + np.random.randn(16) / 10 * 0).round(2)).reshape(4,4)

print(X)
np.linalg.pinv(X)