# Python Summary Guide

Notes influenced by [cheatsheet](https://github.com/gto76/python-cheatsheet)

## Collections:
[List](#list)<br>
[Dictionary](#dictionary)<br>
[Set](#set)<br>

## Data Types
[Type](#type)<br>
[String](#string)<br>
[Formatting](#formatting)<br>

In [174]:
import numpy as np
import pandas as pd
import itertools     # itertools.chain.from_iterable(lst) --> flatten list
import functools     # functools.reduce(func, lst) --> product of elements
import collections   # collections.defaultdict and collections.counter
import re            # REGEX


<p><a name='list'></a></p>

# Collections
## Lists
- Ordered and changable values
- May be heterogeneous (include numbers and characters)

**Basics**

In [175]:
lst = [*range(1,9)]
nest_lst = list(zip(lst, sorted(lst, reverse=False))) # list of tuples [(10,0),(9,1),...]

# Indexing/Slicing
lst[1:6:2]          # lst[from_inclusive : to_exclusive : +- step]
lst[2]              # slice el 2
lst[-1]             # slice el starting at end
lst[:2]             # slice 0 (inclusive) --> 2 (exclusive)
lst[2:]             # slice el 2 (inclusive) --> end
lst[2:-1]           # slide el 2 (inclusive) --> -1, last el (exclusive)
lst[:-1]            # list in reverse
lst[:-1:2]          # slice list in reverse, step=2


[1, 3, 5, 7]

**Methods**

In [176]:
# Adding to lists
add_el = lst.append(10)            # add el to end of list
lst += [9]                         # add list to end of existing list
add_lst = lst.extend([1,2,3,4])    # add list to end of existing list
list_of_chars = list('string')     # splits string ['s', 't', 'r', 'i', 'n', 'g']

# Flattening lists
flatten_lst = [*itertools.chain.from_iterable(nest_lst)] # flatten list
flatten_lst = sum( nest_lst, () )  # flatten list (must know internal data structure (tuple, set, list,)

# Sorting lists
lst.sort(reverse=False)              # .sort *mutates* list --> no return list
sort_lst = sorted(lst,reverse=False) # sorted() returns list
reversed(lst)                        # reversed sorted iterable obj
sorted_by_one_el = sorted(nest_lst, key=lambda row: row[0])            # sort nest_lst by single el
sorted_by_multi_el = sorted(nest_lst, key=lambda row: (row[0],row[1])) # sort by 1st el, if tie, sort by 2nd el

# Arithmatics of List
sum_of_els = sum(lst)                               # sum all elements in list
elementwise_sum = [sum(pair) for pair in nest_lst]  # sum els per inner struct
product_of_els = functools.reduce(lambda output, el: output*el, lst) # easier to use np.array.cumprod


In [177]:
n_occur = lst.count(1)  # number of occurences (works for strings)

idx = lst.index(1)            # index of *first* occurance
np.where(np.array(lst)==1)[0] # must use array to find all indices of occurance

lst.insert(5, 1000) # insert (index, val) in list

el = lst.pop()  # removes/returns last value in list (mutates list)
lst.remove(4)   # removes 1st occurance of VALUE (mutates)
lst.clear()     # removes all items (mutates)

**List Comprehension**
- Improve efficiency of for-loops
- May be used to create dictionaries and tuples


In [178]:
# For loop to create list of squared values
squares = []
for x in range(10):
    squares.append(x**2)
    
# Use .map to create list
squares = list(map(lambda x: x**2, range(10)))

# Use list comprehension to create list
squares = [ x**2 for x in range(10) ]

In [179]:
# Notation:
[ [x,y] for x in range(2) for y in range(3) ]              # nested for-loop notation
[ x for x in range(5) if x%2==0 ]                          # if-statement notation
[ [x,'Even'] if x%2==0 else [x,'Odd'] for x in range(5) ] # if-else statement notation


[[0, 'Even'], [1, 'Odd'], [2, 'Even'], [3, 'Odd'], [4, 'Even']]

<p><a name='dictionary'></a></p>

## Dictionary
- Collect iof unordered, changeable, and indexed value
- Ex: {key: [values]}


**Basics**

In [180]:
# Create dictionary
keys = ['even','odd']
values = [[*range(0,11,2)], [*range(1,11,2)]]
dict_ = dict(zip(keys,values))                 # zip keys and values together

dict_ = {'even': [*range(0,11,2)], 
         'odd':[*range(1,11,2)]}


# Retrieve features of dictionary
dict_.keys()       # return object containing all keys
dict_.values()     # return object containing all values 
dict_.items()      # return tuple of (key, values)

# Retrieve values of key
dict_['even']      # return list of values
dict_.get('even')  # return list of values

# Set default value if key missing
dict_.get('name','NoName')        # returns default value if key not found
dict_.setdefault('name','NoName') # set default for dictionary
dict_.get('name')                 # returns default value set previously

# Update dictionary
dict_.update({'odd': [*range(1,11,3)]}) # replace key with new values
dict_.pop('odd')                        # remove key from dict
dict_['odd'] = [*range(1,11,2)]         # add key and values to dict


**collections.defaultdict()** - dict subclass that creates default value

In [181]:
from collections import defaultdict

dict_ = defaultdict(int)       # default value data type
dict_ = defaultdict(lambda: 1) # default value 1
dict_['test']                  # returns 1 because 'test' not key


1

**collections.counter** - dict subclass that counts occurance of unique values

In [182]:
from collections import Counter

lst = ['blue', 'blue', 'blue', 'red', 'red', 'yellow']
counter = Counter(lst) # dictionary of color 
counter.most_common()  # ordered tuple (color,count)


[('blue', 3), ('red', 2), ('yellow', 1)]

<p><a name='set'></a></p>


## Sets
- unordered, unique elements containing immutable data 
- Sets may be modified

In [183]:
setA = {1,2,3,4,3,2,5}    # create set {1, 2, 3, 4, 5}
setB = set(range(2,12,2)) # create set {2, 4, 6, 8, 10}

setA.add(6)     # add el
setA |= {7}     # add el
setA.update( [7,8,9,10] )  # add collection to set

setA.union(setB)                # values in both A and B
setA.intersection(setB)         # values in A found in B
setA.difference(setB)           # values in A but not in B
setA.symmetric_difference(setB) # values in A or B, but not both 

setA.issubset(setB)   # is A found in B
setA.issuperset(setB) # is B found in A

setA.pop()      # remove 1st val (mutates); returns val; raises error if missing
setA.remove(6)  # remove val from set; raises error if missing
setA.discard(6) # remove val; DOESN'T raise error if missing


<p><a name='types'></a></p>

# Object Types
## Types
- Everything in Python is an object
- Every object has a type
- Types synonymous with Class

In [184]:
type('str')               # return object type
type([1,2,3])             # return object type
isinstance([1,2,3], list) # bool; is object a data type


True

<p><a name='string'></a></p>

## Strings

**Basics**

In [185]:
# Creating a string
A = ' Jonathan Andrew Harris '
lines = 'This is line one.\nThis is line 2.'

# Stylize string
A.lower()      # lowercase all char
A.upper()      # capitalize all char
A.capitalize() # 1st char in 1st word capitalized
A.title()      # 1st char of all words capitalized

# Remove leading/trailing whitespace and string
A = A.strip()     # remove whitespace characters from both ends
A.lstrip()        # remove whitespace from left end
A.rstrip()        # remove whitespace from right end
B = A.strip('t')  # remove characters from both ends

# Split string into list of substrings
A.split()                        # split str; default separator = ' '
A.split(sep=',')                 # split str by separator
A.split(sep=None, maxsplit=1)    # split n times
lines.splitlines(keepends=False) # split on \n,\r,\r\n

# Join list of strings into single string
lst = ['This','is','a','list']
''.join(lst)  # 'Thisisalist'
' '.join(lst) # 'This is a list'
'='.join(lst) # 'This=is=a=list'

# Check if string contains substring
'Harris' in A            # bool; 'substring in string'
A.startswith('Jonathan') # bool; string.startswith(substring)
A.endswith('Andrew')     # bool; string.endswith(substring)
A.find('a')              # returns start index of 1st match, or -1
A.index('a')             # same, but raises ValueError if missing 

# Replace characters within string
A.replace('n', 'XXX', 1) # replace substring with new substring, n times


'JoXXXathan Andrew Harris'

**REGEX**

In [186]:
# Substitute substring with new substring
pattern = r'an'
new_substring = 'AN'
string = A.lower()
re.sub(pattern, new_substring, string, count=2) # substitute

# Find all occurrences of substring
re.findall(pattern, string) # ['an', 'an']
re.finditer(pattern, string) # return iter object

# Split string by substring
test = 'bca bca bca bca '
re.split(r'a ',test)      # ['bc', 'bc', 'bc', 'bc', '']

# Determine if substring in string
re.search(r'an', string) # match object; search for 1st occurance
re.match(r'jo', string)  # match object; search only at beginning of string


<re.Match object; span=(0, 2), match='jo'>

<p><a name='formatting'></a></p>

## Formatting

**Strings**

In [187]:
'{0}'.format('abca')      # just str, no additional space
'{0:<10}'.format('abca')  # n char long, str on left;  'abca      '
'{0:>10}'.format('abca')  # n char long, str on right;  '      abca'
'{0:^10}'.format('abca')  # n char long, str in middle;  '      abca'
'{0:.^10}'.format('abca') # add char instead of white space; '...abca...'


'...abca...'

**Numbers**

In [188]:
'{0:10,}'.format(123456)  # add white space
'{0:,}'.format(123456)    # add , where appropriate; '123,456'
'{0:+,}'.format(123456)   # add char before number; '+123,456'


'+123,456'

**Floats**

In [189]:
# Truncate by significant figures {0:.n}
'{0:.3}'.format(1.23456)    # truncate to 3 sigfigs; '1.23'
'{0:10.3}'.format(1.23456)  # n char long, add white space; '      1.23' 

# Specify n decimal places {0:.nf}
'{0:.3f}'.format(1.23456)   # 3 decimal places; '1.235'

# Convert to scientific notation {0:.ne}
'{0:.3e}'.format(1.23456)   # 3 decimal places, sci-not; '1.235e+00'

# Convert to percentage {0:.n%} -- multiply by 100
'{0:.3%}'.format(1.23456)   # 3 decimal places, sci-not; '123.456%'


'123.456%'