#### Writing Efficient Code

Efficiency Definition:
1. Minimal completion time (fast runtime)
2. Minimal resource consumption (small memory footprint)
3. Code is Pythonic

##### Zen of Python
    import this
    
**Use built in functions over creating your own

**Use NumPy arrays over lists for homogeneous numerical values

**Use literal syntax over formal for setting lists, tuples, dicts**
    ex: use () over tuple(), {} over dict(), [] over list()
    
**To find memory usage of a saved variable of code:**
    - import sys
    - sys.getsizeof(variablename)
    
**Finding items in sets are faster than finding in lists or tuples**

### Checking for PEP-8 errors

    import pycodestyle

    # Create a StyleGuide instance
    style_checker = pycodestyle.StyleGuide()

    # Run PEP 8 check on multiple files
    result = style_checker.check_files(['nay_pep8.py', 'yay_pep8.py'])

    # Print result of PEP 8 style check
    print(result.messages)

#### Profilers

##### line profiler
needs installed: pip install line_profiler

load: %load_ext line_profiler

- %lprun: Run code with the line-by-line profiler: ex: %lprun -f funcname funcname(arguments)

##### memory profiler
needs installed: pip install memory_profiler

load: %load_ext memory_profiler

- %mprun: Run code with the line-by-line memory profiler: ex: %mprun -f funcname funcname(args)
    - function must be saved as a file in order for memory_profiler to work
    - then file imported: from saved_func_file import funcname

#### Pulling up Docstrings¶
You can even see how a method works by pulling up its docstring!
You can do this by writing ? after the method and running the cell.
- pd.read_csv?
- pd.read_csv() #Move your cursor inside the parentheses and press shift+tab

### Help function

Using before just about anything in the parentheses will bring up useful information on that call

- help(function)
- help(package)
- help(datatype)
- help(value)

#### Main data types
- boolean = True / False
- integer = 10
- float = 10.01
- string = “123abc”
- list = [ value1, value2, … ]
- dictionary = { key1:value1, key2:value2, …} 

#### Numeric operators
- + addition
- - subtraction
- * multiplication
- / division
- ** exponent
- % modulus
- // floor division


#### Comparison Operators
- == equal
- != different
- .> higher
- < lower
- .>= higher or equal
- <= lower or equal

#### Boolean Operators
- and logical AND
- or logical OR
- not logical NOT

#### Special Characters
- _#  - comment_
- _\n  - new line_

#### String Oprerators
- string[i] retrieves character at position i
- string[-1] retrieves last character
- string[i:j] retrieves characters in range i to j
- string[i:j:k] retrieves chararcters in range i to j skipping k # of characters


#### List Operations
- list = [] defines an empty list
- list[i] = x stores x with index i
- list[i] retrieves the item with index I
- list[-1] retrieves last item
- list[i:j] retrieves items in the range i to j
- del list[i] removes the item with index i

#### List Comprehension

**preferred over using for loops to iterate through lists**

- example: [2*i, 2*j for i in word for j in word if j > 10]
    - .......return.........for loop.......for loop........condition


for loop

    gen1_gen2_name_lengths_loop = []

    for name,gen in zip(poke_names, poke_gens):
        if gen < 3:
            name_length = len(name)
            poke_tuple = (name, name_length)
            gen1_gen2_name_lengths_loop.append(poke_tuple)
            
_*below performs same operation, only faster*_

list comprehension

    gen1_gen2_pokemon = [name for name,gen in zip(poke_names, poke_gens) if gen < 3]
    
map names

    name_lengths_map = map(len, gen1_gen2_pokemon)
    
combine lists

    gen1_gen2_name_lengths = [*zip(gen1_gen2_pokemon, name_lengths_map)]

#### Dictionary Operations
- dict = {} defines an empty dictionary
- dict[k] = x stores x associated to key k
- dict[k] retrieves the item with key k
- del dict[k] removes the item with key k

#### String Methods
- string.upper() converts to uppercase
- string.lower() converts to lowercase
- string.count(x, start, end) counts how many times x appears (optional start and end points)
- string.find(x,start,end) finds character index based on position of the x first occurrence (optional start and end points) - if no substring found, returns -1
- string.index() - same as find(), only returns 'substring not found' instead of -1
- string.replace(x,y,count) replaces x for y (optional count - # of occurences to replace)
- string.strip(x) removes end/beginning characters from a string (default is whitespace delimiter) - can be lstrip(remove left) or rstrip (remove right)
- character.join(L) returns a string from L (list) values joined by string
- character.format(x) returns a string that includes formatted x
- string.split(x) splits strings by specified delimiter x (whitespace is default)
- string.splitline() splits strings at line boundary or (\n)

#### String Interpolation (position formatting)

**'text{placeholder1} text{placeholder_n}'.format(placeholder1_value, placeholder_n_value)**

- can use variables:

    my_text = "Hello, {} World"
    
    placeholder = "Jason's"
    
    print(my_text.format(placeholder)
    
- can use index numbers in a list to reorder values ex: {0} = 1st list item, {-1} = last list item
- can use named placeholders ex: title = 'Title' - my_text = 'The Movie {title}.format(title=title)
- format specifier - {0:f}% = index 0 is a float
- can also round decimal places in float - {0:.2f}%
- can also use datetime - "Now it is: {:%Y-%m-%d %H:%M}".format(datetime.now())
- using dictionary terms:

    #Create a dictionary
    
    plan = {"field": courses[0],"tool": courses[1]}
    
    my_message = "If you are interested in {data[field]}, you can take the course related to {data[tool]}"
    
    print(my_message.format(data=plan))
    
##### literal formatting (also called 'f' string:)

    name = 'Jason'
    print(f'Hello, my name is {name}')
    
- conversions: ex: {name!r}

    !s - string type
    
    !r - printable string type
    
    !a - !r type that is non ASCII
    
- can call inline operations {num1 * num2} in literal formatting
- can also call functions {my_function(x)} in literal formatting

##### template formatting

    from string import Template
    
    our_tool = tools[0]
    our_fee = tools[1]
    our_pay = tools[2]

    course = Template("We are offering a 3-month beginner course on $tool just for $$ $fee ${pay}ly")
    
    print(course.substitute(tool=our_tool, fee=our_fee, pay=our_pay))

#### List methods
- list.append(x) adds x to the end of the list
- list.extend(L) appends L to the end of the list
- list.insert(i,x) inserts x at i position
- list.remove(x) removes the first list item whose value is x
- list.pop(i) removes the item at position i and returns its value
- list.clear() removes all items from the list
- list.index(x) returns a list of values delimited by x
- list.count(x) returns a string with list values joined by S
- list.sort() sorts list items
- list.reverse() reverses list elements
- list.copy() returns a copy of the list

#### Dictionary Methods
- dict.keys() returns a list of keys
- dict.values() returns a list of values
- dict.items() returns a list of pairs (key,value)
- dict.get(k) returns the value associtated to the key k
- dict.pop() removes the item associated to the key and returns its value
- dict.update(D) adds keys-values (D) to dictionary
- dict.clear() removes all keys-values from the dictionary
- dict.copy() returns a copy of the dictionary


#### Built-in functions

- print(x, sep='y') prints x objects separated by y
- input(s) prints s and waits for an input that will be returned
- len(x) returns the length of x (s, L or D)
- min(L) returns the minimum value in L
- max(L) returns the maximum value in L
- sum(L) returns the sum of the values in L
- range(n1,n2,n) returns a sequence of numbers from n1 to n2 in steps of n - can enter the stop number only if starting at 0 - must be converted to list to manipulate
- abs(n) returns the absolute value of n
- round(n1,n) returns the n1 number rounded to n digits
- type(x) returns the data type of x (string, float, list, dict …)
- str(x) converts x to string
- list(x) converts x to a list
- int(x) converts x to a integer number
- float(x) converts x to a float number
- help(s) prints help about x
- map(function, L) Applies function to values in L
    - using lambda (anonymous function) with map() - map(lambda x: _one-time function_, L)
    - saves using a for loop to iterate through data
- zip - will combine two variables together into a zip type
    - must be unpacked (returns a list of paired tuples)
    - can specify # of items to combine from each list through [#start:#finish]
- enumerate(values, start=n) creates a numbered indexed pair for each value ex: (0,'a')(1,'b') - can set starting index with the start argument - must be converted to list to manipulate
- [\*function(arguments)] - unpacks a function to work with lists and tuples
- [\**function(arguments)] - unpacks a function to wiork with dictionaries

#### Collections Module

import collections

**provides alternate data types to lists, tuples, sets and dicts**

- Counter - returns a countered dictionary of ordered pairs in order from high->low

#### Ittertools Module

import itertools

**provides alternate ways to iterate through data in lieu of loops

- combinations(variable, # of combs.) - results in every non-repeatable combination in variable
    - must be unpacked to view as list of tuples

#### Conditional Statements
- if (condition) :
 
- else if (condition) :

- else:

- if (value) in (list):

#### Data Validation
- try:

- except (error):

- else:


#### Working with Files and Folders
- import os
- os.getcwd()
- os.makedirs(path)
- os.chdir(path)
- os.listdir(path)


#### Loops
- while - runs while condition is met
- for - runs through each item specified
- nested for loop - loop within a loop

- if - conditional loop statement (usually followed by a condition)
- elif - adding another conditional statement if needed
- else - use at end of loop for when if and/or elif conditions are not met

##### best practices
1. understand what is being done in each loop iteration
2. move one-time calculations outside (above the loop)
3. use holistic conversions outside (below) the loop
4. anything done once should be outside the loop

#### Loop Control Statements
- break - finishes loop execution
- continue - jumps to next iteration
- pass - does nothing


#### Running External Programs
- import os
- os.system(command)

#### Functions
- def function(arg1, arg2, *args, kwarg1, kwarg2, **kwargs):
- return (data)

#### Generators
Creates iterable results from a function design
- def function(params):
- yield func(params)
- can create a comprehension for a generator:
    - gen = (((i, j), i + j) for i in func1(8) for j in func2(10))
                  yield          for loop         for loop         condition (if, when,etc)

#### Modules
- import module or module.function()

- from module import * function()


#### Reading and Writing Files
- f = open(path,‘r')
- f.read(size)
- f.readline(size)
- f.close()

- f = open(path,’r’)
- for line in f:

- f.close() 

- f = open(path,'w')
- f.write('str')
- f.close()

#### Sets
- set(l) - Returns a set object
- set1.intersection(set2) - Returns a set of items common to both sets
- set1.difference(set2) - Returns a set of items in set1 not in set2
- set1.symmetric_difference(set2) - Returns a set of items that are different in both sets
- set1.union(set2) - Returns a set of all non-repeating items in both sets

#### Regular Expressions
- import re - Import the Regular Expressions module
- re.search(r"abc",string) - Returns a match object if the regex "abc" is found in s, otherwise None
- re.match(r"abc",string) - same as search, but is specific
- re.split(r'dilimiter',string) - splits string by delimiter provided
- re.sub(r"abc","xyz",string) - Returns a string where all instances matching regex "abc" are replaced by "xyz"

##### metacharacters
- \d = digit (ex: 'User9' returns from re.findall(r'User/d',string)
- \D = non-digit (ex: 'UserN' returns from re.findall(r'User/D',string)
- \w = word - any word containing the provided regex
- \W = non-word - any non-word item
- \s - whitespace
- \S = non-whitespace - any non-whitespace character
- . - matches any character
- ^ - matches the first instance in the string (^string)
- $ - matches the last instance in the string (string$)
- \ - put in front of a charatcer that has another operation to specifically identify
- | - basically an 'and' operator cat|dog|bird
- [] - can use to denote values [a-zA-z] [0-9] and symbols [%^&!]
- [^] - use as negative operandum [^0-9] = find no numbers
- () - groups regex terms
- (?:regex) - will match but not return what is in the parentheses (non-capturing)
##### quantifiers - applies only to the character on its left
- + - shows up one time after the first (ex: 04-13 = \d+-\d+)
- * - shows up zero or more times
- ? - shows up zero or one time (this will also convert a greedy search to a lazy search)
- {n,m} - shows up minimum n times to maximum m times
##### backreference groups
    for string in html_tags:
        #Complete the regex and find if it matches a closed HTML tags
        match_tag =  re.match(r"<(\w+)>.*?</\1>", string)
 
        if match_tag:
            #If it matches print the first group capture
            print("Your tag {} is closed".format(match_tag.group(1))) 
        else:
            #If it doesn't match capture only the tag 
            notmatch_tag = re.match(r"<(\w+)>", string)
            #Print the first group capture
            print("Close your {} tag!".format(notmatch_tag.group(1)))
##### lookaround
- regex(?=reference) = lookup word before the reference
- regex(?!reference) = avoid looking up the word before reference
- (?<=reference)regex = lookup word after the reference
- (?<!reference)regex = avoid looking up the word after reference

#### Datetime
- import datetime as dt - Import the datetime module
- now = dt.datetime.now() - Assign datetime object representing the current time to now
- wks4 = dt.datetime.timedelta(weeks=4) - Assign a timedelta object representing a timespan of 4 weeks to wks4

- dt.datetime(year=2020, month=12, day=31) - Assign a datetime object representing December 25, 2020 to newyear_2020

- newyear_2020.strftime("%A, %b %d, %Y") - Returns "Thursday, Dec 31, 2020"
- dt.datetime.strptime('Dec 31, 2020',"%b, %d, %Y") - Return a datetime object representing December 31, 2020

##### dateutil
has a library of tz that holds timezone database information
 - use tz.gettz('Continent/City') to load that paticular timezone
 - timezone.utc - sets the timezone to UTC
##### timedelta
library that allows a change in time to be entered 
 - timedelta(hours=-3)
 - .total_seconds - total time in seconds that has elapsed
 ##### daylight savings
 use the tz library, it accounts for DST, otherwise, use the UTC timezone
 - tz.datetime_ambiguous(data) - boolean value if one time falls into two timezones
 - data.astimezone() - changes timezone to another (value should be set to replace)


#### Random
- import random - Import the random module
- random.random() - Returns a random float between 0.0 and 1.0
- random.randint(0,10) - Returns a random integer between 0 and 10
- random.choice(l) - Returns a random item from the list l

#### Counter
from collections import Counter - Import the Counter class

c = Counter(l) - Assign a Counter (dict-like) object with the counts of each unique item from l, to c

c.most_common(3) - Return the 3 most common items from l

#### Magic Commands
- %lsmagic: provides a list of all magic commands
- %time: Time the execution of a single statement
- %timeit: Time repeated execution of a single statement for more accuracy
- %%timeit: Runs timeit on multiple lines of code
    - set number of runs for timeit: %timeit -r _number_
    - set number of loops for timeit: % timeit -n _number_
    - %timeit -o: sets the times in variables for comparing run times
        - times.timings - shows all saved times
        - times.best - shows best time
        - times.worst - shows worst time


#### Command Mode (Jupyter)
- shift + enter run cell, select below
- ctrl + enter run cell
- option + enter run cell, insert below
- A insert cell above
- B insert cell below
- C copy cell
- V paste cell
- D , D delete selected cell
- shift + M merge selected cells, or current cell with cell below if only one cell selected
- I , I interrupt kernel
- 0 , 0 restart kernel (with dialog)
- Y change cell to code mode
- M change cell to markdown mode (good for documentation)

#### Cell Edit Mode
- ctrl + click for multi-cursor editing
- ctrl + / toggle comment lines
- tab code completion or indent
- shift + tab tooltip
- ctrl + shift + - split cell