# Coding Best Practices
- What constitutes "good" and "bad" code is somewhat subjective, but there commonly agreed upon coding practices that fall into both categories
- This section is divided into 4 parts:
    1. Code Smell (bad code)
    1. Pythonic Code (good code)
    1. General Reminders
    1. Tips for Functions

---

## Code Smell
- The smell of mercaptan (added to natural gas) can help people identify a gas leak before something blows up
- The smell of bad code can help people identify problems in code before their code blows up
- **Code smell**--bad coding practice that could potentially lead to a bug

**CODE SMELLS**

1. **Duplicate code**--if there are duplicate chunks of the same code it is tedious to edit because we must perform the same edit on multiple chunks.  We can deduplicate by placing chunks inside functions or adding looping structures.  The longer the code chunks or the more the chunk is repeated, the smellier the code.

1. **Magic numbers**--numbers within code can be mysterious (*like magic!*) if there is no explanation for why they are chosen.  It is good practice to add a comment.  If the number should not be modified, it can be written in the form of a constant.  To do this place an assign statement near the top of the script.  Assign that number to a variable that uses all uppercase letters with underscores as needed.  Other data types can also be made into constants in the same way.  This is not an official "constant object". It is a normal assignment statement, but the all caps name is a hint to other programmers that the variable should not be modified.

1. **Commented out code**--commenting out code can be used for debugging.  If commented out code is left in other readers may wonder if/when the code should ever be used.  Commented out code should be removed.

1. **Dead code**--dead code is code that will never actually run because of the the structure of the script.  E.g. function definition if function is never called or lines of code below a return statement in a function definition.

1. **Print debugging**--print debugging is when we temporarily place print statements into code to display what variables contain.  This is good for very short scripts to help see what variables may be contributing to bugs.  The practice is considered a code smell because when used in longer scripts it can be tedious to remove them and it is easy to accidentally leave them in.  Instead, debuggers can be used to view variable contents and data types.  Loggers can be used to record information onscreen or to a file.  Logs can be customized to include variable contents and data types.

1.  **Variables with numeric suffixes**-- variables like `password1`, `password2`, `password3` are differentiated by numbers, but it is unclear what these numbers mean.  Instead, use descriptive variable names.  These situations might also call for a looping structure.

1. **Classes that should just be functions or modules**--in some other programming languages, functions must be written within a class definition.  In Python it is often simpler to just create a function definition within a module and avoid an extra, unneeded, class definition.  That being said, if there are many user defined functions that all work on the same data type, it can be helpful to make that data type a class and turn those functions into class methods.

1. **List comprehensions within list comprehensions**--when a comprehension is placed within another comprehension it becomes hard to comprehend. Similarly, it is possible for a single comprehension to include two for statements within it.  This is hard to comprehend.  In both circumstances it would be better to use 1 for loop and 1 comprehension.  Alternatively, we could use 2 for loops, or even generator objects.

1. **Empty except blocks and poor error messages**--except blocks can be empty (only have pass statement) to prevent an exception from being raised.  This likely only temporarily hides a problem instead of addressing it.  Except statements may also have bare excepts that are not tailored to any specific problem.  Instead, we should avoid bare excepts, tailor except statements to the problem at hand, and provide a detailed message to the user that explains how they can fix the problem.

**CODE SMELL MYTHS**

- There are a few code smell myths.  These are fine to do and can be ignored:
1. **Myth: functions should have only one return statement at the end.** Myth comes from FORTRAN language in which functions were supposed to have "one entry, one exit".

1. **Myth: flag arguments are bad**.  Flag arguments are binary parameters in a function that are often specified using Boolean values.  Functions are supposed to only do one thing.  Flag values could allow functions to do multiple, unrelated things.  In reality, most flag values simply tweak the function's purpose slightly like any other  function argument.
    - E.g. in `sorted([1,4,2], reverse=True)`, `reverse=True` is a flag argument
    
1. **Myth: functions should have at most one try statement.**  Some say that try except blocks of code should be made into their own local function and be called within an enclosing function.  However, this leads to unnecessary complexity.

1. **Myth: global variables are bad.**  This is only half true.  Global variables are often bad.  This is because every global variable effectively becomes an argument for functions in the future.  More arguments mean more complexity and bugs.  It can be hard to track down where global variables are set and understand why they have the values they do.  The exceptions to the rule are global variables that represent constants.  These are often placed at the top of a script.  Because they are assigned right away and never changed, they are unlikely to be the cause of any bugs.

1. **Myth: comments are unnecessary.**  Use comments that explain the "why" of code.

---

**EXAMPLES**

**Duplicate Code**

- Original duplicate code

In [1]:
print('Hello World!')
print('What is your astrological sign?')

print('Hello Moon!')
print('What is your astrological sign?')

print('Hello Sun!')
print('What is your astrological sign?')

Hello World!
What is your astrological sign?
Hello Moon!
What is your astrological sign?
Hello Sun!
What is your astrological sign?


- Function definition

In [2]:
def salutation(subject):
    print(f'Hello {subject}!')
    print('What is your astrological sign?')
salutation('World')
salutation('Moon')
salutation('Sun')

Hello World!
What is your astrological sign?
Hello Moon!
What is your astrological sign?
Hello Sun!
What is your astrological sign?


- Loop

In [3]:
for subject in ['World', 'Moon', 'Sun']:
    print(f'Hello {subject}!')
    print('What is your astrological sign?')

Hello World!
What is your astrological sign?
Hello Moon!
What is your astrological sign?
Hello Sun!
What is your astrological sign?


- Function definition and loop

In [4]:
def salutation(subject):
    print(f'Hello {subject}!')
    print('What is your astrological sign?')
for subject in ['World', 'Moon', 'Sun']:
    salutation(subject)

Hello World!
What is your astrological sign?
Hello Moon!
What is your astrological sign?
Hello Sun!
What is your astrological sign?


**Magic Numbers**

In [5]:
SECONDS_IN_YEAR = 31536000
SUMMER_MONTHS = ['June', 'July', 'August'] 

**Dead Code**

In [6]:
def salutation(subject):
    return f'Hello {subject}!'
    return 'What is your astrological sign?'  # Dead code.  Should be deleted.
salutation("World")

'Hello World!'

**Numerical Suffixes**

- Instead of using numerical suffixes to differentiate values, we can store the values in a collection

- Original numerical suffixes

In [7]:
password1 = "password"
password2 = "123456"
password3 = "111111"
print(password1)
print(password2)
print(password3)

password
123456
111111


- Collection

In [8]:
passwords = ("password", "123456", "111111")
for password in passwords:
    print(password)

password
123456
111111


**Unnecessary Classes**

- Original class and method

In [9]:
class Dice:
    def __init__(self):
        self.sides = 6
    def roll(self):
        import random
        return random.randint(1, self.sides)
dice_object = Dice()
print(dice_object.roll())

6


- Turned into a function

In [10]:
def roll():
    import random
    return random.randint(1, 6)
print(roll())

6


**Nested Comprehensions**

- Original nested comprehensions

In [11]:
outer_list = [['a', 'b'], [1, 2, 3]]
new_outer = [[j for j in inner_list] for inner_list in outer_list]
print(new_outer)

[['a', 'b'], [1, 2, 3]]


- 1 for loop and 1 comprehension

In [12]:
outer_list = [['a', 'b'], [1, 2, 3]]
new_outer = []
for inner_list in outer_list:
    new_outer.append([j for j in inner_list])
print(new_outer)

[['a', 'b'], [1, 2, 3]]


- 2 for loops

In [13]:
outer_list = [['a', 'b'], [1, 2, 3]]
new_outer = []
for inner_list in outer_list:
    new_inner = []
    for j in inner_list:
        new_inner.append(j)
    new_outer.append(new_inner)
print(new_outer)

[['a', 'b'], [1, 2, 3]]


- Comprehension with 2 for statements to flatten list

In [14]:
outer_list = [['a', 'b'], [1, 2, 3]]
flat_list = [j for inner_list in outer_list for j in inner_list]
print(flat_list)

['a', 'b', 1, 2, 3]


- 2 for loops to flatten list

In [15]:
outer_list = [['a', 'b'], [1, 2, 3]]
flat_list = []
for inner_list in outer_list:
    for j in inner_list:
        flat_list.append(j)
print(flat_list)

['a', 'b', 1, 2, 3]


---

## Pythonic Code
- **Pythonic**--good Python coding practices
- *The Zen of Python* lays out aphorisms for writing Pythonic code

1. Use `enumerate()` instead of `range(len())` to return both an index position and its associated item in a for loop. See *For Loops* section for examples.

1. Use the context manager instead of `open()` and `close()`.  If we use `open()` to open a file we later need to use  `close()` to close it to prevent errors.  It is easy to forget to always close it when there are many conditional statements. Additionally, if any exception is raised after the file is opened and before the file is closed the file stays open, which can lead to errors.  Instead, use "context manager" `with open()` statements. See *Read and Write* section for examples.
1. Use `is` to compare with `None` instead of `==`.  See *Identity Operator* for examples.
1. Use f-strings.  See *String Formatting* section for examples.
1. Use raw strings if our string has many backslashes instead of escaping with `\\`.  See *Raw Strings* section for examples.
1. Use shallow copy function to explicitly copy collection instead of slicing an entire collection with `[:]`.  Though they do the same thing, when `[:]` is used it is not clear to the reader that a shallow copy is being created.  See *References and Copies* section for more information on shallow and deep copies.
1. Use the dictionary methods `.get()` and `.setdefault()` instead of if statements. Normally, we'd need to use an if statement to check if key in dictionary before accessing it with `[key]` to avoid `KeyError` exception.  `.get()` or `.setdefault()` allows us to skip using if-then statements.  See *Dictionary* sections for examples.
1. Use collections module and `collections.defaultdict()` to set many default dictionary key values.  The dictionary method `.setdefault()` does this, but it can be tedious if we need to do this for every key.  Argument is data type.  `int` uses `0` as default. `float` uses `0.0` as default.  `str` uses `""` as default.  Collections use empty collections.  E.g. `list` uses `[]`.
1. "Switch" statements and dictionaries.  "Switch" statements are if-elif-else statements used for variable assignments.  These are Pythonic, but verbose.  Another way to retrieve a value based on a name is using a dictionary.  This is less clear, but more concise.  Either grammar can be used.
1. "Ternary" Operator-- if-else statements used for variable assignment (similar to "switch").  Ternary means three.  However, in programming it is synonymous with conditional expressions.  Instead of the standard, multi-lined if-else statement, "ugly" one line ternary statements can be used.  Ternary statements were introduced in Python because programmers were using an even worse short-circuit evaluation that could lead to bugs.  Variable assignment in multi-line if-else statements is by far easiest to read, ternary statements are concise but hard to read, and short-circuit should not be used as it can introduce bugs.
1. Use chaining assignments and comparison operators instead of `and` operator
1. Use `in` instead of multiple or-equals statements.

---

**EXAMPLES**

In [16]:
import collections

**`collections.defaultdict()`**

In [17]:
my_dict = {}
print(my_dict.setdefault("key1", 0))
print(my_dict.setdefault("key2", 0))
print(my_dict.setdefault("key3", 0))

0
0
0


In [18]:
my_dict = collections.defaultdict(int)
print(my_dict["key1"])
print(my_dict["key2"])
print(my_dict["key3"])

0
0
0


**Switch statements and dictionaries**

- The first example is a "switch" statement that conditionally assigns values to variables
- The second example uses keys in a dictionary to retrieve values

In [19]:
season = "Fall"
if season == "Winter":
    holiday = "New Year's Day"
elif season == "Spring":
    holiday = "May Day"
elif season == "Summer":
    holiday = "Juneteenth"
elif season == "Fall":
    holiday = "Halloween"
else:
    holiday = "Personal Day"
print(holiday)

Halloween


In [20]:
my_dict = {
    "Winter": "New Year's Day",
    "Spring": "May Day",
    "Summer": "Juneteenth",
    "Fall": "Halloween"}
print(my_dict["Fall"])
print(my_dict.get("Other Season", "Personal Day"))  # Like else statement above

Halloween
Personal Day


**Ternary**
- The first example is a standard multi-line if-else assignment statement
- The second example is a ternary operator assignment statement
- The third example is a short-circuit evaluation assignment statement
- The fourth example is a buggy short-circuit evaluation assignment statement.  If the value for True is a"falsey" value (False, None, 0, 0.0, "", [], etc.) then the condition would evaluate as False.  This would likely not be the intended code outcome.

In [21]:
x = 0
if x < 1:
    my_variable = "Value when true"
else:
    my_variable = "Value when false"
print(my_variable)

Value when true


In [22]:
x = 0
my_variable = "Value when true" if x < 1 else "Value when false"
print(my_variable)

Value when true


In [23]:
x = 0
my_variable = x < 1 and "Value when true" or "Value when false"
print(my_variable)

Value when true


In [24]:
x = 0
my_variable =  x < 1 and x or "Value when false"
print(my_variable)

Value when false


**Chaining Operators and Assignments**

- The first example shows an unpythonic `and` operator
- The second example shows pythonic chaining comparison operators
- The third example shows an unpythonic `and` operator
- The fourth example shows pythonic chaining assignment and equality operators

In [25]:
x = 0
if x < 5 and x > -1:
    print("Conditions true")

Conditions true


In [26]:
x = 0
if -1 < x < 5:
    print("Conditions true")

Conditions true


In [27]:
x = y = z = 0
if x == y and x == z:
    print("Conditions true")

Conditions true


In [28]:
x = y = z = 0
if x == y == z:
    print("Conditions true")

Conditions true


**`in`**

- The first example shows unpythonic `or` statements
- The second example shows a pythonic `in` statement

In [29]:
my_string = "c"
if my_string == "a" or my_string == 'b'  or my_string == 'c':
    print("It's a match")  

It's a match


In [30]:
my_string = "c"
if my_string in ("a", "b", "c"):
    print("It's a match")  

It's a match


---

## General Reminders
1. Don't add or delete items from a list while looping over that same list.  This changes the index positions of items between iterations and can lead to bugs.  Note that it is perfectly fine to modify/edit items in place with a loop as this does not affect the index positions.
    - Instead of adding:
        1. Create new empty list
        1. For loop that iterates over original list and appends new items to new list
        1. Extend original list with new list
    - Instead of deleting:
        1. Create new empty list
        1. For loop that iterates over original list and appends only the items we want to keep to new list
        1. Reassign new list to original variable name
1. Don't copy mutable values without `copy.copy()` and `copy.deppcopy()`.  See *References* section for more detail.
1. Don't use mutable objects for default arguments in function definitions.  This means that we should not use lists, sets, or dictionaries as default arguments.  Because they are mutable, they can be changed during a function call.  The next time that function is run the default argument would be changed and we'd get unexpected results.
1. Don't use for loops and string concatenation or string formatting to create new strings.  Instead, append to a list and then join list.  This increases speed.  The more iterations the more it matters.
1. Don't expect `sort()` to sort alphabetically.  It sorts ASCIIbetically.  To sort alphabetically, use the argument `key=str.lower`.  See *ASCII and Unicode* section for more information.
1. Don't assume floating point numbers are perfectly accurate.  See *Floating Point* section for details.
1. Don't chain `!=` operator together.  They do something unexpected.
1. Don't forget the comma in single-item tuples.  With no comma, no tuple data type.
1. Don't use `++` or `--`.  Unlike some languages, Python does not have this syntax.  These would act as double positive signs and double negative signs.
1. Don't use `all()` on an empty collection.  The `all()` function inputs a collection and returns `True` if there are no falsey values in that collection.  If there are no values at all it also returns `True`.  This may be an unexpected result.
1. Don't use `True` and `False` for mathematics.  Boolean values are a subclass of int and and `True` actually evaluates to 1 while `False` evaluates to 0.  Convert these to integers explicitly to write clear code.
1. Don't chain multiple, different types of operators together

---

**EXAMPLES**

In [31]:
import time

- Don't use for loops with string concatenation/formatting to create a single long string.  Instead use lists.  The first example is slower than the second.

In [32]:
seconds_start = time.time()
final_string = ""
for i in range(100000):
    final_string = final_string + "spam"
seconds_end = time.time()
seconds_runtime = seconds_end - seconds_start
print(seconds_runtime)

0.024005413055419922


In [33]:
seconds_start = time.time()
my_list = []
for i in range(100000):
    my_list.append("spam")
"".join(my_list)
seconds_end = time.time()
seconds_runtime = seconds_end - seconds_start
print(seconds_runtime)

0.01100301742553711


- Don't chain `!=`

In [34]:
a = 'spam'
b = 'beans'
c = 'spam'

print(a != b != c)
print((a != b) and (b != c))  # Equivalent to above
print(a != b and b != c and a != c)  # What we really mean

True
True
False


- Don't chain multiple types of operators together

In [35]:
False == False in [False]

True

---

## Tips for Writing Functions
- How many functions should a script have?  
    - If a code chunk occurs more than once we should consider putting it into a function.  The longer the chunk and the more repetitions the better candidate it is to be placed into its own function.
    - Even if code doesn't repeat many programmers will separate conceptually distinct lines of code into different functions. Each function is often less than 30 lines of code and definitely less than 200 lines. 
- There are pros and cons to creating many shorter functions
    1. Pros:
        - Easier to understand each individual function
        - Shorter code likely requires fewer parameters.  Easier to debug.
    1. Cons:
        - Must define more functions, creating more function names and more docstrings
        - The overall script becomes harder to understand as we have to figure out how functions work together
- How many parameters should each function have?  0-3 parameters is great. 4-6 is okay.  > ~ 6 parameters often means functions are over-complicated.  A function is supposed to have one main purpose and if it has too many options it should be broken into multiple shorter functions.
- Which argument types should we use?
   - Positional arguments are good if we have 1-3 arguments, they are always used in the function, and it is easy to remember what they are and in which order they are supposed to be written
   - `kwargs` are good if keys reduce confusion caused by positional arguments.  Kwargs are also used with default parameters.
   - Default parameters are good if the majority of function calls use a particular parameter.  Common defaults are 0, 1, "", None, True, and False.    
   - If we are uncertain how many arguments the user will pass and we will use all these arguments, then we should either use a single iterable object or `*args`
        - If a function usually deals with a data structure created while the program is running, it’s better to have it accept a single iterable argument.   E.g. `sum()`.
        - If a function usually deals with arguments that the programmer specifies while writing the code, it’s better to use the `*args` syntax to accept a varying number of arguments.  E.g.`print()`
    - `**kwargs` is a good option if keys reduce confusion and we don't know how many values will be passed.  Its mostly used in wrapper functions.
- To reduce the chances of bugs:
     - Don't use mutable objects as default arguments
     - Choose the simplest argument type possible.  There are even argument types not covered here such as keyword only and position only arguments!  These are more complex.
     - Avoid mixing argument types where possible.  Parameters must be defined in a certain order and arguments must be called in a certain order depending on the argument type.  **Different web resources give conflicting advice on order.  When in doubt keep it simple!!!**:
        - The order of parameters in a function definition might be:
            1. Positional arguments, `kwargs` (same as positional arguments until function call)
            1. `*args`
            1. Default parameters 
            1. `**kwargs`
         - The order of arguments in a function call might be:
            1. Positional arguments
            1. `*args` (collects remaining positional objects into tuple)
            1. Default parameters, `kwargs`, `**kwargs` (collects remaining key-value pairs into dictionary)
    - There can be multiple return statements within a single function.  It is recommended that all return statements within a single function output the same data type.  This will help prevent type errors.
- The best way to learn may be look at other Python functions

---

**EXAMPLES**

- The `print()` function uses:
    1. `*arbs` for an unknown number of user specified strings to concatenate and print
    1. Default parameter to change separator character
    1. Default parameter to change line ending character
    1. Two more default argument which I don't understand : )

In [36]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



- The `sum()` function uses:
    1. Positional argument that is single iterable object.  This will be a collection of numbers to sum.
    1. `/` .  I don't understand : )
    1. Default parameter

In [37]:
help(sum)

Help on built-in function sum in module builtins:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.



- The `pow` (power) function uses:
    1. Positional argument for the base value
    1. Positional value for exponent value
    1. Default parameter for modulus operator

In [38]:
help(pow)

Help on built-in function pow in module builtins:

pow(base, exp, mod=None)
    Equivalent to base**exp with 2 arguments or base**exp % mod with 3 arguments
    
    Some types, such as ints, are able to use a more efficient algorithm when
    invoked using the three argument form.



---