# ML DevOps Engineer
## Course 1 - Writing Clean Code
**5 Key Areas**
1. Writing clean and modular code
2. Refactoring code
3. Optimizing code to be more efficient
4. Writing documentation
5. Following [PEP8](https://www.python.org/dev/peps/pep-0008/ "PEP8 Guide") & Linting 
---
---
### Best Coding Practices - Meaningful Names

#### Be descriptive and imply type:
For `booleans`, you can prefix with `is_` or `has_` to make it clear it is a condition.
<BR>You can also use parts of speech to imply types, like using `verbs for functions` and `nouns for variables`.


In [5]:
is_new = True
has_child = False
age = 16
def multiply(x1, x2):
    return x1 * x2

---
#### Be consistent but clearly differentiate:
👍🏻 `age_list` and `age` is easier to differentiate than<br>👎🏻 `ages` and `age`.

---
#### Avoid abbreviations and single letters:
You can determine when to make these exceptions based on the audience for your code. If you work with other data scientists, certain variables may be common knowledge. While if you work with full stack engineers, it might be necessary to provide more descriptive names in these cases as well. (Exceptions include counters and common math variables.)

In [6]:
# Bad Example
s = [88, 92, 77, 65, 80.5]  # student test scores

# Better
test_scores = [88, 92, 77, 65, 80.5]    

#### BUT Long names aren't the same as descriptive names:
You should be descriptive, but only with relevant information.<BR>
For example, good function names describe what they do well without including details about implementation or highly specific uses.<BR>
Try testing how effective your names are by asking a fellow programmer to guess the purpose of a function or variable based on its name, without looking at your code.<BR>
Coming up with meaningful names often requires effort to get it right.


In [11]:
# Bad Example
def count_unique_values_of_names_list_with_set(names_list):
    return len(set(names_list))

# Better
def count_unique_values(arr):
    return len(set(arr))

---
#### Use build in functions:
Build in functions are optimized, often computationaly faster and easier to read.

In [10]:
# Bad Example
print(sum(s)/len(s))        # print mean of test scores

# Better
import numpy as np
print(np.mean(test_scores))

80.5
80.5


---
#### Use whitespace properly:
* Organize your code with consistent indentation: the standard is to use four spaces for each indent. You can make this a default in your text editor.
* Separate sections with blank lines to keep your code well organized and readable.
* Try to limit your lines to around `79 characters`, which is the guideline given in the [PEP 8](https://www.python.org/dev/peps/pep-0008/?#code-lay-out "PEP8 Code Layout Style Guide") style guide.
  - In many good text editors, there is a setting to display a subtle line that indicates where the `79 character limit` is.

In [12]:
# ToDo: Add some examples for good indentation, e.g., functions with many parameters

---
---
### Best Coding Practices - Modular Code
#### DRY (Don't Repeat Yourself):
Don't repeat yourself! Modularization allows you to reuse parts of your code.<br>
Generalize and consolidate repeated code in functions or loops.
#### Abstract out logic to improve readability:
Abstracting out code into a function not only makes it less repetitive, but also improves readability with descriptive function names. Although your code can become more readable when you abstract out logic into functions, it is possible to over-engineer this and have way too many modules, so use your judgement.

In [14]:
# Bad Example
v = [12, 32, 93, 24, 85]
s1 = []
for x in v:
    s1.append(x + 5)
print(sum(s1)/len(s1))

s2 = []
for x in v:
    s2.append(x + 10)
print(sum(s2)/len(s2))


# Better
import math
import numpy as np

def flat_curve(arr, n):
    return [i + n for i in arr]

def square_root_curve(arr):
    return [math.sqrt(i) * 10 for i in arr]

test_scores = [12, 32, 93, 24, 85]
curved_5 = flat_curve(test_scores, 5)
curved_10 = flat_curve(test_scores, 10)
curved_sqrt = square_root_curve(test_scores)

for score_list in test_scores, curved_5, curved_10, curved_sqrt:
    print(np.mean(score_list))

54.2
59.2
49.2
54.2
59.2
65.76626113696466


---

#### Minimize the number of entities (functions, classes, modules, etc.)
There are trade-offs to having function calls instead of inline logic. If you have broken up your code into an unnecessary amount of functions and modules, you'll have to jump around everywhere if you want to view the implementation details for something that may be too small to be worth it. Creating more modules doesn't necessarily result in effective modularization.

---
#### Functions should do one thing
Each function you write should be focused on doing one thing. If a function is doing multiple things, it becomes more difficult to generalize and reuse. Generally, if there's an "and" in your function name, consider refactoring.

---
#### Arbitrary variable names can be more effective in certain functions
Arbitrary variable names in general functions can actually make the code more readable.


---
#### Try to use fewer than three arguments per function
Try to use no more than three arguments when possible. This is not a hard rule and there are times when it is more appropriate to use many parameters. But in many cases, it's more effective to use fewer arguments. Remember we are modularizing to simplify our code and make it more efficient. If your function has a lot of parameters, you may want to rethink how you are splitting this up.