## Characteristic of Production Code
**CLEAN:** readable, simple, and concise. A characteristic of production quality code that is crucial for collaboration and maintainability in software development.

**MODULAR:** logically broken up into functions and modules. Also an important characteristic of production quality code that makes your code more organized, efficient, and reusable.
    * Don't repeat yourself (DRY)
    * Abstract out logic to improve readability
    * Minimize the number of entities (functions, classes, modules, etc.)
    * Functions should do one thing
    * Arbitrary variable names can be more effective in certain functions
    * Try to use fewer than three arguments per function

**MODULE:** a file. Modules allow code to be reused by encapsulating them into files that can be imported into other files.

In [15]:
# DRY
import math
import numpy as np
test_scores = [88, 92, 79, 93, 85]

# List comprehension
curved_5 = [score + 5 for score in test_scores]
print(np.mean(curved_5))

curved_10 = [score + 10 for score in test_scores]
print(np.mean(curved_10))

curved_sqrt = [math.sqrt(score) * 10 for score in test_scores]
print(np.mean(curved_sqrt))

92.4
97.4
93.44776840374746


In [16]:
# Abstract out logic to improve readability
import math
import numpy as np

def flat_curve(arr, n):
    return [i + n for i in arr]

def square_root_curve(arr):
    return [math.sqrt(i) * 10 for i in arr]

test_scores = [88, 92, 79, 93, 85]

curved_5 = flat_curve(test_scores, 5)
curved_10 = flat_curve(test_scores, 10)
curved_sqrt = square_root_curve(test_scores)

for score_list in test_scores, curved_5, curved_10, curved_sqrt:
    print(np.mean(score_list))


87.4
92.4
97.4
93.44776840374746


## Writting Clean Code
* Nice Whitespace
    * Organize your code with consistent indentation - the standard is to use 4 spaces for each indent.
    * Separate sections with blank lines to keep your code well organized and readable.
    * Try to limit your lines to around 79 characters, which is the guideline given in the PEP 8 style guide.  
    [PEP 8 guidelines for code layout](https://www.python.org/dev/peps/pep-0008/?#code-lay-out)
    
* Meaningfull Names
    * Be descriptive and imply type: E.g. for booleans, use prefix is_ or has_ to make clear it is a condition.
    * Be consistent but clearly differentiate
    * Avoid abbreviations and especially single letters
    * Long names does not necessary mean descriptive names. (don't put the details in the name of function)
    * Use verb for function names.

In [17]:
# Be descriptive and imply type
age_list = [47, 12, 28]

for i, age in enumerate(age_list):
    if age < 18:
        is_minor = True
        age_list[i] = "minor"
        
age_list

[47, 'minor', 28]

## Code refactoring

**REFACTORING:** restructuring your code to improve its internal structure, without changing its external functionality. This gives you a chance to clean and modularize your program after you've got it working.

### Refactor: Wine Quality Analysis
In this exercise, you'll refactor code that analyzes a wine quality dataset taken from the UCI Machine Learning Repository [here](https://archive.ics.uci.edu/ml/datasets/wine+quality). Each row contains data on a wine sample, including several physicochemical properties gathered from tests, as well as a quality rating evaluated by wine experts.

The code in this notebook first renames the columns of the dataset and then calculates some statistics on how some features may be related to quality ratings. Can you refactor this code to make it more clean and modular?

In [20]:
import pandas as pd
df = pd.read_csv('data/winequality-red.csv', sep=';')
df.head()

FileNotFoundError: [Errno 2] File data/winequality-red.csv does not exist: 'data/winequality-red.csv'