<a href="https://colab.research.google.com/github/shap0011/machine_learning_fall_2024/blob/main/variable_naming_comments_readability_error_handling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Naming best practices and code readability

### When writing code in Python, it’s important to make sure that your code can be easily understood by others. Below are the 3 ways to do this:

1. Giving variables obvious names.

2. Defining explicit functions.

3. Organizing your code.

#### Add descriptive comments.
1. What am I doing at each step? Why am I doing it?

2. What is my thought process here?


#### Use spacing and line breaks to improve readability.

#### **Variable names:** should be all lowercase with the words separated by underscores. A variable called `“VariableIsGlobal”` cannot be read as easely and as fast as `“variable_is_global”`

In [1]:
import numpy as np
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
!ls '/content/drive/My Drive/Colab Notebooks'

'Copy of Goodlife_Fitness_Solution.ipynb'
 Data_Cleaning_Solution.ipynb
 EDA_Solution.ipynb
 Feature_Engineering_Solution.ipynb
 Goodlife_Fitness_Solution.ipynb
 my_functions.py
 __pycache__
 python_best_practices.ipynb
 Real_Estate_Solution.ipynb
 test_module.ipynb
 Unsupervised_Clustering_Solution.ipynb
 Untitled
 Untitled0.ipynb
 Untitled1.ipynb
 variable_naming_comments_readability_error_handling.ipynb


In [3]:
import sys
sys.path.append('/content/drive/My Drive/Colab Notebooks')  # Add the folder to the system path

In [4]:
from my_functions import subtract_numbers  # Import the function
from my_functions import multiply_numbers  # Import the function
from my_functions import divide_numbers  # Import the function
from my_functions import add_numbers  # Import the function

In [19]:
add_numbers(3,5)

8

### Bad way to name variables

In [20]:
lst = ["bananas","butter","cheese","toothpaste"]

for i in lst:
    print(i)

bananas
butter
cheese
toothpaste


In [21]:
lst1 = ['butter','milk','eggs','cheese']
lst2 = ['tomatoes','carrot','spinach','cabbage']
lst3 = ['ham','chicken','tuna','turkey']

### A better way

In [22]:
groceries = ["bananas","butter","cheese","toothpaste"]

for i in groceries:
    print(i)

bananas
butter
cheese
toothpaste


### Even better

In [23]:
groceries = ["bananas","butter","cheese","toothpaste"]

for grocery in groceries:
    print(grocery)

bananas
butter
cheese
toothpaste


In [25]:
dairy_list = ['butter','milk','eggs','cheese']
veg_list = ['tomatoes','carrot','spinach','cabbage']
meat_list = ['ham','chicken','tuna','turkey']

### Constants or Final

* Constants or finals must always be written in UPPERCASE, with the words separated by underscores to make them readable, like PI, MAX_VALUE.
* This use is widespread and should be used to avoid any confusion with normal variables.

### camelCase vs under_scores

More details here - https://dewitters.com/dewitters-tao-of-coding/

### Good naming and commenting

#### As mentioned above, here are 3 good commenting practices for students to use:

1. Giving variables obvious names
2. Defining explicit functions
3. Organizing your code

### Different ways to comment in Python

In [10]:
# This is a comment
# You can use the keys crtl + / (in Windows) to comment single and multiple lines

In [26]:
print("This will run.")  # Run this

This will run.


In [27]:
def multiline_example():
    # This is a pretty good example
    # of how you can spread comments
    # over multiple lines in Python
    pass

While the below cell gives you the multiline functionality, this isn’t technically a comment. It’s a string that’s not assigned to any variable, so it’s not called or referenced by your program. Still, since it will be ignored at runtime, it can effectively act as a comment.

In [31]:
# Doc-strings
"""
If I really hate pressing `enter` and
typing all those hash marks, I could
just do this instead

"""

'\nIf I really hate pressing `enter` and\ntyping all those hash marks, I could\njust do this instead\n\n'

### Writing comments

In [29]:
def get_sum(x: float, y: float) -> float:
    """
    Calculate the sum of two numbers.

    Args:
    x (float): The first number
    y (float): The second number

    Returns:
    float: The sum of x and y
    """
    return x + y

In [32]:
get_sum(1,2)

3

### How to practice commenting

1. Start writing comments for yourself in your own code. Make it a point to include simple comments wherever they would be helpful for your (current and future) understanding.
2. Add some clarity to complex functions, and put a docstring at the top of all your scripts.
3. Go back and review old code that you’ve written. See where anything might not make sense, and clean up the code.

### Examples

#### Bad

In [16]:
print('one'); print('two')
x=1
if x == 1: print('one')

# The issue is readability. The code is not indented.
# if <complex comparison> and <other complex comparison>:
    # do something

one
two
one


#### Good

In [17]:
print('one')
print('two')

if x == 1:
    print('one')

# # uncomment for run through
# cond1 = <complex comparison>
# cond2 = <other complex comparison>
# if cond1 and cond2:
#     # do something

one
two
one



**I highly recommend watching this video** - A masterclass to teach you all Python best practices - https://www.youtube.com/watch?v=ubGeHQRjNog

# Errors :
## These are going to make your life miserable this semester
### But Python helps you a lot to identify where the problem is.... and you can easily learn how to fix them!

### Understanding Traceback
Python generates traceback when an exception occurs during the execution of the python program. There are two conditions the python program gets into problems while the program is executed.

1. **Syntax Error** - If the program is not properly coded, the program gets into error at the time of compilation itself. You need to write the correct code; then, only the program will progress to the next lines.


2. **Logical Error (Exception)** - This error happens only during the execution, and it surfaces only when an exceptional condition occurs within the program. The exceptional condition occurs due to the supply of wrong data, and the program is not designed to manage the extraneous condition.

![pic1.png](https://github.com/gt-cse-6040/bootcamp/blob/main/Module%200/Session%202/pic1.png?raw=1)

### How to read the coding error messages
1. The arrow always points to the line of code that failed.

### Let’s look at a coding error message in the function below.

In [38]:
# This code has a syntax error
# print("Hello, world!" # Missing closing parenthesis
print("Hello, world!")  # Missing closing parenthesis

Hello, world!


In [35]:
# Correct version would be:
print("Hello, world!")

Hello, world!


In [37]:
# this code will throw and exception
def divide(a, b):
    return a / b

# This will throw a ZeroDivisionError
# result = divide(10, 0)
result = divide(10, 1)
print(result)

10.0


In [41]:
# will throw error for improper indentation
def Division():
# A = Num / Den
  A = Num / Den
# print ("Quotient ", A)
  print ("Quotient ", A)

Num = int (input ("numerator "))
Den = int (input ("denominator "))

Division()


numerator 10
denominator 5
Quotient  2.0


In [42]:
# This is correct.
def Division():
    Num = int (input ("numerator "))
    Den = int (input ("denominator "))
    A = Num / Den
    print ("Quotient ", A)

Division()
# input 0 denominator to see error message structure

numerator 40
denominator 8
Quotient  5.0


#### What you see is the ERROR TRACEBACK.
1. The BOTTOM ARROW will always point to the line of code that directly caused the failure. It may be a line of code in your notebook, or a line of code from the underlying Python system.

2. The TOP ARROW will point to the initial line of code that was executed, and you can start here, working down, to see the sequence of code that was executed (not every line will be shown).

3. If your function is calling other functions within your notebook, each of those lines will be included in the error trace.

4. What you want to do is work your way down to the first line of code (arrowed to) that is from the notebook, and that you wrote. This will tell you where your underlying problem is.

#### The bottom of the error block will give the ERROR TEXT. If you do not understand what this text means, GOOGLE SEARCH THE TEXT!!

#### Here is the link to an excellent article to understand how to read error messages and the traceback:  https://realpython.com/python-traceback/

### Functions
* A function is a block of reusable code that performs a specific task.

* Functions do things, and their name should make this clear. Therefore, always include a verb in it, no exceptions! Use the same naming as with variables, this means all lowercase words separated by underscores.

In [43]:
# Function
def greet(name):
    return f"Hello, {name}!"

In [44]:
# Usage
print(greet("Alice"))  # Output: Hello, Alice!

Hello, Alice!


### Classes

For naming of classes, use the same `UpperCamelCase` type.


In [46]:
import numpy as np

class DataStats:
    def __init__(self, data):
        self.data = np.array(data)

    def mean(self) -> float:
        return np.mean(self.data)

    def median(self) -> float:
        return np.median(self.data)

    def std_dev(self) -> float:
        return np.std(self.data)

    def range(self) -> float:
        return np.max(self.data) - np.min(self.data)

    def add_data_point(self, value: float):
        self.data = np.append(self.data, value)

    def remove_outliers(self, threshold: float = 3):
        z_scores = np.abs((self.data - self.mean()) / self.std_dev())
        self.data = self.data[z_scores < threshold]

    def summary(self) -> dict:
        return {
            "mean": self.mean(),
            "median": self.median(),
            "std_dev": self.std_dev(),
            "range": self.range(),
            "count": len(self.data)
        }


In [47]:
# Example usage
data = [1, 2, 3, 4, 5, 100]  # Note: 100 is an outlier
stats = DataStats(data)

print("Initial summary:", stats.summary())

Initial summary: {'mean': 19.166666666666668, 'median': 3.5, 'std_dev': 36.17281053805775, 'range': 99, 'count': 6}


!ls