## How to Interact with this Jupyter Notebook

In this activity, you will use a Jupyter Notebook, which integrates both text and code. The gray boxes contain executable code, which you will run in order to view its output. The text in between the code provides instructions.

## Scenario: Bug Squashing 101: Debugging Python Code

You've just been hired as a data analyst at a renowned insect collection company. Your first task is to clean up their extensive insect database, which has unfortunately been plagued by some pesky data entry errors (and maybe a few actual bugs!).

Through a series of interactive exercises, you'll learn how to identify and fix common data inconsistencies and errors using Python. You'll analyze error messages, strategically use print statements, and apply your Python skills to ensure the accuracy and integrity of the insect collection data. By the end of this activity, you'll be a bug-squashing champion, ready to tackle any messy dataset with confidence and precision!

### Project Summary:
You'll encounter and learn how to resolve common errors that can occur when working with data, such as:

* FileNotFoundError: Troubleshoot issues related to missing or inaccessible data files, ensuring your scripts can locate and read the crucial insect collection records.

* TypeError: Navigate through type-related errors, ensuring that you're performing operations on compatible data types (like numerical measurements and species names) and converting them when necessary.

* IndexError: Master list indexing and avoid accessing insect records beyond their valid range, preventing unexpected crashes in your data cleaning scripts.

* KeyError: Confidently work with insect data dictionaries, ensuring that you're accessing existing keys (like 'species' or 'leg count') and handling missing keys gracefully.

* AttributeError: Understand the distinction between different data types and their available methods, avoiding attempts to use incompatible operations on insect data.

* ZeroDivisionError: Prevent division by zero errors by implementing checks and providing alternative solutions when encountering such scenarios, especially when calculating ratios or proportions based on insect attributes.

#### **The FileNotFoundError**

You'll start with a common error you might encounter when working with files: the `FileNotFoundError`. This error occurs when Python can't find the file you're trying to access.

***First, run the cell below and review the error.***

##### **Debugging Tips**

* **Understanding the Error:** The error message clearly states that the file `'insects.csv'` doesn't exist in the expected location. 
* Read the error message carefully: It often points directly to the problem.
* Check file names and paths: Typos are common culprits.
* Use print statements: Print the filename or path before trying to access it to confirm its correctness.
* Verify file existence: Make sure the file is where you think it is.

##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [36]:
# Import a powerful tool called "pandas" that you'll use to work with and organize data easily
import pandas as pd

# Deliberate Error: Incorrect filename
insects_df = pd.read_csv('insects.csv') 

# Correct approach (commented out for reference)
# insects_df = pd.read_csv('insect_collection.csv') 

FileNotFoundError: [Errno 2] No such file or directory: 'insects.csv'

Great debugging skills!

***In the next cell, there is no error.***

Please run the cell, which reads the CSV file into a Pandas DataFrame, converts this DataFrame into a list of dictionaries named `all_insects_data` using the `to_dict()` method, and prints `all_insects_data`. 

Note that each dictionary in the list represents a row (an insect) from the CSV file, with the column names as keys and their corresponding values.

In [34]:
# Convert the DataFrame to a list of dictionaries
all_insects_data = insects_df.to_dict(orient='records')

# Print the list of dictionaries
print(all_insects_data)

[{'name': 'Butterfly', 'species': 'Lepidoptera', 'legs': 6, 'wings': 2}, {'name': 'Beetle', 'species': 'Coleoptera', 'legs': 4, 'wings': 2}, {'name': 'Ant', 'species': 'Hymenoptera', 'legs': 6, 'wings': 0}, {'name': 'Spider', 'species': 'Araneae', 'legs': 8, 'wings': 0}]


#### **The TypeError - Integers**

Next, you'll explore how to access specific information within your insect dictionaries. You'll try to retrieve the first digit of the number of legs an insect has. Remember, dictionaries use keys to access their values.

***First, run the cell below and review the error.***

##### **Debugging Tips**

* **Understanding the Error:** The `TypeError` indicates you're trying to use the subscript operator [] (used for accessing elements within sequences like lists or strings) on an integer, which isn't allowed.
* Check Data Types: Use the `type()` function to confirm the data type of a variable if you're unsure.
* Think Logically: Integers represent single numerical values, so trying to access an "element" within them doesn't make sense.
* Review Your Code: Carefully examine the line causing the error and consider what you're trying to achieve. Are you using the right approach for the data type you're working with?

##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [35]:
# Deliberate Error: Trying to access an element within an integer
first_insect = all_insects_data[0]
first_digit_of_legs = first_insect['legs'][0] 

# Correct approach (commented out for reference)
# first_digit_of_legs = str(first_insect['legs'])[0] 

print(f"The first digit of the number of legs the {first_insect['name']} has is: {first_digit_of_legs}")

TypeError: 'int' object is not subscriptable

##### **Why the Correct Approach Works**

The corrected approach works because it addresses the core issue of the TypeError. Here's a breakdown:

* Type Conversion: We use str(first_insect['legs']) to convert the integer value representing the number of legs into a string. This is crucial because strings are sequences of characters, allowing us to access individual characters using indexing.

* Indexing: Once we have a string representation of the number of legs, we can use [0] to access its first character, which corresponds to the first digit of the original number.

In essence, we're leveraging the sequence-like nature of strings to extract the desired information from a numerical value.

This technique is handy whenever you need to manipulate or analyze individual digits within a number. Remember, understanding data types and their capabilities is key to effective debugging and problem-solving in Python.

#### **The TypeError - Strings**

In this cell, you'll explore another kind of `TypeError` while working with strings.

***First, run the cell below and review the error.***

##### **Debugging Tips**

* **Understanding the Error:** The `TypeError` says you're trying to combine (concatenate) a string and an integer directly using the + operator, which isn't allowed in Python.
* Type Conversion: Convert the integer to a string using str() before concatenation.
* f-strings: Consider using f-strings (formatted string literals) for a more concise and readable way to embed variables within strings.

##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [69]:
# Deliberate Error: Trying to concatenate a string and an integer directly
first_insect = all_insects_data[0]
sentence = "The " + first_insect['name'] + " has " + first_insect['legs'] + " legs." 

# Correct approach (commented out for reference)
# sentence = f"The {first_insect['name']} has {first_insect['legs']} legs."

print(sentence)

TypeError: can only concatenate str (not "int") to str

##### **Why the Correct Approach Works**

The corrected approach, `sentence = f"The {first_insect['name']} has {first_insect['legs']} legs."`, works seamlessly because it leverages the power of f-strings (formatted string literals) in Python.

* F-strings for Elegant String Formatting: F-strings provide a concise and readable way to embed variables directly within strings. By enclosing variables within curly braces {}, their values are automatically converted to strings and inserted into the final string.

* Implicit Type Conversion: In our example, the integer value `first_insect['legs']` is implicitly converted to a string when included within the f-string. This eliminates the need for explicit type conversion using `str()`, making the code cleaner and less prone to errors.

#### Accessing Elements in a List - The IndexError

Next, you'll try to access information about an insect at a specific position (index) within your list of insect dictionaries `all_insects_data`. You'll deliberately try to access an index that is out of bounds.

***First, run the cell below and review the error.***

##### **Debugging Tips**
* **Understanding the Error:** The `IndexError` indicates you're trying to access an element at an index that doesn't exist within the list.
* Check List Length: Use the `len()` function to get the number of elements in a list. Remember, list indices start at 0, so the last valid index is `len(list) - 1`.
* Use Conditional Statements: Before accessing an element at a specific index, check if the index is within the valid range using an if statement

##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [44]:
# Deliberate Error: Accessing an index that's out of bounds
tenth_insect = all_insects_data[10] 
print(tenth_insect)

# Correct approach (commented out for reference)
# if len(all_insects_data) > 9:  # Check if the 10th insect exists
#     tenth_insect = all_insects_data[9]  # Access the 10th insect (index 9)
# else:
#     print("There are not enough insects in the collection.")

IndexError: list index out of range

##### **Why the Correct Approach Works**

The corrected approach, which includes an if condition to check the list length before accessing an element, prevents the IndexError by ensuring we only attempt to access elements within the valid range of the list.

* `len()` for List Length: The `len(all_insects_data)` function returns the total number of elements (insect dictionaries) in the list.

* Zero-Based Indexing: Python lists use zero-based indexing, meaning the first element is at index 0, the second at index 1, and so on. The last valid index is `len(list) - 1`.

* Conditional Access: The if `len(all_insects_data) > 9` condition checks if there are at least 10 elements in the list. If so, it safely accesses the 10th element (at index 9). Otherwise, it prints an informative message indicating that the requested insect doesn't exist.

#### Accessing Dictionary Values - The KeyError

In the following cell, you'll try to access the color of an insect, even though this information isn't present in your insect dictionaries.

***First, run the cell below and review the error.***

##### **Debugging Tips**

* **Understand the Error:** The `KeyError` says you're trying to access a key that doesn't exist in the dictionary.
* Check for Key Existence:
    * Use the in operator: if 'key_name' in dictionary:
    * Use the .get() method: value = dictionary.get('key_name', default_value) (returns default_value if the key doesn't exist)
* Handle Missing Keys Gracefully: Provide informative messages or default values when keys are missing, instead of letting your code crash.

##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [49]:
# Deliberate Error: Accessing a non-existent key 
first_insect = all_insects_data[0]
color = first_insect['color']
print(f"The color of the {first_insect['name']} is: {color}")

# Correct approach (commented out for reference)
# if 'color' in first_insect:
#     color = first_insect['color']
# else:
#     print("Color information is not available for this insect.")

KeyError: 'color'

##### **Why the Correct Approach Works**

The corrected approach, using the in operator to check for key existence, prevents the `KeyError` by ensuring we only try to access the 'color' key if it's present in the `first_insect` dictionary.

* `in` Operator for Key Membership: The expression 'color' in `first_insect` checks if the key 'color' exists within the `first_insect` dictionary. It returns `True` if the key is found, and `False` otherwise.

* Conditional Access: The `if` 'color' in `first_insect` condition allows us to execute the code to access the 'color' value only if the key is present. If the key is missing, we print an informative message instead of encountering an error.

#### Adding New Information - The AttributeError

Next, you'll try to add the color 'brown' to your `first_insect` dictionary. You'll mistakenly try to use the `.append()` method, which is designed for lists, not dictionaries.

***First, run the cell below and review the error.***

##### **Debugging Tips**

* **Understand the Error:** The `AttributeError` says you're trying to use a method (`.append()`) that doesn't exist for the dictionary data type
* Know Your Data Structures:
    * Dictionaries use key-value pairs. To add a new key-value pair, simply assign a value to a new key: `dictionary['new_key'] = new_value`
    * Lists are ordered collections of items. To add an item to the end of a list, use the `.append()` method: `list.append(new_item)`
    * Review Documentation: If you're unsure about the available methods for a particular data type, consult the official Python documentation
    
##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [56]:
# Deliberate Error: Trying to use .append() on a dictionary
first_insect = all_insects_data[0]
first_insect.append('color', 'brown') 

# Correct approach (commented out for reference)
# first_insect['color'] = 'brown'

print(first_insect)

AttributeError: 'dict' object has no attribute 'append'

##### **Why the Correct Approach Works**

The corrected approach, `first_insect['color'] = 'brown'`, successfully adds the 'color' key with the value 'brown' to the `first_insect` dictionary because it uses the correct syntax for assigning values to dictionary keys.

* Key-Value Assignment: In Python dictionaries, you can add a new key-value pair or update an existing one using the assignment operator `=`. The syntax is `dictionary[key] = value`.

* Dynamic Nature of Dictionaries: Dictionaries are mutable, meaning you can modify them by adding, updating, or removing key-value pairs after they've been created.

#### Handling Division by Zero - The ZeroDivisionError

Lastly, you'll calculate the ratio of legs to wings for each insect. You'll need to be careful about insects with zero wings to avoid division by zero errors.

***First, run the cell below and review the error.***

##### **Debugging Tips**
* **Understand the Error:** The `ZeroDivisionError` occurs when you try to divide a number by zero.
* Check for Zero Before Dividing: Use an `if` statement to ensure the divisor is not zero before performing the division
* Handle the Zero Case Gracefully: Provide an appropriate message or alternative calculation when division by zero is encountered


##### **Still Stuck?**
* If you are still stuck, remove or comment out the code under `# Deliberate Error` and uncomment the code below the following line to resolve the error: `# Correct approach (commented out for reference)`. 
* Helpful Tip: You can comment and uncomment multiple lines of code at once by selecting them and pressing `Ctrl + /` on your keyboard.

In [66]:
# Deliberate Error: Potential division by zero
for insect in all_insects_data:
    leg_to_wing_ratio = insect['legs'] / insect['wings']

# Correct approach (commented out for reference)
# for insect in all_insects_data:
#     if insect['wings'] != 0:
#         leg_to_wing_ratio = insect['legs'] / insect['wings']
#     else:
#         leg_to_wing_ratio = "N/A (Insect has no wings)" 
#     print(f"The leg-to-wing ratio for the {insect['name']} is: {leg_to_wing_ratio}")

The leg-to-wing ratio for the Butterfly is: 3.0
The leg-to-wing ratio for the Beetle is: 2.0
The leg-to-wing ratio for the Ant is: N/A (Insect has no wings)
The leg-to-wing ratio for the Spider is: N/A (Insect has no wings)


##### **Why the Correct Approach Works**

The corrected approach, which includes an `if` condition to check for zero wings, prevents the `ZeroDivisionError` by avoiding division by zero.

* Zero Check: The condition `if insect['wings'] != 0` checks if the number of wings is not zero.

* Conditional Calculation: If the number of wings is not zero, the leg-to-wing ratio is calculated and printed.

* Handling Zero Wings: If the number of wings is zero, a message `"N/A (Insect has no wings)"` is printed instead of performing the division, gracefully handling the scenario where the calculation is not possible.

## Activity Recap: Cleaning the Insect Collection Database

Congratulations on completing your first task as a data analyst at the insect collection company! You've successfully applied your Python debugging skills to identify and resolve various errors, ensuring the accuracy and integrity of the insect collection database. Let's recap what you've learned:

* FileNotFoundError: You've learned how to troubleshoot issues related to missing or inaccessible data files, ensuring your scripts can locate and read the crucial insect collection records.

* TypeError: You've navigated through type-related errors, ensuring that you're performing operations on compatible data types (like numerical measurements and species names) and converting them when necessary.

* IndexError: You've mastered list indexing and learned how to avoid accessing insect records beyond their valid range, preventing unexpected crashes in your data cleaning scripts.

* KeyError: You've gained confidence in working with insect data dictionaries, ensuring that you're accessing existing keys (like 'species' or 'leg count') and handling missing keys gracefully.

* AttributeError: You now understand the distinction between different data types and their available methods, avoiding attempts to use incompatible operations on insect data.

* ZeroDivisionError: You've learned how to prevent division by zero errors by implementing checks and providing alternative solutions when encountering such scenarios, especially when calculating ratios or proportions based on insect attributes.

**Remember:**

* Debugging is an essential part of programming. It's about systematically identifying and fixing errors in your code. Don't get discouraged if you encounter errors; it's a natural part of the learning process!