# Chapter 10 File and Exceptions

## Learning Objectives for Today's Lecture on Files and Exceptions

By the end of this chapter, you will be able to:

### 1. Understand File Operations
- Learn to work with files to enable programs to analyze large amounts of data efficiently.
- Understand how to read from and write to files in Python.

### 2. Error Handling Basics
- Learn to handle errors to prevent program crashes in unexpected situations.
- Understand how to make programs more robust when dealing with issues like missing files or bad data.

### 3. Introduction to Exceptions
- Explore how Python uses exceptions to manage errors that arise during program execution.
- Learn to use exception handling to address both innocent mistakes and malicious attempts to disrupt program functionality.

### 4. Utilizing the JSON Module
- Discover how the `json` module can be used to save user data for future use.
- Understand the process of loading and saving data in JSON format for persistence.

### 5. Improving Program Usability
- Learn how working with files and saving data enhances user experience by allowing programs to remember user input across sessions.
- Understand how handling exceptions makes programs more stable and reliable.

### 6. Building Robust Programs
- Develop skills to make your programs more applicable, usable, and stable in real-world scenarios.
- Address practical challenges related to file handling and error management effectively.


## Reading from a File

Reading the Contents of a File
- To begin, we need a file (`pi_digits.txt`) with a few lines of text in it. 
- Let’s start with a file that contains pi to 30 decimal places, with 10 decimal places per line:

In [40]:
from pathlib import Path

path = Path('pi_digits.txt')
contents = path.read_text()
print(contents)

3.1415926535
  8979323846
  2643383279


In [41]:
from pathlib import Path
path = Path('pi_digits.txt')
contents = path.read_text() # read_text() returns an empty string when it reaches the end of the file
contents = contents.rstrip()
print(contents)

3.1415926535
  8979323846
  2643383279


We can strip the trailing newline character when we read the contents of the file, by applying the `rstrip()` method immediately after calling
`read_text()`:

In [5]:
contents = path.read_text().rstrip() # this approach is called method chaining

## Relative and Absolute File Paths

- There are two main ways to specify paths in programming. 
- A relative file path tells Python to look for a given location relative to the directory where the currently running program file is stored.
- You can also tell Python exactly where the file is on your computer, regardless of where the program that’s being executed is stored. This is called an `absolute file path`
- Windows systems use a backslash `(\)` instead of a forward slash `(/)` when displaying file paths, but you should use forward slashes in your code, even on Windows. 
- The `pathlib` library will automatically use the correct representation of the path when it interacts with your system, or any user’s system.

In [None]:
C:\Users\userr\Documents\GitHub\ArewaDataScience\python-programming-fellowship\02_Python-Lessons\pi_digits.txt

In [None]:
\n \t

In [43]:
from pathlib import Path
path = Path(r"C:\Users\userr\Documents\GitHub\ArewaDataScience\python-programming-fellowship\02_Python-Lessons\text_files\pi_digits.txt")
contents = path.read_text() # read_text() returns an empty string when it reaches the end of the file
contents = contents.rstrip()
print(contents)

3.1415926535
  8979323846
  2643383279


In [45]:
PI = 22/7

3.142857142857143

In [44]:
from pathlib import Path
path = Path(r"text_files\pi_digits.txt")
contents = path.read_text() # read_text() returns an empty string when it reaches the end of the file
contents = contents.rstrip()
print(contents)

3.1415926535
  8979323846
  2643383279


In [None]:
from pathlib import Path
path = Path(r"..\00_Stage-1-Getting-Started\")
contents = path.read_text() # read_text() returns an empty string when it reaches the end of the file
contents = contents.rstrip()
print(contents)

## Accessing a File’s Lines
When you’re working with a file, you’ll often want to examine each line of the file. You might be looking for certain information in the file, or you might want to modify the text in the file in some way. 

For example:
- You might want to read through a file of weather data and work with any line that includes the word **sunny** in the description of that day’s weather.  
- In a news report, you might look for any line with the tag `<headline>` and rewrite that line with a specific kind of formatting.

You can use the `splitlines()` method to turn a long string into a set of lines, and then use a `for` loop to examine each line from a file, one at a time:


In [47]:
from pathlib import Path

path = Path('pi_digits.txt')
contents = path.read_text()
lines = contents.splitlines()

# print(lines)
for line in lines:
    print(line)

# print(lines[0])
# print(lines[1])
# print(lines[2])

3.1415926535
  8979323846
  2643383279


In [48]:
from pathlib import Path

path = Path('weather.txt')

contents = path.read_text()
lines = contents.splitlines()
for line in lines:
    print(line)

23
32
21
30
36
39
40
32
30


In [None]:
input()

## Working with a File’s Contents

In [None]:
from pathlib import Path

def farenheit(temp_in_f):
    temp_in_f = float(temp_in_f)
    temp_in_f = temp_in_f * 9/5 + 32
    return temp_in_f

path = Path('weather.txt')

contents = path.read_text()
lines = contents.splitlines()
for line in lines:
    print(farenheit(line))

When Python reads from a text file, it interprets all text in the file as a string. If you
read in a number and want to work with that value in a numerical context, you’ll
have to convert it to an integer using the `int()` function or a float using the `float()`
function.

In [49]:
from pathlib import Path
path = Path('pi_digits.txt')
contents = path.read_text()
lines = contents.splitlines()
pi_string = ''
for line in lines:
    pi_string += line
print(pi_string)
print(len(pi_string))

3.1415926535  8979323846  2643383279
36


In [50]:
from pathlib import Path
path = Path('pi_digits.txt')
contents = path.read_text()
lines = contents.splitlines()
pi_string = ''
for line in lines:
    pi_string += line.lstrip()
print(pi_string)
print(len(pi_string))

3.141592653589793238462643383279
32


## Large Files: One Million Digits
When Python reads from a text file, it interprets all text in the file as a string. 
If you read in a number and want to work with that value in a numerical context, you’ll
have to convert it to an integer using the int() function or a float using the float()
function.

In [51]:
from pathlib import Path
path = Path('pi_million_digits.txt')
contents = path.read_text()
lines = contents.splitlines()
pi_string = ''
for line in lines:
    pi_string += line.lstrip()
    
print(f"{pi_string[:52]}...")
print(len(pi_string))

3.14159265358979323846264338327950288419716939937510...
1000002


In [53]:
name = "Arewa Data Science"
name[2:4]

'ew'

In [55]:
from pathlib import Path
path = Path('pi_million_digits.txt')
contents = path.read_text()
lines = contents.splitlines()
pi_string = ''
for line in lines:
    pi_string += line.lstrip()
    
birthday = input("Enter your birthday, in the form mmddyy: ") # 021094
if birthday in pi_string:
    print("Your birthday appears in the first million digits of pi!")
else:
    print("Your birthday does not appear in the first million digits of pi.")

Your birthday appears in the first million digits of pi!


## Writing to a File
Writing a Single Line Once you have a path defined, you can write to a file using the `write_text()` method. 
To see how this works, let’s write a simple message and store it in a file instead of printing it to the screen:

In [58]:
from pathlib import Path

path = Path('programming_02.txt')
path.write_text("I love programming.")

19

## Writing Multiple Lines
The write_text() method does a few things behind the scenes. 
If the file that path points to doesn’t exist, it creates that file. 
Also, after writing the string to the file, it makes sure the file is closed properly. 
Files that aren’t closed properly can lead to missing or corrupted data.

In [None]:
from pathlib import Path
contents = "I love programming.\n"
contents += "I love creating new games.\n"
contents += "I also love working with data.\n"
path = Path('programming.txt')
path.write_text(contents)

![Note]
- Be careful when calling write_text() on a path object. 
- If the file already exists, write_text() will erase the current contents of the file and write new contents
to the file. 
- Later in this chapter, you’ll learn to check whether a file exists using
pathlib.

## Exceptions

- Python uses special objects called exceptions to manage errors that arise during a program’s execution. 
- Whenever an error occurs that makes Python unsure of what to do next, it creates an exception object. 
- If you write code that handles the exception, the program will continue running. 
- If you don’t handle the exception, the program will halt and show a traceback, which includes a report of the exception that was raised

## Handling the ZeroDivisionError Exception

In [59]:
def div(num1, num2):
    result = num1 / num2
    return result

In [62]:
div(2,0)

ZeroDivisionError: division by zero

In [64]:
div(1,0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001)

1e+166

In [63]:
print(5/0)

ZeroDivisionError: division by zero

In [65]:
def div_2(num1, num2):
    if num2 == 0:
        raise ZeroDivisionError("Cannot divide by zero.")
    result = num1 / num2
    return result

In [67]:
div_2(2,0)

ZeroDivisionError: Cannot divide by zero.

## Using try-except Blocks

In [70]:
try:
    print(5/0)
except ZeroDivisionError:
    print("You can't divide by zero!")

You can't divide by zero!


## Using Exceptions to Prevent Crashes


In [72]:
print("Give me two numbers, and I'll divide them.")
print("Enter 'q' to quit.")

while True:
    first_number = input("\nFirst number: ")
    if first_number == 'q':
        break
    second_number = input("Second number: ")
    if second_number == 'q':
        break
    answer = int(first_number) / int(second_number)
    print(answer)

Give me two numbers, and I'll divide them.
Enter 'q' to quit.


ZeroDivisionError: division by zero

## The else Block

In [73]:
print("Give me two numbers, and I'll divide them.")
print("Enter 'q' to quit.")
while True:
    first_number = input("\nFirst number: ")
    if first_number == 'q':
        break
    second_number = input("Second number: ")
    if second_number == 'q':
        break
    try:
        answer = int(first_number) / int(second_number)
    except ZeroDivisionError:
        print("You can't divide by 0!")
    else:
        print(answer)



Give me two numbers, and I'll divide them.
Enter 'q' to quit.
6.0
2.0
You can't divide by 0!


## Handling the FileNotFoundError Exception

In [106]:
from pathlib import Path

path = Path('alice.txt')
contents = path.read_text(encoding='utf-8')

FileNotFoundError: [Errno 2] No such file or directory: 'alice.txt'

In [75]:
from pathlib import Path
path = Path('alice.txt')
try:
    contents = path.read_text(encoding='utf-8')
except FileNotFoundError:
    print(f"Sorry, the file {path} does not exist.")

Sorry, the file alice.txt does not exist.


##  Handling Multiple Exceptions
- For each `try` block in your code, there can be zero or more `except` blocks. 
- Multiple except blocks allow us to handle each exception differently.
- The argument type of each except block indicates the type of exception that can be handled by it. 
- For example; `ZeroDivisionError`, `ValueError`, `IndexError`, `FileNotFoundError` and so on

## Common Exceptions in Python

Python provides several built-in exceptions that can be handled using `try...except` blocks. Below is a list of notable ones:

### 1. General Exceptions
- **Exception**: Base class for all exceptions (not recommended for specific error handling).

### 2. System-Related Exceptions
- **SystemExit**: Raised when `sys.exit()` is called.
- **KeyboardInterrupt**: Raised when the user interrupts program execution (e.g., pressing Ctrl+C).
- **MemoryError**: Raised when an operation runs out of memory.

### 3. Arithmetic Exceptions
- **ZeroDivisionError**: Raised when dividing by zero.
- **OverflowError**: Raised when a numerical operation exceeds the limits of the numeric type.
- **FloatingPointError**: Raised when a floating-point operation fails.

### 4. Attribute and Lookup Errors
- **AttributeError**: Raised when an invalid attribute reference or assignment is made.
- **KeyError**: Raised when a dictionary key is not found.
- **IndexError**: Raised when a sequence index is out of range.

## 5. Input/Output Exceptions
- **IOError**: Raised when an I/O operation (e.g., file opening) fails.
- **FileNotFoundError**: Raised when a file or directory is requested but does not exist.
- **EOFError**: Raised when input() hits end-of-file condition.

### 6. Import Errors
- **ImportError**: Raised when an import statement fails.
- **ModuleNotFoundError**: A subclass of `ImportError` for missing modules.

### 7. Value and Type Errors
- **ValueError**: Raised when a function receives an argument of the right type but an inappropriate value.
- **TypeError**: Raised when an operation or function is applied to an object of inappropriate type.

### 8. Specific Built-In Errors
- **AssertionError**: Raised when an assertion statement fails.
- **NameError**: Raised when a variable is not found in the local or global namespace.
- **UnboundLocalError**: A subclass of `NameError` for uninitialized local variables.

### 9. OS-Related Exceptions
- **OSError**: Base class for OS-related errors.
  - **FileExistsError**: Raised when trying to create a file or directory that already exists.
  - **PermissionError**: Raised when there’s a permission issue.

### 10. Runtime and Syntax Errors
- **RuntimeError**: A generic error for issues that do not fall into other categories.
- **RecursionError**: Raised when the maximum recursion depth is exceeded.
- **SyntaxError**: Raised when there’s an error in Python syntax.

### 11. Warnings (Not Exceptions but Can Be Handled)
- **DeprecationWarning**: Warns about deprecated features.
- **SyntaxWarning**: Warns about questionable syntax.
- **RuntimeWarning**: Warns about suspicious runtime behavior.

## Analyzing Text

In [107]:
from pathlib import Path
path = Path('exceptions/alice.txt')
try:
    contents = path.read_text(encoding='utf-8')
except FileNotFoundError:
    print(f"Sorry, the file {path} does not exist.")
else:
    # Count the approximate number of words in the file:
    words = contents.split()
    num_words = len(words)
    print(f"The file {path} has about {num_words} words.")


The file exceptions\alice.txt has about 29594 words.


<pre><strong>
The count is a little high because extra information is provided by the publisher in the text file used here, 
but it’s a good approximation of the length of Alice in Wonderland.</strong></pre>

## Working with Multiple Files

In [113]:
from pathlib import Path

def count_words(path):
    """Count the approximate number of words in a file."""
    try:
        contents = path.read_text(encoding='utf-8')
    except FileNotFoundError:
        print(f"Sorry, the file {path} does not exist.")
    else:
        # Count the approximate number of words in the file:
        words = contents.split()
        num_words = len(words)
        print(f"The file {path} has about {num_words} words.")

In [111]:
path = Path('alice.txt')
# count_words(path)
path

WindowsPath('alice.txt')

In [109]:
path = Path('exceptions/alice.txt')
count_words(path)

The file exceptions\alice.txt has about 29594 words.


In [114]:
filenames = ['alice.txt', 'siddhartha.txt', 'moby_dick.txt', 'little_women.txt']
for filename in filenames:
    path = Path('exceptions/'+filename)
    count_words(path)

The file exceptions\alice.txt has about 29594 words.
Sorry, the file exceptions\siddhartha.txt does not exist.
The file exceptions\moby_dick.txt has about 215864 words.
The file exceptions\little_women.txt has about 189142 words.


## Failing Silently

# Using `pass` to Handle Exceptions Silently

In some cases, you may want your program to continue running even when an exception occurs, without notifying the user. To achieve this, you can use Python's `pass` statement in the `except` block. The `pass` statement explicitly tells Python to do nothing in that block.

Here’s how it works:

## Example: Silent Exception Handling
```python
from pathlib import Path

def count_words(path):
    """Count the approximate number of words in a file."""
    try:
        contents = path.read_text(encoding='utf-8')
    except FileNotFoundError:
        pass
    else:
        # Count the approximate number of words in the file:
        words = contents.split()
        num_words = len(words)
        print(f"The file {path} has about {num_words} words.")


In [115]:
from pathlib import Path

def count_words(path):
    """Count the approximate number of words in a file."""
    try:
        contents = path.read_text(encoding='utf-8')
    except FileNotFoundError:
        pass
    else:
        # Count the approximate number of words in the file:
        words = contents.split()
        num_words = len(words)
        print(f"The file {path} has about {num_words} words.")

In [116]:
filenames = ['alice.txt', 'siddhartha.txt', 'moby_dick.txt', 'little_women.txt']
for filename in filenames:
    path = Path('exceptions/'+filename)
    count_words(path)

The file exceptions\alice.txt has about 29594 words.
The file exceptions\moby_dick.txt has about 215864 words.
The file exceptions\little_women.txt has about 189142 words.


## Deciding Which Errors to Report

# Reporting Errors vs. Silent Failures in Python Programs

When designing programs, deciding whether to report errors to users or let the program fail silently depends on the context and user expectations. Below are key considerations to guide this decision:

## When to Report Errors:
- **User Awareness**: If users know which texts or files should be analyzed, they may appreciate being informed about why certain texts were not processed.
- **Clear Communication**: Providing error messages helps users understand what went wrong and allows them to take corrective action.

## When to Let the Program Fail Silently:
- **Unnecessary Information**: If users do not know which specific inputs are expected (e.g., analyzing multiple unknown files), reporting missing items might confuse or overwhelm them.
- **Improved Usability**: Avoiding unnecessary error messages can enhance the user experience by focusing only on relevant outcomes.

## Balancing Error Reporting:
- Python’s error-handling structures, such as `try-except` blocks, allow fine-grained control over error communication.
- The level of detail shared with users is a design decision that should prioritize usability.

## External Dependencies and Exceptions:
- Programs are more likely to encounter exceptions when they depend on external factors like:
  - **User Input**: Invalid or unexpected input from users.
  - **File Existence**: Missing or inaccessible files.
  - **Network Availability**: Issues with connectivity or external services.
- A well-written and thoroughly tested program minimizes internal errors (e.g., syntax or logical errors), but external dependencies remain a common source of exceptions.

## Key Takeaway:
With experience, you’ll develop an intuition for where to include exception-handling blocks and decide the appropriate amount of information to share with users.

For more guidance on exception handling, refer to [Python’s Official Documentation](https://docs.python.org/3/tutorial/errors.html).


# Storing Data

The <strong>JSON</strong> (JavaScript Object Notation) format was originally developed for
JavaScript. However, it has since become a common format used by many languages,
including Python.

# Using `json.dumps()` and `json.loads()`

Here’s how you can use `json.dumps()` and `json.loads()` to store and retrieve data:

1. **Storing Data with `json.dumps()`**:
   - Write a program to store a set of numbers.
   - Use the `json.dumps()` function to convert the data to the JSON format.
   - The function takes one argument: a piece of data to be converted into JSON format.
   - It returns a string representation of the JSON data.
   - You can then write this string to a data file.

2. **Reading Data with `json.loads()`**:
   - Write another program to read the JSON data back into memory.
   - Use the `json.loads()` function to convert the JSON string back into its original Python data structure.

3. **Example Workflow**:
   - First program:
     - Use `json.dumps()` to convert and store data in a file.
   - Second program:
     - Use `json.loads()` to read and load data from the file back into memory.


In [119]:
from pathlib import Path
import json

numbers = [2, 3, 5, 7, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21,]
path = Path('numbers.json')
contents = json.dumps(numbers)
path.write_text(contents);

Now we’ll write a separate program that uses `json.loads()`to read the
list back into memory:

In [120]:
from pathlib import Path
import json

path = Path('numbers.json')
contents = path.read_text()
numbers = json.loads(contents)
print(numbers)

[2, 3, 5, 7, 11, 13, 14, 15, 16, 17, 18, 19, 20, 21]


## Saving and Reading User-Generated Data

# Saving Data with `json`

Saving data with `json` is particularly useful when working with user-generated data. Without storing user information, the data would be lost once the program stops running. 

Here’s an example scenario:

1. **Prompting for User Input**:
   - The program asks for the user's name the first time it is run.
   - The name is stored using `json`.

2. **Remembering User Data**:
   - On subsequent runs, the program retrieves and remembers the user's name.

3. **Storing the User's Name**:
   - Begin by writing code to store the user's name in a JSON file for later use.

This approach ensures a seamless and personalized user experience across sessions.


In [121]:
from pathlib import Path
import json

username = input("What is your name? ")
path = Path('username.json')
contents = json.dumps(username)
path.write_text(contents)
print(f"We'll remember you when you come back, {username}!")

We'll remember you when you come back, Aremu Bala!


Now let’s write a new program that greets a user whose name has
already been stored:

In [122]:
from pathlib import Path
import json

path = Path('username.json')
contents = path.read_text()

username = json.loads(contents)
print(f"Welcome back, {username}!")

Welcome back, Aremu Bala!


We could write a try-except block here to respond appropriately
if username.json doesn’t exist, but instead we’ll use a handy method from the
pathlib module:

In [126]:
from pathlib import Path
import json

path = Path('usernames.json')
if path.exists():
    contents = path.read_text()
    username = json.loads(contents)
    print(f"Welcome back, {username}!")
else:
    username = input("What is your name? ")
    contents = json.dumps(username)
    path.write_text(contents)
    print(f"We'll remember you when you come back, {username}!")

Welcome back, !


## Refactoring

In [127]:
from pathlib import Path
import json
def greet_user():
    """Greet the user by name."""
    path = Path('username.json')
    if path.exists():
        contents = path.read_text()
        username = json.loads(contents)
        print(f"Welcome back, {username}!")
    else:
        username = input("What is your name? ")
        contents = json.dumps(username)
        path.write_text(contents)
        print(f"We'll remember you when you come back, {username}!")

greet_user()

Welcome back, Aremu Bala!


Let’s refactor greet_user() so it’s not doing so many different tasks. 

We’ll start by moving the code for retrieving a stored username to a separate function:

In [128]:
from pathlib import Path
import json
def get_stored_username(path):
    """Get stored username if available."""

    if path.exists():
        contents = path.read_text()
        username = json.loads(contents)
        return username
    else:
        return None
    
def greet_user():
    """Greet the user by name."""
    path = Path('username.json')
    username = get_stored_username(path)
    if username:
        print(f"Welcome back, {username}!")
    else:
        username = input("What is your name? ")
        contents = json.dumps(username)
        path.write_text(contents)
        print(f"We'll remember you when you come back, {username}!")


greet_user()

Welcome back, Aremu Bala!


In [129]:
from pathlib import Path
import json
def get_stored_username(path):
    """Get stored username if available."""
    if path.exists():
        contents = path.read_text()
        username = json.loads(contents)
        return username
    else:
        return None
def get_new_username(path):
    """Prompt for a new username."""
    username = input("What is your name? ")
    contents = json.dumps(username)
    path.write_text(contents)
    return username
def greet_user():
    """Greet the user by name."""
    path = Path('username.json')
    username = get_stored_username(path)
    
    if username:
        print(f"Welcome back, {username}!")
    else:
        username = get_new_username(path)
        print(f"We'll remember you when you come back, {username}!")

greet_user()

Welcome back, Aremu Bala!


: 

# Chapter Summary: Working with Files

In this chapter, you learned the following:

1. **Reading Files**:
   - How to read the entire contents of a file.
   - Techniques to process the file's contents one line at a time when needed.

2. **Writing to Files**:
   - Methods to write as much text as desired to a file.

3. **Handling Exceptions**:
   - Understanding exceptions.
   - Learning how to handle common exceptions encountered in programs.

4. **Storing Data**:
   - Techniques for storing Python data structures.
   - Saving user-provided information to prevent users from restarting each time they run a program.
