# File I/O

## File Directory
* Your file explorer has all the files stored on your computer, organized into folders (also called directories)
* Each file has a path, which is the hierarchy of folders you need to follow to find the file.
  * In Windows, paths have backwards slashes: C:\Users\YourUsername\Programming\My_Code.py
  * In Unix (Mac or Linux), paths have forward slashes: c/Users/YourUsername/Programming/My_Code.py

## Opening Files
* To open a file, use the open() function, which takes 2 parameters: filename (required) and mode (optional)
  * Filename is the name of your file, as a string in quotes.
    * If it's in the same folder, all you need is the name. If it's in a different folder, you need the entire path.
    * Make sure to get the correct extension.
* open() returns a file object that you can use later.

In [3]:
# If the file doesn't exist, make the mode set to 'w' for write (which will create the file if not already existing)
f = open("my_file.txt", 'w')
# Monday.txt already exists, so we don't need to specify a mode
monday = open("monday.txt")

## Closing Files
* Make sure to close your files!
* There are multiple ways to do this

In [None]:
# One option is to open your file, set it as a variable, then close it later

file = open("my_file.txt")
# file processing
file.close()

# If your code runs into an error during the file processing, your file won't close

In [None]:
# Option 2
# To ensure the file closes properly, even if there are errors in your file processing, you can use a try-finally block

file = open("my_file.txt")
try:
  # file processing
  pass
finally:
  file.close()

In [None]:
# Option 3
# If you iterate through your file using a with statement, your file automatically closes, even if an error occurs during the file processing

with open("my_file.txt") as file:
  # file processing
  pass

## File Mode
* The mode is the second, optional parameter to the open function, and allows you to put the file in "read-only" or "write-only" mode
* Most commonly used modes:
  * 'r' open for reading (default)
  * 'w' open for writing, which truncates (overwrites) the file

### All Modes
* 'r' open for reading (default). Starts at the beginning of the file.
* 'r+' open for reading and writing. Starts at the beginning of the file.
* 'w' open for writing. Truncate (overwrite) the file. Starts at the beginning of the file.
* 'w+' open for reading and writing. Create the file if it does not exist, otherwise truncate it. Starts at the beginning of the file.
* 'a' open for writing, create the file if it doesn't exist, and append to the end of the file instead of overwriting. Subsequent writes will always end up at the current end of the file.
* 'a+' open for reading and writing, create the file if it doesn't exist, and append to the end of the file instead of overwriting. Subsequent writes will always end up at the current end of the file.

## Reading Files
Once you have a file object in read-only mode, we can read the file.
There are three functions to read data from a file:
* .read() Reads the entire file into a multi-line string
* .readline() Reads one line of the file into a string
* .readlines() Reads the entire file into a list, where each element in the list is a string representing one line.

In [4]:
with open("monday.txt", 'r') as monday:
  output = monday.read()
  print(output)

Happy Monday everyone!
What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Why do we use it?
It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page editors no

In [7]:
with open("monday.txt", 'r') as monday:
  line = monday.readline()
  print(line)

Happy Monday everyone!



In [8]:
with open("monday.txt", 'r') as monday:
  lines = monday.readlines()
  for line in lines:
    print(line)

Happy Monday everyone!

What is Lorem Ipsum?

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.



Why do we use it?

It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page edito

In [9]:
'''
Exercise
Create a program that reads a text file named "example.txt" and outputs the number of lines in the file.
(Assume your text file is in the same directory as your Python file)
Hint: Open the file in read-only mode, and use the .readlines() function.

'''

with open('example.txt') as f:
  lines = f.readlines()
  output = len(lines)
  print(output)

17


In [12]:
'''Exercise
You are a data analyst who must analyze a company's sales data to determine which products are selling the best.
The data is stored in a CSV file named "sales_data.csv". The file contains the following columns: Product Name, Date Sold, Units Sold, Price per Unit, Total Sale
Write a Python script that reads the data from the CSV file and calculates the total revenue generated by each product.
Make sure to ignore the first line, since that's the header.
You can go through the file line by line, and split each line into strings based on commas to get the individual column values.
You can keep track of each product and its total sale in a dictionary.
The only two relevant columns are Product Name and Total Sale.

https://docs.python.org/3/library/csv.html
'''
sales_dict = {}
with open("sales_data.csv") as data:
  lines = data.readlines()
  num_lines = len(lines)
  for i in range(1, num_lines):
    line = lines[i]
    product_name, date_sold, units_sold, price_per_unit, total_sale = line.removesuffix('\n').split(sep=',', maxsplit=5)
    sales_dict[product_name] = total_sale
  print(sales_dict)

{'Jibbers': '62.37', 'Jabbers': '94.5', 'Willers': '45', 'Wonkers': '96'}


In [18]:
sales_dict = {}
with open("sales_data.csv") as data:
  next(data)
  output = data.readlines()
  # print(output)
for o in output:
  product_name, date, quantity, price, total = o.removesuffix('\n').split(",")
  sales_dict.update({product_name : total})
print(sales_dict)

['Jibbers,12/1/2023,11,5.67,62.37\n', 'Jabbers,11/30/2023,14,6.75,94.5\n', 'Willers,10/12/2023,10,4.5,45\n', 'Wonkers,12/3/2023,12,8,96\n']
{'Jibbers': '62.37', 'Jabbers': '94.5', 'Willers': '45', 'Wonkers': '96'}


## Writing to Files
Make sure to open your file in a mode that allows writing - the default mode is read-only.
There are two functions to write data to a file:
* .write(S) Insert the string S in a single line in the file.
* .writelines(L) For a list L containing strings, insert each string as a new line in the file.


In [20]:
sales_data = ['Jibbers,12/1/2023,11, 5.67\n', 'Jabbers,11/30/2023,14, 6.75\n', 'Willers,10/12/2023,10, 4.50\n', 'Wonkers,12/3/2023,12, 8.00\n']

with open("new_sales_data.csv", 'w') as file:
  file.writelines(sales_data)

In [21]:
# This is how we can check if it worked
with open("new_sales_data.csv", 'r') as file:
  check = file.readlines()

if (check == sales_data):
  print("The data is the same!")
else:
  print("The data was not written/read properly.")

The data is the same!


In [27]:

'''
Exercise
Create a program that asks the user for their name and favorite color, then writes this information to a new text file named "user_info.txt".
Write the name on the first line of the file, and the age on the second line of the file.

'''

username = input("Enter your name: ")
user_color = input("Enter your favorite color: ")

with open("user_info.txt", 'w') as f:
  f.write(username + '\n' + user_color)


Enter your name: Pat
Enter your favorite color: Blue


In [28]:
with open("user_info.txt", 'r') as file:
  check = file.read()
  print(check)

Pat
Blue


##File Formatting in Windows vs Unix
* Text files are formatted differently depending on whether you use Windows or Unix (Mac/Linux)
  * In Windows, there is \r\n at the end of every line (\r is carriage return)
  * In Unix, there is just \n at the end of every line
* This usually doesn't cause any issues if you're just doing simple file processing, but it makes the exact contents of a text file differ based on what operating system you're using