> **Disclaimer:** Working with I/O is different depending on your working environment. Make sure you understand how to allow your notebook access to the files.

# Introduction

First of all - **File object and file content are NOT the same**. A [file object][File] is the Pythonic way of "communicating" with the file, e.g. query its properties, manage its attributes, etc. One of the many actions applicable with a file object is to read/write its content. This "communication" is implemented by the built-in function [`open()`][open], which also sets some preliminary features of the object.

[File]: https://docs.python.org/2/library/stdtypes.html#file-objects "File object"
[open]: https://docs.python.org/2/library/functions.html#open "open() documentation"

## Open and close

_File_ objects are created by the `open(name[, mode])` built-in function, where _name_ is the full file path and _mode_ is the mode in which the file is opened. Several modes are available, but the most common ones are **'r'** for reading (default), **'w'** for writing and **'a'** for appending.

It is not a healthy habit to leave open _File_ objects "hanging" in the file system, so we make sure to close them after we are done with them. The following three scripts illustrate exceedingly better syntaxes for addressing a file.

> **Note:** This example uses the file example.txt, which is available [here](https://drive.google.com/drive/folders/1KQXg5CpZ8u59ybkvOnFPzo_WaLHxek_g?usp=sharing).

#### open() - 1

In [1]:
fname = "example.txt"

In [2]:
my_file = open(fname, 'r')
# Here do something with the file...

# my_file.closed
# my_file.close()

In [None]:
my_file.close()

#### open() - 2

Finally, Python supports the following syntax to wrap it all compactly. **This is how it is usually done.**

In [None]:
with open(fname, 'r') as my_file:
    # Here do something with the file...
    pass

print("Done!")

# Reading

There are several ways to read the data of a file, and we will see two of them:
* Iteratively with a `for`-loop
* As a whole with the `read()` method

## Read with a `for`-loop

_File_ objects are their own iterators, and their "elements" are their lines. Iterating a _File_ object with a _for_ loop will ieterate the lines of the file. Note that the lines include the "\n" at the end of each line (therefore the double-space print).

In [7]:
fname = "example.txt"
line_counter = 1

with open(fname) as f:
    for line in f:
        print(line_counter, ":", line)
        line_counter += 1
    print("Done reading the file!")

1 : This is the first line.

2 : This is the second line.

3 : This is the third and last line.
Done reading the file!


In [4]:
fname = "example.txt"

with open(fname) as f:
  lines = list(f)

In [5]:
lines

['This is the first line.\n',
 'This is the second line.\n',
 'This is the third and last line.']

> **Note:** Why are there double spaces in the output in the example above?

## Read with _read()_

This method is the most simple one, as it simply reads the entire content of the file into a single string.

In [18]:
fname = "example.txt"

with open(fname) as f:
    text = f.read()

In [19]:
text

'This is the first line.\nThis is the second line.\nThis is the third and last line.'

In [None]:
text.split("\n")

> **Your turn:** Read the file "players.txt". Can you tell how many lines does it have?

### Solution

In [10]:
open?

In [11]:
with open("players.txt") as players_file:
    line_count = 0
    for line in players_file:
        line_count +=1

line_count

7

In [14]:
with open("players.txt") as players_file:
  text = players_file.read()
  line_count = text.count("\n")

line_count

6

# Writing

### Writing methods

Similarly to `read()`, there is `write()` for writing. `write()` expects a single string and writes it directly to the file. `write()` automatically creates a new file if required, and overwrites the content of the file if it already exists.

In [20]:
fname = "example.txt"

str1 = "I. This is the first line."
str2 = "II. This is the second line."
str3 = "III. This is the third and last line."

In [24]:
with open(fname, 'w') as f:
    f.write(str1 + '\n')
    f.write(str2 + '\n')
    f.write(str3)

In [25]:
lines = [
    f"{str1}\n",
    f"{str2}\n",
    f"{str3}\n"
]

print(lines)

with open(fname, 'w') as f:
  f.writelines(lines)
  # Same as:
  # for line in lines:
  #   f.write(line)

['I. This is the first line.\n', 'II. This is the second line.\n', 'III. This is the third and last line.\n']


### Writing modes

In standard writing mode, indicated by 'w', a new file will be created and an existing file will be overwritten.

Compare the example above with the following:

In [29]:
with open(fname, 'w') as f:
    f.write(str1 + '\n')

with open(fname, 'w') as f:
    f.write(str2 + '\n')

with open(fname, 'w') as f:
    f.write(str3)

Testing...

In [27]:
with open(fname, 'r') as f:
    print(f.read())

III. This is the third and last line.


If we want to append the data to what is already in the file, then we should use the append mode, indicated by 'a'.

In [31]:
with open(fname, 'w') as f:
    f.write(str1 + '\n')

with open(fname, 'a') as f:
    f.write(str2 + '\n')

with open(fname, 'a') as f:
    f.write(str3)

Testing...

In [None]:
with open(fname, 'r') as f:
    print(f.read())

In [32]:
import os

In [33]:
os.listdir()

['.config', 'word length.txt', 'example.txt', 'players.txt', 'sample_data']

## Additional notes

* Many other file-related functionalities (copy, remove, delete, existence, etc.) are available in other modules.

* The concept of "openning" is very general and is in use by many other **file-like** objects, including web-pages, I/O streams and others.

* Two other common aspects of working with files are not covered here, and the reader is encouraged to explore them further by referring to the following:
    * Buffering - the _open()_ argument _buffering_ and the _File_ method _flush()_
    * Position - the _File_ methods _seek()_ and _tell()_

* File extensions (e.g. txt, csv, html, etc.) are irrelevant for the _open()_ functionality. They are used by the OS to relate files to their relevant application.

* for dealing with paths, see [blog post about the python pathlib module][1]


[1]: https://treyhunner.com/2018/12/why-you-should-be-using-pathlib/

# Exercises


## Exercise 1 <--- Homework 9/7/2024

> Part 1: Create a new empty file named "my_file.txt".<br/>
> Part 2: Write into the new file the words: "hello world".<br/>
> Part 3: open the file and check the contents.

### Solution

In [None]:
# Option 1
my_file = open("new_file.txt", 'w')
my_file.write("hello world")
my_file.close()

In [None]:
# Option 2
with open("new_file.txt", 'w') as my_file:
  my_file.write("hello world")

In [None]:
# Double click on the file in colab and see "Hello world"

## Exercise 2

* Part 1 -  Use the list of words to create a file called “word length.txt”, containing a line saying “The word \<word\> has \<n\> letters” for each word in the list.
* Part 2 - Read the file "word_length.txt" and print its contents.

In [None]:
my_words = ['The', 'quick', 'brown', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

### Solution

In [None]:
# part 1:
with open('word length.txt', 'w') as f:
    for word in my_words:
        f.write('The word {} has {} letters.\n'.format(word, len(word)))    # you could use f-strings just the same

# part 2:
with open('word length.txt', 'r') as f:
    print(f.read())