# Chapter 8 – Reading and Writing Files

### os.path.join() function

To make your program to work on every OS (Windows, Linux and OSX) you need to run some codes before the beginning of the program. That's because the windows syntax for pathing is the backslash folder separator ('\') and on Linux and OSX it's the forward separator ('/'). But python has a simpler way to do that, see below.

In [1]:
import os
os.path.join('usr', 'bin', 'spam')

'usr/bin/spam'

**Note:** See that the os.path.join() function makes the path on the user OS on the inside arguments in order.

**Note2:** We will never import the os module, but you need to import it on every new program.

Here's another example with a strings of files and a path.

In [6]:
myFiles = ['accounts.txt', 'details.csv', 'invite.docx']
for filename in myFiles:
    print(os.path.join('/usr/vilasboasmv', filename))

/usr/vilasboasmv/accounts.txt
/usr/vilasboasmv/details.csv
/usr/vilasboasmv/invite.docx


### The Current Working Directory

You can ask for python what's your working directory and it will output it as a string with the `os.getcwd()`. Another useful function is the `os.chdir()` that can change the directory that you are working on.

We wouldn't change the directory of this notebook because it is linked with github and would cause some problems during study.

In [8]:
os.getcwd()

'/home/vilasboasmv/Automate_Boring_Stuff/Chapter 8'

### Absolute vs. Relative Paths

**Absolute path:** always begins with the root folder.

**Relative path:** relative to the program's current working directory.

We also have the special names for folders like dot (.) and dot-dot (..)

**dot:** means "this folder".

**dot-dot:** means the "parent folder".

Here's an example:

![alt text](https://automatetheboringstuff.com/images/000032.jpg "dot and dot-dot example")

### Creating New Folders

You can create new folders by using the `os.makedirs()` function. For example:

```python3
import os
os.makedirs('/usr/documents/newfolder')
```

### Handling Absolute and Relative Paths
Calling `os.path.abspath(path)` will return a string of the absolute path of the argument. This is an easy way to convert a relative path into an absolute one.

Calling `os.path.isabs(path)` will return True if the argument is an absolute path and False if it is a relative path.

Calling `os.path.relpath(path, start)` will return a string of a relative path from the start path to path. If start is not provided, the current working directory is used as the start path.

Se this examples:

In [5]:
print(os.path.abspath('.'))
print(os.path.abspath('./usr'))
print(os.path.isabs('.'))
print(os.path.isabs(os.path.abspath('.')))

/home/vilasboasmv/Automate_Boring_Stuff/Chapter 8
/home/vilasboasmv/Automate_Boring_Stuff/Chapter 8/usr
False
True


Now for `os.path.relpath` the program will find the path from the root ('/') to the current working directory. Note that is the same as the `os.getcwd()` but it doesn't include the root, so the third line is a False assumption.

In [12]:
print(os.path.relpath('.', '/'))
print(os.getcwd())
print(os.path.relpath('.', '/') == os.getcwd())

home/vilasboasmv/Automate_Boring_Stuff/Chapter 8
/home/vilasboasmv/Automate_Boring_Stuff/Chapter 8
False


Calling `os.path.dirname(path)` will return a string of everything that comes before the last slash in the path argument. Calling `os.path.basename(path)` will return a string of everything that comes after the last slash in the path argument.

See the application of the example from the path and files.

> /home/vilasboasmv/downloads/Anaconda3-5.2.0-Linux-x86_64.sh

In [15]:
path = '/home/vilasboasmv/downloads/Anaconda3-5.2.0-Linux-x86_64.sh'
print(os.path.basename(path))
print(os.path.dirname(path))

Anaconda3-5.2.0-Linux-x86_64.sh
/home/vilasboasmv/downloads


Now if you want to get both on a tuple you can use `os.path.split()` function.

In [16]:
os.path.split(path)

('/home/vilasboasmv/downloads', 'Anaconda3-5.2.0-Linux-x86_64.sh')

Also, you can use the split to get everything separeted on a tuple with the `os.path.sep` function.

In [17]:
path.split(os.path.sep)

['', 'home', 'vilasboasmv', 'downloads', 'Anaconda3-5.2.0-Linux-x86_64.sh']

### Finding File Sizes and Folder Contents

Calling `os.path.getsize(path)` will return the size in bytes of the file in the path argument. And, calling `os.listdir(path)` will return a list of filename strings for each file in the path argument. (Note that this function is in the os module, not os.path.)

In [6]:
print('The size of this directory is: '+ str(os.path.getsize('.'))+' bytes.')
print(os.listdir('.'))

The size of this directory is: 4096 bytes.
['.ipynb_checkpoints', 'Chapter 8 – Reading and Writing Files.ipynb']


You can also sum all the sizes from the files of a directory.

In [7]:
totalSizes = 0
for filename in os.listdir('/home/vilasboasmv/Downloads'):
    totalSizes = totalSizes + os.path.getsize(os.path.join('/home/vilasboasmv/Downloads'))
print(totalSizes)

45056


### Checking Path Validaty

Calling `os.path.exists(path)` will return True if the file or folder referred to in the argument exists and will return False if it does not exist.

Calling `os.path.isfile(path)` will return True if the path argument exists and if is there a file and will return False otherwise.

Calling `os.path.isdir(path)` will return True if the path argument exists and if is there a folder and will return False otherwise.

Examples:

In [11]:
print(os.path.exists('/home'))
print(os.path.exists('/madeup'))
print(os.path.isfile('.'))
print(os.path.isfile('/home/vilasboasmv/Downloads/foot.JPG'))
print(os.path.isdir('.'))
print(os.path.isdir('/home/vilasboasmv/Downloads/foot.JPG'))

True
False
False
True
True
False


## The File Reading/Writing Process

There are three steps to reading or writing files in Python.

1. Call the `open()` function to return a File object.

2. Call the `read()` or `write()` method on the File object.

3. Close the file by calling the `close()` method on the File object.

### Opening files with `open()` Function

See the example:

In [13]:
helloFile = open('./Hello.txt')

When you use this function, the program enter on a only-read mode, in other words python can't write or modify the file with `open()`.

### Reading the file content

Now you can start reading it with `read()`.

In [14]:
helloContent = helloFile.read()
print(helloContent)

Hello World!



Alternatively, you can use the `readlines()` method to get a list of string values from the file, one string for each line of text. See the example.

In [15]:
sonnetFile = open('sonnet29.txt')
sonnetFile.readlines()    

["When, in disgrace with fortune and men's eyes,\n",
 'I all alone beweep my outcast state,\n',
 'And trouble deaf heaven with my bootless cries,\n',
 'And look upon myself and curse my fate,\n']

This technique makes easier to work with readable files.

### Writing to Files

When you open a file with the open() function it automatically uses the default of the second argument that is "only-read" argument, you can input the second argument of only-read by using the 'r', see the example:

```python3
open('./example.txt', 'r')
```

To write files you need to open it by allowing it to write the file that you are opening. Using the second argument as 'w'. Using the second argument as write mode it will overwrite everything that is on the file. To maintain the data that is on the file you can use the append mode, by passing the 'a' as the second argument, by this way it will add what you want to the end of the file.

If the filename doesn't exist when you pass the open() statement it will create a file if you use the write or append as the second argument.

After reading a file, you can close it by using `close()` statement.

In [2]:
baconFile = open('bacon.txt', 'w')
baconFile.write('Hello World!\n')

13

In [3]:
baconFile.close()
baconFile = open('bacon.txt', 'a')
baconFile.write('Bacon is a not a vegetable.')

27

In [4]:
baconFile.close()
baconFile = open('bacon.txt')
content = baconFile.read()
baconFile.close()
print(content)

Hello World!
Bacon is a not a vegetable.


### Saving Variables with the shelve Module

You can save data and re-open it with the shelve module. See the example:

In [5]:
import shelve
shelfFile = shelve.open('mydata')
cats = ['Zophie', 'Pooka', 'Simon']
shelfFile['cats'] = cats
shelfFile.close()