# Lecture 3

## I/O (Reading From and Writing To Files)
## Navigating The File System
----

## I/O

### Reading

Before any file operation we need to `open` the file.

In [1]:
f = open('Linux_2k.log')

In [2]:
print(f)

<_io.TextIOWrapper name='Linux_2k.log' mode='r' encoding='cp1250'>


In [3]:
f.read()

'Jun 14 15:16:01 combo sshd(pam_unix)[19939]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=218.188.2.4 \nJun 14 15:16:02 combo sshd(pam_unix)[19937]: check pass; user unknown\nJun 14 15:16:02 combo sshd(pam_unix)[19937]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=218.188.2.4 \nJun 15 02:04:59 combo sshd(pam_unix)[20882]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=220-135-151-1.hinet-ip.hinet.net  user=root\nJun 15 02:04:59 combo sshd(pam_unix)[20884]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=220-135-151-1.hinet-ip.hinet.net  user=root\nJun 15 04:06:18 combo su(pam_unix)[21416]: session opened for user cyrus by (uid=0)\nJun 15 04:06:19 combo su(pam_unix)[21416]: session closed for user cyrus\nJun 15 04:06:20 combo logrotate: ALERT exited abnormally with [1]\nJun 15 04:12:42 combo su(pam_unix)[22644]: session opened for user news by (uid=0)\nJun 15 04:12:43 combo su(pam_uni

You also need to `close` the file, otherwise your program will not allow other programs to access it.

In [4]:
f.close()

Note: We are using a system log example from the [Loghub](https://github.com/logpai/loghub) repository. The relevant documentation can be found on [arxiv.org](https://arxiv.org/abs/2008.06448).

You can also add *encoding information* to the `open()` method to avoid the mess with funny characters. 

In [5]:
f = open('city_names.txt')
f.read()

'ÄŚeskĂ˝ Krumlov, PĂ©cs, KrakĂłw'

In [6]:
f.close()

In [7]:
f = open('city_names.txt', encoding = 'utf-8')
f.read()

'Český Krumlov, Pécs, Kraków'

In [8]:
f.close()

You can find encoding options for all languages and character sets in the documentation of the [codecs module](https://docs.python.org/3/library/codecs.html#standard-encodings). 

Multiline text can be also be read sequentially.

In [9]:
f = open('Linux_2k.log')

In [10]:
f.readline()

'Jun 14 15:16:01 combo sshd(pam_unix)[19939]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=218.188.2.4 \n'

In [11]:
f.readline()

'Jun 14 15:16:02 combo sshd(pam_unix)[19937]: check pass; user unknown\n'

In [12]:
f.close()

The best way to read and write file is by using the `with` statement. This ensures that the file is closed when the block inside the with statement is exited. We don't need to explicitly call the `close()` method, it is done internally.

In [13]:
with open('Linux_2k.log', encoding="utf-8") as f:
    for line in f:                # remember to indent! 
        print(line)

# After the operation the connection to the file is closed. 

Jun 14 15:16:01 combo sshd(pam_unix)[19939]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=218.188.2.4 

Jun 14 15:16:02 combo sshd(pam_unix)[19937]: check pass; user unknown

Jun 14 15:16:02 combo sshd(pam_unix)[19937]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=218.188.2.4 

Jun 15 02:04:59 combo sshd(pam_unix)[20882]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=220-135-151-1.hinet-ip.hinet.net  user=root

Jun 15 02:04:59 combo sshd(pam_unix)[20884]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=220-135-151-1.hinet-ip.hinet.net  user=root

Jun 15 04:06:18 combo su(pam_unix)[21416]: session opened for user cyrus by (uid=0)

Jun 15 04:06:19 combo su(pam_unix)[21416]: session closed for user cyrus

Jun 15 04:06:20 combo logrotate: ALERT exited abnormally with [1]

Jun 15 04:12:42 combo su(pam_unix)[22644]: session opened for user news by (uid=0)

Jun 15 04:12:43 combo su(pam_unix

### Writing

In [14]:
with open(file = 'message.txt', mode = 'w', encoding = 'utf-8') as write_text:
    write_text.write('Hello Monthy! \nThis is Python class on file I/O.')

There are four ways to open a file:
- "r" - Read - Default value. Opens a file for reading, error if the file does not exist
- "a" - Append - Opens a file for appending, creates the file if it does not exist
- "w" - Write - Opens a file for writing, creates the file if it does not exist
- "x" - Create - Creates the specified file, returns an error if the file exists

### Navigating The File System

One way to navigate in your file system is by using the `os` module. This module provides methods for getting directory info, creating and deleting folders, listing files, etc. 

In [15]:
import os

`getcwd()` will give you your current working directory, and `listdir()` lists the file in the directory of your choice. (If you don't give the 'path' parameter as input it will list the files in your current working directory.)

In [16]:
current_directory = os.getcwd()

In [17]:
files = os.listdir(current_directory)

In [18]:
print(files)

['.ipynb_checkpoints', 'city_names.txt', 'data_IO.ipynb', 'Linux_2k.log', 'message.txt', 'README.md']


In [19]:
type(files)

list

The `os`module uses Linux commands to interact with the file system. `mkdir()` will create a new directory, and `path.join()` is used to define new paths. Note, that the `path()` method uses the approprite directory separators, depending on your operating system. (Forward slashes for Linux and MAC, double backslashes for Windows.)

In [20]:
path = 'C:\\Users\\'
 
# Join various path components
os.path.join(path, 'Documents', 'Python_classes' ,'')

'C:\\Users\\Documents\\Python_classes\\'