# **File input/output**
______________________________

## Contents:
- [Open File](#Open-file)
- [Close file](#Close-file)
- [Read file](#Read-file)
- [Position in the file](#Position-in-the-file)
- [Context Manager](#Context-manager)
- [Write in the file](#Write-in-the-file)
- [Useful functions](#Useful-functions)
- [Examples](#Examples)

## **`Open file`**

**`<variable>`** here is a `file object` or `file descriptor` and in fact is a link to a file and it doesn't keep anything from the file content

Possible **`<access_modes>`** are:

| literal | mode | description |
| --- | --- | --- |
| `'x'` | create | create a new file (error if file exists) |
| `'r'` | read | open for reading only (by default) |
| `'w'` | write | open for wrinting, new records replace the old ones. If file not exists, a new one creates |
| `'a'` | append | open for update. New records added in the end of the file |
| `'t'` | text | open in a text mode (by default) |
| `'b'` | binary | working with binary files |
| `'r+'` | read + write | open for reading and writing |
| | | modes can be joined, default mode is `'rt'` |

**`<file_name>`** should contain also a path to the file in case it is located outside the working directory/folder

In [5]:
# this should create a file test.txt in C:\Users\temp\ (Windows)
#test_file = open('C:\\Users\\temp\\test.txt', 'w')

#or:

#test_file = open(r'C:\Users\temp\test.txt', 'w')

While working with file paths (Windows) it is convinient to use `raw` strings:

In [6]:
# normal string
path1 = 'C:\new\test.txt'
print(path1)

#raw string
path2 = r'C:\new\test.txt'
print(path2)

C:
ew	est.txt
C:\new\test.txt


#### **`Encoding`**

We can indicate the file encoding using `encoding` parameter

In [7]:
file1 = open('test_files/test.txt', 'r', encoding='utf-8')

file1.encoding

'utf-8'

## **`Close file`**

After working with file it should be closed

In [8]:
file1.close() # closing previously opened file 'test.txt'

To check if file is closed or not, use atribute `closed`

In [9]:
file1.closed

True

## **`Read file`**

#### To read from the file we can use the following methods:
* `read()`
* `readline()`
* `readlines()`

#### **`read()`**

Method `read()` reads everything from the file at once and returns a `string`, that can contain all special symbols like `'\n'`, `'\t'`, etc

In [10]:
file = open('test_files/test.txt', 'r')

content = file.read() # read all and save all into 'content'

#limited_content = file.read(6) # read 6 symbols

file.close()

In [11]:
content

'Python\nJava\nJavascript\nC#\nC\nC++\nPHP\nR\nObjective-C'

#### **`readline()`**

Method `readline()` reads 1 line from the file (up to '\n') and returns this line plus '\n'. If no lines in the file method `readline()` returns empty line

In [12]:
file = open('test_files/test.txt', 'r')

language = file.readline()

file.close()

In [13]:
language, language.strip()

('Python\n', 'Python')

In [14]:
# read several lines:
file = open('test_files/test.txt', 'r') # open test.txt with 'r'-argument, meaning 'read'

s1 = file.readline() # read the first line in the file
s2 = file.readline() # read the second line in the file

file.close() # close the file (obligatory)

In [15]:
s1, s2.rstrip()

('Python\n', 'Java')

In [16]:
# read all the lines:
file = open('test_files/test.txt')

for line in file:
    print(line.strip())
    
file.close()    

Python
Java
Javascript
C#
C
C++
PHP
R
Objective-C


#### **`readlines()`**

Method `readlines()` reads all lines from the file and returns a `list` of all lines with '\n' at the end of each line

In [17]:
file = open('test_files/test.txt', 'r')

langs = file.readlines() # with all special symbols

# clean version:
langs1 = [line.strip() for line in file.readlines()]
# or
langs2 = list(map(str.strip, file.readlines()))
# or
langs3 = list(map(lambda line: line.strip(), file.readlines()))

file.close()

In [18]:
langs

['Python\n',
 'Java\n',
 'Javascript\n',
 'C#\n',
 'C\n',
 'C++\n',
 'PHP\n',
 'R\n',
 'Objective-C']

## **`Position in the file`**

When we read text from a file using the `read()` or `readlines()` methods, the current position is moved to the end of the file. When using the `readline()` method, the current position is moved to the next line in the file.

After the reading is completed, we can no longer read a single character from the file. All subsequent calls to `read()` or `readline()` will result in an empty line being read.

To re-read data from a file, we can:
* reopen the file, then the position will again go to the beginning
* move the position using the **`seek()`** file method.

#### **`Method seek()`**

The `seek()` method sets the position in bytes from the beginning of the file

In [19]:
file = open('test_files/file.txt', 'r', encoding='utf-8')
line1 = file.readline().strip()
file.seek(0)               # set the position to the beginning
line2 = file.readline()

print(line1, line2, sep='\n')

file.close()

Kate
Kate



#### **`Method tell()`**

The `tell()` returns the position in bytes

In [20]:
file = open('test_files/file.txt', 'r', encoding='utf-8')

print(file.tell())
print(file.readline().strip())
print(file.tell())

file.close()

0
Kate
5


## **`Context manager`**

Context managers allow to allocate and release resources precisely when you want to. The most widely used example of context managers is the `with` statement.

In [21]:
# reading file using 'with'
with open('test_files/file.txt') as input_file:
    s3 = input_file.readline()
    s4 = input_file.readline()
# no need to close the file in this case

In [22]:
# reading file by iterating through its lines
with open('test_files/file.txt') as inf:
    for line in inf:
        line = line.strip()
        print(line)

Kate
Maria
Ann


With the context manager, we can work with multiple files at once:

In [23]:
with open('test_files/file.txt', 'r') as input_file, open('test_files/output.txt', 'w') as output_file:
    print(input_file.read())

Kate
Maria
Ann



## **`Write in the file`**

There are 3 ways to write in the file:
* method `write()` – writes a line (string) into the file
* method `writelines()` – writes a list of strings into the file
* function `print('', file='output_file')`


#### **`Method write()`**

If the file is opened in `'w'` mode, then its contents are fully overwritten by the new lines. <br>
If the file is opened in `'a'` mode, then new lines append into the end of the file <br>
If the file is opened in `'r+'` mode, then its contents are partially overwritten

In [24]:
# 1st option
ouf = open('test_files/file.txt', 'w') # open file.txt for writing ('w')
ouf.write('this text is written from the python program\n')
ouf.write(str(25)) # if we want to write a number, we need to make it string
ouf.close()

In [25]:
# 2nd option
with open('test_files/file.txt', 'a') as ouf:
    ouf.write('another text\n')
    ouf.write(str(34.32))
# no need to close the file here

In [26]:
with open('test_files/file.txt', 'r+') as ouf: # if there are 3 lines in the file, 
    ouf.write('another text\n')     # then, if we write a new one,
    ouf.write(str(34.32))           # and then the second one,
# then finally we will have 3 lines in the file - first two are newly added and the last old one

#### **`Method writelines()`**

In [27]:
names = ['Tom\n', 'John\n', 'Mark\n'] # a list of strings each with '\n' at the end

with open('test_files/file.txt', 'a', encoding='utf-8') as file:
    file.writelines(names) # write a list of strings into the file

#### **`Function print()`**

In [28]:
with open('test_files/file.txt', 'w', encoding='utf-8') as output:
    print('Kate', file=output)
    print('Maria', file=output)
    print('Ann', file=output)

## **`Useful functions`**

1.`String methods` can be used to modify line output:
* s = input_file.readline()**.rstrip()** - removes all unnecessary symbols like '\t', '\n', etc from the end of the line
* s = input_file.readline()**.lstrip()** - removes all unnecessary symbols like '\t', '\n', etc from the beginning of the line
* s = input_file.readline()**.strip()** - removes all unnecessary symbols like '\t', '\n', etc from the line
* os.path **.join**('.', 'dirname', 'filename.txt') - creates a full path to the file ('./dirname/filename.txt')
2. `file.closed` - returns `True` if file is closed, else `False`
3. `file.mode` - returns access option
4. `file.name` - returns a name of the file

## **`Examples`**

#### Print the penultimate line from the file

In [29]:
with open('test_files/file.txt') as f:
    print(f.readlines()[-2])

Maria



#### Print the random line from the file

In [31]:
import random
with open('test_files/file.txt') as f:
    print(random.choice(f.readlines()))

Ann



#### Print the line reversal from the file

In [32]:
with open('test_files/file.txt', encoding='utf-8') as file:
    print(file.readline()[::-1])


etaK


#### Print all lines from the file in reverse order

In [33]:
with open('test_files/file.txt', encoding='utf-8') as file:
    print(*file.readlines()[::-1], sep='')

Ann
Maria
Kate



#### Print the max length line from the file

In [34]:
with open('test_files/test.txt', encoding='utf-8') as file:
    lines = list(map(str.strip, file.readlines()))
    print(*filter(lambda x: len(x)==max(map(len, lines)), lines), sep='\n')

Objective-C


#### Print the sum of numbers in each line from the file

In [35]:
with open('test_files/numbers.txt', encoding='utf-8') as file:
    for line in file:
        print(sum(map(int, line.split())))

3
7
8


#### Print the statistics of the file: number of letters, number of words, number of lines

In [36]:
with open('test_files/file.txt', encoding='utf-8') as file:
    num_lines = len(file.read().split('\n'))
    file.seek(0)
    num_words = sum([len(line.strip().split()) for line in file])
    file.seek(0)
    num_letters = len(list(filter(lambda x: x.isalpha(), file.read())))

print('Input file contains:')
print(f'{num_letters} letters\n{num_words} words\n{num_lines} lines')

Input file contains:
12 letters
3 words
4 lines


#### Print the random pair of 'name language' with names and languages located in 2 files

In [37]:
import random
with open('test_files/file.txt') as names,  open('test_files/test.txt') as langs:
    n, s = names.read().split(), langs.read().split()

print(*[(random.choice(n), random.choice(s)) for i in range(3)])

('Ann', 'Objective-C') ('Maria', 'R') ('Kate', 'C')


#### Create a function 'read_csv' that reads a data.csv file and transforms it to a list of dictionaries

In [38]:
def read_csv(file):
    with open(file) as file:
        lst = []
        keys = file.readline().strip().split(',')
        for line in file:
            lst.append(dict(zip(keys, line.strip().split(','))))
    return lst

In [39]:
read_csv('test_files/data.csv')

[{'name': 'George', 'address': '4312 Abbey Road', 'age': '22'},
 {'name': 'John', 'address': '54 Love Ave', 'age': '21'}]

#### Create a program that writes to a text file random.txt 25 random numbers in the range from 111 to 777 (inclusive), each on a new line

In [40]:
import random
with open('test_files/random.txt', 'w') as file:
    file.writelines([f'{random.randint(111, 777)}\n' for i in range(25)])

#### Create a program that numerates lines from one file to another

In [41]:
with open('test_files/file.txt') as inp, open('test_files/output.txt', 'w') as out:
    for i, j in enumerate(inp, start=1):
        print(f'{i}) {j}', end='', file=out)

#### Given a file with student names and marks for 3 tests, write a program to count the number of students who passed all three tests. The test is considered passed if the mark is not less than 65

In [42]:
with open('test_files/grades.txt', 'r') as inp:
    d = []
    for line in inp:
        name, *marks = line.strip().split()
        if all(map(lambda x: int(x) >= 65, marks)):
            d.append(name)
    print(len(d))

2
