# IO

> In computing, input/output or I/O (or, informally, io or IO) is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system.  from [wikipedia](https://en.wikipedia.org/wiki/Input/output#:~:text=In%20computing%2C%20input%2Foutput%20or,or%20another%20information%20processing%20system.)

# File based IO
The first kind of IO we're going to discuss are files.A well known abstraction (yes it is an abstraction) the Operating system provides us - instead of dealing with the spinning disk locations

In [6]:
# Shakespear sonnets
with open('./files/pg1105.txt', 'r') as f:
    res = f.readlines()
    print(len(res))
    print(res[34:40])
    # What happens if I try to reread it?
    

2854
['\n', 'The Complete Works of William Shakespeare\n', 'The Sonnets\n', '\n', 'November, 1997  [Etext #1105]\n', '\n']


In [8]:
# We might want to read line by line
with open('./files/pg1105.txt', 'r') as f:
    for n in range(1,10):
        line = f.readline()
        print(line)

﻿This Etext file is presented by Project Gutenberg, in

cooperation with World Library, Inc., from their Library of the

Future and Shakespeare CDROMS.  Project Gutenberg often releases

Etexts that are NOT placed in the Public Domain!!



*This Etext has certain copyright implications you should read!*



<<THIS ELECTRONIC VERSION OF THE COMPLETE WORKS OF WILLIAM

SHAKESPEARE IS COPYRIGHT 1990-1993 BY WORLD LIBRARY, INC., AND IS



In [9]:
# Obviousely we need to cleanup all those empty lines and get only the lines of eternal proze

interesting_lines = []
with open('./files/pg1105.txt', 'r') as f:
    for line in f:
        # Note I can just iterate on lines directly
        if line != '\n':
            interesting_lines.append(line)
        
print(interesting_lines[20:30])


['**Welcome To The World of Free Plain Vanilla Electronic Texts**\n', '**Etexts Readable By Both Humans and By Computers, Since 1971**\n', '*These Etexts Prepared By Hundreds of Volunteers and Donations*\n', 'Information on contacting Project Gutenberg to get Etexts, and\n', 'further information is included below.  We need your donations.\n', 'The Complete Works of William Shakespeare\n', 'The Sonnets\n', 'November, 1997  [Etext #1105]\n', 'The Library of the Future Complete Works of William Shakespeare\n', 'Library of the Future is a TradeMark (TM) of World Library Inc.\n']


In [10]:
# so lets try to write to a new file
with open('./files/new.txt', 'r') as f:
    f.writelines(interesting_lines)
    
# OH

FileNotFoundError: [Errno 2] No such file or directory: './files/new.txt'

In [11]:
# Notice this is a specialized jupyter lab syntax, where I can call shell commands with ! in the start
!touch ./files/interesting.txt

In [12]:
# So now we've got interesting file, can we write it to a new file?

with open('./files/interesting.txt', 'r') as f:
    f.writelines(interesting_lines)
    
# OH

UnsupportedOperation: not writable

In [13]:
# Note the 'r' flag - it stands for 'read', now let's change to write

with open('./files/interesting.txt', 'w') as f:
    f.writelines(interesting_lines)
    


In [15]:
# Quick check if something happened
!ls -al files

total 248
drwxrwxr-x 3 alonisser alonisser   4096 Sep 30 11:28 .
drwxrwxr-x 6 alonisser alonisser   4096 Sep 30 11:28 ..
-rw-rw-r-- 1 alonisser alonisser     23 Sep 30 01:01 data.json
-rw-rw-r-- 1 alonisser alonisser 111926 Sep 30 11:28 interesting.txt
drwxrwxr-x 2 alonisser alonisser   4096 Sep 30 01:06 .ipynb_checkpoints
-rw-rw-r-- 1 alonisser alonisser     11 Sep 30 01:08 loaded.json
-rw-rw-r-- 1 alonisser alonisser 115149 Sep 29 02:10 pg1105.txt


In [16]:
!head ./files/interesting.txt

﻿This Etext file is presented by Project Gutenberg, in
cooperation with World Library, Inc., from their Library of the
Future and Shakespeare CDROMS.  Project Gutenberg often releases
Etexts that are NOT placed in the Public Domain!!
*This Etext has certain copyright implications you should read!*
<<THIS ELECTRONIC VERSION OF THE COMPLETE WORKS OF WILLIAM
SHAKESPEARE IS COPYRIGHT 1990-1993 BY WORLD LIBRARY, INC., AND IS
PROVIDED BY PROJECT GUTENBERG WITH PERMISSION.  ELECTRONIC AND
MACHINE READABLE COPIES MAY BE DISTRIBUTED SO LONG AS SUCH COPIES
(1) ARE FOR YOUR OR OTHERS PERSONAL USE ONLY, AND (2) ARE NOT


## IO Streams
To use an exact terminology - what we're dealing with here are io streams, so we can also "seek" change the cursor place on the stream backwards of forward

In [20]:
# Shakespear sonnets
with open('./files/pg1105.txt', 'r') as f:
    res = f.readlines()
    print(len(res))
    print(f'Current byte position {f.tell()}')
    res = f.readlines()
    print(len(res))
    
    f.seek(10000) # Note that seek is in bytes in the stream
    print(f'Current byte position {f.tell()}')
    res = f.readlines()
    print(len(res))

2854
Current byte position 115149
0
Current byte position 10000
2615


In [27]:
from io import SEEK_END

In [29]:
## Reading an IO stream can be lazy
# We can open a file more then once
f = open('./files/other_name.txt', 'w')
# Now another part of the file tries to open my file
d = open('./files/other_name.txt')
print(d.readline())
f.seek(0, SEEK_END)
f.writelines(['NO FILE'])

print(d.readline())
# print(f.read())
f.close()
d.close()

NO FILE



## One big str

Sometimes we just need one big strings, not lines. 

In [30]:
# Consider this example

!cat ./files/data.json

# A json document with a line break is perfectally valid but if we try to read it as a line..  

{"number":
       	42}


In [32]:
import json
with open('./files/data.json', 'r') as f:
    res = f.readlines()
    print(res)
    json.loads(res)

['{"number":\n', '       \t42}\n']


TypeError: the JSON object must be str, bytes or bytearray, not list

In [33]:
# BUT if we tread it not a list of lines but as a string
with open('./files/data.json', 'rb') as f:
    res = f.read()
    print(res)
    a = json.loads(res)
    print(a)


b'{"number":\n       \t42}\n'
{'number': 42}


# Files and File Like Objects

Note that like many other things in python. the IO module file object is **implementing an interface**, that can be implemented otherwise

Enter: File Like Objects

![IO: The moon of jupyter](https://upload.wikimedia.org/wikipedia/commons/7/7b/Io_highest_resolution_true_color.jpg)

In [36]:
# file like objects
# suppose I've got a file opener function which reads a file , loads from json to dict, and transforms the keys

def list_opener(file):
    res = file.read()
    a_list = json.loads(res)
    return [x*2 for x in a_list]

In [34]:
!cat ./files/loaded.json

[1,2,3,4,5]

In [38]:
# So for the loaded.json file I expect to get
expected_result = [2,4,6,8,10]
with open('./files/loaded.json') as f:
    actual_list = list_opener(f)
    assert actual_list == expected_result, 'It was supposed to be the same list'

AssertionError: It was supposed to be the same list

In [40]:
# Lets say I've got a way to reuse this api, but now I don't got a file.. one way might be to write a file to the disk and then "open" it
# But there is a better way

import io
expected_result = [2,4,6,8,10]

syntetic_file = io.StringIO('[1,2,3,4,5]')

res = list_opener(syntetic_file)
print(res)
assert res == expected_result, 'It was supposed to be the same list'

[2, 4, 6, 8, 10]


# io streams: stdout and stderr

In [None]:
import sys

> In a UNIX-style operating system, there are three so-called ``streams'', which represent file-like objects through which input to and output from programs is directed. The stream known as standard input generally represents the keyboard and is the basic source of user input to text-based programs. The streams known as standard output and standard error are the default destinations for output from programs, and generally represent the screen of the user's computer. For many simple tasks, Python provides functions so that you don't have to deal with these streams directly. For example, the print statement directs its output to the standard output stream; the raw_input function reads its input from the standard input stream. 

**STDIN** User input

**STDOUT** Regular output (defaults to screen)

**STDERR** Error output (defaults to screen)

## Let's jump to the terminal and try to run input.py

## Other kinds of IO
While we talked about **IO** I almost discussed just files - which is IO to disk. BUT what other kind of IO I might have? network, screen, mouse (clicks) . etc/

# Thinking about IO

Computing is done in the cpu, which is the heart and brains of a computer, CPUs are FAST. Main memory access is FAST. IO? may seem fast to us, but isn't really fast

IO is where computing meets the laws of physics 

![Latency numbers every programmer should know](https://camo.githubusercontent.com/77f72259e1eb58596b564d1ad823af1853bc60a3/687474703a2f2f692e696d6775722e636f6d2f6b307431652e706e67)

And not only time, but also disk space, network, etc

# Homework

1. Write a program that prints it self

Also print
* THe number of comments
* the number of functions

* Bonues - succeed at printing it self even if run from another folder for example 

```bash
python printer.py
```
(Hint the os modules are what you should be checking)

2. Write a program that does http requests to a list of sites (from sites.txt here) - read from a files, does each request 100 times, and calculate the mean, median time for each site, then produce a report with the results   . 
requests
* Bonus : error handling for failing requests 
* Bonus: make the report in an html format that can be shown in a browser
* Bonus: Print the report to your home/office printer 

You'll need to use datetime built in library
(Hint you might want to use a program you already got installed in order to actually print) popen is your friend