io module provides facilities for dealing with various types of I/O.
    1. Text I/O
    2. Binary I/O
    3. Raw I/O
    
File object 
    - A concrete object belonging to any of these categories 
    - Also, called stream and file-like object

In [1]:
import io

In [2]:
print(io.__doc__)

The io module provides the Python interfaces to stream handling. The
builtin open function is defined in this module.

At the top of the I/O hierarchy is the abstract base class IOBase. It
defines the basic interface to a stream. Note, however, that there is no
separation between reading and writing to streams; implementations are
allowed to raise an OSError if they do not support a given operation.

Extending IOBase is RawIOBase which deals simply with the reading and
writing of raw bytes to a stream. FileIO subclasses RawIOBase to provide
an interface to OS files.

BufferedIOBase deals with buffering on a raw byte stream (RawIOBase). Its
subclasses, BufferedWriter, BufferedReader, and BufferedRWPair buffer
streams that are readable, writable, and both respectively.
BufferedRandom provides a buffered interface to random access
streams. BytesIO is a simple stream of in-memory bytes.

Another IOBase subclass, TextIOBase, deals with the encoding and decoding
of streams into text. TextIOW

In [3]:
print(dir(io))

['BlockingIOError', 'BufferedIOBase', 'BufferedRWPair', 'BufferedRandom', 'BufferedReader', 'BufferedWriter', 'BytesIO', 'DEFAULT_BUFFER_SIZE', 'FileIO', 'IOBase', 'IncrementalNewlineDecoder', 'OpenWrapper', 'RawIOBase', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'StringIO', 'TextIOBase', 'TextIOWrapper', 'UnsupportedOperation', '_WindowsConsoleIO', '__all__', '__author__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_io', 'abc', 'open', 'open_code']


## In-memory Streams

In [4]:
# Writing to a buffer
output = io.StringIO()
output.write("This goes into the buffer. ")
print("And so does this.", file=output)

In [5]:
# Retrieve the value written
print(output.getvalue())

This goes into the buffer. And so does this.



In [6]:
output.close()  # discard buffer memory

In [7]:
try:
    output.write("This goes into the buffer. ")
except ValueError as ex:
    print(ex)

I/O operation on closed file


In [8]:
# Initialize a read buffer
input = io.StringIO("Inital value for read buffer")

# Read from the buffer
print(input.read())

Inital value for read buffer


In [9]:
sentence = """
Welcome! Are you completely new to programming? 
If not then we presume you will be looking for information about why and 
how to get started with Python. Fortunately an experienced programmer in 
any programming language (whatever it may be) can pick up Python very quickly. 
It's also easy for beginners to use and learn, so jump in!
"""

In [10]:
stream_fh = io.StringIO(sentence)
stream_fh

<_io.StringIO at 0x26cd5cd0160>

In [11]:
stream_fh.read(10)

'\nWelcome! '

In [12]:
stream_fh.tell()

10

In [13]:
stream_fh.seek(0)

0

In [14]:
stream_fh.read(16)

'\nWelcome! Are yo'

In [15]:
stream_fh.readline()

'u completely new to programming? \n'

In [16]:
stream_fh.readline()

'If not then we presume you will be looking for information about why and \n'

In [17]:
stream_fh.readlines()

['how to get started with Python. Fortunately an experienced programmer in \n',
 'any programming language (whatever it may be) can pick up Python very quickly. \n',
 "It's also easy for beginners to use and learn, so jump in!\n"]

## Working with byte stream

In [18]:
# Writing to a buffer
output = io.BytesIO()
output.write("This goes into the buffer. ".encode("utf-8"))
output.write("ÁÇÊ".encode("utf-8"))

# Retrieve the value written
print(output.getvalue())

output.close()  # discard buffer memory

b'This goes into the buffer. \xc3\x81\xc3\x87\xc3\x8a'
b'Inital value for read buffer'


In [19]:
# Initialize a read buffer
input = io.BytesIO(b"Inital value for read buffer")

# Read from the buffer
print(input.read())

b'Inital value for read buffer'


## Wrapping Byte Streams for Text Data

In [20]:
# Writing to a buffer
output = io.BytesIO()
wrapper = io.TextIOWrapper(
    output,
    encoding="utf-8",
    write_through=True,
)
wrapper.write("This goes into the buffer. ")
wrapper.write("ÁÇÊ")

# Retrieve the value written
print(output.getvalue())

output.close()  # discard buffer memory

# Initialize a read buffer
input = io.BytesIO(
    b"Inital value for read buffer with unicode characters " + "ÁÇÊ".encode("utf-8")
)
wrapper = io.TextIOWrapper(input, encoding="utf-8")

# Read from the buffer
print(wrapper.read())

b'This goes into the buffer. \xc3\x81\xc3\x87\xc3\x8a'
Inital value for read buffer with unicode characters ÁÇÊ


## Reading Buffer data

In [21]:
b = io.BytesIO(b"abcdef")
view = b.getbuffer()
view[2:4] = b"56"
b.getvalue()

b'ab56ef'

__Question:__ Ordinary String vs StringIO stream

In [22]:
ordinary_string = ""
for i in range(100):
    ordinary_string += str(i)

print(ordinary_string)

0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899


In [23]:
str_stream_string = io.StringIO()
for i in range(100):
    str_stream_string.write(str(i))

print(str_stream_string)
print(str_stream_string.getvalue())

<_io.StringIO object at 0x0000026CD5CD01F0>
0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899


In [24]:
import sys

print(f"{sys.getsizeof(ordinary_string)   =}")
print(f"{sys.getsizeof(str_stream_string) =}")

sys.getsizeof(ordinary_string)   =239
sys.getsizeof(str_stream_string) =136


__NOTE:__ streams are more memory efficient

## Usage

In [25]:
import csv
import io

reader = csv.reader(io.StringIO("a,b,c\n1,2,3"))
print([r for r in reader])
# output [['a', 'b', 'c'], ['1', '2', '3']]

[['a', 'b', 'c'], ['1', '2', '3']]


In [26]:
import gzip
import io

byte_stream = io.BytesIO()
gzip_file = gzip.GzipFile(fileobj=byte_stream, mode="wb")
gzip_file.write(b"Hello World")
gzip_file.close()

byte_stream.getvalue()

b'\x1f\x8b\x08\x00l\x91\x0c_\x02\xff\xf3H\xcd\xc9\xc9W\x08\xcf/\xcaI\x01\x00V\xb1\x17J\x0b\x00\x00\x00'

In [27]:
byte_stream.getvalue()

b'\x1f\x8b\x08\x00l\x91\x0c_\x02\xff\xf3H\xcd\xc9\xc9W\x08\xcf/\xcaI\x01\x00V\xb1\x17J\x0b\x00\x00\x00'