In [1]:
"""
Opening files and file objects


1. In Python, you open and read a file by using the built-in open function and various built-in reading operations. The open doesn’t read
   anything from the file; instead, it returns an object called a file object that you can use to access the opened file. A file object keeps
   track of a file and how much of the file has been read or written. All Python file I/O is done using file objects rather than filenames.

2. In the example, the first call to readline returns the first line in the file object, everything up to and including the first newline
   character or the entire file if there’s no newline character in the file; the next call to readline returns the second line, if it
   exists, and so on.

3. The first argument to the open function is a pathname. In the example, you’re opening what you expect to be an existing file
   in the current working directory.

4. Note also that this example uses the with keyword, indicating that the file will be opened with a context manager. It’s enough to note
   that this style of opening files better manages potential I/O errors and is generally preferred.
"""
with open('data/word_count.txt', 'r') as file_object:
    line = file_object.readline()
    print(line)


Python provides a complete set of control flow elements,



In [2]:
"""
Closing files


1. After all data has been read from or written to a file object, it should be closed. Closing a file object frees up system resources,
   allows the underlying file to be read or written to by other code, and in general makes the program more reliable. For small scripts,
   not closing a file object generally doesn’t have much of an effect; file objects are automatically closed when the script or program
   terminates. For larger programs, too many open file objects may exhaust system resources, causing the program to abort.

2. You close a file object by using the close method when the file object is no longer needed. Or using a context manager and the keyword
   with is also a good way to automatically close files when you’re done.
"""
file_object = open('data/word_count.txt', 'r')
line = file_object.readline()
print(line)
file_object.close()


Python provides a complete set of control flow elements,



In [3]:
"""
Opening files in write or other modes


1. The second argument of the open command is a string denoting how the file should be opened. 'r' means “Open the file for reading,”
   'w' means “Open the file for writing”(any data already in the file will be erased), and 'a' means “Open the file for appending”
   (new data will be appended to the end of any data already in the file). If you want to open the file for reading, you can leave out
   the second argument; 'r' is the default.

2. Depending on the operating system, open may also have access to additional file modes. The open can take an optional third argument,
   which defines how reads or writes for that file are buffered. Buffering is the process of holding data in memory until enough data has
   been requested or written to justify the time cost of doing a disk access. Other parameters to open control the encoding for text files
   and the handling of newline characters in text files.
"""
file_object = open('data/myfile.txt', 'w')
file_object.write("Hello World!")
file_object.close()

In [3]:
"""
Counting file lines


1. If there’s nothing more to be read from the file, readline returns an empty string, which makes it easy to (for example) count the
   number of lines in a file

2. An even shorter way to count all the lines is to use the built-in readlines method, which reads all the lines in a file and returns them
   as a list of strings, one string per line (with trailing newlines still included)

3. If you happen to be counting all the lines in a huge file, of course, this method may cause your computer to run out of memory because
   it reads the entire file into memory at once. It’s also possible to overflow memory with readline if you have the misfortune to try to
   read a line from a huge file that contains no newline characters, although this situation is highly unlikely. To handle such circumstances,
   both readline and readlines can take an optional argument affecting the amount of data they read at any one time.

4. Another way to iterate over all of the lines in a file is to treat the file object as an iterator in a for loop. This method has
   the advantage that the lines are read into memory as needed, so even with large files, running out of memory isn’t a concern.
   The other advantage of this method is that it’s simpler and easier to read.
"""

# count line 1
file_object = open('data/word_count.txt', 'r')
count = 0
while file_object.readline() != "":
    count = count + 1
print(count)
file_object.close()

# count line 2
file_object = open("data/word_count.txt", 'r')
print(len(file_object.readlines()))
file_object.close()

# count line 3
file_object = open("data/word_count.txt", 'r')
count = 0
for line in file_object:
    count = count + 1
print(count)
file_object.close()

4
4
4


In [9]:
"""
newline in text mode


1. A possible problem with the read method may arise due to the fact that on Windows and Macintosh machines, text-mode translations occur
   if you use the open command in text mode—that is, without adding a b to the mode. In text mode, on a Macintosh any \r is converted to
   "\n", whereas on Windows "\r\n" pairs are converted to "\n". You can specify the treatment of newline characters by using the newline
   parameter when you open the file and specifying newline="\n", "\r", or "\r\n".

2. If the file has been opened in binary mode, the newlinepparameter isn’t needed, because all bytes are returned exactly as they are
   in the file.
"""

# Forces only "\r" to be considered to be a newline. 
file_object = open("data/word_count.txt", newline="\r")
for line in file_object:
    print(line)
file_object.close()

Python provides a complete set of control flow elements,
including while and for loops, and conditionals.
Python uses the level of indentation to group blocks
of code with control elements.


In [12]:
"""
write & writeline


1. The write methods that correspond to the readline and readlines methods are the write and writelines methods. Note that there’s no
   writeline function.
   
2. The write method writes a single string, which can span multiple lines if newline characters are embedded within the string. The write
   doesn’t write out a newline after it writes its argument; if you want a newline in the output, you must put it there yourself.
   If you open a file in text mode (using w), any \n characters are mapped back to the platform-specific line endings (that is, '\r\n'
   on Windows or '\r' on Macintosh platforms). Again, opening the file with a specified newline prevents this situation. 

3. The writelines method is something of a misnomer because it doesn’t necessarily write lines; it takes a list of strings as an argument
   and writes them, one after the other, to the given file object without writing newlines. If the strings in the list end with newlines,
   they’re written as lines; otherwise, they’re effectively concatenated in the file. But writelines is a precise inverse of readlines
   in that it can be used on the list returned by readlines to write a file identical to the file readlines read from.
"""
lines = ["123\n", "hello\r", "word"]
output = open("data/myfile2.txt", 'w')
output.writelines(lines)
output.close()

In [14]:
"""
Binary mode


1. On some occasions, you may want to read all the data in a file into a single bytes object, especially if the data isn’t a string,
   and you want to get it all into memory so you can treat it as a byte sequence. Or you may want to read data from a file as bytes
   objects of a fixed size.

2. You may be reading data without explicit newlines, for example, where each line is assumed to be a sequence of characters of a fixed
   size. To do so, use the read method. Without any argument, this method reads all of a file from the current position and returns that
   data as a bytes object. With a single-integer argument, it reads that number of bytes (or less, if there isn’t enough data in the file
   to satisfy the request) and returns a bytes object of the given size.

3. Keep in mind that files open in binary mode deal only in bytes, not strings. To use the data as strings, you must decode any bytes
   objects to string objects. This point is often important in dealing with network protocols, where data streams often behave as files
   but need to be interpreted as bytes, not strings.
"""

input_file = open("data/myfile2.txt", 'rb')
header = input_file.read(4)
data = input_file.read()
print(header, data)
input_file.close()

b'123\n' b'hello\rword'


In [15]:
"""
Reading and writing with pathlib


In addition to its path-manipulation powers, a Path object can be used to read and write text and binary files. This capability can be
convenient because no open or close is required, and separate methods are used for text and binary operations. One limitation, however,
is that you have no way to append by using Path methods, because writing replaces any existing content.
"""
from pathlib import Path


p_text = Path('data/myfile3.txt')
return_value = p_text.write_text('Text file contents')
print(return_value, p_text.read_text())

p_binary = Path('data/myfile4.txt')
return_value = p_binary.write_bytes(b'Binary file contents')
print(return_value, p_binary.read_bytes())

18 Text file contents
20 b'Binary file contents'


In [21]:
"""
Screen input/output and redirection

1. You can use the built-in input method to prompt for and read an input string. The prompt line is optional, and the newline
   at the end of the input line is stripped off. To read in numbers by using input, you need to explicitly convert the string
   that input returns to the correct number type.

2. input writes its prompt to the standard output and reads from the standard input. Lower-level access to these and standard error
   can be obtained by using the sys module, which has sys.stdin, sys.stdout, and sys.stderr attributes. These attributes can be treated
   as specialized file objects. For sys.stdin, you have the read, readline, and readlines methods. For sys.stdout and sys.stderr, you
   can use the standard print function as well as the write and writelines methods, which operate as they do for other file objects.

3. You can redirect standard input to read from a file. Similarly, standard output or standard error can be set to write to files
   and then programmatically restored to their original values by using sys.__stdin__, sys.__stdout__, and sys.__stderr__.


# input
x = input("enter file name to use: ")
print(x)
y = int(input("enter your number: "))
print(y)
"""
# sys.stdout, sys.stdin, sys.stderr
import sys

#print("Write to the standard output.")
#sys.stdout.write("Write to the standard output2.\n")
#s = sys.stdin.readline()
#print(s)

# redirect
f = open("./data/myfile5.txt", 'w')
sys.stdout = f
sys.stdout.writelines(["A first line.\n", "A second line.\n"])
print("3+4")
sys.stdout = sys.__stdout__
f.close()

f = open("./data/myfile6.txt", 'w')
# The print function also can be redirected to any file without changing standard output
print("A first line.\n", "A second line.\n", file=f) # 
f.close()

In [26]:
"""
The module mio
"""
sys.path.append('./module')
import mio

mio.capture_output(file="./data/capture_file")
print('hello world1')
print("test2")
mio.restore_output()
mio.print_file(file="./data/capture_file")
mio.clear_file()

In [24]:
"""
Reading structured binary data with the struct module

1. For very simple storage needs, it’s usually best to use text or bytes input and output. For more sophisticated applications, Python
   provides the ability to easily read or write arbitrary Python objects (pickling). This ability is much less error-prone than directly
   writing and reading your own binary data and is highly recommended.

2. But there’s at least one situation in which you’ll likely need to know how to read or write binary data: when you’re dealing with
   files that are generated or used by other programs. 

3. As you've seen, Python supports explicit binary input or output by using bytes instead of strings if you open the file in binary mode.
   But because most binary files rely on a particular structure to help parse the values, writing your own code to read and split them into
   variables correctly is often more work than it’s worth. Instead, you can use the standard struct module to permit you to treat those
   strings as formatted byte sequences with some specific meaning.

4. struct gets even better; you can insert other special characters into the format string to indicate that data should be read/written
   in big-endian, little-endian, or machinenative-endian format (default is machine-native) and to indicate that things like a C short
   integer should be sized either as native to the machine (the default) or as standard C sizes.
"""
import struct

# Define a format string understandable to the struct module, which tells the module how the data in one of your records is packed
record_format = 'hd4s'

# determine the number of bytes used to contain data in such a format
record_size = struct.calcsize(record_format)

# read record
result_list = []
input = open('./data/myfile7.txt', 'rb')
while 1:
    record = input.read(record_size)
    if record == b'':
        input.close()
        break
    # unpack the record into a tuple
    result_list.append(struct.unpack(record_format, record))
print(result_list)


[(7, 3.14, b'gbye')]


In [27]:
"""
Pickling object files

1. Python can write any data structure into a file, read that data structure back out of a file, and re-create it with just a few commands.
   This capability is unusual but can be useful, because it can save you many pages of code that do nothing but dump the state of a program
   into a file (and can save a similar amount of code that does nothing but read that state back in). Python provides this capability via
   the pickle module. 

2. It doesn’t matter what was stored in a, b, and c. The content might be as simple as numbers or as complex as a list of dictionaries
   containing instances of user-defined classes. pickle.dump saves everything. Any data that was previously in the variables a, b, or c
   is restored to them by pickle.load.

3.  The pickle module can store almost anything in this manner. It can handle lists, tuples, numbers, strings, dictionaries, and just about
    anything made up of these types of objects, which includes all class instances. It also handles shared objects, cyclic references,
    and other complex memory structures correctly, storing shared objects only once and restoring them as shared objects, not as identical
    copies. But code objects (what Python uses to store byte-compiled code) and system resources (like files or sockets) can’t be pickled.
 
4. More often than not, you won’t want to save your entire program state with pickle. Most applications can have multiple documents open
   at one time, for example. If you saved the entire state of the program, you would effectively save all open documents in one file.
   An easy and effective way of saving and restoring only data of interest is to write a save function that stores all data you want to
   save into a dictionary and then uses pickle to save the dictionary. Then you can use a complementary restore function to read
   the dictionary back in (again using pickle) and to assign the values in the dictionary to the appropriate program variables.
   This technique also has the advantage that there’s no possibility of reading values back in an incorrect order— that is, an order
   different from the order in which the values were stored.

5. Reasons not to pickle:
   --- Pickling is neither particularly fast nor space-efficient as a means of serialization. Even using JSON to store serialized objects
       is faster and results in smaller files on disk.
   --- Pickling isn’t secure, and loading a pickle with malicious content can result in the execution of arbitrary code on your machine.
       Therefore, you should avoid pickling if there’s any chance at all that the pickle file will be accessible to anyone who might
       alter it. 
"""
import pickle

#
a = 1
b = [1, 2, '3asd', b'hhh']
c = {'d': 1, 'c': 2}

file = open("./data/state", "wb")
pickle.dump(a, file)
pickle.dump(b, file)
pickle.dump(c, file)
file.close()

file = open("./data/state", 'rb')
e = pickle.load(file)
d = pickle.load(file)
f = pickle.load(file)
file.close()

print(e, d, f)

1 [1, 2, '3asd', b'hhh'] {'d': 1, 'c': 2}


In [28]:
"""
Shelve module

1. The shelve object as being a dictionary that stores its data in a file on disk rather than in memory, which means that you still have
   the convenience of access with a key, but you don’t have the limitations of the amount of available RAM.

2. The Python shelve module permits the reading or writing of pieces of data in large files without reading or writing the entire file
   
3. shelve.open returns a shelf object that permits basic dictionary operations, key assignment or lookup, del, in, and the keys method.
   But unlike a normal dictionary, shelf objects store their data on disk, not in memory. Unfortunately, shelf objects do have one
   significant restriction compared with dictionaries: They can use only strings as keys, versus the wide range of key types allowable
   in dictionaries. 
"""

import shelve

book = shelve.open("./data/address")
book['flintstone'] = ('fred', '555-1234', '1233 Bedrock Place')
book['rubble'] = ('barney', '555-4321', '1235 Bedrock Place')
book.close()

book = shelve.open("./data/address")
print(book['flintstone'])

('fred', '555-1234', '1233 Bedrock Place')
