<a href="https://colab.research.google.com/github/justalge/another_python_totorial/blob/main/week4/Lecture_7_files_packages_exceptions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# File Management

A file or a computer file is a chunk of logically related data or information which can be used by computer programs. Usually a file is kept on a permanent storage media, e.g. a hard drive disk. A unique name and path is used by human users or in programs or scripts to access a file for reading and modification purposes.

The term "file" - as we have described it in the previous paragraph - appeared in the history of computers very early. Usage can be tracked down to the year 1952, when punch cards where used.

#### Reading and Writing Files in Python

The syntax for reading and writing files in Python is similar to programming languages like C, C++, Java, Perl, and others but a lot easier to handle

In [None]:
# We will start with writing a file. We have a string which contains part of the
# definition of a general file from Wikipedia:

definition = """

A computer file is a computer resource for recording data discretely in a
computer storage device. Just as words can be written 
to paper, so can information be written to a computer
file. Files can be edited and transferred through the 
internet on that particular computer system.
"""

# We will write this into a file with the name file_definition.txt:

open("file_definition.txt", "w").write(definition)

!ls

file_definition.txt  sample_data


In [None]:
# lets look at the file content:

!cat file_definition.txt


A computer file is a computer resource for recording data discretely in a
computer storage device. Just as words can be written 
to paper, so can information be written to a computer
file. Files can be edited and transferred through the 
internet on that particular computer system.

In [None]:
print(definition, file=open("file_definition.txt", "w"))


A computer file is a computer resource for recording data discretely in a
computer storage device. Just as words can be written 
to paper, so can information be written to a computer
file. Files can be edited and transferred through the 
internet on that particular computer system.


In [None]:
# We successfully created and have written to a text file. Now, we want to see
# how to read this file from Python. We can read the whole text file into one
# string, as you can see in the following code:

text = open("file_definition.txt").read()
print(text)


A computer file is a computer resource for recording data discretely in a
computer storage device. Just as words can be written 
to paper, so can information be written to a computer
file. Files can be edited and transferred through the 
internet on that particular computer system.


In [None]:
# Reading in a text file in one string object is okay, as long as the file is
# not too large. If a file is large, wwe can read in the file line by line. We
# demonstrate how this can be achieved in the following example with a small file:

with open("file_definition.txt", "r") as fh:
    for line in fh:
        print(line.strip())
        print('--------------------------------------')


--------------------------------------
A computer file is a computer resource for recording data discretely in a
--------------------------------------
computer storage device. Just as words can be written
--------------------------------------
to paper, so can information be written to a computer
--------------------------------------
file. Files can be edited and transferred through the
--------------------------------------
internet on that particular computer system.
--------------------------------------


In [None]:
'bird flies'.split()

['bird', 'flies']

In [None]:
'   \n sdflkdsjfkldsjfkldsfjklds \n    '.strip()

'sdflkdsjfkldsjfkldsfjklds'

In [None]:
# Some people don't use the with statement to read or write files. This is not
# a good idea. The code above without with looks like this:

fh = open("file_definition.txt")
for line in fh:
    print(line.strip())
    print('---------------------------')
fh.close()


---------------------------
A computer file is a computer resource for recording data discretely in a
---------------------------
computer storage device. Just as words can be written
---------------------------
to paper, so can information be written to a computer
---------------------------
file. Files can be edited and transferred through the
---------------------------
internet on that particular computer system.
---------------------------


**A striking difference between both implementation consists in the usage of close. If we use with, we do not have to explicitly close the file. The file will be closed automatically, when the with blocks ends. Without with, we have to explicitly close the file, like in our second example with fh.close(). There is a more important difference between them: If an exception occurs inside of the ẁith block, the file will be closed. If an exception occurs in the variant without with before the close, the file will not be closed. This means, you should alwawys use the with statement**

We saw already how to write into a file with "write". The following code is an example, in which we show how to read in from one file line by line, change the lines and write the changed content into another file. The file can be downloaded: pythonista_and_python.txt:

In [None]:
# first lets download file:

! wget https://www.python-course.eu/pythonista_and_python.txt
!ls

--2021-09-27 09:26:23--  https://www.python-course.eu/pythonista_and_python.txt
Resolving www.python-course.eu (www.python-course.eu)... 138.201.17.115, 2a01:4f8:171:286f::4
Connecting to www.python-course.eu (www.python-course.eu)|138.201.17.115|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 406 [text/plain]
Saving to: ‘pythonista_and_python.txt’


2021-09-27 09:26:24 (31.8 MB/s) - ‘pythonista_and_python.txt’ saved [406/406]

file_definition.txt  pythonista_and_python.txt	sample_data  text_file.txt


In [None]:
with open("pythonista_and_python.txt") as infile:
    with open("python_newbie_and_the_guru.txt", "w") as outfile:
        for line in infile:
            line = line.replace("Pythonista", "Python newbie")
            line = line.replace("Python snake", "Python guru")
            print(line.rstrip())
            # write the line into the file:
            outfile.write(line)
            

A blue Python newbie, green behind the ears, went to Pythonia.
She wanted to visit the famous wise green Python guru.
She wanted to ask her about the white way to avoid the black.
The bright path to program in a yellow, green, or blue style.
The green Python turned red, when she addressed her.
The Python newbie turned yellow in turn.
After a long but not endless loop  the wise Python uttered:
"The rainbow!"


As we have already mentioned: If a file is not to large and if we have to do replacements like we did in the previous example, we wouldn't read in and write out the file line by line. It is much better to use the readmethod, which returns a string containing the complete content of the file, including the carriage returns and line feeds. We can apply the changes to this string and save it into the new file. Working like this, there is no need for a withconstruct, because there will be no reference to the file, i.e. it will be immediately deleted afeter reading and writing:

In [None]:
txt = open("pythonista_and_python.txt").read()
txt = txt.replace("Pythonista", "Python newbie")
txt = txt.replace("Python snake", "Python guru")
open("python_newbie_and_the_guru.txt", "w").write(txt)

!ls

file_definition.txt	   python_newbie_and_the_guru.txt  text_file.txt
pythonista_and_python.txt  sample_data


#### Resetting the Files Current Position

It's possible to set - or reset - a file's position to a certain position, also called the offset. To do this, we use the method seek. The parameter of seek determines the offset which we want to set the current position to. To work with seek, we will often need the method tell which "tells" us the current position. When we have just opened a file, it will be zero. Before we demonstrate the way of working of both seek and tell, we create a simple file on which we will perform our commands:

In [None]:
open("small_text.txt", "w").write("brown is her favorite colour")

!ls

file_definition.txt	   python_newbie_and_the_guru.txt  small_text.txt
pythonista_and_python.txt  sample_data			   text_file.txt


In [None]:
# The method tell returns the current stream position, i.e. the position where
# we will continue, when we use a "read", "readline" or so on:

fh = open("small_text.txt")
fh.tell()

# Zero tells us that we are positioned at the first character of the file.

0

In [None]:
!cat small_text.txt

brown is her favorite colour

In [None]:
# We will read now the next five characters of the file:

fh.read(5)

'brown'

In [None]:
# Using tellagain, shows that we are located at position 5:

fh.tell()

5

In [None]:
# Using read without parameters will read the remainder of the file
# starting from this position:

fh.read()

' is her favorite colour'

In [None]:
# Using tellagain, tells us about the position after the last character of the
# file. This number corresponds to the number of characters of the file!

fh.tell()

28

With seekwe can move the position to an arbitrary place in the file. The method seek takes two parameters:

`fh.seek(offset, startpoint_for_offset)`

where fh is the file pointer, we are working with. The parameter offset specifies how many positions the pointer will be moved. The question is from which position should the pointer be moved. This position is specified by the second parameter startpoint_for_offset. It can have the follwoing values:

* 0: reference point is the beginning of the file
* 1: reference point is the current file position
* 2: reference point is the end of the file

if the startpoint_for_offset parameter is not given, it defaults to 0.

WARNING: The values 1 and 2 for the second parameter work only, if the file has been opened for binary reading. We will cover this later!

In [None]:
# The following examples, use the default behaviour:

fh.seek(13)
print(fh.tell())   # just to show you, what seek did!
fh.read()          # reading the remainder of the file

13


'favorite colour'

It is also possible to move the position relative to the current position. If we want to move k characters to the right, we can just set the argument of seek to fh.tell() + k

In [None]:
k = 6
fh.seek(5)    # setting the position to 5
fh.seek(fh.tell() + k)   #  moving k positions to the right
print("We are now at position: ", fh.tell())

We are now at position:  11


`seek` doesn't like negative arguments for the position. On the other hand it doesn't matter, if the value for the position is larger than the length of the file. We define a function in the following, which will set the position to zero, if a negative value is applied. As there is no efficient way to check the length of a file and because it doesn't matter, if the position is greater than the length of the file, we will keep possible values greater than the length of a file:

In [None]:
def relative_seek(fp, k):
    """ rel_seek moves the position of the file pointer k characters to
    the left (k<0) or right (k>0) 
    """
    position = fp.tell() + k
    if position < 0:
        position = 0
    fh.seek(position)
    

with open("small_text.txt") as fh:
    print(fh.tell())
    relative_seek(fh, 7)
    print(fh.tell())
    relative_seek(fh, -5)
    print(fh.tell())
    relative_seek(fh, -10)
    print(fh.tell())

0
7
2
0


You might have thought, when we wrote the function relative_seek why do we not use the second parameter of seek. After all the help file says "1 -- current stream position;". What the help file doesn't say is the fact that seek needs a file pointer opened with "br" (binary read), if the second parameter is set to 1 or 2. We show this in the next subchapter

#### Binary read

So far we have only used the first parameter of open, i.e. the filename. The second parameter is optional and is set to "r" (read) by default. "r" means that the file is read in text mode. In text mode, if encoding (another parameter of open) is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding.

The second parameter specifies the mode of access to the file or in other words the mode in which the file is opened. Files opened in binary mode (appending 'b' to the mode argument) return contents as bytes objects without any decoding.

We will demonstrate this in the following example. To demonstrate the different effects we need a string which uses characters which are not included in standard ASCII. This is why we use a Turkish text, because it uses many special characters and Umlaute. the English translation means "See you, I'll come tomorrow."

In [None]:
# We will write a file with the Turkish text "Görüşürüz, yarın geleceğim.":

txt = "Görüşürüz, yarın geleceğim." 
number_of_chars_written = open("see_you_tomorrow.txt", "w").write(txt)

In [None]:
# We will read in this files in text mode and binary mode to demonstrate the differences:

text = open("see_you_tomorrow.txt", "r").read()
print("text mode: ", text)
text_binary = open("see_you_tomorrow.txt", "rb").read()
print("binary mode: ", text_binary)

text mode:  Görüşürüz, yarın geleceğim.
binary mode:  b'G\xc3\xb6r\xc3\xbc\xc5\x9f\xc3\xbcr\xc3\xbcz, yar\xc4\xb1n gelece\xc4\x9fim.'


In [None]:
# In binary mode, the characters which are not plain ASCII like "ö", "ü", "ş",
# "ğ" and "ı" are represented by more than one byte. In our case by two characters.
# 14 bytes are needed for "görüşürüz":

text_binary[:14]

# "ö" for example consists of the two bytes "\xc3" and "\xb6".

b'G\xc3\xb6r\xc3\xbc\xc5\x9f\xc3\xbcr\xc3\xbcz'

In [None]:
# There are two ways to turn a byte string into a string again:

t = text_binary.decode("utf-8")
print(t)
t2 = str(text_binary, "utf-8")
print(t2)

Görüşürüz, yarın geleceğim.
Görüşürüz, yarın geleceğim.


It is possible to use the values "1" and "2" for the second parameter of seek, if we open a file in binary format:

In [None]:
with open("see_you_tomorrow.txt", "rb") as fh:
    x = fh.read(14)
    print(x)
    # move 5 bytes to the right from the current position:
    fh.seek(5, 1)
    x = fh.read(3)
    print(x)
    print(str(x, "utf-8"))
    # let's move to the 8th byte from the right side of the byte string:
    fh.seek(-8, 2)
    x = fh.read(5)
    print(x)
    print(str(x, "utf-8"))

b'G\xc3\xb6r\xc3\xbc\xc5\x9f\xc3\xbcr\xc3\xbcz'
b'\xc4\xb1n'
ın
b'ece\xc4\x9f'
eceğ


#### Read and Write to the Same File

In the following example we will open a file for reading and writing at the same time. If the file doesn't exist, it will be created. If you want to open an existing file for read and write, you should better use "r+", because this will not delete the content of the file.

In [None]:
fh = open('colours.txt', 'w+')
fh.write('The colour brown')

#Go to the 12th byte in the file, counting starts with 0
fh.seek(11)   
print(fh.read(5))
print(fh.tell())
fh.seek(11)
fh.write('green')
fh.seek(0)
content = fh.read()
print(content)

brown
16
The colour green


#### "How to get into a Pickle"

You can save your data in an easy way that you or better your program can reread them at a later date again. We are "pickling" the data, so that nothing gets lost.

Python offers a module for this purpose, which is called "pickle". With the algorithms of the pickle module we can serialize and de-serialize Python object structures. "Pickling" denotes the process which converts a Python object hierarchy into a byte stream, and "unpickling" on the other hand is the inverse operation, i.e. the byte stream is converted back into an object hierarchy. What we call pickling (and unpickling) is also known as "serialization" or "flattening" a data structure.

An object can be dumped with the dump method of the pickle module:

`pickle.dump(obj, file[,protocol, *, fix_imports=True])`

`dump()` writes a pickled representation of obj to the open file object file. The optional protocol argument tells the pickler to use the given protocol:

* Protocol version 0 is the original (before Python3) human-readable (ascii) protocol and is backwards compatible with previous versions of Python.
* Protocol version 1 is the old binary format which is also compatible with previous versions of Python.
* Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes.
* Protocol version 3 was introduced with Python 3.0. It has explicit support for bytes and cannot be unpickled by Python 2.x pickle modules. It's the recommended protocol of Python 3.x.

The default protocol of Python3 is 3.

**If fix_imports is True and protocol is less than 3, pickle will try to map the new Python3 names to the old module names used in Python2, so that the pickle data stream is readable with Python 2.**

Objects which have been dumped to a file with pickle.dump can be reread into a program by using the method pickle.load(file). pickle.load recognizes automatically, which format had been used for writing the data. A simple example:

In [None]:
import pickle

cities = ["Paris", "Dijon", "Lyon", "Strasbourg"]
fh = open("data.pkl", "bw")
pickle.dump(cities, fh)
fh.close()

with open("data.pkl", "bw") as h:
    pickle.dump(...)

!ls

colours.txt	     pythonista_and_python.txt	     see_you_tomorrow.txt
data.pkl	     python_newbie_and_the_guru.txt  small_text.txt
file_definition.txt  sample_data		     text_file.txt


In [None]:
# The file data.pkl can be read in again by Python in the same or another 
# session or by a different program:

f = open("data.pkl", "rb")
villes = pickle.load(f)
print(villes)
['Paris', 'Dijon', 'Lyon', 'Strasbourg']

['Paris', 'Dijon', 'Lyon', 'Strasbourg']


['Paris', 'Dijon', 'Lyon', 'Strasbourg']

In [None]:
# Only the objects and not their names are saved. That's why we use the assignment
# to villes in the previous example, i.e. data = pickle.load(f).

# In our previous example, we had pickled only one object, i.e. a list of French
# cities. But what about pickling multiple objects? The solution is easy: We pack
# the objects into another object, so we will only have to pickle one object again.
# We will pack two lists "programming_languages" and "python_dialects" into a list
# pickle_objects in the following example:

import pickle
fh = open("data.pkl","bw")
programming_languages = ["Python", "Perl", "C++", "Java", "Lisp"]
python_dialects = ["Jython", "IronPython", "CPython"]
pickle_object = (programming_languages, python_dialects)
pickle.dump(pickle_object,fh)
fh.close()

# The pickled data from the previous example, - i.e. the data which we have written
# to the file data.pkl, - can be separated into two lists again, when we reread the data:

f = open("data.pkl","rb")
languages, dialects = pickle.load(f)
print(languages, dialects) 

['Python', 'Perl', 'C++', 'Java', 'Lisp'] ['Jython', 'IronPython', 'CPython']


#### `shelve` Module

One drawback of the pickle module is that it is only capable of pickling one object at the time, which has to be unpickled in one go. Let's imagine this data object is a dictionary. It may be desirable that we don't have to save and load every time the whole dictionary, but save and load just a single value corresponding to just one key. The shelve module is the solution to this request. A "shelf" - as used in the shelve module - is a persistent, dictionary-like object. The difference with dbm databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python objects -- anything that the "pickle" module can handle. This includes most class instances, recursive data types, and objects containing lots of shared sub-objects. The keys have to be strings.

The shelve module can be easily used. Actually, it is as easy as using a dictionary in Python. Before we can use a shelf object, we have to import the module. After this, we have to open a shelve object with the shelve method open. The open method opens a special shelf file for reading and writing:


In [None]:
import shelve
s = shelve.open("MyShelve")
s

<shelve.DbfilenameShelf at 0x7f4cebf2d590>

If the file "MyShelve" already exists, the open method will try to open it. If it isn't a shelf file, - i.e. a file which has been created with the shelve module, - we will get an error message. If the file doesn't exist, it will be created.

We can use s like an ordinary dictionary, if we use strings as keys:

In [None]:
s["street"] = "Fleet Str"
s["city"] = "London"
for key in s:
    print(key)

city
street


In [None]:
# A shelf object has to be closed with the close method:

s.close()

In [None]:
# We can use the previously created shelf file in another program or in an interactive Python session:

s = shelve.open("MyShelve")

print(s["street"])

print(s["city"])

Fleet Str
London


In [None]:
# It is also possible to cast a shelf object into an "ordinary" dictionary with the dict function:

dict(s)

{'city': 'London', 'street': 'Fleet Str'}

# Modular Programming and Modules

![](https://www.python-course.eu/images/legos.webp)

Modular design means that a complex system is broken down into smaller parts or components, i.e. modules. These components can be independently created and tested. In many cases, they can be even used in other systems as well.

If you want to develop programs which are readable, reliable and maintainable without too much effort, you have to use some kind of modular software design. Especially if your application has a certain size. There exists a variety of concepts to design software in modular form. Modular programming is a software design technique to split your code into separate parts. These parts are called modules. The focus for this separation should be to have modules with no or just few dependencies upon other modules. In other words: Minimization of dependencies is the goal. When creating a modular system, several modules are built separately and more or less independently. The executable application will be created by putting them together.

#### Importing Modules

So far we haven't explained what a Python module is. In a nutshell: **every file, which has the file extension .py and consists of proper Python code, can be seen or is a module!** There is no special syntax required to make such a file a module. A module can contain arbitrary objects, for example files, classes or attributes. All those objects can be accessed after an import. There are different ways to import a modules. We demonstrate this with the math module:

In [None]:
import math

The module math provides mathematical constants and functions, e.g. π (math.pi), the sine function (math.sin()) and the cosine function (math.cos()). Every attribute or function can only be accessed by putting "math." in front of the name:

In [None]:
math.pi

3.141592653589793

In [None]:
math.sin(math.pi/2)

1.0

In [None]:
math.cos(math.pi/2)

6.123233995736766e-17

In [None]:
# It's possible to import more than one module in one import statement. In this
# case the module names are separated by commas:

import math, random

Import statements can be positioned anywhere in the program, but it's good style to place them directly at the beginning of a program.

If only certain objects of a module are needed, we can import only those:

In [None]:
from math import sin, pi

The other objects, e.g. cos, are not available after this import. We are capable of accessing sin and pi directly, i.e. without prefixing them with "math." Instead of explicitly importing certain objects from a module, it's also possible to import everything in the namespace **except that is beginning with an underscore "_"** of the importing module. This can be achieved by using an asterisk in the import:

In [None]:
from math import *

sin(3.01) + tan(cos(2.1)) + e

2.2968833711382604

It's not recommended to use the asterisk notation in an import statement, except when working in the interactive Python shell. One reason is that the origin of a name can be quite obscure, because it can't be seen from which module it might have been imported. We will demonstrate another serious complication in the following example:

In [None]:
from numpy import *
from math import *
print(cos(3))

-0.9899924966004454


In [None]:
# Let's modify the previous example slightly by changing the order of the imports:

from math import *
from numpy import *
print(cos(3))

-0.9899924966004454


In [None]:
# the results are the same =) but could be different, because in the first case
# we 

People use the asterisk notation, because it is so convenient. It means avoiding a lot of tedious typing. Another way to shrink the typing effort lies in renaming a namespace. A good example for this is the numpy module. You will hardly find an example or a tutorial, in which they will import this module with the statement.

In [None]:
import numpy

It's like an unwritten law to import it with

In [None]:
import numpy as np
np.diag([3, 11, 7, 9])

array([[ 3,  0,  0,  0],
       [ 0, 11,  0,  0],
       [ 0,  0,  7,  0],
       [ 0,  0,  0,  9]])

#### Designing and Writing Modules

But how do we create modules in Python? A module in Python is just a file containing Python definitions and statements. The module name is moulded out of the file name by removing the suffix .py. For example, if the file name is fibonacci.py, the module name is fibonacci.

Let's turn our Fibonacci functions into a module. There is hardly anything to be done, we just save the following code in the file fibonacci.py:

In [None]:
code = '''

def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)
def ifib(n):
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
    return a

'''

with open('fibonacci.py', 'w') as handle:
    print(code, file=handle)

!cat fibonacci.py



def fib(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n-1) + fib(n-2)
def ifib(n):
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
    return a




In [None]:
import fibonacci
fibonacci.fib(7)

13

Don't try to call the recursive version of the Fibonacci function with large arguments like we did with the iterative version. A value like 42 is already too large. You will have to wait for a long time!

As you can easily imagine: It's a pain if you have to use those functions often in your program and you always have to type in the fully qualified name, i.e. fibonacci.fib(7). One solution consists in assigning a local name to a module function to get a shorter name:

In [None]:
fib = fibonacci.ifib
fib(10)

55

But it's better, if you import the necessary functions directly into your module

#### More on Modules

Usually, modules contain functions or classes, but there can be "plain" statements in them as well. These statements can be used to initialize the module. They are only executed when the module is imported.

Let's look at a module, which only consists of just one statement:

In [None]:
code = 'print("The module is imported now!")'

with open('one_time.py', 'w') as handle:
    print(code, file=handle)

!cat one_time.py

print("The module is imported now!")


In [None]:
import one_time

The module is imported now!


In [None]:
# second time:

import one_time

We can see that it was only imported once. Each module can only be imported once per interpreter session or in a program or script. If you change a module and if you want to reload it, you must restart the interpreter again. In Python 2.x, it was possible to reimport the module by using the built-in reload, i.e.reload(modulename). This is not possible anymore in Python 3.x

You have to execute an "import imp" and use imp.reload(my_module). Alternatively, you can use "imp import reload" and use reload(my_module)

In [None]:
from imp import reload
reload(one_time)
import one_time
print('--------')
reload(one_time)

# Since version 3.4 you should use the "importlib" module, because imp.reload is marked as deprecated

The module is imported now!
--------
The module is imported now!


<module 'one_time' from '/content/one_time.py'>

#### Executing Modules as Scripts

Essentially a Python module is a script, so it can be run as a script:

In [None]:
!python one_time.py

The module is imported now!


The module which has been started as a script will be executed as if it had been imported, but with one exception: The system variable name is set to "main". So it's possible to program different behaviour into a module for the two cases. With the following conditional statement the file can be used as a module or as a script, but only if it is run as a script the method fib will be started with a command line argument:

In [None]:
code = '''
def ifib(n):
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
        print(a)
    return a

if __name__ == "__main__":
    import sys
    ifib(int(sys.argv[1]))
'''

with open('my_module.py', 'w') as handle:
    print(code, file=handle)

!cat my_module.py


def ifib(n):
    a, b = 0, 1
    for i in range(n):
        a, b = b, a + b
        print(a)
    return a

if __name__ == "__main__":
    import sys
    ifib(int(sys.argv[1]))



In [None]:
file1.py

import my_module
...

python file1.py
python my_module.py

In [None]:
# If it is run as a script, we get the following output:

!python my_module.py 50

1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
1346269
2178309
3524578
5702887
9227465
14930352
24157817
39088169
63245986
102334155
165580141
267914296
433494437
701408733
1134903170
1836311903
2971215073
4807526976
7778742049
12586269025


In [None]:
# If it is imported, the code in the if block will not be executed:

import my_module

#### Kinds of Modules

There are different kind of modules:

* Those written in Python 

They have the suffix: .py
* Dynamically linked C modules 

Suffixes are: .dll, .pyd, .so, .sl, ...
* C-Modules linked with the Interpreter 

It's possible to get a complete list of these modules:

In [None]:
import sys
print(sys.builtin_module_names)



#### Module Search Path

If you import a module, let's say "import xyz", the interpreter searches for this module in the following locations and in the order given:

* The directory of the top-level file, i.e. the file being executed.
* The directories of PYTHONPATH, if this global environment variable of your operating system is set.
* standard installation path Linux/Unix e.g. in /usr/lib/python3.5

In [None]:
# It's possible to find out where a module is located after it has been imported:

import numpy
numpy.__file__

'/usr/local/lib/python3.7/dist-packages/numpy/__init__.py'

#### Content of a Module

With the built-in function dir() and the name of the module as an argument, you can list all valid attributes and methods for that module.



In [None]:
import math
dir(math)

['__doc__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 'acos',
 'acosh',
 'asin',
 'asinh',
 'atan',
 'atan2',
 'atanh',
 'ceil',
 'copysign',
 'cos',
 'cosh',
 'degrees',
 'e',
 'erf',
 'erfc',
 'exp',
 'expm1',
 'fabs',
 'factorial',
 'floor',
 'fmod',
 'frexp',
 'fsum',
 'gamma',
 'gcd',
 'hypot',
 'inf',
 'isclose',
 'isfinite',
 'isinf',
 'isnan',
 'ldexp',
 'lgamma',
 'log',
 'log10',
 'log1p',
 'log2',
 'modf',
 'nan',
 'pi',
 'pow',
 'radians',
 'remainder',
 'sin',
 'sinh',
 'sqrt',
 'tan',
 'tanh',
 'tau',
 'trunc']

Calling dir() without an argument, a list with the names in the current local scope is returned

It's possible to get a list of the Built-in functions, exceptions, and other objects by importing the builtins module:

In [None]:
_aa = 3


In [None]:
def f():
    import torch
f()

In [None]:
torch

NameError: ignored

In [None]:
import builtins
dir(builtins)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

# Packages

If you have created a lot of modules at some point in time, you may loose the overview about them. You may have dozens or hundreds of modules and they can be categorized into different categories. It is similar to the situation in a file system: Instead of having all files in just one directory, you put them into different ones, being organized according to the topics of the files. Python modules are organized into packages

A package is basically a directory with Python files and a file with the name \_\_init__.py. This means that every directory inside of the Python path, which contains a file named \_\_init__.py, will be treated as a package by Python. It's possible to put several modules into a Package.

We will demonstrate with a very simple example how to create a package with some Python modules. First of all, we need a directory. The name of this directory will be the name of the package, which we want to create. We will call our package "simple_package". This directory needs to contain a file with the name __init__.py. This file can be empty, or it can contain valid Python code. This code will be executed when a package is imported, so it can be used to initialize a package, e.g. to make sure that some other modules are imported or some values set. Now we can put all of the Python files which will be the submodules of our module into this directory. We create two simple files a.py and b.py just for the sake of filling the package with modules.

In [None]:
code_a = '''
def bar():
    print("Hello, function 'bar' from module 'a' calling")
'''
code_b = '''
def foo():
    print("Hello, function 'foo' from module 'b' calling")
'''

!mkdir pac

with open('pac/a.py', 'w') as handle:
    print(code_a, file=handle)

with open('pac/b.py', 'w') as handle:
    print(code_b, file=handle)

In [None]:
import pac

In [None]:
pac/a

TypeError: ignored

We can see that the package simple_package has been loaded but neither the module "a" nor the module "b"! We can import the modules a and b in the following way:

In [None]:
from pac import a, b
a.bar()
b.foo()

Hello, function 'bar' from module 'a' calling
Hello, function 'foo' from module 'b' calling


As we have seen at the beginning of the chapter, we can't access neither "a" nor "b" by solely importing package `pac`

Yet, there is a way to automatically load these modules. We can use the file \_\_init__.py for this purpose. All we have to do is add the following lines to the so far empty file \_\_init__.py:

In [None]:
code = '''
import pac.a
import pac.b
'''

with open('pac/__init__.py', 'w') as handle:
    print(code, file=handle)

import pac

In [None]:
pac.a

<module 'pac.a' from '/content/pac/a.py'>

# Relative import errors

https://napuzba.com/a/import-error-relative-no-parent/p4

https://stackoverflow.com/questions/16981921/relative-imports-in-python-3

# Errors and Exceptions

An exception is an error that happens during the execution of a program. Exceptions are known to non-programmers as instances that do not conform to a general rule. The name "exception" in computer science has this meaning as well: It implies that the problem (the exception) doesn't occur frequently, i.e. the exception is the "exception to the rule". Exception handling is a construct in some programming languages to handle or deal with errors automatically. Many programming languages like C++, Objective-C, PHP, Java, Ruby, Python, and many others have built-in support for exception handling.

#### Exception Handling in Python

Exception handling in Python is very similar to Java. The code, which harbours the risk of an exception, is embedded in a try block. While in Java exceptions are caught by catch clauses, in Python we have statements introduced by an "except" keyword. It's possible to create "custom-made" exceptions: With the raise statement it's possible to force a specified exception to occur.

Let's look at a simple example. Assuming we want to ask the user to enter an integer number. If we use a input(), the input will be a string, which we have to cast into an integer. If the input isn't a valid integer, we will generate (raise) a ValueError. We show this in the following interactive session:

In [None]:
!ls

colours.txt	     MyShelve.db		python_newbie_and_the_guru.txt
data.pkl	     one_time.py		sample_data
fibonacci.py	     pac			see_you_tomorrow.txt
file_definition.txt  __pycache__		small_text.txt
my_module.py	     pythonista_and_python.txt	text_file.txt


In [None]:
!cd ../; cd content; ls

colours.txt	     MyShelve.db		python_newbie_and_the_guru.txt
data.pkl	     one_time.py		sample_data
fibonacci.py	     pac			see_you_tomorrow.txt
file_definition.txt  __pycache__		small_text.txt
my_module.py	     pythonista_and_python.txt	text_file.txt


In [None]:
n = int(input("Please enter a number: "))

Please enter a number: yy


ValueError: ignored

In [None]:
while True:
    try:
        n = input("Please enter an integer: ")
        n = int(n)
        break
    except ValueError:
        print("No valid integer! Please try again ...")
print("Great, you successfully entered an integer!")

Please enter an integer: yy
No valid integer! Please try again ...
Please enter an integer: 99
Great, you successfully entered an integer!


It's a loop, which breaks only if a valid integer has been given. The while loop is entered. The code within the try clause will be executed statement by statement. If no exception occurs during the execution, the execution will reach the break statement and the while loop will be left. If an exception occurs, i.e. in the casting of n, the rest of the try block will be skipped and the except clause will be executed. The raised error, in our case a ValueError, has to match one of the names after except. In our example only one, i.e. "ValueError:". After having printed the text of the print statement, the execution does another loop. It starts with a new input().

#### Multiple Except Clauses

A try statement may have more than one except clause for different exceptions. But at most one except clause will be executed.
Our next example shows a try clause, in which we open a file for reading, read a line from this file and convert this line into an integer. There are at least two possible exceptions:

* an IOError
* ValueError

Just in case we have an additional unnamed except clause for an unexpected error:

In [None]:
import sys

try:
    f = open('integers.txt')
    s = f.readline()
    i = int(s.strip())
except IOError as e:
    errno, strerror = e.args
    print("I/O error({0}): {1}".format(errno,strerror))
    # e can be printed directly without using .args:
    # print(e)
except ValueError:
    print("No valid integer in line.")
except:
    print("Unexpected error:", sys.exc_info()[0])
    raise

I/O error(2): No such file or directory


The handling of the IOError in the previous example is of special interest. The except clause for the IOError specifies a variable "e" after the exception name (IOError). The variable "e" is bound to an exception instance with the arguments stored in instance.args. If we call the above script with a non-existing file, we get the message:

`I/O error(2): No such file or directory`

And if the file integers.txt is not readable, e.g. if we don't have the permission to read it, we get the following message:

`I/O error(13): Permission denied`

An except clause may name more than one exception in a tuple of error names, as we see in the following example:

In [None]:
try:
    f = open('integers.txt')
    s = f.readline()
    i = int(s.strip())
except (IOError, ValueError):
    print("An I/O error or a ValueError occurred")
except:
    print("An unexpected error occurred")
    raise

An I/O error or a ValueError occurred


We want to demonstrate now, what happens, if we call a function within a try block and if an exception occurs inside the function call:

In [None]:
def f():
    x = int("four")

try:
    f()
except ValueError as e:
    print("got it :-) ", e)


print("Let's get on")

got it :-)  invalid literal for int() with base 10: 'four'
Let's get on


the function catches the exception.

We will extend our example now so that the function will catch the exception directly:

In [None]:
def f():
    try:
        x = int("four")
    except ValueError as e:
        print("got it in the function :-) ", e)

try:
    f()
except ValueError as e:
    print("got it :-) ", e)


print("Let's get on")

got it in the function :-)  invalid literal for int() with base 10: 'four'
Let's get on


As we have expected, the exception is caught inside the function and not in the callers exception:

We add now a "raise", which generates the ValueError again, so that the exception will be propagated to the caller:

In [None]:
def f():
    try:
        x = int("four")
    except ValueError as e:
        print("got it in the function :-) ", e)
        raise

try:
    f()
except ValueError as e:
    print("got it :-) ", e)

print("Let's get on")

got it in the function :-)  invalid literal for int() with base 10: 'four'
got it :-)  invalid literal for int() with base 10: 'four'
Let's get on


#### Custom-made Exceptions

It's possible to create Exceptions yourself:

In [None]:
raise SyntaxError("Sorry, my fault!")

SyntaxError: ignored

The best or the Pythonic way to do this, consists in defining an exception class which inherits from the Exception class. You will have to go through the chapter on Object Oriented Programming to fully understand the following example:

In [None]:
class MyException(Exception):
    pass

raise MyException("An exception doesn't always prove the rule!")

MyException: ignored

#### Clean-up Actions (try ... finally)

So far the try statement had always been paired with except clauses. But there is another way to use it as well. The try statement can be followed by a finally clause. Finally clauses are called clean-up or termination clauses, because they must be executed under all circumstances, i.e. a "finally" clause is always executed regardless if an exception occurred in a try block or not. A simple example to demonstrate the finally clause:

In [None]:
try:
    x = float(input("Your number: "))
    inverse = 1.0 / x
finally:
    print("There may or may not have been an exception.")
print("The inverse: ", inverse)

Your number: 35
There may or may not have been an exception.
The inverse:  0.02857142857142857


#### Combining try, except and finally

In [None]:
try:
    x = float(input("Your number: "))
    inverse = 1.0 / x
except ValueError:
    print("You should have given either an int or a float")
except ZeroDivisionError:
    print("Infinity")
finally:
    print("There may or may not have been an exception.")

Your number: 56
There may or may not have been an exception.


#### else Clause

The try ... except statement has an optional else clause. An else block has to be positioned after all the except clauses. An else clause will be executed if the try clause doesn't raise an exception.

The following example opens a file and reads in all the lines into a list called "text":

In [None]:
import sys
file_name = sys.argv[1]
text = []
try:
    fh = open(file_name, 'r')
    text = fh.readlines()
    fh.close()
except IOError:
    print('cannot open', file_name)

if text:
    print(text[100])

cannot open -f


In [None]:
def f1():
    print('enter f1')
    raise Exception('hooray!')
    print('exit f1')

def f2():
    print('enter f2')
    f1()
    # try:
    #     f1()
    # except:
    #     print('Exception is caught by f2')
    print('exit f2') 

def f3():
    print('enter f3')
    try:
        f2()
    except:
        print('Exception is caught by f3')
    print('exit f3')

f3()

enter f3
enter f2
enter f1
Exception is caught by f3
exit f3


In [None]:
L = list(map(int, input().split()))

min1 = min(L[0], L[1])
min2 = max(L[1], L[0])

max2 = min(L[0], L[1])
max1 = max(L[1], L[0])

for i in L[2:]:

    if i > max1:
        max2 = max1
        max1 = i
    elif i > max2:
        max2 = i

    if i < min1:
        min2 = min1
        min1 = i
    elif i < min2:
        min2 = i

if min1 * min2 >= max2 * max1:
    print(min1, min2)
else:
    print(max2, max1)

-1 0
[-1, 0]
-1 -1
