[Back to Lecture Overview](Overview.ipynb)

# Basic Routines
* Author: Johannes Maucher
* Last Update: 10.11.2020
* References:
    * https://www.tutorialspoint.com/python/python_basic_syntax.htm

Contents:
* Getting help
* comments
* include packages and naming of functions from packages
* Indent as syntax element
* Files of type .py / .ipynb
* Run .py - script
* install packages with conda or pip

<figure align="center">
<img width="800" src="https://maucher.home.hdm-stuttgart.de/Pics/DS_Python_GetStart_All.png">
</figure>

## Packages and Modules
In [Notebook Getting Started](01GettingStarted.ipynb) different environments to write Python code, e.g. in the IPython-Shell, in a script of type *.py* or in jupyter notebooks have been introduced. Independent of the used environment there are 

* some basic functions, which are available directly, e.g. in order to determine the length of a list `L` one can just apply `len(L)`
* many functions, which are defined in modules of the Python standard library. In order to use these functions one must first **import** the module. Functions and modules, which are included in standard Python are listed in the [documentation of Python Standard Library](https://docs.python.org/3.6/library/index.html).
* many functions and classes, which are defined in modules that are not contained in the Python standard distribution. In order to use functions from external packages one must first **download and install** the package. Then the module must be imported like the modules of the Python standard distribution. 

> **Note:** In Python a **module** is just a file with extension *.py*. Such a file can contain the defintion of **functions** and **classes** or a sequence of statements (instructions). A **package** consists of one or more modules. 

<figure align="center">
<img width="600" src="https://maucher.home.hdm-stuttgart.de/Pics/pythonModularity.png">
</figure>

### Download and install external packages

#### Pip
External Python packages are available from the software repository **Python Package Index (PyPI)**. The [official page of PyPI](https://pypi.python.org/pypi) contains a list of all currently available packages. The application **pip** is the package manager of PyPI. It can be launched from the command line in order to download, install, update or remove packages. The most important pip-commands are:

```
pip install <<package_name>>
pip search <<package_name>>
pip show <<package_name>>
pip uninstall <<package_name>>
```

All functions and options provided by pip are described in the [official pip documentation](https://pip.pypa.io/en/stable/). 

#### Conda
If [Anaconda](https://www.anaconda.com) is installed, the included conda package manager can be applied as an alternative to pip. However, the conda package list is less comprehensive than PyPI. The available packages are listed on the [Anaconda Package List](https://docs.continuum.io/anaconda/packages/pkg-docs). From the Anaconda shell these packages can be downloaded and installed by

```
conda install <<package_name>>
```

Alternatively, packages and virtual environments can be managed via the environment-view of the **Anaconda Navigator** (figure below).

<figure align="center">
<img width="900" src="https://maucher.home.hdm-stuttgart.de/Pics/anacondaNavigatorEnvironment.PNG">
</figure>

### Import of packages and modules
All installed packages can be imported by

```
import <<package_name>>
```
For example to import the *time*-package of the Python standard library:

In [4]:
import time

All modules, functions and classes of an imported package can be accessed by prefixing the name of the package:
```
<<package_name>>.<<function_name>>
```
For example, in order to get the current local time as a string variable one can apply the `ctime()`-function of the `time`-package as follows: 

In [7]:
time.ctime()

'Tue Nov 10 09:50:28 2020'

Once the package (module) is imported, it's functions and classes can be viewed by typing the name of the package, followed by a dot and then pressing the *tab*-key. In the case of the *time*-package this yields: 

<figure align="center">
<img width="200" src="https://maucher.home.hdm-stuttgart.de/Pics/timeSubmodules.PNG">
</figure>

Some packages are quite comprehensive. Importing the entire package can be inefficient if only one or a few functions are required. In this case one can just import the required module or function:
```
from <<package_name>> import <<module_name>>, <<function_name>>, <<function_name>>, ...
```
For example, if only the functions `ctime()` and `sleep()` are required from the `time`-package, then they can be imported and applied as follows:

In [9]:
from time import ctime,sleep
print(ctime()) #current local time
sleep(10)     #stop current thread for 10 seconds
print(ctime())

Tue Nov 10 09:50:36 2020
Tue Nov 10 09:50:46 2020


Note that if this version of import is applied, the function-name need not be prefixed by the package-name. This is also the case if the following version of import is applied:
```
from <<package_name>> import *
```
This imports the entire package. However, the functions can be called without the package-name prefix. The drawback of this option is that it is not apparent from which packages the functions are provided (if more packages are imported in this way). Moreover, serious errors (namespace conflicts) can arise if a function name is defined ambigously in different imported packages. In the case of long package names it may be a good choice to apply the following option for import:
```
import <<package_name>> as <<alias_name>>
```
For example in order to avoid prefixing with the long package name `statsmodels` one can apply an arbitrary alias as follows:


In [10]:
#!pip install scipy

In [11]:
import statsmodels.api as stats
a=stats.datasets.co2.load()
print(a.data)

[('1958-03-29T00:00:00.000000000', 316.1)
 ('1958-04-05T00:00:00.000000000', 317.3)
 ('1958-04-12T00:00:00.000000000', 317.6) ...
 ('2001-12-15T00:00:00.000000000', 371.2)
 ('2001-12-22T00:00:00.000000000', 371.3)
 ('2001-12-29T00:00:00.000000000', 371.5)]


## Getting Help
Help on modules and functions can be obtained in different ways. Online documentation is available for all [Modules of the Python standard library](https://docs.python.org/2.7/library/index.html) and for all [Modules provided in the PyPI repository](https://pypi.python.org/pypi?%3Aaction=browse). For inline information help can be obtained by 

```
help(<<package_name>>)
```
or equivalently
```
<<package_name>>?
```
In the same way help on individual functions or objects can be queried.


## Input and Output
### print
The value of an arbitrary variable can be printed to the output (console or jupyter notebook) by
```
print(variable_name)
```
For example:

In [12]:
a=3.5
print(a)

3.5


**Formatted output - old way:**

The `print` command also allows commented and formated output. For example:

In [13]:
a=5
b=3.0
print("%2d divided by %3f is %2.3f"%(a,b,a/b)) # old way of formatted output

 5 divided by 3.000000 is 1.667


Here `%2d` is a placeholder for an integer-variable with 2 digits and `%2.3f` is a placeholder for a float-variable with 2 digits before and 3 digits after the decimal point. The variables, which will be inserted at the placeholder positions, must be specified within parenthesis, which is separated by a `%`-character from the string.

**Formated output - new way:**
In the new way of formatted output the placeholders are defined like this
```
{2:2.3f}
```
This implies that a float variable with 2 digits before and 3 digits after the decimal point can be inserted here. The `2` before the `:` (colon) indicates, that the variable at index 2 of the variable-list in the `format`-method will be inserted here. 


In [14]:
print("{0:2d} divided by {1:2f} is {2:2.3f}".format(a,b,a/b)) # new way of formatted output
print("{1:2f} divided by {0:2d} is {2:2.3f}".format(a,b,b/a))

 5 divided by 3.000000 is 1.667
3.000000 divided by  5 is 0.600


### Input and raw_input
Python programs may require input of users. For this the `input()`-function can be applied. Note that this function does not interprete the input. If you like to have the input interpreted `eval(input())` must be applied. 

In both versions the program stops at the line with the input-instruction and waits for a user-input, which must be terminated by pressing the *Return*-key.

In [33]:
try:
    i=eval(input("Some keyboard input, please! "))
    print(i)
    print(type(i))
except:
    print("ERROR: Input impossible")

try:
    r=input("Some keyboard input, please! ")
    print(r)
    print(type(r))
except:
    print("ERROR: Input impossible")

Some keyboard input, please! 3*4
12
<class 'int'>
Some keyboard input, please! 3*4
3*4
<class 'str'>


### Read from File

In Python input from files and output to files is realized by *file objects*. A file object is returned by the function `open(filename,mode)`, where the string-variable `filename` defines the (path and) name of the file that shall be accessed, and the string-variable `mode` defines how the file shall be accessed. E.g. 
* `mode = 'w'` for writing to a file (if the file already exists it will first be erased)
* `mode = 'a'` for appending to a file
* `mode = 'r'` for reading from a file.
* `mode = 'r+'` for reading from and writing to a file.

Once a file object has been created, it's `read()` method can be called to read the entire file content into a single string-variable. If file access is no longer needed the file object's `close()`-method shall be called to clean up all resources.

In [17]:
fin = open("exampleFile.txt","r")
print("Type of fin:  ",type(fin))
filecontents = fin.read()
print("Type of content:  ",type(filecontents))
print("Content:\n", filecontents)
fin.close()

Type of fin:   <class '_io.TextIOWrapper'>
Type of content:   <class 'str'>
Content:
 This is the first row.
The sentence, which starts in the second row
ends in the third row.

Last line follows an empty line.


An alternative way to read in all contents of a file into a single string variable is:

In [18]:
with open("exampleFile.txt","r") as f:
    read_data = f.read()
print(read_data)

This is the first row.
The sentence, which starts in the second row
ends in the third row.

Last line follows an empty line.


Actually this option is often recommended, because in this way the file is properly closed after access, even if an exception is raised on the way. 



If not the entire file content but only single lines shall be read the `readline()`-method can be applied as follows:

In [19]:
fin = open("exampleFile.txt","r")
print(fin.readline())
print(fin.readline())
fin.close()

This is the first row.

The sentence, which starts in the second row



In the while-loop of the following code snippet the contents of a file are read line by line. The code demonstrates, that
* for empty rows in the file `'\n'` is returned, which is the line-break control sequence
* At the end of the file an empty string is returned by `readline()`.


In [20]:
fin = open("exampleFile.txt","r")
while True:
    s=fin.readline()
    print(len(s))
    if len(s)>0:
        if s=="\n":
            print("Empty row")
        else:
            print(s)
    else:
        break
fin.close()

23
This is the first row.

45
The sentence, which starts in the second row

23
ends in the third row.

1
Empty row
32
Last line follows an empty line.
0


The most memory-efficient method for reading file-contents is to loop over the file object as follwos:

In [21]:
with open("exampleFile.txt","r") as fin:
    for line in fin:
        print(line)

This is the first row.

The sentence, which starts in the second row

ends in the third row.



Last line follows an empty line.


The `readlines()`-method returns a list of string-variables. Each element in this list contains the contents of a single line:

In [22]:
with open("exampleFile.txt","r") as fin:
    linelist=fin.readlines()
print(linelist)

['This is the first row.\n', 'The sentence, which starts in the second row\n', 'ends in the third row.\n', '\n', 'Last line follows an empty line.']


### Write to File

Writing to files can be done in 2 different modes: If the file shall always be erased before writing new text to it the parameter `mode` of the `open(filename,mode)`-method must be set to `mode='w'`. If the new content shall be appended to the already available contents in the file the mode parameter must be set to `mode='a'`. For writing, the file object, which is returned by the `open(filename,mode)`-method provides the methods `write()` and `writelines()`. The former writes the string in the methods argument to the file. The argument of the latter must be a list of strings, which is written to the file (not line by line, unless a `\n` is at the end of the list-elements):  

In [23]:
fout=open("outPutFile.txt",'w')
fout.write("My first row is here.\n")
fout.write("Here is the second row.\n")
usertext=input("Enter some text for the third row! ")
fout.write(usertext+'\n')
fout.writelines(['line 1','line 2'])
fout.close()

Enter some text for the third row! this is example text


In [24]:
with open("outPutFile.txt",'a') as fout:
    fout.write("Appended this line")

Note that in this way only string-variables can be written to files. Non-string-variables need to be converted to strings before writing:

In [34]:
with open("outPutFile.txt",'a') as fout:
    num=12.7
    try:
        fout.write(num)
    except:
        print("ERROR: Can't write this data type to file")

ERROR: Can't write this data type to file


In [35]:
with open("outPutFile.txt",'a') as fout:
    fout.write('\n')
    fout.write(str(num))

> There exists many Python packages for accessing files of certain types, e.g *.csv, Excel, json*, even for formats of other media types such as *.jpeg, .png, .wav, .mp3, .mp4, etc.*. Accessing databases is subject of lecture [Database Access](07DataBaseSQL.ipynb).       