# 4. handling input/output

1. [file handling](#file handling)
1. [modules](#modules)

## 4.1 File handling <a name="file handling" />

_The different modes in which a file can be opened can be found at [reading and writing files](https://docs.python.org/3.6/tutorial/inputoutput.html#reading-and-writing-files)_

### 4.1.1 File handling context

* a file must be opened, manipulated, then closed

In [9]:
fid = open("./extra/test.txt", "w+")
print("working with file: file open? {}".format(not(fid.closed)))
if not fid.closed:
    print("do not forget to close your file to avoid data corruption")
    fid.close()
print("stopped working with file: file open? {}\n".format(not(fid.closed)))

working with file: file open? True
do not forget to close your file to avoid data corruption
stopped working with file: file open? False



* easier &rightarrow; context manager<br />
  when leaving the context (decrease indentation) &rightarrow; automatic closing of file 

In [11]:
with open("./extra/test.txt", "r") as fid:
    print("context active: file open? {}".format(not(fid.closed)))
print("context inactive: file open? {}".format(not(fid.closed)))

context active: file open? True
context inactive: file open? False


#### reading from and writing to a file

In [None]:
text = ''
with open("./extra/test.txt", "r") as fid:
    for line in fid:
        text += line # if we would print line by line, extra linebreaks would be inserted by print(), concatenate

# first 250 characters of the text
print([text[:250]]) # linebreaks are represented by \n and are automatically formatted by print()

print('\n', '-' * 25, ' full text starts here ', '-' * 25, '\n')

# full text
print(text)

In [None]:
text = 'March 20, 2018 is the 250th birthday of Jean-Baptiste Joseph Fourier\nHappy birthday Count Fourier!'
with open("./extra/myFile.txt", "w+") as fid: # writes by concatenating to the end of the file
    fid.writelines(text)
    
text = ''
with open("./extra/myFile.txt", "r") as fid:
    for line in fid:
        text += line
print(text)

### 4.1.2 Python scripts

* save all your statements in a `.py`, e.g., `test.py`

* a collection of variables, functions, statements, &#8230;

* execute statements from command line / import functionalities

```bash
$ python test.py
```

_HINT:_ a .py file can be executed when called from the command line. To do so consistently on all systems (Windows, Mac OS, Linux) one must put the following condition

```python
if __name__ == "__main__":
    <content>
```

We will see later what the part outside this conditional becomes.

_HINT:_ For automatic execution by the correct python interpreter on unix/linux platforms add the shebang! on the first line

```bash
#!/usr/bin/env python3
```

### 4.1.3 dynamically writing a Python script

One can dynamically create a Python script by writing into a .py file using formatters!

In [25]:
a = 42 # change this value to any python structure, as you wish
with open("myfirstscript.py", "w") as fid:
    fid.writelines("#!/usr/bin/env python3\n\n") # the Python 3 shebang!
    fid.writelines( # new lines should use the correct indentation needed in the file (otherwise use \n\t escapes)
        """if __name__ == '__main__':
    a = {}
    print('a is {{}}'.format({}))
    """.format(a, a)) # double braces {{ }} are needed to escape the first level

In [26]:
!python myfirstscript.py

a is 42


### Exercise: Making a word histogram
_HINT:_ Search for "Aligning the text and specifying a width" on [formatting strings](https://docs.python.org/3/library/string.html#format-string-syntax)

In [39]:
from string import ascii_letters
from urllib.request import urlopen
from collections import OrderedDict

d = dict() # initialise an empty dictionary

with urlopen("http://www.textfiles.com/science/kennedy.txt") as f: # acts just as open(filename, "r")
    for line in f:
        for word in line.split():
            word = word.decode("ascii") # binary strings --> ascii strings
            word = "".join([c for c in word.lower() if c in ascii_letters]) # make lower case, purge punctuation
            # *** to complete *** : use the dictionary d to store word frequencies
            if word in d.keys():
                d[word] += 1
            else:
                d[word] = 1

# sorting the dictionary such that words with highest frequency appear first
d_ordered = OrderedDict(sorted(d.items(), key=lambda x: x[1])[::-1] ) 

L = max([len(k) for k in d_ordered.keys()])
print(L)

# printing
for k in d_ordered:
    """
    *** to complete *** with print statement such as to print lines with words and their frequency
    word1      : +++++++
    long_word2 : +++
    """
    print('{:17s}: {}'.format(k, d_ordered[k] * '+'))

16
the              : ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
and              : +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
of               : ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
a                : +++++++++++++++++++++++++++++
space            : ++++++++++++++++++++++++++++
is               : +++++++++++++++++++++++
                 : ++++++++++++++++++++++
for              : ++++++++++++++++++++
center           : ++++++++++++++++++++
in               : +++++++++++++++++++
to               : ++++++++++++++++++
kennedy          : +++++++++++++++++
launch           : ++++++++++++++
are              : +++++++++++++
at               : ++++++++++++
as               : ++++++++++++
with             : +++++++++++
spaceport        : +++++++++
work             : +++++++++
force            : +++++++++
nasa             : +++++++++
it               : +++++++++
o

## 4.2 Packages and modules <a name="modules" />

Packages extend the functionality of the basic Python functionality through multiple modules. When downloading Python, all of the modules of the [standard library](https://docs.python.org/3/library/index.html) are automatically available.

* packages are to Python what libraries are to C
* __modules__ &in; __packages__
* a collection of modules available in the standard library of a Python distribution

    * `math`, `sys`, `random`, `datetime`, `email`, &#8230;

One of the most used stacks is the scientific python stack &rightarrow; [scipy ecosystem](https://scipy.org/about.html)
* [scipy](http://scipy.org/): special functions, ODE solvers, &hellip;
* [numpy](http://www.numpy.org/): numerical computation
* [matplotlib](http://matplotlib.org/): plotting
* [IPython](http://ipython.org/) (+[Jupyter](http://jupyter.org/)): interactive console
* [sympy](http://www.sympy.org/): symbolic computing
* [pandas](http://pandas.pydata.org/): data structures and computation
* [nose](http://nose.readthedocs.io/en/latest/): continuous integration

Frequently augmented with the [scikits](https://www.scipy.org/scikits.html) packages ([list](http://scikits.appspot.com/scikits))

Machine learning:
* [scikit-learn](http://scikit-learn.org/stable/index.html)

Image processing:
* [scikit-image](http://scikit-image.org/)
* [openCV](https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_tutorials.html)
* python imaging library [PIL](http://pythonware.com/products/pil/)

and many, many more &hellip;

### 4.2.1 simple import

* `import module`

    * functions must be called as `module.function` &rightarrow; namespaces


* `from module import function1, function2`

    * only specific functions are imported

    * functions can be called as `function1`


* `from module import *`

    * __avoid__ whenever possible!
    
### 4.2.2 more advanced import

* `import modules as modulename`

  * choose your own namespace for the module
  
  * frequently used `numpy as np` `matplotlib.pyplot as plt`
  

* `from module import function as fn`

  * choose your name for a specific function
  
  * function can be called as `fn`

#### `log` is a function in the `math` module
after executing the next cell, try executing the previous cell again!

In [40]:
import math
math.log(100)/math.log(10)

2.0

In [41]:
log(100)/log(10) # log is only known in the namespace 'math'

NameError: name 'log' is not defined

In [42]:
from math import log
log(100)/log(10)

2.0

In [43]:
from math import log as ln
ln(100)

4.605170185988092

### 4.2.3 creating your own module (easy starter's guide)
For more information see the [modules tutorial](https://docs.python.org/3/tutorial/modules.html)

In [72]:
print(abc) # variable not defined

NameError: name 'abc' is not defined

In [73]:
z = 'abc' # change this value to any python structure, as you wish
with open("mymodule.py", "w") as fid:
    fid.writelines("#!/usr/bin/env python3\n\n") # the Python 3 shebang!
    fid.writelines("abc = '{}'".format(z))
print(abc) # abc is still unknown, but the lines before have been executed

NameError: name 'abc' is not defined

In [74]:
with open("mymodule.py", "r") as fid:
    for line in fid:
        print(line)

#!/usr/bin/env python3



abc = 'abc'


In [75]:
import mymodule

print(mymodule.abc) # but it is in this module

from mymodule import abc as q

print(q)

abc
abc


In [76]:
import importlib # Python remembers which modules have already been imported, no new import takes place
importlib.reload(mymodule) # force reload of the module

<module 'mymodule' from '/home/rphlypo/Documents/Pro/GrenobleINP/FormationContinue/PythonDebutants/notebook/mymodule.py'>

#### Exercise: create a module

* use an external editor (you can use new &rightarrow; textfile from the notebook tree)
* name your module first_name.py
* the module should contain the variables that can be imported
  * age = your_Python_age (int, number of years of experience)
  * firstName = 'your_first_name' (string)
  * lastName = 'your_last_name' (string)
* when called from the command line, a string should be returned:
  * 'Hi, I'm /firstName/ /lastName/ , and I am /age/ Python years old'
  
Use `import <module>` and display the variables with calls to `print`, then run `!python <module>`