# Input and Output
- str() function return human-readable representations of values.
- repr() generate representations which can be read by the interpreter.
- For objects which don’t have a particular representation for human consumption, str() will return the same value as repr().

In [27]:
s = 'Hello, world.'
str(s)

'Hello, world.'

In [28]:
l = list(range(4))
str(l)

'[0, 1, 2, 3]'

In [29]:
repr(s)

"'Hello, world.'"

In [30]:
repr(l)

'[0, 1, 2, 3]'

In [31]:
x = 10 * 3.25
y = 200 * 200
s = 'The value of x is ' + str(x) + ', and y is ' + repr(y) + '...'
print(s)

The value of x is 32.5, and y is 40000...


repr() of a string adds string quotes and backslashes:

In [32]:
hello = 'hello, world\n'
hellos = repr(hello)
hellos

"'hello, world\\n'"

The argument to repr() may be any Python object:

In [33]:
repr((x, y, ('spam', 'eggs')))

"(32.5, 40000, ('spam', 'eggs'))"

In [34]:
n = 7
for x in range(1, n):
    for i in range(n):
        print(repr(x**i).rjust(i+2), end=' ') # rjust or center can be used
    print()

 1   1    1     1      1       1        1 
 1   2    4     8     16      32       64 
 1   3    9    27     81     243      729 
 1   4   16    64    256    1024     4096 
 1   5   25   125    625    3125    15625 
 1   6   36   216   1296    7776    46656 


In [35]:
for x in range(1, n):
    for i in range(n):
        print("%07d" % x**i, end=' ')  # old C format
    print()

0000001 0000001 0000001 0000001 0000001 0000001 0000001 
0000001 0000002 0000004 0000008 0000016 0000032 0000064 
0000001 0000003 0000009 0000027 0000081 0000243 0000729 
0000001 0000004 0000016 0000064 0000256 0001024 0004096 
0000001 0000005 0000025 0000125 0000625 0003125 0015625 
0000001 0000006 0000036 0000216 0001296 0007776 0046656 


# Usage of the `str.format()` method 

In [36]:
print('We are at the {} in {}!'.format('osur', 'Rennes'))

We are at the osur in Rennes!


In [37]:
print('From {0} to  {1}'.format('November 17', 'November 24'))

From November 17 to  November 24


In [38]:
print('It takes place at {place}'.format(place='Milon room'))

It takes place at Milon room


In [39]:
import math
print('The value of PI is approximately {:.7g}.'.format(math.pi))

The value of PI is approximately 3.141593.


## Formatted string literals (Python 3.6)

In [40]:
print(f'The value of PI is approximately {math.pi:.4f}.')

The value of PI is approximately 3.1416.


In [41]:
name = "Fred"
print(f"He said his name is {name}.")
print(f"He said his name is {name!r}.")

He said his name is Fred.
He said his name is 'Fred'.


In [42]:
f"He said his name is {repr(name)}."  # repr() is equivalent to !r

"He said his name is 'Fred'."

In [43]:
width, precision = 10, 4
value = 12.34567
print(f"result: {value:{width}.{precision}f}")  # nested fields

result:    12.3457


In [44]:
from datetime import *
today = datetime(year=2017, month=1, day=27)
print(f"{today:%B %d, %Y}")  # using date format specifier

January 27, 2017


# Exercise
Create a list containing the values of [binomial coefficients](https://en.wikipedia.org/wiki/Binomial_coefficient) and reproduce the [Pascal's triangle](https://en.wikipedia.org/wiki/Pascal%27s_triangle)
<pre>
                 1  
               1   1  
             1   2   1  
           1   3   3   1  
         1   4   6   4   1  
       1   5  10  10   5   1  
     1   6  15  20  15   6   1  
   1   7  21  35  35  21   7   1  
</pre>

<button data-toggle="collapse" data-target="#pascal" class='btn btn-primary'>Solution</button>
<div id="pascal" class="collapse">
```python
def binomial(n, p):
    b = 1    
    for i in range(1, min(p, n - p) + 1):
        b *= n
        b = b // i
        n -= 1
    return b

def pascal_triangle(n):
    for i in range(n):
        line = [binomial(i, j) for j in range(i+1)]
        s = (n-i) * 3 * " " # number of spaces
        for c in line:
            s += repr(c).rjust(3) + 3 * " " # coeffs repr split by 3 spaces
        print(s)
        
pascal_triangle(10)
```

In [54]:
# %load solutions/pascal.py
def binomial(n, p):
    b = 1    
    for i in range(1, min(p, n - p) + 1):
        b *= n
        b = b // i
        n -= 1
    return b

def pascal_triangle(n):
    spaces = 2*n
    for i in range(n):
        for j in range(spaces):
            print('', end=" ")
        for j in range(i+1):
            print("{:3d}".format(binomial(i, j)), end=" ")
        print()
        spaces = spaces - 2
        
        

pascal_triangle(10)


                      1 
                    1   1 
                  1   2   1 
                1   3   3   1 
              1   4   6   4   1 
            1   5  10  10   5   1 
          1   6  15  20  15   6   1 
        1   7  21  35  35  21   7   1 
      1   8  28  56  70  56  28   8   1 
    1   9  36  84 126 126  84  36   9   1 


# Reading and Writing Files

`open()` returns a file object, and is most commonly used with file name and accessing mode argument.


In [55]:
f = open('workfile.txt', 'w')
f.write("1. This is a txt file.\n")
f.write("2. \\n is used to begin a new line")
f.close()
!cat workfile.txt

1. This is a txt file.
2. \n is used to begin a new line

`mode` can be :
- 'r' when the file will only be read, 
- 'w' for only writing (an existing file with the same name will be erased)
- 'a' opens the file for appending; any data written to the file is automatically added to the end. 
- 'r+' opens the file for both reading and writing. 
- The mode argument is optional; 'r' will be assumed if it’s omitted.
- Normally, files are opened in text mode.
- 'b' appended to the mode opens the file in binary mode.

In [56]:
with open('workfile.txt') as f:
    read_text = f.read()
f.closed

True

In [57]:
read_text

'1. This is a txt file.\n2. \\n is used to begin a new line'

In [58]:
lines= []
with open('workfile.txt') as f:
    lines.append(f.readline())
    lines.append(f.readline())
    lines.append(f.readline())
    
lines

['1. This is a txt file.\n', '2. \\n is used to begin a new line', '']

- `f.readline()` returns an empty string when the end of the file has been reached.
- `f.readlines()` or `list(f)` read all the lines of a file in a list.

For reading lines from a file, you can loop over the file object. This is memory efficient, fast, and leads to simple code:

In [59]:
with open('workfile.txt') as f:
    for line in f:
        print(line, end='')

1. This is a txt file.
2. \n is used to begin a new line

### Exercise: Wordcount Example

[WordCount](https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0) is a simple application that counts the number of occurrences of each word in a given input set.

- Use lorem module to write a text in the file "sample.txt"
- Write a function `words` with file name as input that returns a sorted list of words present in the file.
- Write the function `reduce` to read the results of words and sum the occurrences of each word to a final count, and then output the results as a dictionary
`{word1:occurences1, word2:occurences2}`.
- You can check the results using piped shell commands:
```sh
cat sample.txt | fmt -1 | tr [:upper:] [:lower:] | tr -d '.' | sort | uniq -c 
```

<button data-toggle="collapse" data-target="#words" class='btn btn-primary'>Solution</button>
<div id="words" class="collapse">
```python
from lorem import text

with open('sample.txt','w') as f:
    f.write(text())
    
def words(file):
    """ Parse a file and returns a sorted list of words """
    with open(file) as f:
        text = f.read()
        
    return sorted(text.lower().replace('.','').split())

words('sample.txt')
```

### Install `lorem` with conda in the current Jupyter kernel

```python
import sys
!conda install --yes --prefix {sys.prefix} -c conda-forge lorem
```

### Install `lorem` with pip in the current Jupyter kernel
```python
import sys
!{sys.executable} -m pip install lorem
```

In [63]:
from lorem import text

with open('sample.txt','w') as f:
    f.write(text())
    

'Dolorem porro voluptatem modi dolor. Labore velit velit ut non non eius. Amet est velit velit. Sed aliquam sit quiquia quiquia dolore quisquam. Labore porro quiquia labore. Quisquam eius velit etincidunt magnam porro quaerat. Modi quisquam magnam porro. Quaerat quisquam aliquam sed eius neque ipsum.\n\nQuaerat quaerat quaerat non amet est. Est non quaerat non eius aliquam dolorem dolore. Neque quaerat quiquia est est. Velit quisquam voluptatem non quaerat tempora. Neque porro sed modi neque modi. Ipsum adipisci neque eius ipsum velit labore. Amet non est dolore magnam. Ut etincidunt modi aliquam. Ut eius adipisci ipsum.\n\nAmet labore aliquam amet magnam ut modi adipisci. Numquam ipsum dolore non ipsum numquam. Dolorem quaerat dolore tempora. Est dolore ut quiquia consectetur. Est labore sed voluptatem consectetur porro porro.'

In [62]:
%cat sample.txt

Etincidunt tempora labore sed neque. Sit porro quisquam eius amet. Aliquam magnam quaerat aliquam. Quiquia tempora aliquam dolorem dolore aliquam dolor. Dolorem ipsum velit etincidunt sit porro magnam quisquam. Est amet dolorem quisquam aliquam adipisci numquam non. Porro amet sed est aliquam. Numquam quiquia velit quaerat sed est dolore. Sit ipsum sit sed dolor velit velit. Ut neque quiquia porro quaerat magnam adipisci.

Quisquam ut modi amet. Aliquam magnam consectetur tempora. Dolore labore dolore consectetur. Est est neque quaerat sed porro. Quisquam porro magnam magnam. Consectetur dolor modi sed. Sit sit neque tempora quaerat. Labore quisquam non adipisci adipisci. Est labore dolorem est modi quiquia quiquia. Dolorem magnam quisquam labore tempora ipsum.

Porro ipsum consectetur quiquia. Numquam est dolore quiquia dolorem labore. Neque velit tempora modi dolor eius adipisci. Magnam porro adipisci sed neque. Velit neque magnam non aliquam tempora. Modi consectetur etincidunt 

In [73]:
def words(filename):
    with open(filename) as f:
        for line in f:
            line = line.strip().lower().replace('.','')
    return sorted(line.split(' '))

words('sample.txt')

['adipisci',
 'adipisci',
 'adipisci',
 'adipisci',
 'adipisci',
 'adipisci',
 'aliquam',
 'aliquam',
 'aliquam',
 'aliquam',
 'amet',
 'amet',
 'consectetur',
 'consectetur',
 'dolor',
 'dolorem',
 'dolorem',
 'etincidunt',
 'etincidunt',
 'ipsum',
 'ipsum',
 'ipsum',
 'labore',
 'magnam',
 'modi',
 'neque',
 'neque',
 'neque',
 'porro',
 'porro',
 'porro',
 'porro',
 'quiquia',
 'quiquia',
 'quiquia',
 'quiquia',
 'quiquia',
 'quisquam',
 'quisquam',
 'quisquam',
 'sed',
 'sed',
 'tempora',
 'tempora',
 'tempora',
 'ut',
 'velit',
 'velit',
 'velit']

In [81]:
l = []
help(sorted)

Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.



In [94]:
import operator

def reduce(words):
    d = {}
    current_word = words[0]
    d[current_word] = 1
    n = len(words)
    for i in range(1,n):
        if words[i] == current_word:
            d[words[i]] += 1
        else:
            d[words[i]] = 1
            current_word = words[i]
    #return sorted(d.items(), key=operator.itemgetter(1),reverse=True)
    return sorted(d.items(), key=lambda x:x[1], reverse=True)

#reduce(words('sample.txt'))
                  

In [100]:
%load_ext fortranmagic
%env['CC'] = 'gcc-7'

The fortranmagic extension is already loaded. To reload it, use:
  %reload_ext fortranmagic
env: ['CC']='gcc-7'


In [107]:
%%fortran
subroutine somme(  tab, s )
integer :: n
integer, dimension(:), intent(in) :: tab
integer, intent(out) :: s
n = size(tab)
s = 0
do i = 1, n
   s = s + tab(i)
end do
end subroutine somme
        

In [109]:
import numpy as np
tab = np.array([1,2,3,4,5])
somme( tab)

15

In [28]:
def words( file ):
    """ Parse a file and returns a sorted list of words """
    pass

words('sample.txt')
#[('adipisci', 1),
# ('adipisci', 1),
# ('adipisci', 1),
# ('aliquam', 1),
# ('aliquam', 1),

In [29]:
d = {}
d['word1'] = 3
d['word2'] = 2
d

{'word1': 3, 'word2': 2}

In [21]:
def reduce ( words ):
    """ Count the number of occurences of a word in list
    and return a dictionary """
    pass

reduce(words('sample.txt'))
#{'neque': 80),
# 'ut': 80,
# 'est': 76,
# 'amet': 74,
# 'magnam': 74,
# 'adipisci': 73,

<button data-toggle="collapse" data-target="#reduce" class='btn btn-primary'>Solution</button>
<div id="reduce" class="collapse">
```python
def reduce ( words ):
    """ Count the number of occurences of a word in list
    and return a dictionary """
    current_word = None
    result = {}
    for word in words:
        if current_word is None:
            current_word = word
            result[word] = 0  # Add the first word in result

        # this if only works because words output is sorted 
        if current_word == word:
            result[word] += 1
        else:
            current_word = word
            result[word] = 1
                       
    return result
reduce(words('sample.txt'))
```

# Saving structured data with json

- JSON (JavaScript Object Notation) is a popular data interchange format.
- JSON format is commonly used by modern applications to allow for data exchange. 
- JSON can be used to communicate with applications written in other languages.

In [30]:
import json
json.dumps([1, 'simple', 'list'])

'[1, "simple", "list"]'

In [34]:
x = dict(name="Pierre Navaro", organization="CNRS", position="IR")
with open('workfile.json','w') as f:
    json.dump(x, f)

In [35]:
with open('workfile.json','r') as f:
    x = json.load(f)
x

{'name': 'Pierre Navaro', 'organization': 'CNRS', 'position': 'IR'}

In [36]:
%cat workfile.json

{"name": "Pierre Navaro", "organization": "CNRS", "position": "IR"}

Use `ujson` for big data structures
https://pypi.python.org/pypi/ujson
