# Introduction to Python and Jupyter notebooks

Python is a general purpose programming language, very popular for data science thanks to the myriad of libraries and tools that provides.

See https://www.python.org/ and https://en.wikipedia.org/wiki/Python_(programming_language)

Python can be used from the command line, with a REPL. But we will use it through notebooks

* What is a REPL? What does the REPL acronmy stands for? (try to find it out using Google)

Python uses spaces or tabs for code blocks. **By convention, we will use 4 spaces for blocks** (this is the default in Jupyter).

The Python language design is distinguished by its emphasis on readability, simplicity, and explicitness. 
Some people go so far as to liken it to “executable pseudocode”.

Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages
A colon denotes the start of an indented code block after which all of the code must be indented by the same amount until the end of the block.

One major reason that whitespace matters is that it results in most Python code looking cosmetically similar, which means less cognitive dissonance when you read a piece of code that you didn’t write yourself (or wrote in a hurry a year ago!).

In [None]:
array = [4,1,2,3,5,7,8,5]
pivot = 4
lower = []
greater = []
for x in array:
    if x < pivot:
        lower.append(x)
    else:
        greater.append(x)

Comments

In [5]:
# Comments start with ampersand, and can be anywhere
a = 4
# This line is a comment

b = 5  # Inlined comment (2 spaces after code by convention)

Semicolon

In [6]:
# Several commands in the same line
a = 5; b = 6; c =3

print(a)
print(b)
print(c) 

5
6
3


## Everything is an object

Every number, string, data structure, function, class, module etc exists in the Python interpreter in its own “box” which is referred to as a Python object.

* `type` and `isinstance`

Objects have attributes

In [4]:
a = "hello"
# Check . with autocompletion
a.capitalize()

'Hello'

In [5]:
dir(a)  # Output of cell hidden for brevity (click the dots to get the full output)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

## Strongly typed

Strongly typed means that Python will not convert objects under the hood. You must do any necessary conversion yourself

In [6]:
4 + "5"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [9]:
a = "5"
4 + int(a)

9

## Main types

### Scalars

int, long, float

str, bool

None

In [13]:
# Integers
x = 45**4511

In [14]:
type(x)

int

In [15]:
type(3)

int

In [19]:
isinstance(x, int)

True

In [20]:
a = 4
b = 5
a + b

9

In [21]:
a*b

20

In [22]:
a/b

0.8

In [23]:
a//b

0

In [24]:
a % b

4

In [25]:
# Float
a = 9.122
b = 28.12121

a+b


37.243210000000005

In [26]:
a*b

256.52167762

In [27]:
7*3.5

24.5

In [28]:
int(7*3.5)

24

In [29]:
# Boolean
x = True
y = False

In [30]:
true  # se escriben en mays

NameError: name 'true' is not defined

In [31]:
x and y

False

In [32]:
x or y

True

In [33]:
not y

True

In [34]:
x & y

False

In [36]:
x | y

True

In [40]:
x

True

In [41]:
x or pepe

True

In [42]:
x and pepe  # pepe no existe

NameError: name 'pepe' is not defined

In [43]:
# Strings
x = "esto es una cadena"
y = 'esto es otra cadena'
z = "Puedo usar las 'comillas' simples sin problema"
a = 'Puedo usar las "comillas" dobles sin problema'

In [44]:
print(x)
print(y)
print(z)
print(a)

esto es una cadena
esto es otra cadena
Puedo usar las 'comillas' simples sin problema
Puedo usar las "comillas" dobles sin problema


In [45]:
x + y

'esto es una cadenaesto es otra cadena'

In [46]:
x + ", " + y

'esto es una cadena, esto es otra cadena'

In [47]:
"""
Esto es un texto,
con varias l'ineas

Y m'as todav'ia
"""

"\nEsto es un texto,\ncon varias l'ineas\n\nY m'as todav'ia\n"

In [48]:
"Esto es un texto
Con varias l'ineas"

SyntaxError: EOL while scanning string literal (<ipython-input-48-3b14886bf14a>, line 1)

In [49]:
int("5")

5

In [50]:
float(5)

5.0

In [51]:
str(5)

'5'

In [52]:
x = 7
"El resultado es " + x

TypeError: must be str, not int

In [53]:
"El resultado es " + str(x)

'El resultado es 7'

### Data structures

lists, tuples, dicts and sets

In [54]:
[4,2,1,2,5,6,6,3,3,3]

[4, 2, 1, 2, 5, 6, 6, 3, 3, 3]

In [55]:
[4,4,2.7,"hola",3,2.5]

[4, 4, 2.7, 'hola', 3, 2.5]

In [58]:
x = [4,5,3,12,2,3,6,6,7,8]

In [59]:
x[0]

4

In [60]:
x[1]

5

In [61]:
x[2:5]

[3, 12, 2]

In [62]:
y = [1,2,3]

In [63]:
x + y

[4, 5, 3, 12, 2, 3, 6, 6, 7, 8, 1, 2, 3]

In [64]:
s = "esto es una cadena"

In [65]:
s[0], s[7]

('e', ' ')

In [67]:
x + s  # s parece una lista pero no lo es

TypeError: can only concatenate list (not "str") to list

In [68]:
y.append(123)

In [69]:
y

[1, 2, 3, 123]

In [70]:
len(y)

4

In [71]:
y.reverse()

In [72]:
y

[123, 3, 2, 1]

In [73]:
y.sort()

In [74]:
y

[1, 2, 3, 123]

In [75]:
y.pop()

123

In [76]:
y

[1, 2, 3]

In [77]:
y.pop()  # Dame el 'ultimo elemento, y qu'italo de la lista

3

In [78]:
y

[1, 2]

In [79]:
?list.pop

[0;31mDocstring:[0m
L.pop([index]) -> item -- remove and return item at index (default last).
Raises IndexError if list is empty or index is out of range.
[0;31mType:[0m      method_descriptor


In [80]:
y.pop(0)

1

In [81]:
y

[2]

In [82]:
y.pop()

2

In [83]:
y

[]

In [85]:
# Dictionaries

# Metadatos de personas
d = {
    'nombre': 'Pepe',
    'apellidos': 'Pepon',
    7: [1,2,3]
}

In [86]:
d['nombre']

'Pepe'

In [87]:
d['apellidos']

'Pepon'

In [88]:
d['direccion']   # no existe esta clave

KeyError: 'direccion'

In [89]:
d[7]

[1, 2, 3]

In [90]:
d['direccion'] = 'aqui hay que poner algo'

In [91]:
d

{'nombre': 'Pepe',
 'apellidos': 'Pepon',
 7: [1, 2, 3],
 'direccion': 'aqui hay que poner algo'}

In [92]:
d['direccion'] = {
    'calle': 'Soledad',
    'cod. postal': 28001
}

In [93]:
d

{'nombre': 'Pepe',
 'apellidos': 'Pepon',
 7: [1, 2, 3],
 'direccion': {'calle': 'Soledad', 'cod. postal': 28001}}

In [94]:
d.keys()

dict_keys(['nombre', 'apellidos', 7, 'direccion'])

In [95]:
d.pop(7)

[1, 2, 3]

In [96]:
d

{'nombre': 'Pepe',
 'apellidos': 'Pepon',
 'direccion': {'calle': 'Soledad', 'cod. postal': 28001}}

In [97]:
d.values()

dict_values(['Pepe', 'Pepon', {'calle': 'Soledad', 'cod. postal': 28001}])

In [99]:
# Tuples

# Las listas pueden cambiar
x

[4, 5, 3, 12, 2, 3, 6, 6, 7, 8]

In [100]:
x[5] = 'me he colado'

In [101]:
x

[4, 5, 3, 12, 2, 'me he colado', 6, 6, 7, 8]

In [102]:
t = (4,2,1,2,5,2)

In [103]:
t

(4, 2, 1, 2, 5, 2)

In [104]:
len(t)

6

In [105]:
t[3]

2

In [106]:
t[0]

4

In [107]:
t[3] = 'hola'

TypeError: 'tuple' object does not support item assignment

In [108]:
(3,2,'hola',2,2)

(3, 2, 'hola', 2, 2)

In [109]:
# Sets

x

[4, 5, 3, 12, 2, 'me he colado', 6, 6, 7, 8]

In [113]:
c1 = set(x)

In [114]:
c2 = set([3,12,7,8])

In [115]:
c1 - c2

{2, 4, 5, 6, 'me he colado'}

In [117]:
c1.intersection(c2)

{3, 7, 8, 12}

## Flow control

if - else - elseif

for, while loops (break and continue)

In other languages, you have switch, that does not exist in Python

In [119]:
# if - else - elseif
x = 7
y = 5
if(x==y):
    print("Son iguales")
else:
    print("No son iguales")



No son iguales


In [122]:
x = 3
y = 3
if(x>y):
    print("Es mayor")
elif(x<y):
    print("Es menor")
else:
    print("Son iguales")

Son iguales


In [125]:
# For loops
x = [12,35,6,6,2,3,3,3]

for e in x:
    print(e)

12
35
6
6
2
3
3
3


In [127]:
for e in x:
    if(e%2 == 0):
        print(e)

12
6
6
2


In [128]:
x

[12, 35, 6, 6, 2, 3, 3, 3]

In [129]:
for e in x:
    if(e == 3):
        break
        
    print(e)

12
35
6
6
2


In [130]:
for e in x:
    if(e == 6):
        continue
        
    print(e)

12
35
2
3
3
3


In [134]:
# While loops
i = 0
while i < 4:
    print(x[i])
    i += 1

12
35
6
6


## Functions

return (and None when ommitted)

Positional and keyword arguments

In [135]:
def nombre(arg1, arg2, arg3):
    # se hace algo aqui
    
    return arg1 + arg2 + arg3

In [136]:
nombre(4,3,2)

9

In [137]:
nombre(2)

TypeError: nombre() missing 2 required positional arguments: 'arg2' and 'arg3'

In [138]:
def escribe(s):
    print(s)

In [139]:
escribe("Hola")

Hola


In [140]:
x = escribe("Hola")

Hola


In [142]:
x == None

True

In [143]:
x

In [144]:
if x:
    print("x es algo")
else:
    print("resulta que x es None")

resulta que x es None


In [147]:
# Si queremos retonar un "error", es buena idea retornar None
from math import sqrt

def raiz_cuadrada(n):
    if(n>=0):
        return sqrt(n)
    else:
        return None  # o simplemente return sin nada m'as

In [148]:
raiz_cuadrada(7)

2.6457513110645907

In [150]:
raiz_cuadrada(-6)  # Retorna None

Local variables

In [161]:
var_fuera = 5

def mi_funcion(arg1):
    var_dentro = 7
    print("Veo var dentro" + str(var_dentro))
    print("Y tambi'en var_fuera")
    print(var_fuera)

In [159]:
mi_funcion(5.6)

Veo var dentro7
Y tambi'en var_fuera
5


In [160]:
var_dentro  # Esto era local a la funcion, no est'a en el contexto actual

NameError: name 'var_dentro' is not defined

In [162]:
x = 7

def mi_funcion(x):
    print("Cu'anto vale x?")
    print(x)

In [163]:
x

7

In [164]:
mi_funcion(3)

Cu'anto vale x?
3


In [165]:
x

7

How to return multiple values? How to consume that output?

In [173]:
def mi_funcion(a,b,c):
    suma = a+b+c
    mul = a*b*c
    
    return suma, mul

In [174]:
mi_funcion(3,4,1)

(8, 12)

In [175]:
a, b = mi_funcion(3,4,1)

In [176]:
a

8

In [177]:
b

12

In [178]:
a = mi_funcion(3,4,1)

In [179]:
a

(8, 12)

In [180]:
# C'omo ignoro lo que no quiero?
_, b = mi_funcion(3,4,1)

In [181]:
b

12

In [182]:
_

8

In [185]:
_,_ = mi_funcion(3,4,1)  # error: dos veces el mismo nombre de variable
# pero lo podemos usar para ignorar varias veces

In [184]:
_

12

Positional and nominal arguments

In [186]:
# Por defecto, los argumentos son posicionales

def mi_funcion(arg1, arg2, arg3):
    return arg1+arg2+arg3

In [187]:
mi_funcion(4,5,6)  # arg1 -> 4   arg2 -> 5  arg3 -> 6

15

In [192]:
# Tambi'en podemos hacer argumentos nominales
def mi_funcion(arg1, nom1=1, nom2='hola'):
    print(nom1)
    print(nom2)
    return arg1

In [193]:
mi_funcion(4,5,6)

5
6


4

In [194]:
mi_funcion(4,nom2=7)

1
7


4

In [195]:
mi_funcion(4,nom1=5)

5
hola


4

In [196]:
mi_funcion(4,nom2='Hola soy nom2', nom1='Hola soy nom1')

Hola soy nom1
Hola soy nom2


4

Anonymous (lambda) functions

In [197]:
lambda x: x*2

<function __main__.<lambda>(x)>

In [198]:
f = lambda x,y: x*y

In [199]:
f(7,8)

56

Functions as objects: how to reference, how to call

* Example with map

In [202]:
x = [5,4,2,2,1,2,4,5,6,1,2]

list(map(lambda n: e*2, x))

[6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]

In [203]:
list(map(sqrt, x))

[2.23606797749979,
 2.0,
 1.4142135623730951,
 1.4142135623730951,
 1.0,
 1.4142135623730951,
 2.0,
 2.23606797749979,
 2.449489742783178,
 1.0,
 1.4142135623730951]

Call by value or call by reference? Python is always "call by reference" (warning!)

In [204]:
l = [1,3,5,2,5,6,67,7,2]

In [206]:
l[6]

67

In [207]:
def g(ns):
    ns[6] = 89
    return len(ns)

In [208]:
g(l)

9

In [210]:
l

[1, 3, 5, 2, 5, 6, 89, 7, 2]

In [211]:
# Puedo enganhar a Python y pasar por valor

In [212]:
l = [1,3,5,2,5,6,67,7,2]
l2 = l
g(l2)
l

[1, 3, 5, 2, 5, 6, 89, 7, 2]

# Exercises

## Exercise 1

Write a function “centenario” that will take Name, and year of birth  as inputs and print name together with the text explaining when the person is to have 100 years 

* call to function: 	`centenario("Antonio", 1967)`
* output: 	`Antonio will reach 100 years in 2067.`

The function should return a string, and it should not have any *side effect*. If the input string is empty, or the year is equal to or lower than 0, it should return None

In [227]:
def centenario(nombre, anho):
    if (not nombre) or (anho <= 0):
        return None
    
    anho_cent = anho + 100
    s = nombre + " will reach 100 years in " + str(anho_cent) + "."
    return s

In [228]:
centenario("Antonio", 1967)

'Antonio will reach 100 years in 2067.'

In [229]:
centenario("", 2004)

In [230]:
centenario("Pepe", -156)

## Exercise 2

Implement a function that takes as input three variables, and returns the largest of the three.  Do this without using the Python `max()` function! Make one version without any local variable and another with one local variable.

* call to function: `max3(3,4,2)`
* output: `4`


In [270]:
def max3_1(n1, n2, n3):
    # Usando max
    return max([n1,n2,n3])

def max3_2(n1, n2, n3):
    # Con variables locales
    m = n1
    if n2>m:
        m = n2        
    if n3>m:
        m = n3        
    return m

def max3_2bis(n1, n2, n3):
    # Otra manera
    l = [n2, n3]
    m = n1
    for e in l:
        if e>m:
            m = e
            
    return m
    
def max3_3(n1, n2, n3):
    # Sin variables locales
    if n1>n2:
        if n1>n3:
            return n1
        else:
            return n3
    else:
        if n2>n3:
            return n2
        else:
            return n3

def max3_3c(n1, n2, n3):
    # Sin variables locales
    if n1>n2:
        if n1>n3:
            return n1
        else:
            return n3
    elif n2>n3:
        return n2
    else:
        return n3

        
def max3_3bis(n1, n2, n3):
    # Sin variables locales
    if n1>n2>n3: # OJO!! Esto est'a mal, habr'ia que usar and, or, lo que necesit'aramos
        return n1
    elif n2>n3:
        return n2
    else:
        return n3

In [264]:
max3_3bis(4,5,6)

6

In [265]:
max3_3bis(4,6,3)

6

In [266]:
max3_3bis(-500,-600,-700)

-500

In [269]:
max3_3bis(3,1,2)

2

# Types in greater detail

Numeric types: arbitray length ints, methods, operations (integer and real division)

In [None]:
# Me he adelantado, lo hemos visto arriba donde ve'iamos ints

Complex numbers

In [272]:
3+2j

(3+2j)

In [273]:
3+2*j  # Esto no vale

NameError: name 'j' is not defined

In [274]:
c = 3+2j

In [277]:
c.conjugate()

(3-2j)

In [278]:
c*c.conjugate()

(13+0j)

Strings: single and multi line, strings as collections, interpolations and formatting, common methods (find, replace, split, count, isdigit, join)

In [281]:
from math import pi

In [283]:
# Interpolation
nombre = 'Antonio'
anho = 2018


s = "%s, estamos en el anho %d, somos muy viejos y pi vale %.4f" % (nombre, anho, pi)
print(s)

Antonio, estamos en el anho 2018, somos muy viejos y pi vale 3.1416


In [284]:
# M'etodos habituales
s.find("h")

25

In [285]:
s[25]

'h'

In [287]:
s.find("m")  # Devuelve la pos. de la primera ocurrencia

13

In [288]:
s.find("muy")

40

In [291]:
s.replace("m", "XXX")

'Antonio, estaXXXos en el anho 2018, soXXXos XXXuy viejos y pi vale 3.1416'

In [292]:
s.replace("am","PEPEPEPE")

'Antonio, estPEPEPEPEos en el anho 2018, somos muy viejos y pi vale 3.1416'

In [293]:
# La variable s no ha cambiado
s

'Antonio, estamos en el anho 2018, somos muy viejos y pi vale 3.1416'

In [294]:
s.split()

['Antonio,',
 'estamos',
 'en',
 'el',
 'anho',
 '2018,',
 'somos',
 'muy',
 'viejos',
 'y',
 'pi',
 'vale',
 '3.1416']

In [295]:
s.split(",")

['Antonio', ' estamos en el anho 2018', ' somos muy viejos y pi vale 3.1416']

In [298]:
s = "4,5,1,2,6,7,78,8,2"
s.split(",")

['4', '5', '1', '2', '6', '7', '78', '8', '2']

Booleans and boolean operators

In [None]:
# Lo hemos comentado arriba

Common type conversions: int, float, str, bool

In [299]:
# Lo hemos comentado arriba

# Exercises

## Exercise 3

Write a function to calculate the number of words, number of lines, and length of a string the same way the wc command does in the command line

In [309]:
def wc(s):
    nc = len(s.replace(" ", "").replace("\n", ""))
    nw = len(s.split())
    nl = len(s.split('\n'))
    
    return (nl, nw, nc)

In [310]:
x = """Varias palabras en el mismo texto 

Varias l'ineas, algunas vac'ias incluso

Ya est'a"""

In [311]:
wc(x)

(5, 13, 70)

In [312]:
wc("")

(1, 0, 0)

## Exercise 4

Write a Python program to remove the nth index character from a string. If the input string is empty, return None.

In [368]:
def remove_n(s, n):
    if (not s) or (n<0):  # poner n <= len(s) es redundante
        return None
    
    # Antes de la pos n??
    s1 = s[0:n]
    s2 = s[n+1:]
    
    return s1+s2

def remove_n_2(s, n):
    if (not s) or (n<0):  # poner n <= len(s) es redundante
        return None

    s2 = ""
    i = -1
    
    for c in s:
        i += 1
        if(i==n):
            continue
        s2 = s2 + c
        
    return s2

In [369]:
s

'4,5,1,2,6,7,78,8,2'

In [370]:
remove_n("", 5)

In [371]:
remove_n(s, -7)

In [372]:
remove_n_2(s,0)

',5,1,2,6,7,78,8,2'

In [373]:
remove_n_2(s,len(s)-1)

'4,5,1,2,6,7,78,8,'

In [374]:
remove_n_2(s,5)

'4,5,12,6,7,78,8,2'

In [375]:
remove_n_2(s,123123)

'4,5,1,2,6,7,78,8,2'

In [338]:
s[12312323:]

''

In [342]:
s[123131232:len(s)]

''

# Data structures in greater detail

## Lists

* Create lists, empty lists. What can a list contain?
* Indexing
* Common methods: append, extend, insert, remove, pop
* Operations with lists: concatenate, sort, reverse
* Operators: in

In [1]:
x = [1,2,3,5,12,2,5,5,23,3]

In [2]:
x.append('cualquier cosa')

In [3]:
x

[1, 2, 3, 5, 12, 2, 5, 5, 23, 3, 'cualquier cosa']

In [4]:
x + 'otra cosa'

TypeError: can only concatenate list (not "str") to list

In [5]:
x + ['otra cosa']

[1, 2, 3, 5, 12, 2, 5, 5, 23, 3, 'cualquier cosa', 'otra cosa']

In [6]:
?x.extend

[0;31mDocstring:[0m L.extend(iterable) -> None -- extend list by appending elements from the iterable
[0;31mType:[0m      builtin_function_or_method


In [7]:
help(x.extend)

Help on built-in function extend:

extend(...) method of builtins.list instance
    L.extend(iterable) -> None -- extend list by appending elements from the iterable



In [8]:
x.extend([67,454,2])

In [9]:
x

[1, 2, 3, 5, 12, 2, 5, 5, 23, 3, 'cualquier cosa', 67, 454, 2]

In [10]:
x + [785,2,3,4,1,1]

[1,
 2,
 3,
 5,
 12,
 2,
 5,
 5,
 23,
 3,
 'cualquier cosa',
 67,
 454,
 2,
 785,
 2,
 3,
 4,
 1,
 1]

In [11]:
?x.insert

[0;31mDocstring:[0m L.insert(index, object) -- insert object before index
[0;31mType:[0m      builtin_function_or_method


In [12]:
x = x[1:5]

In [13]:
x

[2, 3, 5, 12]

In [14]:
x.insert(2,'un elemento')

In [15]:
x

[2, 3, 'un elemento', 5, 12]

In [16]:
x.remove(2)

In [17]:
x

[3, 'un elemento', 5, 12]

In [19]:
?x.pop()

[0;31mDocstring:[0m
L.pop([index]) -> item -- remove and return item at index (default last).
Raises IndexError if list is empty or index is out of range.
[0;31mType:[0m      builtin_function_or_method


In [21]:
x.pop()  # quita el 'ultimo elemento

'un elemento'

In [23]:
[].pop()

IndexError: pop from empty list

In [24]:
l = [12,3,3,5,6,6,7]
while l:
    print(l.pop())

7
6
6
5
3
3
12


List comprehensions

In [25]:
s = "5,4,1,2,5,5,2,4,5,6,6"
s.split(",")

['5', '4', '1', '2', '5', '5', '2', '4', '5', '6', '6']

In [26]:
# No recomendado
l = []
for n in s.split(","):
    if n.isdigit():
        l.append(int(n))
        
l

[5, 4, 1, 2, 5, 5, 2, 4, 5, 6, 6]

In [27]:
[int(n) for n in s.split(",")]

[5, 4, 1, 2, 5, 5, 2, 4, 5, 6, 6]

In [28]:
[int(n) for n in s.split(",") if n.isdigit()]

[5, 4, 1, 2, 5, 5, 2, 4, 5, 6, 6]

## Tuples

Immutable objects that can be indexed. Operations: concatenation, multiplication.

In [30]:
a = (1,2)
b = (3,4)

a + b

(1, 2, 3, 4)

In [31]:
a = a + b
a

(1, 2, 3, 4)

In [32]:
5*(1,2)

(1, 2, 1, 2, 1, 2, 1, 2, 1, 2)

In [59]:
x 

[1, 2, 3, 4, 5, 6, 7, 8]

In [60]:
27 in x

False

In [61]:
5 in x

True

## Dictionaries

Also called maps. Key -> values

In [33]:
d = {
    'nombre': 'Pepe',
    'edad': 47,
    'hijos': ['Pepa', 'Pepito']
}

In [34]:
d

{'nombre': 'Pepe', 'edad': 47, 'hijos': ['Pepa', 'Pepito']}

In [35]:
d.keys()

dict_keys(['nombre', 'edad', 'hijos'])

In [36]:
d.values()

dict_values(['Pepe', 47, ['Pepa', 'Pepito']])

How to iterate with , how to merge dictionaries

In [38]:
for e in d:
    print("Clave %s  -> Valor %s" % (e, str(d[e])))

Clave nombre  -> Valor Pepe
Clave edad  -> Valor 47
Clave hijos  -> Valor ['Pepa', 'Pepito']


Dicts comprehensions

In [39]:
[e for e in d]

['nombre', 'edad', 'hijos']

In [40]:
[d[e] for e in d]

['Pepe', 47, ['Pepa', 'Pepito']]

## Sets

union, intersection, difference

In [41]:
a = set([4,5,4,2,5,6,7])
b = set([4,5])

In [43]:
c = set(5,4,4,4)  # error, los conjuntos se hacen a partir de listas

TypeError: set expected at most 1 arguments, got 4

In [44]:
a

{2, 4, 5, 6, 7}

In [45]:
b

{4, 5}

In [46]:
a - b

{2, 6, 7}

In [47]:
b - a

set()

In [48]:
a.intersection(b)

{4, 5}

In [49]:
b.intersection(a)

{4, 5}

In [50]:
a.union(b)

{2, 4, 5, 6, 7}

# Exercises

## Some (maybe not so) quick exercises

* For a sequence `[1, 2, 3, 4, 5, 6, 7, 8]` get the squared values using the lambda function.
* Prepare a list with 10 names. Make a code that will put all vowels to capitals and every other character to lower letters.
* Prepare again a list with 10 names. Make a function with two input variables: list, and character; that returns a list of names containing one or more of input characters' inside the name.  
  * (continuing prev. exercise) Reverse word order from the input list 
* Write a Python function that takes a list of words and returns the length of the longest one.


In [58]:
x = list(range(1,9))
[n**2 for n in x]

f = lambda n: n**2
[f(n) for n in x]

list(map(lambda n: n**2, x))

[1, 4, 9, 16, 25, 36, 49, 64]

In [70]:
names = ['Pepe','Alberto','Ursula']
vowels = 'aeiou'
names_transf = []

for n in names:
    nr = ""
    for l in n:
        if l.lower() in vowels:
            nr += l.upper()
        else:
            nr += l.lower()
    names_transf.append(nr)
names_transf

['pEpE', 'AlbErtO', 'UrsUlA']

In [74]:
def filter_names(ns, ch):
    return [n for n in ns if ch.lower() in n.lower()]

In [75]:
filter_names(names, 'e')

['Pepe', 'Alberto']

In [76]:
filter_names(names, 'a')

['Alberto', 'Ursula']

In [99]:
def filter_and_rev_names(ns, ch):
    results = []
    #ns.reverse()  -> Esto no, porque cambia los args de entrada (paso por referencia)
    for n in ns:
        if ch.lower() in n.lower():
            results.append(n)
    
    results.reverse()   # Esto s'i, porque no modifica los args de entrada
    
    return results

In [100]:
filter_and_rev_names(names, 'a')

['Ursula', 'Alberto']

In [101]:
def longest_word_length(ws):
    lengths = [len(w) for w in ws]
    return max(lengths)

In [102]:
longest_word_length(names)

7

In [103]:
# Ejercicio propuesto -> devuelve la palabra m'as larga, no la longitud
# Y otro ejercicio -> Devuelve la palabra y la longitud
# (si hay varias palabras de la misma long., se devuelve una de ellas solamente)

## More exercises

* Categorize a list of words by their first letter, meaning that the result of the operation is first letter and all the words from the input list starting with that letter. (hint: use dict)
* Sort a collection of strings by the number of distinct letters in each string. (hint: use set and lambda)
* Reverse word order from the input string by using for comprehension 


In [110]:
def categorize_words(ws):
    d = {}
    for w in ws:
        fc = w[0].lower()        
        if fc in d.keys():
            l = d[fc]
            l.append(w)
        else:
            d[fc] = [w]
            
    return d

In [111]:
categorize_words(names + ["Ambrosio"])

{'p': ['Pepe'], 'a': ['Alberto', 'Ambrosio'], 'u': ['Ursula']}

In [116]:
p = names[0]
len(set(p.lower()))

2

In [125]:
lengths = [len(set(w.lower())) for w in names]
lengths.sort()
lengths

[2, 5, 7]

In [154]:
def num_diff_chars(w):
    return len(set(w.lower()))

def categorize(f, ws):
    # Diccionario de palabras
    d = {}
    for w in ws:
        k = f(w)
        if k in d.keys():
            l = d[k]
            l.append(w)
        else:
            d[k] = [w]

    return d

def sort_words(ws):
    lengths = [num_diff_chars(w) for w in ws]
    lengths = list(set(lengths))
    lengths.sort() 
    
    d = categorize(num_diff_chars, ws)
    
    sorted_words = []
    for length in lengths:
        words = d[length]  # p.ej ['Ursula', 'Antonio']
        for w in words:
            sorted_words.append(w)
    
    return sorted_words

In [155]:
sort_words(names)

['Pepe', 'Ursula', 'Antonio', 'Alberto']

In [156]:
# Repetimos el ejercicio anterior aprovechando la funci'on gen'erica que acabamos de hacer
def categorize_words_2(ws):
    return categorize(lambda w: w[0].lower(), ws)

In [157]:
categorize_words(names)

{'p': ['Pepe'], 'a': ['Alberto', 'Antonio'], 'u': ['Ursula']}

In [158]:
categorize_words_2(names)

{'p': ['Pepe'], 'a': ['Alberto', 'Antonio'], 'u': ['Ursula']}

In [162]:
assert categorize_words(names) == categorize_words_2(names)

In [170]:
# Se puede hacer esto con un comprehension?
w = "Hola que tal"
rw = list(w)
rw.reverse()
rw = "".join(rw)
rw

'lat euq aloH'

In [182]:
def reverse_string(w):
    rw = ""    
    for c in w:
        rw = c + rw
    return rw

def reverse_with_pop(w):
    lw = list(w)
    rw = ""
    while len(lw) > 0:
        rw = rw + lw.pop()
    return rw

In [174]:
reverse_string("Hola")

'aloH'

In [177]:
reverse_with_pop("Hola")

'aloH'

In [181]:
w = "Hola que tal"
lw = list(w)
"".join([lw.pop() for c in w]) 

'lat euq aloH'

# Shell commands and magic in notebooks cells

If we start with the `!` symbol, the cell will execute shell code, not Python code

In [190]:
!pwd 

/home/dsc/notebooks


In [191]:
!ls -hl

total 116K
-rw-rw-r-- 1 dsc dsc 112K may 12 12:33 05_1_intro_python.ipynb
-rw-r--r-- 1 dsc dsc 2,2K may 11 17:32 Un notebook de prueba para ver c'omo funciona Jupyter.ipynb


In [192]:
!ls ~ 

anaconda3  Downloads	 pasword_github.txt  Repos	     Videos
Data	   metastore_db  Pictures	     Templates
Desktop    Music	 Public		     Untitled.ipynb
Documents  notebooks	 R		     untitled.txt


In [194]:
!ls ~/Data/

airline_tickets  challenge  opentraveldata  README.md  shell  us_dot


In [195]:
!head -5 ~/Data/README.md

# Data
This directory contains some samples of data that will be used for the sessions in the master.


Cells accept *magic* commands, to perform some tasks. For instance, to measure the time it takes to run the cell

In [196]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%python  %%python

In [199]:
%%time

categorize_words(names)

CPU times: user 6 µs, sys: 1e+03 ns, total: 7 µs
Wall time: 9.78 µs


{'p': ['Pepe'], 'a': ['Alberto', 'Antonio'], 'u': ['Ursula']}

In [204]:
%timeit categorize_words(names) 
%timeit categorize_words_2(names)

1.23 µs ± 13.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
1.67 µs ± 29.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


# Files

We can read data from files stored in the hard drive (or in fact, from anywhere, but with Python standard, we will focus only on our hard drive).

File open modes:

* `r` Read-only mode
* `w` Write-only mode. Creates a new file (deleting any file with the same name)
* `a` Append to existing file (create it if it does not exist)
* `r+` Read and write
* `b` Add to mode for binary files, that is `rb` or `wb`

Common functions: read, **readlines**, write, **writelines**, **close**, flush, seek, tell, closed

# Exercises

* Create a function that accepts string as search string and returns number of lines with that string in a command history (hint: use `a in b`)
* List the files in `~/Data/opentraveldata/` 
* Repeat the same for  `/home/dsc/Data/us_dot/otp` and `~/Data/us_dot/traffic/`. 
* Write a function that will take text filename and pattern as input parameters, and return the number of occurances of case insensitive pattern inside a text (similar to: grep -i –o pattern file | wc -l)
* Open `Finn.txt` , read lines into a list.  Remove trailing white spaces from each line . Write the resulting list to the new file. How many lines does the new file have?  (hint: empty list is made with `[]` ) 
* Open `Finn.txt`, read lines into a list. Create a new version of Finn_nbl.txt with no blank lines.
* Obtain the difference in number of lines between original Finn file and and the one without blank lines  and print the result. (with Python code)