# Character strings

Strings correspond to character lists, and therefore are iterables, which allows to apply the slicing and indexing operations introduced previously. As it was seen previously, strings can be defined using single or double comma characters:


In [1]:
fruit = "banana"

In [2]:
colombian_sweet = 'bocadillo'

With respect to common operators such as * and +, we can obtain the following results:
En este caso, los operadores + y * dan los siguientes resultados:

| Operator | Input types           | Result                                         
| --------- | --------------- | --------- 
|    +      | str + str | Concatenates both strings                                   
|    \*     | str * int | Concatenates str int times 

For instance, we can perform these operations to our data:

In [3]:
fruit + colombian_sweet

'bananabocadillo'

In [4]:
fruit * 3

'bananabananabanana'

In [5]:
colombian_sweet[0]

'b'

In [6]:
colombian_sweet[:7]

'bocadil'

However unlike lists, strings are inmutable, which means that index-based assignment cannot be performed:

In [7]:
fruit[2] = 'z'

TypeError: 'str' object does not support item assignment

Strings define a lot of important methods. To list all of them, we can use the dot notation and then press <kbd>Tab</kbd>. For instance, lets lists all the available operations over our `fruit` variable:

In [10]:
fruit.upper()

'BANANA'

**Note**: All those methods return a new string each time, as strings are inmutable.
Here are some of the most important string manipulation methods available:

* **upper**: Converts all characters into Uppercase

In [11]:
fruit.upper()

'BANANA'

* **count**: Counts the number of appereances of a substring

In [12]:
fruit.count('a')

3

* **replace**: Replaces all substring matches by another string

In [15]:
fruit.replace('a', 'o')
fruit

'banana'

In [16]:
fruit.replace('ban', 'en')

'enana'

* **split**: Splits an string by spaces and returns a lists that contains all split words.

In [17]:
s = "Hello, world! hello world"

In [18]:
s.split()

['Hello,', 'world!', 'hello', 'world']

It also allows to split words using another string:

In [19]:
colombian_sweet.split('ca')

['bo', 'dillo']

* **rstrip:** Removes any tralling newspace characters at the end of a string

In [20]:
s = 'I want my line without any newline\n'
s.rstrip()

'I want my line without any newline'

It also works with Windows line endings (CRLF):

In [28]:
s = 'Windows is the original sin  \r\n'
s.rstrip()
mipalabra = "dulce "*49+"dulce"
mipalabra


'dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce dulce'

## Problemas

### Problema 1

Tomar la variable `dulce`, hacer que se repita 50 veces, y separar las palabras con un espacio, de tal forma que obtengamos algo como lo siguiente, pero **sin** generar un espacio al final.

        'bocadillo bocadillo ...'

mipalabra = "dulce "*49+"dulce"
mipalabra

### Problema 2

¿Cuántas veces se repite la palabra `banano` en la siguiente cadena?:

In [36]:
muchas_frutas = 'banAnobanAnobananobanaNobananobananobanaNobaNanobanano\
bananobananobaNanobananobananobaNanobAnanobananobananobanaNobananobanAno\
bananobananobanaNobananobananobananobananobananobananobananobananobAnAno\
bAnanobananobananobananobananobananobanANobananobananobanaNobananobanano\
bananobanaNobAnAnobananobananobananobananobananobAnAnobananobananobanano\
baNanobananobananobaNaNobananobANanobananobananobananobAnanobananobanano\
bananobananobAnanobananobaNAnobananobananobananobaNanobanaNobANanobanano\
baNanobananobananobAnanobananobananobananobaNAnobananobanANobananobAnano\
bANanobanAnobananobaNanobananobananobananobananobananobananobAnanobanano\
bananobanAnobananobananobanAnobananobananobananobanAnobananobananobaNano\
bAnanobananobAnanobaNanobananobanaNobananobananobanANobananobananobANAno\
bananobananobaNAnobanaNobAnanobanAnobananobananobanAnobaNanobananobanaNo\
banaNobANAnobananobananobanAnobananobananobanANobananobanAnobananobanano\
banaNobananobAnanobananobAnanobananobanANobananobananobanAnobanaNobanano\
bananobAnanobananobaNanobananobanANobananobananobananobaNAnobananobanAno\
bananobananobananobaNanobananobananobanAnobananobananobANanobananobanano\
bananobananobaNanobananobananobananobAnanobananobananobananobananobanano\
bananobanANobananobanaNobAnanobananobaNanobaNAnobananobananobananobanano\
bananobananobananobananobananobAnanobanaNobananobananobaNAnobananobanANo\
bananobanaNobananobananobananobananobananobaNanobananobanaNobanAnobanAno\
bananobanAno'




0

*Respuesta*:

    150

In [38]:
muchas_frutas.count("banano")

150

### Problema 3

Cuántas veces se repite `banano` en la cadena anterior, sin importar si algunas de sus letras están en mayúsculas o no?

*Respuesta*:

    239

In [40]:
mf=muchas_frutas.upper()
mf.count("BANANO")


239

### Problema 4

¿Qué produce el método `center`?

Experimentar con los siguientes comandos para ver que produce:

In [42]:
dulce="manzana"
dulce.center(2)

'manzana'

In [43]:
dulce.center(10)

' manzana  '

In [44]:
dulce.center(16)

'    manzana     '

In [45]:
dulce.center(30)

'           manzana            '

# Tuples

Tuples are inmutable and ordered lists of objects. As analog to strings, they share common iterable operations such as slicing and indexing, however we cannot modify any of its values. 

We define tuples using round parentheses instead of square brackets:

In [47]:
tp = (1, 2, 3, 4, 'a')
tp

(1, 2, 3, 4, 'a')

In [48]:
tp[3]

4

In [49]:
tp[-1]

'a'

In [50]:
tp[2:]

(3, 4, 'a')

Remember, tuples are inmutable:

In [56]:
tp[4] = 'b'

TypeError: 'tuple' object does not support item assignment

**Nota**: We can define a tuple without enclosing its elements inside parentheses, a common practice present among Python programmers:

In [57]:
tp1 = 'a', 'b', 2
tp1

('a', 'b', 2)

In [55]:
tp1

('a', 'b', 2)

# Iterable unpacking

On Python we can "unpack" an iterable contents and assign them to variables using the notation ``var1, var2, ..., varN = iterable``. To achieve this, it is necessary that the iterable object contains ``N`` elements, otherwise it will raise an exception:

In [58]:
# Unpack a list
a, b, c = [2, 3, 4]

print(a, b, c)

2 3 4


In [60]:
# Unpack a string
a, b, c = '123'
print(a, b, c)

1 2 3


In [61]:
# Unpack a tuple

a, b, c = 4, -2, 1+9j
print(a, b, c)

4 -2 (1+9j)


In [62]:
# Less elements than variables:
a, b, c = [2, 3]

ValueError: not enough values to unpack (expected 3, got 2)

In [63]:
# Less variables than elements
a, b = [4, 5, 6] 

ValueError: too many values to unpack (expected 2)

In [65]:
# Some tricks!
a, b, c = [3, 4, 'jasdlkasd']
print(a, b, c)

3 4 jasdlkasd


## Problemas

### Problema 1

¿Es posible calcular el promedio a la lista de la siguiente tupla?

In [66]:
li = (3, 18, 17, 44, 14, 12, 29, 19, 4, 6, 17, 7, 14, 6, 8, 17, 17, 21, 65,\
      19, 10, 31, 92, 17, 5, 15, 3, 14, 20, 12, 29, 57, 15, 2, 17, 1, 6, 17, 2,\
      71, 12, 11, 62, 14, 9, 20, 43, 19, 4, 15)

In [67]:
# Escribir la solución aquí
sum(list(li))/len(li)

20.04

### Problema 2

Crear una tupla que tenga un sólo elemento

In [90]:
# Escribir la solución aquí
a=(3, 5, 'rojo')
max(a)

TypeError: unorderable types: str() > int()

### Problema 3

¿Qué efecto tiene esta operación

In [83]:
x, y, z = tp1
tp1

('a', 'b', 2)

dado el valor de `tp1` definido arriba?

In [84]:
# Obtener los valores de x, y, z aquí
print(x,y,z)

a b 2


Teniendo en cuenta esto, explicar qué ocurre al realizar esta operación entre los elementos de una lista

In [85]:
l = [-1, 6, 7, 9]
l[0], l[2] = l[2], l[0]

In [86]:
# Imprimir la lista l aquí
print(l)

[7, 6, -1, 9]


### Problema 4

¿Por qué, en cambio, esta operación falla?

In [87]:
u, v = tp1

ValueError: too many values to unpack (expected 2)

### Problema 5

¿Cómo se calcula el máximo de una tupla?

In [88]:
# Escribir la solución aquí
max(li)

92

# Dictionaries

Dictionaries are one of the most (If not the most) used data structure on Python. Until now, we have seen iterable data types that are indexed using integer numbers. Instead of integer indices, dictionaries allow to use any **inmutable** data object as index. This allows to create an associative array on which a Value object is retrieved using a Key object.

For instance, we would like to store a list of users and their passwords, such that given an username, we can obtain its password. 

To accomplish this task, we can use a Dictionary, as it follows:

In [91]:
codes = {'Luis': 2257, 'Juan': 9739, 'Carlos': 5591}

As we can see, dictionary are defined using curly braces and semicolon separators that allow to discriminate values from keys. ``{k:v}`` 

To retrieve a value from a dictionary, we just need to index it by its key object:

In [92]:
codes['Carlos']

5591

Lets retrieve Juan's password:

In [93]:
codes['Juan']

9739

If someone decided to change its password, we may just using our usual index assignment notation:

In [94]:
codes['Luis'] = 1627

In [95]:
codes

{'Carlos': 5591, 'Juan': 9739, 'Luis': 1627}

**Nota**: Dictionaries are un ordered which means that there is not an order between keys at definition or addition time. For example, ``Luis`` was defined first, but it appears last. To overcome this limitation, there exists an ``OrderedDict`` under the ``collections`` package which offers the same operations as a plain dictionary.

If someone decides to cancel its subscription, we just can use ``pop`` or ``del``

In [96]:
val = codes.pop('Juan')
print(val)

del codes['Luis']

9739


In [97]:
codes

{'Carlos': 5591}

If we would like to add a new user, we can use our index assignment notation as well:

In [98]:
codes['Jorge'] = 6621

In [99]:
codes

{'Carlos': 5591, 'Jorge': 6621}

Para saber si una persona ya está en el diccionario o no, usamos el siguiente
método:

In [100]:
'Carlos' in codes

True

In [101]:
'José' in codes

False

Finally, to get all dictionary keys and values, we can invoke ``keys`` and ``values`` methods, respectively:

In [102]:
codes.keys()

dict_keys(['Carlos', 'Jorge'])

In [103]:
codes.values()

dict_values([5591, 6621])

**Note:** The return values of both methods are iterable generators that cannot be sliced or indexed. We will see how to treat them later.

## Problemas

### Problema 1

Dado el siguiente diccionario que guarda las notas de distintos estudiantes

In [106]:
notas = {
    'Juan': [4.5, 3.7, 3.4, 5],
    'Alicia': [3.5, 3.1, 4.2, 3.9],
    'Germán': [2.6, 3.0, 3.9, 4.1]
}

calcular:

* La nota promedio de Juan (recuerde que se puede utilizar `sum` y `len` para obtener el promedio).

*Respuesta*

    4.15

In [119]:
# Escribir la solución aquí
pj=sum(notas['Juan'])/len(notas['Juan'])
pj


4.15

* La nota promedio del curso

*Respuesta*

    3.74

In [122]:
# Escribir la solución aquí
pj=sum(notas['Juan'])/len(notas['Juan'])
pa=sum(notas['Alicia'])/len(notas['Alicia'])
pg=sum(notas['Germán'])/len(notas['Germán'])

prom=(pj+pa+pg)/3
prom

3.741666666666667

# Datatype conversion and other goodies

To perform datatype conversion, we can use type keywords as functions, for example:

* **int**: Conversion to integer datatype

In [123]:
int(3.99)

3

In [124]:
int('6')

6

* **float**: Conversion to float datatype

In [125]:
float(12)

12.0

In [128]:
float('4.23')

4.23

* **str**: String representation of a given object

In [129]:
str(36.1)

'36.1'

In [131]:
miTupla=(1,2,'rojo')
str(miTupla)

"(1, 2, 'rojo')"

* **list**: Iterable or generator objects conversion to list

In [132]:
list((3, 2, 4))

[3, 2, 4]

In [133]:
list('1457')

['1', '4', '5', '7']

If the given value is a dictionary, list only takes its keys

In [136]:
str({'a': 12, 'b': 5})

"{'a': 12, 'b': 5}"

* **dict**: Dictionary initialization from a iterable that contains (K, V) pairs.

In [None]:
dict([[10, 'a'], [15, 't']])

We can use ``zip`` to join a list of keys with a list of values:

In [None]:
keys = ['A', 'B', 'C']
values = [12, 34, -1 + 9j]
pairs = list(zip(keys, values))
print(pairs)

dict(pairs)