## Data types: ways to classify information

As many other languages, Python has got several built-in data types that can represent information.

## Literals and variables, *literally*

- **Literals**: generic values associated to specific symbols or combinations of symbols.
- **Variables**: references to values located in the memory of the device that executes the program.

In Python, variables are not linked to any specific data type (it does not provide strict typing functionalities), which means that variables can be reassigned to any other data type without raising errors.

In [1]:
# These are literals:

2
12.3
"hello"

# This is a literal assigned to a variable:

var = 2

# Once the variable is assigned, its value can be accessed by calling it.
# The example below passes the variable as argument to the print function:

print(var)

2


### _Exercise 1: Variable creation_

Create a variable named `my_first_variable` that contains value `69`.

- [Click here to open the script in the editor](./solutions/exercise_1.py)
- Test the script using `Ctrl + Shift + P` > `Tasks: Run Task` > `Test exercise 1`

### _Exercise: Determinación del tipo de dato de una variable_

1. Busca información sobre la **función** que te permite pasar como argumento una variable y conocer el tipo de dato que contiene.
2. Asocia el valor que devuelve la función a una variable llamada `known`.

- [Click here to open the script in the editor](./solutions/exercise_2.py)
- Test the script using `Ctrl + Shift + P` > `Tasks: Run Task` > `Test exercise 2`

In [3]:
# Do not modify this (base data):

unknown = "What am I?"

# Write your code below:



## Non sequential data types

**Sequentiality**: capacity of a given data type to define a sequence of values.

There are three built-in non sequential data types in Python:
- `int`: numeric values without decimals
- `float`: numeric values with decimals
- `complex`: numeric values with real and imaginary parts

In [4]:
# Let's assign non-sequential values to variables:

# Integer value:

integer_variable = 1

# Float value:

float_variable = 1.2

# Complex value:

complex_variable = 1 + 2j

### _Exercise: Tipos de datos no secuenciales_

Crea dos variables llamadas `var_1` y `var_2` que contengan, respectivamente, el valor de Pi (con cuatro decimales) y el número de horas en un día.

En la misma línea de cada variable definida indica, mediante un comentario, el tipo de dato que contiene.

In [5]:
# Write your answer below:



## Tipos de datos secuenciales

Los tipos de datos secuenciales básicos en Python abarcan los Strings (`str`), Tuplas (`tuple`), Listas (`list`), Sets (`set`) y Diccionarios (`dict`):

* **Strings**: conjuntos ordenados de caracteres.
* **Tuplas/Listas**: conjuntos ordenados de elementos.
* **Sets**: conjuntos no ordenados de elementos.
* **Diccionarios**: conjuntos de elementos relacionados mediante pares clave-valor.

_La diferencia entre las tuplas y las listas es que las primeras no permiten modificar sus elementos, mientras que las segundas sí._

In [6]:
# Let's assign sequential values to a variable:

# String value:

string_value = "Strong string strung in the stream"

# Tuple value:

tuple_variable = (1, 2, 3)

# List value:

list_variable = [1, 2, 3]

# Set value:

set_variable = {1, 2, 3}

# Dictionary value:

dictionary_variable = {
    "key_1": "value_1",
    "key_2": "value_2",
}

### _Exercise: Tipos de datos secuenciales_

Crea tres variables (llámalas como quieras) que contengan los siguientes datos:

1. Lista con valores del `1` al `5` (incluidos).
2. Set con valores del `-2` al `0` (incluidos).
3. String con los caracteres `'a'`, `'b'` y `'c'`.
4. Lista de las palabras contenidas en la frase `"I'm learning Python"`.
5. Diccionario que relacione las claves de `"seconds"`, `"minutes"` y `"hours"` con la cantidad de los segundos, minutos y horas que hay en un día (respectivamente).

In [7]:
# Write your answer below:



## Data type conversion (type casting)

Python allows to change the data type of its elements with some flexibility. Thus, it is possible to convert a `float` into an `int` and vice versa, but (for example) it is not possible to convert a `str` into an `int`.

In [8]:
# This is a float value assigned to a variable:

variable = 6.9
print("The value of the variable is: ", variable)
print("The data type of the variable is: ", type(variable))

# Now, its value is being casted into an integer:

variable = int(variable)  # The value will be truncated (not rounded).
print("The value of the variable is: ", variable)
print("The data type of the variable is: ", type(variable))

The value of the variable is:  6.9
The data type of the variable is:  <class 'float'>
The value of the variable is:  6
The data type of the variable is:  <class 'int'>


### _Exercise: Cambio de tipo de datos (1)_

Convierte un `int` cualquiera a un `float` y asigna el resultado a una variable. Seguidamente, convierte un `float` cualquiera a un `int` y asigna el resultado a otra variable distinta.

Finalmente, imprime los resultados en pantalla.

In [9]:
# Write your answer below:



### _Exercise: Cambio de tipo de datos (2)_

Prueba a crear una variable que contenga un `str` e intenta cambiar el tipo de datos de dicha variable a `int`. ¿Qué ocurre?

In [10]:
# Write your answer below:



# Boolean values... true or false?

The boolean data type (`bool`) is the one that acts on any Python instruction internally, without anyone noticing.

Variants of this data type are the literals `True` and `False`.

A curious fact about this data type is that any other data type in Python contains it, implicitly. In other words, any data type in Python can be *casted* to `bool`, as can be seen below:

In [11]:
# Boolean casting for every basic data type in Python:

print("Boolean value of True: ", bool(True))
print("Boolean value of False: ", bool(False))

print("Boolean value of 1: ", bool(1))
print("Boolean value of 1.2: ", bool(1.2))
print("Boolean value of 1 + 2j:", bool(1 + 2j))

print("Boolean value of 'Hello':", bool("Hello"))
print("Boolean value of (1, 2, 3):", bool((1, 2, 3)))
print("Boolean value of []:", bool([]))
print("Boolean value of {1, 2, 3}:", bool({1, 2, 3}))
print("Boolean value of {'key_1': 'value_1'}:", bool({"key_1": "value_1"}))

Boolean value of True:  True
Boolean value of False:  False
Boolean value of 1:  True
Boolean value of 1.2:  True
Boolean value of 1 + 2j: True
Boolean value of 'Hello': True
Boolean value of (1, 2, 3): True
Boolean value of []: False
Boolean value of {1, 2, 3}: True
Boolean value of {'key_1': 'value_1'}: True


In [12]:
# Wait... what?

print("Boolean value of the boolean type: ", bool(bool))
print("Boolean value of a generic boolean value: ", bool(bool()))

Boolean value of the boolean type:  True
Boolean value of a generic boolean value:  False


### _Exercise: Booleanos_

Crea una serie de variables (mínimo 4, máximo 6) que contengan distintos tipos de datos (los que quieras, con los contenidos que te parezca) e imprime su valor booleano en pantalla.

_**Sugerencia**: prueba a crear una lista vacía y una tupla con varios elementos y razona por qué sus valores booleanos son, respectivamente, `False` y `True`._

In [13]:
# Write your answer below:



# Sequential data types: one thing after another

Sequential data types are those that allow storage of a sequence of elements. They have got useful methods for accessing, modifying and deleting elements.

A revelant question while studying sequential data types is how to access the elements that they contain. In Python, there are two ways of doing so: using indices and using *slices*. The first one is used to access a single element, while the second one is used to access a range of elements.

## Length of a sequence: your best friend

Knowing the length of a sequence is a very useful tool when working with sequential data types, since it allows to know the number of elements that a sequence contains.

The length of **any** sequence can be obtained by using the `len()` function, which will be an integer value. Note that if said function is applied to a non-sequential data type, an error will be raised.

In [14]:
# This is a string:

var = "Hello!"

# Its length can be determined using the `len` method:

print("Length (amount of characters) of the string: ", len(var))

Length (amount of characters) of the string:  6


### Indices

Indices are values that can be used to represent a position in a sequence, such as an integer or a string. In Python, indices start at `0` and end at `n-1`, where `n` is the length of the sequence.

Let's take a *string* as an example:

In [15]:
var = "I am a sequence"

# Knowing that indices start at `0` and end at `len(value) - 1`:

print("First character of the string: ", var[0])
print("Last character of the string: ", var[len(var) - 1])
print("Some character in the middle: ", var[7])

# Negative indices are supported in Python:

print("Last character of the string: ", var[-1])
print("Penultimate character of the string: ", var[-2])
print("Some character in the middle: ", var[-10])
print("First character of the string: ", var[-len(var)])

First character of the string:  I
Last character of the string:  e
Some character in the middle:  s
Last character of the string:  e
Penultimate character of the string:  c
Some character in the middle:  a
First character of the string:  I


### Splices

Simply put, splices are ranges of indices that allow access to the elements of a sequence. They can be really useful in case you want to extract a specific part of a sequence. They are expressed in a `[start:stop:step]` format, where:

- `start` is the index of the first element to be included in the slice. Defaults to `0`.
- `stop` is the index of the penultimate element to be included in the slice (the end itself is excluded). Defaults to the length of the sequence.
- `step` is the increment between indices. Defaults to `1`.

A key point to take into account is that the first index of a slice is included in the result, while the last one is not. Let's take a look at an example:

In [16]:
var = "I am another sequence"

# Examples:

print("Whole string: ", var[0:len(var)])
print("First 7 characters: ", var[0:8])  # The last index is not included.
print("Elements in even positions: ", var[0:len(var):2])

# Start, stop and step values can be omitted:

print("Whole string: ", var[::])
print("First 7 characters: ", var[:8])
print("Elements in even positions: ", var[::2])

Whole string:  I am another sequence
First 7 characters:  I am ano
Elements in even positions:  Ia nte eune
Whole string:  I am another sequence
First 7 characters:  I am ano
Elements in even positions:  Ia nte eune


## _Exercise: Reverse string using slices_

## Strings

Strings are, probably, one of the most versatile data types in Python, right behind booleans. They allow modifications of almost every kind, from changing from uppercase to lowercase to finding regular expressions.

### Methods

In order to use the methods of a `str`, it is necessary to create a variable assigned to one. It can also be done using a literal, but it is not very useful, to say the least.

The following are some basic modifications of this data type.

In [17]:
# Basic string manipulation:

string = "This is definitely NOT Flipped Learning"

# Conversion to lowercase:

print("Lowercase: ", string.lower())

# Conversion to uppercase:

print("Uppercase: ", string.upper())

# Capitalize the leading character of the string:

print("Capitalize: ", string.capitalize())

# Capitalize the leading character of the each word in the string:

print("Title: ", string.title())

# Swap upper and lower case characters:

print("Swap: ", string.swapcase())

Lowercase:  this is definitely not flipped learning
Uppercase:  THIS IS DEFINITELY NOT FLIPPED LEARNING
Capitalize:  This is definitely not flipped learning
Title:  This Is Definitely Not Flipped Learning
Swap:  tHIS IS DEFINITELY not fLIPPED lEARNING


Clearly, these operations seem too simple, so it is often useful to know some more complex ones in order to be able to manipulate a text extensively:

In [18]:
# Advanced string manipulation:

string = "    Lorem ipsum dolor sit down and pay attention    "

# Find a substring in the string:

print("Finding \"sit down\" position: ", string.find("sit down"))

# Replace a substring with another substring:

print("Replacing spaces with underscores: ", string.replace(' ', '_'))

# Removing leading and trailing whitespace in the string:

print("Removing leading and trailing whitespace: ", string.strip())

Finding "sit down" position:  22
Replacing spaces with underscores:  ____Lorem_ipsum_dolor_sit_down_and_pay_attention____
Removing leading and trailing whitespace:  Lorem ipsum dolor sit down and pay attention


## Lists

Lists allow concatenating a series of elements, accessing them and modifying them with total freedom, they are **mutable** structures.

## Methods

Again, the list to be manipulated must be associated to a variable in order to be able to make changes to it.

Some operations on lists include the following methods:

In [19]:
# List methods:

my_list = [3, 2, 1, ['a', 'b']]

# List length:

print("List length: ", len(my_list))
print("Length of the last item in the list: ", len(my_list[-1]), end="\n\n")

# Single item addition:

print("List before append: ", my_list)
my_list.append("hello")
print("List after append: ", my_list, end="\n\n")

# Multiple item addition:

print("List before extend: ", my_list)
my_list.extend(["how", "are", "you"])
print("List after extend: ", my_list, end="\n\n")

# Item removal:

print("Base list: ", my_list)
my_list.pop()
print("Last element removed: ", my_list)
my_list.pop(len(my_list) // 2)
print("Middle element removed: ", my_list, end="\n\n")

# Count number of items in list:

print("How many 2s are in the list? There is ", my_list.count(2), end="\n\n")

# Reverse list contents:

print("List before reverse: ", my_list)
my_list.reverse()
print("List after reverse: ", my_list, end="\n\n")

# List contents' wipe:

print("List before clear: ", my_list)
my_list.clear()
print("List after clear: ", my_list)

List length:  4
Length of the last item in the list:  2

List before append:  [3, 2, 1, ['a', 'b']]
List after append:  [3, 2, 1, ['a', 'b'], 'hello']

List before extend:  [3, 2, 1, ['a', 'b'], 'hello']
List after extend:  [3, 2, 1, ['a', 'b'], 'hello', 'how', 'are', 'you']

Base list:  [3, 2, 1, ['a', 'b'], 'hello', 'how', 'are', 'you']
Last element removed:  [3, 2, 1, ['a', 'b'], 'hello', 'how', 'are']
Middle element removed:  [3, 2, 1, 'hello', 'how', 'are']

How many 2s are in the list? There is  1

List before reverse:  [3, 2, 1, 'hello', 'how', 'are']
List after reverse:  ['are', 'how', 'hello', 1, 2, 3]

List before clear:  ['are', 'how', 'hello', 1, 2, 3]
List after clear:  []


#### _Exercise: Copia de listas_

A partir de una variable llamada `list_1` que contiene una lista, crea otra variable llamdada `list_2` que contenga los elementos de la primera lista (sin copiarlos a mano).

*__Aviso__: los elementos de ambas listas deben ser independientes (si se modifica `list_1` no se deberá modificar `list_2`).*

In [20]:
# Do not modify this (base data):

list_1 = [1, 2, 3, 4, 5]

# Write your code below:



# Do not modify this (automatic answer check):

assert list_1 is not list_2, "Error: Lists are not independent."
print("OK!")

NameError: name 'list_2' is not defined

#### _Exercise: Ordenación de listas_

Dada una lista `list_1`, guarda en la variable `list_2` una copia ordenada (de menor a mayor) e independiente de la primera.

_**Sugerencia**: busca el método de las listas que te se ajusta a los requisitos de este ejercicio._

In [None]:
# Do not modify this (base data):

list_1 = [5, 3, -1, 2, 3, 0]

# Write your code below:

 

# Do not modify this (automatic answer check):

assert list_1 is not list_2, "Error: lists are not independent."
assert all(list_2[i] <= list_2[i + 1] for i in range(len(list_2) - 1)), "Error: the list is not sorted."
print("OK!")

## Tuples

Tuples are sequential data types that are highly similar to lists. Their main difference is that lists allow editing the content of their positions, while tuples do not. They are **immutable** structures.

## Dictionaries

Dictionaries are, as mentioned before, pairs of values. They have got some peculiarities, such as the fact that in case an entry is repeated, its value will be overwritten with the most recent one. There are no repeated entries.

It is always important to keep in mind that the data types used as dictionary keys must be immutable, that is, their elements cannot be modified. This is the case of integers, floats, complex numbers, strings and tuples. Lists cannot be used as dictionary keys.
 
The values of a dictionary are not accessible through indices, as it happens with lists, tuples or strings. In this case, between the brackets should go the name of the key related to the value that is wanted to be found.

In [None]:
# Dictionary access example:

# Dictionary definition:

my_dict = {
    "key_1": 1,
    "key_2": 2,
    "key_3": 3
}

# Dictionary access:

print(my_dict)
print(my_dict["key_2"])

In [None]:
# Dictionary update example:

my_dict = {
    "key_1": 1,
    "key_2": 2,
    "key_3": 3
}

my_dict["key_2"] = 5  # Updates the value related to the "key_2" key.

print(my_dict)
print(my_dict["key_2"])

# Navigation

- **Previous lesson**: [Introduction](./introduction.ipynb)
- **Next lesson**: [Operators](./operators.ipynb)