# Session 03

[![Open and Execute in Google Colaboratory](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/astrojuanlu/ie-mbd-python-data-analysis-i/blob/main/sessions/Session%2003.ipynb)

- Strings
- Slicing and indexing
- Data structure concepts: mutability and order
- Tuples, properties and methods

## Strings

Python can also manipulate text using so-called strings, wrapped in either double or single quotes. Double quotes are preferred.

In [None]:
name = "Juan Luis"
name

Such strings can be operated together as well. For example, strings overload the mathematical addition operator `+` to perform concatenation:

In [None]:
"Juan" + " " + "Luis"

Strings contain characters, and they can be extracted with an operation called indexing:

In [None]:
name = "Juan Luis"
name

Also, strings have _methods_: functions that are attached to them. They are accessed using the dot (`.`):

In [None]:
name.lower()  # `lower` returns a string with all characters in lowercase

<div class="alert alert-info">For a list of all string methods, see the documentation at https://docs.python.org/3/library/stdtypes.html#string-methods</div>

## Indexing and slicing

The number zero `0` corresponds to the first element, `1` to the second, and so forth. Think of it as an offset:

In [None]:
name[0]

In [None]:
name[1]

Negative indices start from the end:

In [None]:
name[-1]

In [None]:
name[-2]

You can return the length of the string using the built-in function `len`:

In [None]:
len(name)

To obtain a portion of the string, you use a similar syntax, which in this case is called slicing:

In [None]:
name[1:5]

Some notes:

- The syntax is `[start_index:end_index:step]`
- When an element is not specified, the default is used
  - The default start is `0` (the beginning)
  - The default end is the end
  - The default step is one `1`

In [None]:
name[1::2]

## Mutability and order

In Python, everything is an object. Those objects live in memory, and variables point to them.

Sometimes we can rearrange or change the underlying object in memory. When this happens, we say that these objects are **mutable**.

We have already seen strings. Strings are immutable: the underlying memory cannot be altered. Trying to do so results in an error:

In [None]:
name = "John Lewis"

name[0] = "X"

Notice that overwriting a variable's value is a different operation!

In [None]:
name = "Juan Luis"  # `name` now points to a different object!
name

## Tuples

Tuples, unlike strings, are heterogeneous containers: their individual elements can be numbers, strings, other tuples, or any other object.

In [None]:
my_tuple = 1, 2.0, "3", "four"
my_tuple

In [None]:
my_tuple[0]

In [None]:
len(my_tuple)

Tuples, like strings, are also immutable:

In [None]:
my_tuple[0] = -1

## Exercises

### 1. Find string methods

Find which `str` methods achieve this result:

| Original | Transformed |
| --- | --- |
| `"Tim O'Reilly"` | `"tim o'reilly"` |
| `"Tim O'Reilly"` | `"Tom O'Reolly"` |
| `"tim o'reilly"` | `"Tim O'Reilly"` |
| `"Tim O'Reilly"` | `"tIM o'rEILLY"` |
| `"Tim O'Reilly"` | `"    Tim O'Reilly    "` (20 chars) |
| `"Tim O'Reilly"` | `"tIM o'rEILLY"` |

### 2. Categorize our clients

We want to categorize clients of our telecom company according to some features, and bucket them into 2 groups:

- Women aged 20 to 25 (both included) or users with any gender younger than 20, in both cases with average monthly data consumption over 5 GB
- Men aged 35 to 45 with average monthly data consumption between 2 and 5 GB, or women aged 30 to 40 with average monthly data consumption between 3 and 8 GB

To what groups does each user belong?

_Tip: Make each user a tuple_

| Id | Age (years) | Sex | Average monthly consumption (GB) |
| --- | --- | --- | --- |
| 1 | 40 | male | 10.2 |
| 2 | 50 | female | 5.4 |
| 3 | 23 | female | 8.0 |
| 4 | 18 | male | 2.5 |