# Session 02

## Strings

Strings contain characters, and they can be extracted with an operation called indexing:

In [1]:
name = "Juan Luis"
name

'Juan Luis'

The number zero `0` corresponds to the first element, `1` to the second, and so forth. Think of it as an offset:

In [2]:
name[0]

'J'

In [3]:
name[1]

'u'

Negative indices start from the end:

In [4]:
name[-1]

's'

In [5]:
name[-2]

'i'

You can return the length of the string using the built-in function `len`:

In [6]:
len(name)

9

To obtain a portion of the string, you use a similar syntax, which in this case is called slicing:

In [7]:
name[1:5]

'uan '

Some notes:

- The syntax is `[start_index:end_index:step]`
- When an element is not specified, the default is used
  - The default start is `0` (the beginning)
  - The default end is the end
  - The default step is one `1`

In [8]:
name[1::2]

'unLi'

Strings are **immutable**, which means that I can't modify its value:

In [9]:
name[0] = "X"

TypeError: 'str' object does not support item assignment

Also, strings have methods: functions that are attached to them.

In [10]:
name.lower()  # `lower` returns a string with all characters in lowercase

'juan luis'

<div class="alert alert-info">For a list of all string methods, see the documentation at https://docs.python.org/3/library/stdtypes.html#string-methods</div>

## Tuples

Tuples are heterogeneous containers: their individual elements can be numbers, strings, other tuples, or any other object.

In [11]:
my_tuple = 1, 2.0, "3", "four"
my_tuple

(1, 2.0, '3', 'four')

In [12]:
my_tuple[0]

1

In [13]:
len(my_tuple)

4

Tuples are also immutable:

In [14]:
my_tuple[0] = -1

TypeError: 'tuple' object does not support item assignment

## Lists

Lists are heterogeneous, mutable containers. To create them, use square brackets:

In [15]:
my_list = [1, 2.0, "3", "four"]
my_list

[1, 2.0, '3', 'four']

<div class="alert alert-warning">Square brackets are used for two purposes in Python: (1) indexing/slicing and (2) creating lists. Don't confuse the two: the former will always go next to an object!</div>

In [16]:
my_list[0]  # Indexing a list

1

In [17]:
my_list + ["F$V3"]  # Concatenating `my_list` with a list of one element

[1, 2.0, '3', 'four', 'F$V3']

Mutable means that you can change the individual elements of the list:

In [18]:
my_list[0] = -1
my_list

[-1, 2.0, '3', 'four']

## More methods

Some string methods return lists:

In [19]:
name.split()

['Juan', 'Luis']

Which means that I can chain several operations:

In [20]:
# With an intermediate variable
parts = name.split()
parts[0]

'Juan'

In [21]:
# Without an intermediate variable, same result
name.split()[0]

'Juan'

A more complicated example:

In [22]:
"jcano@faculty.ie.edu".split("@")[-1].split(".")[-2:]

['ie', 'edu']

Let's examine what's happening here:

In [23]:
(
    "jcano@faculty.ie.edu"  # string
    .split("@")  # str.split returns a list of two elements
    [-1]  # pick the last element of the list, which is a string
    .split(".")  # str.split returns another list
    [-2:]  # slice from the second-to-last element onwards
)

['ie', 'edu']

## Converting objects

In some cases you can convert one object in another. To do this, use the corresponding built-in function (`list`, `tuple`, `str`):

In [24]:
0.1

0.1

In [25]:
str(0.1)  # Notice the quotes: this is a string!

'0.1'

In [26]:
list(my_tuple)  # Converts the tuple above into a list, notice the square brackets!

[1, 2.0, '3', 'four']

In [27]:
tuple(my_list)  # Converts the list above into a tuple, notice the parentheses!

(-1, 2.0, '3', 'four')

Some other times, conversion might proceed and lose information:

In [28]:
int(1.5)

1

Or directly fail:

In [29]:
int("hello")

ValueError: invalid literal for int() with base 10: 'hello'

## Special sequences

There are some special sequences in Python that are very useful, as it's the case with ranges:

In [30]:
range(1, 10)

range(1, 10)

That doesn't say much, but see what happens if you convert it to a list:

In [31]:
list(range(1, 10))

[1, 2, 3, 4, 5, 6, 7, 8, 9]

## Exercises

### 1. Find string methods

Find which `str` methods achieve this result:

| Original | Transformed |
| --- | --- |
| `"Tim O'Reilly"` | `"tim o'reilly"` |
| `"Tim O'Reilly"` | `"Tom O'Reolly"` |
| `"tim o'reilly"` | `"Tim O'Reilly"` |
| `"Tim O'Reilly"` | `"tIM o'rEILLY"` |
| `"Tim O'Reilly"` | `"    Tim O'Reilly    "` (20 chars) |
| `"Tim O'Reilly"` | `"tIM o'rEILLY"` |

### 2. Weird list

Create this list without conditionals, loops, or hardcoding the values:

```
[49, 0, 47, 0, 45, 0, 43, 0, 41, 0, 39, 0, 37, 0, 35, 0, 33, 0, 31, 0, 29, 0, 27, 0, 25, 0, 23, 0, 21, 0, 19, 0, 17, 0, 15, 0, 13, 0, 11, 0, 9, 0, 7, 0, 5, 0, 3, 0, 1, 0]
```

### 3. Extract titles

Given the following list of names:

- Braund, Mr. Owen Harris
- Cumings, Mrs. John Bradley (Florence Briggs Thayer)
- Heikkinen, Miss. Laina
- Futrelle, Mrs. Jacques Heath (Lily May Peel)
- Allen, Mr. William Henry

Devise which string and slicing methods you can use to extract the titles as such:

```
"Mr"
"Mrs"
"Miss"
"Mrs"
"Mr"
```

### 4. Categorize our clients

We want to categorize clients of our telecom company according to some features, and bucket them into 2 groups:

- Women aged 20 to 25 (both included) or users with any gender younger than 20, in both cases with average monthly data consumption over 5 GB
- Men aged 35 to 45 with average monthly data consumption between 2 and 5 GB, or women aged 30 to 40 with average monthly data consumption between 3 and 8 GB

To what groups does each user belong?

_Tip: Make each user a tuple_

| Id | Age (years) | Sex | Average monthly consumption (GB) |
| --- | --- | --- | --- |
| 1 | 40 | male | 10.2 |
| 2 | 50 | female | 5.4 |
| 3 | 23 | female | 8.0 |
| 4 | 18 | male | 2.5 |