# Strings, indexing and slicing
This notebook introduces the basic properties of Python strings, as well as the basics for working with indexes, especially for slicing strings.

Before starting with the main content of this notebook, let's introduce the concept of **comments**.

Everything written to the right of an asterisk (`#`) is ignored when the code is executed:

In [1]:
# print('Hello sunshine!')
# print('Hello mom and dad!')
print('Hello world!')
# print('Good buy!')

Hello world!


Comments are generally used to introduce explanations within the code, or to prevent part of the code (maybe because it returns some error) to be executed.

In [2]:
# This code prints different ages of a given person

# First, the name and date of birth of a person are assigned to variables
name = "James" # This variable keeps the first name
birth = 1991   # This variable keeps the year of birth

# Second, the current age of the given person is assigned to a new variable
age = 2020 - birth

# Printing messages
# The first message presents the given person
print("{} was born in {}".format(name, birth))
# The second message prints the age in Earth solar years
print("{} is {} years old.".format(name, age))
# The third message prints the age in lunar years
print("In lunar years, {} is {} years old.".format(name, age * 1.03))
# The fourth message prints the age in Martian years
print("If {} were Martian, he would be {:.2f} years old.".format(name, age / 1.88))

James was born in 1991
James is 29 years old.
In lunar years, James is 29.87 years old.
If James were Martian, he would be 15.43 years old.


As you see, the last `print()` statement has been commented because something is wrong and an error is returned (maybe you can spot the error and fix it).

⇒ **Note**: Jupyter Notebooks help us better understand our code by coloring different elements with different colors

## Strings
Python works with different data types:
- strings are text data, and are defined by quotation marks (for example, `"Hello world!"`)
- integers are numbers without decimals (for example, `1991`)
- floating points are numbers with decimals (for example, `1.88`)

To know the type of a given data, the function `type()` can be used:

In [3]:
x = "Hello world!"
y = 1991
z = 1.88

# To print the data type, I use the type() function within the .format() method
print("The data type of variable x is {}".format(type(x)))
print("The data type of variable y is {}".format(type(y)))
print("The data type of variable z is {}".format(type(z)))

The data type of variable x is <class 'str'>
The data type of variable y is <class 'int'>
The data type of variable z is <class 'float'>


These data types are named after their abbreviations: `str` for strings, `int` for integers, and `float` for floating points.

Conveniently, there are functions corresponding to these abbreviations, namely `str()`, `int()` and `float()`, to convert a data type to another one:

In [4]:
# Convert data to string
v1 = str(1991)   # Convert integer to string
v2 = str(1.88)   # Convert floating point to string

print('{} is a {}'.format(v1, type(v1)))
print('{} is a {}'.format(v2, type(v2)))
print() # This prints an empty line for better readability

# Convert data to integer
v3 = int("2020")   # Convert string to integer
v4 = int(12.34)    # Convert floating point to integer

print('{} is a {}'.format(v3, type(v3)))
print('{} is a {}'.format(v4, type(v4)))  # Notice that when printed, the decimals are lost
print()

# Convert data to floating point
v5 = float("56.78")   # Convert string to floating point
v6 = float(1770)      # Convert integer point to floating

print('{} is a {}'.format(v5, type(v5)))
print('{} is a {}'.format(v6, type(v6)))  # Notice that when printed, a 0 decimal is added

1991 is a <class 'str'>
1.88 is a <class 'str'>

2020 is a <class 'int'>
12 is a <class 'int'>

56.78 is a <class 'float'>
1770.0 is a <class 'float'>


For defining a string you can choose either double quotation marks `"..."` or single quotation marks `'...'`, as long as the opening and closing quotation marks are of the same type

In [5]:
print("I like double quotation marks.")
print('But I prefer single quotation marks.')

I like double quotation marks.
But I prefer single quotation marks.


To print quotation marks of a given type, you can enclose them within quoation marks of the other type

In [6]:
print('I like "double" quotation marks.')
print("But I prefer 'single' quotation marks.")

I like "double" quotation marks.
But I prefer 'single' quotation marks.


There are some special characters, known as **escape sequences**, always preceded by a backslash `\`:

In [7]:
# '\n' is used for printing new lines
print("Freude, schöner Götterfunken,\nTochter aus Elisium,\nWir betreten feuertrunken\nHimmlische, dein Heiligthum.")
print()

# '\t' is used por tabulation
print("I start at the left margin.")
print("\tAnd I start eight blank spaces to the right.")
print()

# "\"" and '\'' are used for printing quotation marks of the same type that those enclosing them
print("I like \"double\" quotation marks.")
print('But I prefer \'single\' quotation marks.')
print()

# '\\' is used for printing backslash
print("My code is saved in Documents\\CompMethEthno\\Code")

Freude, schöner Götterfunken,
Tochter aus Elisium,
Wir betreten feuertrunken
Himmlische, dein Heiligthum.

I start at the left margin.
	And I start eight blank spaces to the right.

I like "double" quotation marks.
But I prefer 'single' quotation marks.

My code is saved in Documents\CompMethEthno\Code


Strings can be concatenated using the operator `+`, resulting in a new string.

In [8]:
w1 = 'Beet'
w2 = 'hoven'

print('Hello ' + w1 + w2 + '!') # Notice that I added a space after 'Hello'

Hello Beethoven!


Only strings can be concatenated with this method. To concatenate strings with other data types, the function `str()` can be used.

⇒ **Note**: In the following cells, the variables `name` and `birth` defined in the second cell of this notebook are used. Make sure that you executed that cell.

In [9]:
print(name + ' was born in ' + str(birth)) # Notice the spaces before and after 'was born in'

James was born in 1991


Using the function `print()`, different elements can be printed at the same time separated by commas, resulting in an interpolated blank space. With this method, different data types can be printed in a single command.

In [10]:
print(name, birth)

James 1991


By now, you might have already realized that Python offers different ways for achieving the same output. Using one or the other depends on your needs, or even on your style of coding.

In [11]:
print('{} was born in {}'.format(name, birth))
print(name + ' was born in ' + str(birth))
print(name, 'was born in', birth)      # Here the spaces are not needed, they are added for each comma

James was born in 1991
James was born in 1991
James was born in 1991


## Indexing and slicing
Strings are sequences of ordered characters, and the position of each character in this sequence is indexed with a number. In Python, **indexing starts at 0**. That means, that the first character of a string is in the position with index 0, the second character has index 1, and so on.

For example, let's consider the string `"Hello world"`. The indexing of this string would be like this:

|string|`H`|`e`|`l`|`l`|`o`|` `|`w`|`o`|`r`|`l`|`d`|`!`|
|------|---|---|---|---|---|---|---|---|---|---|---|---|
|index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11|

⇒ **Note**: the quotation marks are ignored for indexing, they are just used to indicate that whatever is enclosed by them is a string.

We can access the character in a given position by using its index in square brackets `[ ]` after the string

In [12]:
phrase = 'Hello world!'

print(phrase[0])
print(phrase[6])
print(phrase[11])

H
w
!


Indexes are also assigned in backward order, using negative numbers, in this way:

|string|`H`|`e`|`l`|`l`|`o`|` `|`w`|`o`|`r`|`l`|`d`|`!`|
|------|---|---|---|---|---|---|---|---|---|---|---|---|
|index |-12|-11|-10| -9| -8| -7| -6| -5| -4| -3| -2| -1|

In [13]:
print(phrase[2] + phrase[4] + phrase[-3] + phrase[-1])

lol!


Two indexes can be combined, separated by colon (`:`) to **slice** the given string. This slice starts in the first index, and **ends in the position previous to the second index**.

So, let's say that we want to return the slice `ello` from the phrase `Hello world!`. These characteres correspond to the indexes `1`, `2`, `3` and `4`. Therefore, the indexes that we need to use for slicing are `1` and `5`, because the slice starts in the first index (`1`) and ends in the position previous to the second index (if we give `5`, it ends at `4`).

In [14]:
print(phrase[1:5])

ello


When slicing, if we omit the first index, the returned slice starts from the beginning. If the second index is omitted, the slice ends at the end.

In [15]:
# To return 'Hello'
print(phrase[:5])

# To return 'world!'
print(phrase[6:])

Hello
world!


## Exercises
### Exercise 1. Debugging
When a code has a problem, or doesn't work properly, programmers say that it has "bugs." And I can assure you that when you start coding, your code is going to have bug. So, an important task for all programmers is "debugging," that is, fixing problems, correcting code. So it is good that you get used to debugging from the beginning.

The following piece of code is full of bugs. It will return error messages until everything is solved. These error messages are quite cryptic, but they actually help a lot. Start reading the error messages from the last line. This last line tells you the type of error and what is the error. For example, if you run the following cell as it is, you will get a `SyntaxError` (look at the last line). Read what it tells you about that error. Then, if you read the error message from the beginning, you will find in which line the error occurred, in this case `line 5`. The error message will even try to tell you the exact position were the error is located, by copying the bugged line, and signalling the exact position with the symbol `^` underneath. These indications of the line and the exact location of the bug might not always be accurate (in fact, that is the case of this first error message), because the way the computer process the code is not exactly as we read it. But in any case, it is always good to try to get as much help as possible from these error messages.

So, can you fully debug this code?

In [19]:
animal = "lamb"

print ("Mary had a little {},".format(animal))
print ("\tIts fleece was white as snow,")
print ("And every where that Mary went")
print ("\tThe"  + animal + "was sure to go;")
print ("He followed her to school one day—")
print ("\tThat was against the rule,")
print ("It made the children laugh and play,")
print ("\tTo see a {} at school.".format(animal))

Mary had a little lamb,
	Its fleece was white as snow,
And every where that Mary went
	Thelambwas sure to go;
He followed her to school one day—
	That was against the rule,
It made the children laugh and play,
	To see a lamb at school.


### Exercise 2. More debugging
Sorry, this code also have few bugs. Please try to debug it. Once you get the code running, read carefully the output. You might want to fix few things more.

In [24]:
name = "John"
height = 178

print("I have a friend called " + name)
print("He is", height, "centimetres tall.")
print("However his friend from the USA don't understand the metric system.")

inches = height / 2.54

print("Talking to him, {} is {:.2f} inches tall.".format(name, inches))
print('But height is usually expressed in feet.')
print('So name is {:.2f} feet and {:.2f} inches tall.'.format(inches / 12, inches % 12))

I have a friend called John
He is 178 centimetres tall.
However his friend from the USA don't understand the metric system.
Talking to him, John is 70.08 inches tall.
But height is usually expressed in feet.
So name is 5.84 feet and 10.08 inches tall.


### Exercise 3. Let's play with slicing!
This is just a game. You have to print the phrases I give you in comments, but **only using slices** from the given string. That is, you are not allowed to add text or characters that are not alreay present in the given string.

The next cell is just an example for you to understand the exercise. You start playing in the following one.

In [25]:
course = "Computational Methods in Ethnomusicology"

# music
print(course[30:-5])

# Methodology
print(course[14:20]+course[-5:])

# My mom is cool
print(course[14]+course[-1], course[2]+course[1:3], course[-7]+course[-8], course[-6:-4]+course[-5:-3])

music
Methodology
My mom is cool


Now your turn!

In [82]:
verse = "If music be the food of love, play on (Shakespeare)"

# pear
print(verse[-6:-2])

# velofood
print(verse[26:28]+verse[24:26]+verse[16:20])

# move on
print (verse[3]+verse[25:28], verse[-16:-14])

# Shake the fool
print(verse[-12:-7],  verse[12:15], verse[16:19]+verse[24])

# I love food (Shakespeare)
print(verse[0], verse[24:28],verse[16:20], verse[-13:51])

pear
velofood
move on
Shake the fool
I love food (Shakespeare)
