# Appendix 1: Strings

## Advanced string operations

### Concatenation: `+`

For string the `+` operator is used for concatenation, joining multiple strings together.

In [None]:
word1 = 'Hello'
word2 = 'Python'
greet = word1 + ' ' + word2 + '!'
print(greet)

### Multiplication: `*`

The `*` operator is used for "multiplying" a string, repeating and concatenating it the given times.

In [None]:
greet3times = greet * 3
print(greet3times)

### Length: `len()`

The `len()` statement returns the length of the string.

In [None]:
print(len(greet))

### String indexing and slicing: `[]`

A single charcter of a string can be access by indexing it, starting from zero:

In [None]:
print(greet[0])

*Question:* what will happen if we index with a negative number?

In [None]:
print(greet[-1])

*Question:* what will happen if we with a number larger than the length of the string?

In [None]:
print(greet[100])

We can also create substrings by fetching a slice of a string.  
Note that the end index is exclusive, so if the slice is given as `[4:6]`, then the characters with the index 4 and 5 will be sliced.

In [None]:
print(greet[0:5])
print(greet[6:7])

The first (start) index can be omitted, by default it will be zero:

In [None]:
print(greet[:5])

The second (end) index can also be omitted, by default it will be the end of the string:

In [None]:
print(greet[6:])

*Question:* what happens if we omit both the start and the end index?

In [None]:
print(greet[:])

*Question:* what happens if we use negative indices?

In [None]:
print(greet[-7:])
print(greet[1:-2])

*Question:* what happens if the end index is larger than the length of the string?

In [None]:
print(greet[6:100])

## Built-in string functions

A comprehensive list of the built-in functions can be found in the ['string library'](https://docs.python.org/3/library/stdtypes.html#string-methods) reference documentation.

These string functions are *methods*, which means they can be called on a string instance (value or variable) in a form `stringvar.method(parameters)`.
They do not modify the original string, but return a new instance.

### Lowercase: `lower`

Replace all letters to lowercase.

In [None]:
print(greet)
greet_lower=greet.lower()
print(greet_lower)

### Uppercase: `upper`

Replace all letters to uppercase.

In [None]:
print(greet)
greet_upper=greet.upper()
print(greet_upper)

### Capitalization: `capitalize` and `title`

Replace the very first letter or the first letter of each words to uppercase. The rest will be turned to lowecase.

In [None]:
print(greet_lower)
greet_capital=greet_lower.capitalize()
print(greet_capital)

greet_title=greet_lower.title()
print(greet_title)

### Substring search: `find`

Looks up the first occurance of a character or a substring in a string. The result is the starting index position of the first occurance as an `integer`. Keep in mind that the first index is `0`! The returned value is `-1` if the substring was not found.

In [None]:
print(greet)
location = greet.find('Python')
print(location)

print(greet)
location = greet.find('java')
print(location)

The starting index of the search can also be passed to the function.
This way multiple occurances of a substring can be looked up.

In [None]:
print(greet3times)
location = greet3times.find('Python')
print(location)

location = greet3times.find('Python', location + 1)
print(location)

This function is case-sensitive.  
If you would like to search for both lower and uppercase variants, you may convert the string to lowercase first!

In [None]:
print(greet)
location = greet.find('python')
print(location)

print(greet.lower())
location = greet.lower().find('python')
print(location)

### Substring replace: `replace`

Replace **all** occurances of a substring to another substring.

This function is also case-sensitive.

In [None]:
greet_alternative = greet3times.replace('Hello', 'Hi')
print(greet_alternative)

### Stripping: `lstrip`, `rstrip`, `strip`

All functions are used to trim unrequired whitespace characters (spaces, tabulators, newlines) from a string.
* `lstrip` - remove whitespace characters from the lefthand side.
* `rstrip` - remove whitespace characters from the righthand side.
* `stri` - remove whitespace characters from both sides.

In [None]:
greet_world = '   --== Hello World  ==-- '
print(greet_world.lstrip())
print(greet_world.rstrip())
print(greet_world.strip())

The characters to remove can also be specified otherwise:

In [None]:
print(greet_world.strip(' -='))

### Prefix and suffix check: `startswith`, `endswith`

These functions verifies whether a string starts or ends with the given substring. The result is a *boolean* value (`True` or `False`.)

This function is also case-sensitive.

In [None]:
print(greet.startswith('Hello'))
print(greet.startswith('Hi'))

### Splitting: `split`

Split a string into a list of substring by defining a so-called *separator* or *delimiter* character or string.
The *separator* is removed from the string.

In [None]:
print(greet3times)
words = greet3times.split('!')
print(words)

*Question:* why is there an empty string at the end of the result list?

## Logical operations on strings

### Containment check: `in`

Verify whether a letter or a substring occures *anywhere* inside a string.
The result is a *boolean* value (`True` or `False`.)

In [None]:
print('p' in greet)
print('P' in greet)
print('Python' in greet)

In [None]:
if 'P' in greet:
    print('Contains a letter P!')

### Equality check: `==`

Perform a case-sensitive equality check between two strings.

In [None]:
if word2 == 'Python':
    print('It was Python.')
else:
    print('It was not Python.')

---

## Summary exercise on strings

**Task:** request the name, birth year, email address and spoken languages of the user.
The spoken languages are requested as a string, separated by commas.

Check whether the following validation rules are matched. If any of the data is invalid, display an error message and request a repeated entry of the data.
 - The name must contain at least 2 parts. (There must be a space inside it.)
 - The birth year must be a number, between 1900 and 2019.
 - The email address must contain a `@` letter and must end with a `elte.hu` domain.


When the data was given successfully, trim any unncceseary whitespaces and display it in a corrected format:
 - The name shall be displayed with each part starting with a capital letter.
 - Beside the birth year, calculate the (possible) age of the current user.
 - The email address shall be lowercase.
 - The spoken languages shall be displayed as a list of languages instead of a single string.