# Silly Strings

Hit **Space** to move forward and **Shift + Space** to move backward

# What are strings?

So far, we have used string, integer, float, and boolean values.

```python
bank_name = 'JSeaMorgan Chase & Co.'
principal_amount = 1815
introductory_bonus = 0
annual_interest_rate = 2.9

while True:
    ...
```

But, strings are not like integers, float, and booleans! Each integer, float, or boolean is one value. However, a string is a **sequence**, or ordered collection, of values—characters, to be precise!

Lists are also sequences. Essentially, strings are lists whose elements are characters.

<div class="row">
    <img src="state-diagram-of-list.png" alt="State diagram of list" style="width: 50%" class="column">
    <img src="state-diagram-of-string.png" alt="State diagram of string" style="width: 50%" class="column">
</div>

## Strings represent textual data

## Web applications

- Email
- Password
- Name
- Posts
- Comments

```python
email = 'ykim@allegheny.edu'
password = 'illnevertell'
name = 'Maria'
posts = [
    'Today, I went biking.',
    'Look at my salad!',
    'Feeling cute, might delete later.'
]
comments = ['Good luck!', "Let's catch up sometime :)"]
```

## Computational biology

Genomic sequence!

```python
normal = 'ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACT'
sample = 'ATGGTGCACCTGACTCCTGTGGAGAAGTCTGCCGTTACT'
```

## How to create a string

Surround text in quotes (single `'` or double `"`--choose your own adventure!).

In [None]:
cheese = 'Cheddar'
fruit = "apple"
lyrics = '''I'm a lumberjack, and I'm okay.
    I sleep all night and I work all day.''' # What am I?
print(lyrics)

## Sometimes, you want to insert dynamic values into static strings

```python
# Remember me?
print(f'''
Bank: {bank_name}
Principal amount: ${principal_amount}
Introductory bonus: ${introductory_bonus}
Annual interest rate: {annual_interest_rate}%
Time: {time} years
Accrued amount: ${accrued_amount}
''')
```

## How to create a format string

1. Start the string with the letter `f`
2. Then, create the string as usual
4. Within the string, surround expressions that evaluate to the dynamic values you want to insert with curly braces `{}`

In [None]:
profession = 'programmer'
lyrics = f'''I'm a {profession}, and I'm okay.
I sleep all night and I work all day.'''
print(lyrics)

# Accessing a character, or characters, in a string

Sometimes, you want to get a subsection, such as a character or characters, of a string, but not the whole string. A subsection of a string is called a **substring**.

## When would you want to access a substring?


To check for known point mutations:

![Sickle cell anemia mutation](sickle-cell-anemia-mutation.png)

## How to access a character in a string

Use the bracket operator `[]` to specify the index of the character.

The first character of a string is at index 0.

In [None]:
normal = 'ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACT'
nucleotide = normal[19]
print(f'The 20th nucleotide is {nucleotide}.') # Why 20th?

In [None]:
normal = 'ATGGTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACT'
sample = 'ATGGTGCACCTGACTCCTGTGGAGAAGTCTGCCGTTACT'

# Does patient have sickle-cell anemia?

mutation_location = 19
if sample[mutation_location] != normal[mutation_location]: # Index can be any expression that evaluates to an integer
    print('Sickle-cell mutation found.')

## How to access characters in a string

You can access multiple characters at once by using the slice operator—`[n:m]`.

The slice operator will return characters from index `n` to `m - 1`.

In [None]:
title = "Monty Python"
print(title[0:5])
print(title[6:12])

# Strings are immutable

In the Love It or List It module, we learned that you can assign an element, or elements, in a list using bracket notation:

In [None]:
maria_lucky_numbers = [7, 128, 23]
print(maria_lucky_numbers)
maria_lucky_numbers[0] = 17
print(maria_lucky_numbers)

So, can we assign a character in a string using bracket notation?

In [None]:
fruit = 'apple'
print(fruit)
fruit[0] = 'y'
print(fruit)

No! This is because strings are **immutable**--they cannot be mutated.

<img src="immutable-string.png" width="600" class="center">

So, what are we to do to get a string with the value `'ypple'`? We simply have to create a new one.

In [None]:
fruit = 'apple'
print(fruit)
fruit = 'y' + fruit[1:] # Slice makes a copy
print(fruit)

<img src="immutable-string-solution.png" width="600" class="center">


## But... why?

- Performance: You know (most of) the storage requirements at construction time (during interpretation)

- Performance: You can safely reuse string objects (because you know they won't change!)

In [None]:
# Sharing is caring
my_fruit = 'apple'
your_fruit = 'apple'
print(my_fruit is your_fruit)

- Philosophy: "The other is that strings in Python are considered as "elemental" as numbers. No amount of activity will change the value 8 to anything else, and in Python, no amount of activity will change the string “eight” to anything else."
- History: Because the creator of Python (Guido van Rossum) decided it so

# Getting the length of a string


## When do you need the length of a string?

Password validation!

![Password validation](password-validation.png)

## How to get the length of a string

Use the `len` function.

In [None]:
password = input('Enter a password: ')
if len(password) < 8:
    print('Password must be at least 8 characters long.')
else:
    print('Password is valid!')

# Traversing a string

## When do you want to traverse a string?

Search engines

- Searching problem: Does a website's content contain the query? Return result
- Counting problem: How many times does a website's content contain the query? Prioritize result

## How to traverse a string

1. `while` loop
2. `for` loop

### `while` loop

In [None]:
query = 'p'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

index = 0
while index < len(content):
    print(content[index])
    if content[index] == query:
        print('Found a match!')
        break # Why is breaking efficient?
    index += 1

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

index = 0
while index < len(content) - len(query) + 1:
    word = content[index:index + len(query)]
    print(word)
    if word == query:
        print('Found a match!')
        break
    index += 1

### `for` loop

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

for index in range(len(content) - len(query) + 1):
    word = content[index:index + len(query)]
    print(word)
    if word == query:
        print('Found a match!')
        break
    index += 1

This pattern of traversing a string and returning (or breaking!) when we find what we are looking for is called a **search**.

The pattern of traversing a string and counting how many times we find what we are looking for is called a **count**.

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

count = 0 # Initialize counter
for index in range(len(content) - len(query) + 1):
    word = content[index:index + len(query)]
    if word == query:
        print('Found a match!')
        count += 1 # Increment counter
    index += 1
print(f'Number of matches: {count}')

# String methods

## What's the difference between methods and functions?

A method *is* a function!

```python
fruits = ['apple', 'banana', 'canteloupe']
```

- Called by naming it and following the name with parentheses: `fruits.append('peach')`
- Can take input (arguments) and give output (return value): `fruits.pop() # Deletes and returns last element`

Specifically, methods are functions that *belong* to objects.

*Which* methods an object has depends on its type.

`fruit` has methods `append` and `pop` because it is a `list`.

A method is applied to the object it is called **on**.

You call a method **on** an object with dot notation: `<object>.<method>`.

`fruits.append('peach')` applies the `append` method on the `fruits` object, which is a list.

## `upper` method

In [None]:
fruit = 'banana'
louder_fruit = fruit.upper()
print(louder_fruit)

The `upper` method was called **on** the `fruit` object, which is a string. It returns the string with all characters in uppercase.

We know that `fruit` has a `upper` method because it is a string.

In [None]:
## `find` method

fruit = 'banana'
index = fruit.find('a')
print(index)

The `find` method was called **on** the `fruit` object, which is a string. It searches for the substring (`'a'`) within the string and returns the index of the first match.

We know that `fruit` has a `finds` method because it is a string.

You can find the complete list of string methods in the Python documentation: http://docs.python.org/3/library/stdtypes.html#string-methods.

# The `in` operator

Just as with lists, the `in` operator takes two strings and returns `True` if the first is a substring of the second.

In [None]:
print('a' in 'banana')

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''
print(query in content)

## So, why go through all the trouble of traversing a string?

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

print(f'Using in operator: {query in content}')

for index in range(len(content) - len(query) + 1):
    word = content[index:index + len(query)]
    if word == query:
        print('Using for loop: True')
        break
    index += 1

What if you want to require two matches?

Cannot use `in` operator! Traversal to the rescue!

In [None]:
query = 'puppy'
content = '''A puppy is a juvenile dog.
A puppy's coat color may change as the puppy grows older,
as is commonly seen in breeds such as the Yorkshire Terrier.'''

count = 0
for index in range(len(content) - len(query) + 1):
    word = content[index:index + len(query)]
    if word == query:
        count += 1
        if count == 2:
            print('Found two matches!')
            break
    index += 1

# String comparison

Relational operators (e.g. `==`, `<`, `>`) work on strings as well!

In [None]:
ans = input('Answer "yes" or "no": ')
if ans == 'yes':
    print('Answered "yes"!')
elif ans == 'no':
    print('Answered "no"!')

## Sorting strings

<div>
    <img src="contacts.jpeg" class="center" width="300">
</div>

In [None]:
first_name = input('Enter first name: '), last_name = input('Enter last name: ')
name = ' '.join([first_name, last_name])
contact_first_name = 'Julia', contact_last_name = 'Fillory'
contact_name = ' '.join([first_name, last_name])

if last_name < contact_last_name:
    print(f'{name} comes before {contact}')
elif last_name > contact_last_name:
    print(f'{name} comes after {contact}')
else:
    if first_name < contact_first_name:
        print(f'{name} comes before {contact}')
    elif first_name > contact_first_name:
        print(f'{name} comes after {contact}')

Be aware of case when sorting!