# Strings

## Contents:
1. What is a string?
2. String examples
3. Multiline strings
4. Escape sequences
5. Working with strings and overloading
6. Strings within strings
7. Strings and functions
8. String methods
9. Formatting strings

## What is a string?

`str`, short for _string_, is Python's text data type. Strings are sequences of characters, including digits and symbols. Strings are surrounded by either single (`'`) or double (`"`) quotes. Which one to use is a matter of preference, but they must be the consistent around a string and should be consistent across our code. The presence of apostrophes, single quotes, or double quotes in our text may require us to use a different quote option, though.

## String examples

In [None]:
'This is a string'

In [None]:
"So is this"

In [None]:
"these quotes do not match'

In [5]:
'Let's see if this works

SyntaxError: invalid syntax (1618629434.py, line 1)

In [None]:
# use double quotes to avoid the error
"Let's see if this works now"

## Multiline strings

If we want a string to span multiple lines of code, we need to wrap it in triple quotations (`'''` or `"""`). We saw this before with our docstrings!

In [None]:
'''
I am a multiline string

I'm very useful for documenting functions
and for storing longer texts.

And you don't have to worry about apostrophes or "quote"s!
'''

## Exercise 1
Create a multiline variable on anything of your choice and output it.

In [3]:
# Your code goes her
x = '''the rain in Spain
falls mainly in the plains'''

print (x)

the rain in Spain
falls mainly in the plains


In [4]:
x

'the rain in Spain\nfalls mainly in the plains'

## Escape sequences

Where did those `\n` characters come from and where did our line breaks go in that last string? `\n` is an _escape sequence_, a combination of characters that means something else. Here, it means "new line", not literally "\n". More generally, the backslash (`\`) is an escape character that can be used to indicate that the next character should be treated differently.

### Escape sequences

| Escape sequence | Description        |
|-----------------|--------------------|
| \\'             | Single quote       |
| \\"             | Double quote       |
| \\\\            | Backslash          |
| \\t             | Tab                |
| \\n             | Newline            |
| \\r             | Carriage return    |

There are other escape sequences in Python that do not require a `print` statement to be correctly interpreted within a string. These escape sequences can be used directly in the string definition and will work as expected. Here are some common escape sequences:

1. `\\`: This escape sequence represents a single backslash character. You can use it directly in a string to include a backslash.
   
   Example:
   ```python
   my_string = 'This is a backslash: \\'
   ```

2. `\"`: This escape sequence represents a double quotation mark (`"`). It allows you to include double quotes within a double-quoted string.

   Example:
   ```python
   my_string = "This is a double-quoted string with a quote inside: \"example\""
   ```

3. `\'`: As previously mentioned, this escape sequence represents a single quotation mark (`'`) and can be used within a single-quoted string.

   Example:
   ```python
   my_string = 'This is a single-quoted string with a quote inside: won\'t'
   ```

These escape sequences work in the same way as `\'` and do not require a `print` statement for proper interpretation within a string. They allow you to include special characters within your string literals.

In [6]:
'This string won\'t result in an error thanks to the escape sequence'

"This string won't result in an error thanks to the escape sequence"

## Working with strings and overloading

Some arithmetic and comparison operators also work with strings, though their meaning changes. When an operator does different things depending on the data provided, we say that it is _overloaded_. It's important to understand what type of data is being used with an overloaded operator so that the results are as expected.

### Strings and arithmetic operators

In [7]:
# adding strings together concatenates them
'hello' + ' ' + 'world'

'hello world'

In [8]:
# multiplying a string repeats it
'ha' * 3

'hahaha'

In [9]:
# mixing data types results in an error
'The year is ' + 2020

TypeError: can only concatenate str (not "int") to str

### Strings and comparison operators

We can use the `==` and `!=` operators to compare strings, as well as to compare strings and numbers.

In [10]:
'apple' == 'Apple'

False

In [11]:
'apple' != 'Apple'

True

In [12]:
'20' == 20

False

## Strings within strings


We can extract a piece of a string by _slicing_ it. For example, if we want to get initials from a first name and last name, we may slice the first letter of each.

To slice a string, we add square brackets (`[]`) to the end of it. To get a single character, we then put the _index_ or position, of the character we want to get.

To get multiple characters, we put the starting index of our slice, then a colon (`:`) and finally the index where we want to end our slice. **Strings in Python are _zero-indexed_, meaning that the first letter is at index 0, not 1.** Slices do not include the character at the ending index position.

### Slicing single characters

In [13]:
first_name = 'Ada'
last_name = 'Lovelace'

# print initials
print('Initials are', first_name[0], last_name[0])

Initials are A L


If we don't provide a starting index, our string slice will go from the beginning to the ending index. Similarly, if we don't provide an ending index, our string slice will go from the starting index to the end.

In [14]:
print(first_name[:1], last_name[4:])

A lace


In [15]:
phone_number = '+1 555-123-4567'

# get the area code
phone_number[3:6]

'555'

We can even use negative indices. Here, we slice from the fourth-to-last character to the end.

In [16]:
phone_number[-4:]

'4567'

## Exercise 2
Get the middle 6 letters of this word: `supercalifragilisticexpialidocious`

In [18]:
# Your code goes here
x = 'supercalifragilisticexpialidocious'
len(x)

34

In [19]:
x[33]

's'

In [20]:
x[16]

's'

### Checking for strings

We can also check for character sequences, or _substrings_ within a string. There are multiple ways to do this, but one of the simplest is with the `in` operator.

In [21]:
job_qualifications = 'The successful applicant will be proficient in R, Python, SQL, statistics, and data visualization.'

In [22]:
job_qualifications

'The successful applicant will be proficient in R, Python, SQL, statistics, and data visualization.'

In [23]:
'R' in job_qualifications

True

In [24]:
' r ' in job_qualifications

False

In [25]:
'JavaScript' in job_qualifications

False

In [26]:
'licant' in job_qualifications

True

In [27]:
'r' in job_qualifications

True

## Strings and functions

String data works with some of the functions we've seen before, like `print()`, as well as with built-in functions like `len()`, which tells us how long the argument we pass in is.

In [29]:
# We can print strings, ints, and floats together
print(1, 'fish,', 2, 'fish')

1 fish, 2 fish


In [28]:
len('Sometimes we type long-winded sentences to meet word counts or character counts.')

80

## String methods

Strings in Python also have _methods_, which are functions that work only for a specific type of data. We'll learn about methods in more depth later. The key points here are:

- String methods work only on strings -- not on integers, floats, or booleans
- Methods are called differently from other functions

To use a string method, we specify a string, then follow it with a period and the method being called.

In [30]:
# convert a string to all caps
'I am not yelling'.upper()

'I AM NOT YELLING'

In [31]:
# count the 'e's
'This string is unusual'.count('e')

0

In [32]:
# check if a string ends with a substring
'file_name.csv'.endswith('.csv')

True

In [33]:
# replace text
'long file name with spaces.csv'.replace(' ', '_')

'long_file_name_with_spaces.csv'

For a full list of string methods, see the [Python documentation](https://docs.python.org/3/library/stdtypes.html#string-methods).

### Formatting Strings

The `format()` string method makes it easier to combine text with other data. We can create text templates with curly braces (`{}`), then pass values in.

In [34]:
first_name = 'Ada'
last_name = 'Lovelace'

'Ada Lovelace\'s initials are {}. {}.'.format(first_name[0], last_name[0])

"Ada Lovelace's initials are A. L."

If we preface the string with `f`, we can pass values directly into the curly braces.

In [35]:
f'Ada Lovelace\'s initials are {first_name[0]}. {last_name[0]}.'

"Ada Lovelace's initials are A. L."

- **f-string**: `f'...'` is an f-string, it allows for more concise and readable string interpolation by directly embedding expressions within the string.
- **`.format()` method**: This method allows for string formatting by specifying placeholders `{}` within the string and providing values to replace those placeholders in the `.format()` method.

Both expressions essentially achieve the same outcome: they insert the first characters of the `first_name` and `last_name` variables into the string to state Ada Lovelace's initials. The choice between them often comes down to personal preference and the Python version being used. Many developers prefer f-strings for their simplicity and readability.

When in doubt, we can call for `help()`!

In [None]:
help(str.lower)