# Basic Python Types: Strings

**Last Updated**: 2021-04-27

In this module, you will be introduced to the final basic Python data type: the *string (str)*. While integers and floats cover numeric data, strings represent any piece of text, from a single character to an entire book. Strings are an essential part of working with Python, and this module will show you the tools you can use to work with this crucial data type.

## Table of Contents

1. [Using This Notebook](#Using-This-Notebook)
2. [Constructing Strings](#Constructing-Strings)
3. [Splitting Strings](#Splitting-Strings)
4. [Concatenating Strings](#Concatenating-Strings)
5. [Structure of a String](#Structure-of-a-String)
6. [Searching for a Substring](#Searching-for-a-Substring)
7. [String Formatting](#String-Formatting)
8. [Example Problems](#Example-Problems)
9. [Closing](#Closing)
10. [Release Notes](#Release-Notes)

## Using This Notebook

To use this notebook, first reset the kernel and clear all of the outputs. After, work through the notebook in sequential order, running any code cells as they appear. Code cells for the example problems may not run correctly until you fill in the indicated sections with your solutions.

## Constructing Strings

There are two primary ways we can define a new string: we can either wrap some text in 1) double quotes (") or 2) single quotes('). You cannot use one single quote and one double quote, however. There is no functional difference between using either method, but it is recommended that you use only one style for the entire file or project.

In [1]:
string_single = 'This is a string defined using single quotes'
string_double = "This is a string defined using double quotes"

print(string_single)
print(string_double)

This is a string defined using single quotes
This is a string defined using double quotes


Certain characters, such as the double quotes or single quotes that define the string, require a special *escape character* (`\`) when included in the string.

In [2]:
# The " must be escaped because it would normal define the string boundary
string_esc_1 = "This notebook is named \"Basic Python Types\"."

# The ' does not have to be scaped because the string is defined by "
string_esc_2 = "This isn't too bad..."

print(string_esc_1)
print(string_esc_2)

This notebook is named "Basic Python Types".
This isn't too bad...


## Splitting Strings

In some cases, you may want to *split* strings into individual words or sections. Python offers the `str.split()` method to do this.

In [3]:
# By default, split() separates on white space
unsplit_string1 = "The answer to life, the universe, and everything is 42."
print(unsplit_string1.split())

# You can specify a separator instead of using white space
unsplit_string2 = "A-dash-separated-string."
print(unsplit_string2.split('-'))

['The', 'answer', 'to', 'life,', 'the', 'universe,', 'and', 'everything', 'is', '42.']
['A', 'dash', 'separated', 'string.']


## Concatenating Strings
In other cases, you may want to combine, or *concatenate* multiple strings. There are three ways you can do this operation. First, you can use the addition (`+`) operator to combine multiple strings in a similar way to adding numeric types. Notably, the addition operator doesn't add any separators, so you must explicitly include spaces if needed.

In [4]:
# Use the `+` operator like a numeric type
add_greetings = "Hello" + " " + "world!"
print(add_greetings)

Hello world!


You can also use the multiplication (`*`) operator to create a new string made up of duplicates. Similar to the situation above, the operator does not add white space or separators when creating the final output.

In [5]:
# Use the `*` operator like a numeric type
mult_greetings = "Hello world! " * 5
print(mult_greetings)

Hello world! Hello world! Hello world! Hello world! Hello world! 


Lastly, Python offers the `str.join()` method to combine strings. This method allows you to define the *separator* that it will use to generate the concatenated string.

In [6]:
# Use str.join()
sep = " "  # Use a " " separator
join_greeting = sep.join(["Hello", "world!"])
print(join_greeting)

Hello world!


## Structure of a String

Before we move into more complicated methods, let's take a closer look at the structure of a string. In Python, a string is simply a list of individual characters. These characters are given an *implicit* numeric location, or **index**, starting at 0. To access a specific character by its index, you use *square bracket indexing*: `my_string[index]`. For example, let's take a look at the string "Hello, world!".

In [7]:
# Let's create variable so we can work with it
hello = "Hello, world!"

# Python is 0-indexed, so the first character is associated with index 0
print(hello[0])

H


If you wanted the number of characters in a string, you could use the `len(my_string)` method to calculate it quickly. **Note:** because strings start at index 0, the last index of a string is actually `len(my_string) - 1`. If you try to access an index that doesn't exist, it will cause an error.

In [8]:
# Let's get the length of hello
hello_len = len(hello)
print(hello_len)

13


In [9]:
# If you access an index outside of the string, an error is raised
hello[hello_len]  # Remember that you need to subtract 1 from the len

IndexError: string index out of range

## String Splicing and Substrings

You may often want to grab a small piece, or *substring*, of a larger string. This process, called **splicing**, utilizes the index position introduced in the section above to define the start and endpoints of the substring you want. In fact, the syntax we used to get a specific character is a single-character splicing operator! The syntax for more complicated substrings builds upon that syntax

In [10]:
# Use start:stop to define the range
# The stop index is not included

hello[0:5]

'Hello'

In [11]:
# You can slice from anywhere within the string
hello[7:13]

'world!'

In [12]:
# You can use a : as a shortcut notation for the first or last character
print(hello[:5])  # Shortcut for first character until stop
print(hello[7::])  # Shortcut to start until last character

Hello
world!


In [13]:
# You can use a negative start number to slice from the back
hello[-6::]

'world!'

In [14]:
# You can also determine how frequently you pick characters by passing in a third number
hello[::2]  # Pick every 2nd character

'Hlo ol!'

## Searching for a Substring

Sometimes, you want to determine if a string exists in another string. Depending on what you want to do, there are two ways you can do this: using `str.find()` or using the `in` operator.

In [15]:
# `str.find()` will return the location of the first instance search string.
hello.find("world")

7

In [16]:
# str.find() will return -1 if the search string doesn't exist
hello.find("Greetings")

-1

In [17]:
# the in operator is a Boolean operator that returns true if the search string exists
"Hello" in hello

True

Therefore, if you should use `str.find()` only if you need the exact location of the substring. Otherwise, you should use the `in` operator.

## String Formatting

In all of the previous examples, we created strings by hard-coding the characters directly. However, you may want to create dynamic strings that use variables or other values from other parts of your script. To insert these values into a string, you can use a process called *string formatting*. The method that we use for formatting is the `string.format()` method. The `string.format()` method has many options to customize the output (see the [documentation](https://docs.python.org/3/library/string.html#formatstrings)), but we will only cover a subset of these options in this module.

In general, the syntax for this method is the following:

```python
"This is a string that {} will edit".format(values_to_insert)
```

In the original string, you use *placeholders*, or `{}` to indicate the location to insert a value. Then, you pass comma-separated values that will be inserted to the `.format()` method.

In [18]:
# Simple string formatting example
"This is a string that {} will edit".format("I")

'This is a string that I will edit'

You can also use multiple placeholders to insert multiple values into a string. By default, each placeholder is assigned an *implicit* zero-indexed location corresponding to the order that the arugments passed to the `.format()` method.

In [19]:
"{} and {} will work together.".format("Jim", "Pam")

'Jim and Pam will work together.'

However, you can provide an *explicit* location to each placeholder instead of relying on the implicit numbering. These numbers refer to the order of the arguments passed to the `.format()` method.

In [20]:
"{1} and {0} will work together.".format("Jim", "Pam")

'Pam and Jim will work together.'

Instead of using an index location, you can name the arguments passed to `.format()` and use them in the placeholders. **Note: the naming of the arguments is separate from the name of a variable.**

In [21]:
person1 = "Jim"
person2 = "Pam"

"{person1} and {person2} will work together.".format(person1 = person1, person2 = person2)

'Jim and Pam will work together.'

You can also pass numeric types directly as a `.format()` argument instead of having to explicitly convert into a string.

In [22]:
"There are {} bags that weigh an average of {} pounds".format(5, 3.533325235252)

'There are 5 bags that weigh an average of 3.533325235252 pounds'

You can customize the format of the numeric type by including the appropriate options in the placholder. For a list of the most up-to-date options, refer to the [documentation](https://docs.python.org/3/library/string.html#formatstrings). In the example below, we restrict (with rounding) floats to two decimal places using the `:nf` syntax, where `n` is the number of decimal places to show.

In [23]:
"There are {number} bags that weigh an average of {pounds:.2f} pounds".format(number = 5, pounds = 3.533325235252)

'There are 5 bags that weigh an average of 3.53 pounds'

## Example Problems

There are several example problems included below that you can use to test your understanding of the skills introduced in this module. A solutions notebook is provided at the following links:

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/anthony-agbay/introduction-to-python-environment/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fanthony-agbay%252Fintroduction-to-python%26urlpath%3Dlab%252Ftree%252Fintroduction-to-python%252Fmodules%252Fbasic-python-types-strings%252Fbasic-python-types-strings-solutions.ipynb%26branch%3Dmain) [![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/anthony-agbay/introduction-to-python/blob/main/modules/basic-python-types-strings/basic-python-types-strings-solutions.ipynb)



### Example Problem 1: Format A String to Describe Pi to 8 Decimal Points

Using the basic outline of code below, use string formatting to recreate the following string to describe $\pi$ to 8 decimal points:

```python
"Pi to 8 decimal points is: 3.14159265"
```

In [None]:
# You can access the value of pi using `pi`
from math import pi

my_string = ""  # INSERT YOUR SOLUTION HERE

## DO NOT EDIT THE CODE BELOW
try:
    assert my_string == "Pi to 8 decimal points is: 3.14159265."
    print("Your solution is correct.")
except:
    print("Your answer is incorrect. Please try again.")

### Example Problem 2: Extract the Title of My Favorite Book From a Sentence

Given the following sentence, extract out the title of the book, *The Paper Menagerie and Other Stories* without passing in the specific index locations of the start and end of the title.

```python
"My favorite book is The Paper Menagerie and Other Stories written by Ken Liu."
```

You do not need to worry about generalizing the approach to other book titles. You may need multiple lines and variables to create fhe final solution.

In [None]:
## PROVIDED VALUES AND VARIABLES ##
original_sentence = "My favorite book is The Paper Menagerie and Other Stories written by Ken Liu."

## YOUR SOLUTION SPACE ##
title_string = "" # Make sure title_string has your solution value


## DO NOT EDIT THE CODE BELOW
try:
    assert title_string == "The Paper Menagerie and Other Stories"
    print("Your solution is correct.")
except:
    print("Your answer is incorrect. Please try again.")

## Closing

In this module, you learned about the final basic type in Python, the string. Python includes many methods for working with strings that were not covered in this introductory module. However, you now have a strong foundation to build upon as you continue to work with Python.

---

## Release Notes

- **2021-04-27**
    - Initial posting
    
---

**[Return to the Introduction to Python Homepage](https://walkintheforest.com/Content/Introduction+to+Python/%F0%9F%90%8D+Introduction+to+Python)**