# **Working with Text**
Manipulating textual data is critical for language technology. Luckily, Python provides some powerful tools for working with strings.

## **Strings and Lists**
Python strings share many features with lists that we've seen previously. We can get the length of a string:

In [None]:
name = "Hasan"
print(len(name))

We can get a character at a particular index or a range of characters:

In [None]:
name = "Hasan"

print(name[2])

print(name[2:])

And we can loop over the characters in the string:

In [None]:
name = "Hasan"

for character in name:
    print(character)

Strings also support concatenation:

In [None]:
given = "Jack"
family = "Reed"

print(given + family)

...and the `in` operator:

In [None]:
sentence = "The quick brown fox jumped over the lazy dog"

if "fox" in sentence:
    print("We found a fox")

#### **Exercise 1**
Print out all of the 2-character substrings that occur in the given string. For instance, the first substring should be "He". There should be 12 total substrings.

<details>
  <summary>Show answer</summary>
      <pre style="background-color: honeydew; padding: 10px; border-radius: 5px;"><code style="background: none;">for index in range(len(sentence) - 1):
    print(sentence[index:index+2])</code></pre>
</details>

In [None]:
sentence = "Hello, world!"

# TODO: Loop over the string and print out all the 2-character substrings


## **String Methods**
Python strings also have a number of text-specific methods that will be helpful. For instance, we will use the `lower` method to convert strings to all lowercase characters.

In [None]:
sentence = "Adam told Sarah that he bought a new Honda Civic in New York"

print(sentence.lower())

The method `split` will separate the string at space characters (or some other character that we provide.

In [None]:
sentence = "Adam told Sarah that he bought a new Honda Civic in New York"

print(sentence.split())

We can join a list of strings together using the `join` method.

In [None]:
string_list = ["Adam", "told", "Sarah", "that", "he", "bought", "a", "new", "Honda", "Civic", "in", "New", "York"]

"#".join(string_list)

Some other helpful methods include:
- `capitalize`
- `find`
- `isdigit`
- `isspace`
- `replace`
- `upper`

See the complete list [here](https://www.w3schools.com/python/python_strings_methods.asp).

#### **Exercise 2**
Go to the link above and pick one of the string methods not listed here. Try it out below, and write a comment explaining how it works.

In [None]:
# TODO: Your string method


## **Format Strings**
Often you will need to include variable values in a string. One easy way to do that is with **format strings**.

In [None]:
calculation_result = 12345 * 6789

print(f"The result of the calculation is {calculation_result}")

Format strings work for just about any type of variable.

In [None]:
my_list = [10, 100.1, "abc"]

print(f"Here's a random list: {my_list}")

## **Escape Characters**
Occasionally, you might see **escape characters** in strings you are working with. These are denoted by the backslash `\` and are used to perform special functions. For instance, the escape character `\n` is used to indicate a newline.

In [None]:
my_two_line_string = "Hello,\nworld"
print(my_two_line_string)

Another commonly used escape character is the tab character `\t`, which can help line up columns of text.

In [None]:
str1 = "Michael\t22"
str2 = "Teddy\t24"

print(str1)
print(str2)

## **Summary**
In this lesson, we learned about Python strings.
- Functionality shared with lists such as `len` and looping
- String-specific methods such as `lower` and `split`
- Interpolating variables with format strings
- Escape characters

At this point, you've completed the first set of skills. Congratulations!

Next, you'll learn about reading and writing files.

[Next Lesson](<./7. Files.ipynb>)