# Working with Text Data: Strings

- [Download the lecture notes](https://philchodrow.github.io/PIC16A/content/basics/strings.ipynb). 

A *string* is a series of one or more characters enclosed in either `'single'` or `"double"` quotation marks. For most strings, it doesn't matter whether you use single or double quotes: 

In [2]:
a = "to boldly go"
b = 'to boldly go'
a == b

True

In [5]:
a = "Picard says 'to boldly go'"
b = 'Picard says "to boldly go"'
c = "Picard says "to boldly go""
# ---

SyntaxError: invalid syntax (<ipython-input-5-9bdba32c05c3>, line 3)

What if you need to include both kinds of quotation marks in the same string? In this case, we need to use `\` to *escape* characters: 

In [7]:
d = 'Picard says "That\'s Kirk\'s line."'
d

'Picard says "That\'s Kirk\'s line."'

The `print()` function displays a pleasant, human-readable representation of many `python` objects. 

In [8]:
print(d)
# ---

Picard says "That's Kirk's line."


## Basic String Manipulations

Python gives us several ways to manipulate strings. An especially important one is *concatenation*, which can be achieved with `+`:

In [9]:
"U.S.S. Enterprise " + "D"

'U.S.S. Enterprise D'

We can also do "multiplication." In the context of strings, multiplication means *repetition*: 

In [13]:
"Picard "*3
3*"Picard "

'Picard Picard Picard '

Concatenation is a useful tool for constructing strings using variables:

In [15]:
x = "boldly"
"to " + x + " go"

'to boldly go'

When we want to form messages involving numbers, we generally need to use the `str()` function to convert those numbers into strings prior to concatenation: 

In [16]:
x = 9
"Deep Space " + x
# ---

TypeError: can only concatenate str (not "int") to str

In [17]:
"Deep Space " + str(x)

'Deep Space 9'

## String Indexing

Like C++, Python using 0-based indexing. It also supports negative indices to count backwards from the end of a string. 

<figure class="image" style="width:50%">
  <img src="https://cdn.programiz.com/sites/tutorial2program/files/python-list-index.png" alt="The word 'probe' is shown with positive indices (left to right) zero through four and negative indices (left to right) -5 through -1. ">
  <figcaption><i>Illustration of string indexing in Python</i></figcaption>
</figure>




In [18]:
s = "Alpha Quadrant"

In [19]:
s[0]

'A'

In [20]:
s[-1]

't'

Take a moment to predict the output of the following: 

In [21]:
s[2], s[-2]

('p', 'n')

We can easily grab substrings using the `:` operator. `s[start:stop]` will get letters starting at index `start`, up to **and not including** index `stop`. 

In [22]:
s[0:3]

'Alp'

In [23]:
s[-4:-2]

'ra'

We can also use the syntax `s[start:stop:interval]` to get letters that are `interval` apart. 

In [24]:
# every other letter from indices 0 through 5

s[0:6:2]

'Apa'

In [25]:
# leaving `start` and `stop` blank indices you want to go through the whole string

s[::2]

'ApaQarn'

In [26]:
# s backwards
s[::-1]

'tnardauQ ahplA'

An important thing you **can't** do with string indexing is modify the letters in a string. This is because strings are immutable (cannot be changed in place). 

In [27]:
s[0] = "a"
# ---

TypeError: 'str' object does not support item assignment

Later, we'll try out a few *string methods*: functions that allow you to programmatically alter strings, like this: 

In [28]:
s.replace("Alpha", "Gamma")

'Gamma Quadrant'