# Python Data Types: Strings

In Python, text data is represented with the string data type. It gets a bit more complicated under the hood (and depending on which version of Python you are using), but for now, string means text.

Strings are declared by wrapping text in matching single ('text') or double quotes ("text").  
You can also declare strings in matching groups of triple quotes (either single or double - called triple quotes in both cases).  
Don't try to mix and match.

In [1]:
s = 'this is a string'

In [2]:
s = "this is also a string"

In [3]:
s = """this is also also a string"""

In [4]:
s = '''this is also also also a string'''

For the most part, single and double quotes are interchangeable. The only caveat is when the text you are trying to represent includes a single quote - as in a possessive - (they're or Michael's) for example. In this case you must use the double (or triple) quote to declare the string, otherwise Python will interpret the possessive quote as the end of the string.

In [5]:
# This will raise an error
s = 'Michael's code is ok, i guess..'

SyntaxError: invalid syntax (<ipython-input-5-4968a85a35c9>, line 2)

In [6]:
# This is ok
s = "Michael's code is ok, i guess.."

In [7]:
s

"Michael's code is ok, i guess.."

Note: the inverse is true if your text data contains a double quote for some reason; in this case you would need to declare the string by wrapping the text in single (or triple) quotes.

In [8]:
s = 'this is a weird " but " totally " acceptable " text " string'
s

'this is a weird " but " totally " acceptable " text " string'

Triple quotes are primarilly used for multiline strings. They can be triple single ('''...''') or triple double ("""...""").

In [9]:
# This multiline string wrapped in double quotes will raise an error since 
# Python doesn't know the string continues to the next line
x = "This
is
a
multiline
string"

SyntaxError: EOL while scanning string literal (<ipython-input-9-4f0ed96673fa>, line 3)

In [11]:
# When you geniunely need a multiline string, use triple quotes
x = """This
is
a
multiline
string"""

In [12]:
x

'This\nis\na\nmultiline\nstring'

To keep things simple, use triple quotes for multiline strings and double quotes for everything else.  
if you get some sort of string literal SyntaxError, double check your quotes.

### Escape characters

There are a few characters and character sequences that have special meaning in Python strings. The '\' character can be used to 'escape' the special meaning of these characters and character sequences. In the above example of an embedded single quote in a text sequence, we could have also used the '\' character immediately before the single quote to tell python to ignore the meaning of the quote.

In [13]:
# This is a problem - where does my string end??
s = 'a string with a ' in it..'

SyntaxError: invalid syntax (<ipython-input-13-65bc4e86599b>, line 2)

In [16]:
# this is ok since the meaning of the ' is escaped/ignored
s = 'a string with a \' in it..'
s

"a string with a ' in it.."

The '\' character can also be used to declare special sequences in Python strings. The important ones are '\n' and '\t'.  

In [None]:
# \n indicates a new line in Python and is basically like hitting return in a text editor
s = 'this is the first line \nand this is the second line'
print(s)

In [None]:
# \t means tab and is like hitting tab in a text editor
s = 'this comes before the tab \t and this comes after'
print(s)

In [None]:
path = r'C:\Users\mtroyer\Documents\Learning_Python_for_Fun_and_Profit\3. Data Types'

In the event you need a \n or \t in the string without the special meaning, you guessed it, you can escape it with a preceeding '\'.

In [18]:
# This \n is treated as a newline character
s = 'this is the first line \nand this is the second line'
print(s)

this is the first line 
and this is the second line


In [19]:
# This \n is escaped with the preceeding '\' and printed as is
s = 'this is the first line \\nand this is the second line'
print(s)

this is the first line \nand this is the second line


This is especially important in file paths!

In [20]:
# this is gonna get totally mangled.. notice all the \n's 
path = 'C:\new_user\new_folder\new_file'
print(path)

C:
ew_user
ew_folder
ew_file


In [22]:
# this will work since all the \n's are escaped
# path = 'C:\\new_user\\new_folder\\new_file'
path = r'C:\new_user\new_folder\new_file'
print(path)

C:\new_user\new_folder\new_file


As is usually the case with file paths, you can also just declare the entire string a raw string - meaning everything is treated as is and without special meaning.  
To declare a raw string, preface the entire thing with a lower case 'r'

In [23]:
# this is a total mess
path = 'C:\new_user\new_folder\new_file'
print(path)

C:
ew_user
ew_folder
ew_file


In [24]:
# this is ok!
path = r'C:\new_user\new_folder\new_file'
print(path)

C:\new_user\new_folder\new_file


### String Operations

There are a number of operations you can perform on string objects. The important ones are index, slice, split, and concatenate.

In [27]:
# You can index strings using square brackets- i.e. get the 11th character in this string sequence
# remember Python is 0 indexed, so the first item in the stringe is item 0
# The 11th item, in that case, is index [10]
s = 'this is a text string'
s[9]

' '

In [33]:
# You can slice strings using square brackets with a starting and ending index separated by a colon (:)
# Remember, when slicing sequences in Python, the starting value is inclusive, while the ending value is exclusive.
# e.g. string[1:5] will slice a string starting with index position 1 up to but NOT including index position 5.
s = 'this is a text string'
s[0:11]

'this is a t'

Notice how the slice does not include index position [10] - the 't' as in the example above. 

In [47]:
# You can split a string using the .split() method which returns a list of the split pieces.
# .split() defaults to split on spaces, but you can specify any character you want.
s = 'a string with a bunch of spaces in it'

s = r'C:\new_use\new_folder\new_file'
print s
# s.split('\\')

C:\new_use\new_folder\new_file


In [43]:
s = 'a string. with. a. bunch. of. periods. in it'
s.split('.')

['a string', ' with', ' a', ' bunch', ' of', ' periods', ' in it']

In [44]:
# You can add two or more strings together (concatenate) using the '+' operand.
s = 'this is how you ' + 'concatenate two strings'
s

'this is how you concatenate two strings'

In [45]:
# Or with previously defined string variables
s1 = 'this is how you ' 
s2 = 'concatenate two strings'
s3 = s1 + s2
s3

'this is how you concatenate two strings'