#Strings

String manipulation is a crucial ability in software engineering, and Python comes with an extremely powerful set of tools that makes modifying and creating strings easy.

##Creating Strings

You can create a string in Python using either single or double quotes:

In [1]:
single_quote_string = 'This is a single quote string'
print(single_quote_string)

This is a single quote string


In [2]:
double_quote_string = "This is a double quote string"
print(double_quote_string)

This is a double quote string


Single and double quotes can only be used to create a single line string. To create a multi-line string use triple quotes (single or double):

In [3]:
multiline_string = """To be or not to be
that is the question"""
print(multiline_string)

To be or not to be
that is the question


###String Constructor: `str()`

In addition to using quotes to create strings, we can use constructors to create a new string. Just like integers and floats have constructors that can create integers and floats, strings has its own constructor `str()`. For example:

In [4]:
constructor_string = str("This is a double quote string")
print(constructor_string)

This is a double quote string


`str()`, just like its number counterparts, can convert other Python data types to strings:

In [5]:
float_string = str(98.6)
print(float_string)

98.6


In [6]:
bool_string = str(True)
print(bool_string)

True


##Using escape character `\`

Python has a mechanism to be able to include special characters and effects into strings that would otherwise be very difficult to represent. By preceeding a character with the backslash (\) character, you give it a special meaning. Here are some uses for the escape character: 

###Newline  (`\n`)

Insert a newline even in a single line string.

In [7]:
newline_string = "To be or not to be\nThat is a question"
print(newline_string)

To be or not to be
That is a question


###Tab (`\t`)

Insert a tab inside a string.

In [8]:
tab_string = "To be or not to be\tThat is a question"
print(tab_string)

To be or not to be	That is a question


###Single or double quotes (`\'`), (`\"`)

In [9]:
quote_within_quote = "Shakespeare said: \"To be or not to be, that is a question.\""
print(quote_within_quote)

Shakespeare said: "To be or not to be, that is a question."


###Backslash (`\\`)

In [10]:
escaped_backslash_string = "I just wanted to use this backslash \\ in this example"
print(escaped_backslash_string)

I just wanted to use this backslash \ in this example


##String Operators

Python has a host of operators that you can use for string manipulation that is at your disposoal. Strings can utilize these features because string are Python sequences. A string in Python is represented as a sequence of characters. Because of this, strings can be manipulated in ways that other kinds of objects cannot be manipulated.

###Concatenation (`+`)

In [11]:
concat_string = "To be or not to be" + "that is a question"
print(concat_string)

To be or not to bethat is a question


Notice the concatentation operator did what you would expect: combine those two strings together. A new string was created and assigned to the variable `concat_string`. Also note that there is no space between the words "be" and "that": that's because string concatentation combines the two strings together exactly as is. Later on in this lesson we will go over helper functions that can make concatentation easier.

###Duplication (`*`)

Duplicates the string the specified number of times

In [12]:
duplicate_string = "To be or not to be " * 3
print(duplicate_string)

To be or not to be To be or not to be To be or not to be 


###Character Extraction (`[]`)

Retrive the nth character of a list (and you count from zero). This is possible because strings are a Python sequence. Note: negative numbers start from the end of the string, with index -1 being the last character in the string, -2 being the second to last character, and so on.

In [13]:
concat_string = "To be or not to be" + "that is a question"
print("concat_string[0] : " + concat_string[0])
print("concat_string[4] : " + concat_string[4])
print("concat_string[-1]: " + concat_string[-1])

concat_string[0] : T
concat_string[4] : e
concat_string[-1]: n


###Slice (`[start:end:step]`)

Extract a substring from a string:

1. [:] extracts the entire sequence from start to end. 
2. [ start :] specifies from the start offset to the end. 
3. [: end ] specifies from the beginning to the end offset minus 1. 
4. [ start : end ] indicates from the start offset to the end offset minus 1. 
5. [ start : end : step ] extracts from the start offset to the end offset minus 1, skipping characters by step.

In [14]:
letters = 'abcdefghijklmnopqrstuvwxyz'
print(letters[:])

abcdefghijklmnopqrstuvwxyz


Print all of the characters from after the 5th character onwards to the end

In [15]:
print(letters[5:])

fghijklmnopqrstuvwxyz


Print all of the characters from the beginning to the 20th character

In [16]:
print(letters[:20])

abcdefghijklmnopqrst


Print all of the characters from after the 5th letter onwards to the 20th character

In [17]:
print(letters[5:20])

fghijklmnopqrst


Print every other character from beginning to end starting with the first caracther

In [18]:
print(letters[0:26:2])

acegikmoqsuwy


Combine all of these techniques and you have a powerful arsenal to manipulate strings to build the dataset that you need. We will now look into how to package some of the code that we have been writing into a single module so that we can use it in real life situations.