# Introductory Notes

Throughout this entire notebook you should be experimenting with the code in the non-text cells. A great way to begin to get a feel for Python is by playing with it. So have some fun by changing the values in the cells and then running them again with Shift-Enter. Before you do, think about what you expect the output to be, and make sure your intuition matches up with what you run. If it doesn't, take some time to think about what happened so you can hone your intuition.

At the end of each section there will be some questions to help further your understanding. Remember, in Python we can always manually test things by trying them out; however, you should try to think about the answers to these questions before you run some code. This way you can check and verify your understanding of the section's topic.

#### Working with individual characters in strings

We know how to work with an entire string via some of the methods that we've discussed, but what if we wanted to work with the individual characters? There are a couple of ways to do this, but the first we'll focus on is through indexing. We know that to Python, a string is just a collection of characters. It turns out that we can access the individual characters simply by asking Python for a given numbered element in our collection (i.e. the string).  We do this by placing the element number that we want in square brackets, `[]`,  right after our string (or variable, if it's stored in one). This element number is referred to as the **index** of the character (or element, if it's not a string - more on this soon).


In [1]:
my_str_variable = 'Test String'

In [2]:
my_str_variable[1]

'e'

In [3]:
my_str_variable[5]

'S'

In [4]:
my_str_variable[-1]

'g'

In [6]:
my_str_variable[-3] 

'i'

Using indices like this, we can access any element of a string. But why is the element at index 1 `e`, and not `T`? After all, `T` is the first element in the string. Also, what are those negative numbers doing? In the case of the former, it turns out that Python (and many programming languages) starts indexing at 0, which means that the first element in our string (and any collection that supports indexing) is accessed via indexing at 0. We refer to languages that work this way as **zero indexed**. As for the negative numbers, this is a way to access elements starting from the end of the string, rather than the beginning. Indexing from the end starts from -1 and continues downwards from there. So, we would use -2 to access the `n` in the string.

Note that we can also access any given number of the characters (any **substring**) by combining multiple index numbers separated by a colon `:`. On the left side of the colon, specify the starting index of a range you want to grab, and, on the right, the index you want the range to stop at. For example:

In [7]:
my_str_variable[1:3]

'es'

In [8]:
my_str_variable[5:9]

'Stri'

In [9]:
my_str_variable[-6:-1]

'Strin'

In [10]:
my_str_variable[1:]

'est String'

In [11]:
my_str_variable[:-1]

'Test Strin'

This indexing turns out to be pretty useful. You might notice, though, that when indexing from `[1:3]`, only the letters at index 1 and 2 are returned; when indexing from `[5:9]`, we get the letters at indices 5, 6, 7, and 8. This is because the indices that you pass in are inclusive on the left side, and exclusive on the right side. This means that when you index, you will grab letters from the starting index that you give up to but not including letters at the ending index that you give. 

What about those last two examples, where there isn't an ending index or a starting one? If you don't give an ending index, then Python assumes you want to go to the end of the string. Similarly, if you don't give a starting index, Python assumes that your starting index is the first index in the string. Remember, this is the zeroth index in Python (don't worry if this feels confusing, you'll get used to it quickly).

Is there a way to grab elements at regular intervals in a string? For example, what if we wanted to grab every second letter? Python allows us to do this by passing in an optional third number while indexing. This optional third number, also separated by a colon (`:`), tells Python the step size by which to move through the string when indexing. So, if we wanted to grab every second letter from the beginning to end, we could index with `[::2]`. If we wanted to grab every 3rd letter from the letter at index 2 to the letter at index 10, we could use the indexing `[2:10:3]`.

Remember that below, `my_str_variable` still stores 'Test String'. 

In [12]:
my_str_variable[::2]

'Ts tig'

In [13]:
my_str_variable[2:10:3]

'sSi'

**Indexing Questions**

Assume that we are still working the same string - 'Test String'. 

1. How would I index into the string to access the first letter, `T`?
2. How would I index into the string to access the last letter, `g`, using both positive and negative indexing?
3. How would I index into the string to grab the substring `est`?
4. How would I index into the string to grab the substring `Str`?
5. How would I access every 3rd letter in the string?

Got it, enough indexing already! Is there a way to cycle (or step through) each one of the letters one by one, and do something with the conditional logic we learned, rather than just grabbing a certain letter or group of letters? Of course! (Why would I ask a question for which the answer was no? That would be lame.)

#### Iteration and Strings

We can cycle through all of the letters in our string (a process called **iteration**) in one of a couple of different ways. Let's first look at cycling through with a `while` loop, since we worked with those last week.

In [14]:
my_str, idx = 'hello', 0
while idx < 5:
    print my_str[idx]
    idx += 1

h
e
l
l
o


This while loop will **iterate** over the letters of our string `hello`, printing each one until `idx` reaches the value 6. Since we knew the length of our string (i.e it's 5 letters long), we knew that we could use the condition `while idx < 5:` for our loop checking, and ensure that all the letters would be printed. What if we didn't know the length ahead of time, though? There is actually a function that we can use to figure this out (we'll talk much more about functions and how they work later). It's `len()`, and we simply call `len()` with our string passed as an argument, and it returns the length of our string.

In [15]:
len(my_str)

5

Now, we can write our `while` loop to be a little bit more general:

In [16]:
my_str, idx = 'hello', 0
while idx < len(my_str):
    print my_str[idx]
    idx += 1

h
e
l
l
o


Great! But we did mention that there are other ways to iterate over the letters in our string, and in general we try to stay away from `while` loops in Python.

The other way that we can iterate over the letters in our string is to use a `for` loop. `for` loops are built off of the same idea of `while` loops (doing something over and over again), but instead of continuing until some condition is no longer met, `for` loops operate directly on iterables. This leaves the concern about when to stop for Python to figure out. With a `for` loop, we don't have to care how many iterations/cycles the loop will go through. Let's look at the syntax of a `for` loop.   

In [17]:
my_str = 'hello'
for idx in range(len(my_str)):
    print my_str[idx]

h
e
l
l
o


**Note**: the `range()` function (which we will cover in more depth when we get to functions) as used above simply gives us a list of numbers from 0 up to but not including the inputted number. In the case above, since `len(my_str)` is 5, `range(len(my_str))` returns a list of integers from 0 to 4.

This `for` loop does the exact same thing as the `while` loop we wrote above, but with slightly different syntax. How does it work? At each iteration of the loop, `idx` is assigned one of the values in `range(len(my_str))`, and then the code within the indented block is run with that value of `idx`. How does Python know what the values of `idx` will be? Python simply goes through the values of whatever is after the `in` statement **in order**, and assigns those values to `idx`, one at a time through each iteration of the loop. Since `range(len(my_str))` returns to us a list of integers from 0 to 4, those values get assigned to `idx` as we run through the `for` loop. Let's look at one of our favorite kinds of tables to view this:

| After loop # | idx | What's Printed |
| ------------ |:---:|:--------------:|
|      1       |  0  |       'h'      |
|      2       |  1  |       'e'      |
|      3       |  2  |       'l'      |
|      4       |  3  |       'l'      |  
|      5       |  4  |       'o'      |

Note that with our `for` loop, the `idx` variable is automatically changed, rather than us having to manually update it (like we did in the `while` loop). This is one of the incredibly nice aspects of `for` loops! But wait, it gets even better!

It turns out that the above implementation of our `for` loop is actually considered to be non-Pythonic. This is because the way that `for` loops are constructed allows us to achieve the same output as above by writing the following:

In [18]:
my_str = 'hello'
for char in my_str:
    print char

h
e
l
l
o


What's going on here!? Well, instead of iterating over all of the integers in a `range(len(my_str))` call like we did in our first `for` loop, we've gotten Python to simply iterate over all of the individual characters in our string, `my_str`. In each iteration of this `for` loop, `char` stores a different letter of `my_str`, and then the call `print char` prints that character. In the end, we get the same result as either of our `while` loops above, and the less Pythonic `for` loop that we wrote above. This way is considered to be the Pythonic way to iterate over a string (and other iterables, which we'll cover next class), and so it's an important concept to grasp.

Why is it more Pythonic? That's a good question. When we say that something is more **Pythonic**, this means that we are using the language in such a way that makes your code both more readable and simultaneously uses Python's power to make your solutions more optimal. Let's look at how this applies to the final implementation of our `for` loop.

We can see that it is more readable since we don't have to index into our string anymore. This means that there is less to follow along with and keep track of; rather than keeping track of both the current index we are on and what letter that index corresponds to in our string, all we have to keep track of is the current letter we're on. We can also note that our code just looks cleaner and more simple, too. In terms of making our code more optimal, since we no longer have to index into the string to grab characters, we have fewer steps in each iteration of the loop. This means less work for Python to do.

**String Iteration Questions**

1. Write a for loop to iterate over the letters of the string 'Today is Tuesday' and print each one. 
2. Adjust the body of that for loop to add the letter 'z' to each of the letters before you print it. Your
output should look as follows: 

```
Tz
oz
dz
az
yz
.
.
.
dz
az
yz
```

#### A Quick Aside on String Formatting 

There's one more thing that we should talk about before moving on from our discussion of strings - string formatting. String formatting is going to allow us to format strings in certain ways. Probably most usefully, it's going to allow us to insert variable contents into strings dynamically. We'll get an idea of how and when this is most useful as we work through this course. For now, let's just look at the syntax of it all.  

In [1]:
my_name = 'Sean'

In [2]:
print('Hello %s' % my_name)

Hello Sean


In [3]:
print('Hello {}'.format(my_name))

Hello Sean


How is this working? Well, in each case, it's filling in a given part of our string with the value of our variable. In the first case, we use a `%` sign to denote where the replacement should happen, followed by a letter to denote what type of variable will be passed in there (`s` is used for string, `d` is for a decimal, etc.). You can find what each letter denotes [here](https://docs.python.org/2/library/stdtypes.html#string-formatting). In the second case, we use brackets `{}` to denote where the replacement should take place. We can also place numbers, or even variable names themselves inside these brackets and referece them in the `format()` method...

In [4]:
print('Hello {0}'.format(my_name))

Hello Sean


In [5]:
print('Hello {name}'.format(name=my_name))

Hello Sean


This is something that we don't use much past pretty simple cases, but there are many more things you can do with it - you can read about them [here](https://docs.python.org/2/library/string.html#format-specification-mini-language). In general, though, string formatting is much more readable and dynamic as compared to a bunch of concatenation.

**String Formatting Questions**

1. Can you print a string that adds your name (in a variable) to 'Hello ', instead of the name Sean like we did above?