# In-Class Exercises, Ch. 3, For Loops

Relevant reading:
- [Python & Pandas Field Guide: Section 2.4.4. String Operations](https://snakebear.science/02-ValuesAndVariables/Operators.html#string-operations)
- [Python & Pandas Field Guide: Section 3.1. For Loops](https://snakebear.science/03-ControlStructures/ForLoops.html)

Key ideas in these exercises:
1. String data
2. For loops

***
## Some Examples of String Repetition
Look at the following line of code. Make a prediction about what it will do and then run it.

In [None]:
print("Hello " * 5)

Do the same for the code below. Notice that any 'white space' in a string is treated just like any other part of the string &ndash; a space is just another character, like an 'a' or a '5'.

In [None]:
print("Hello " * 5)
print("Hello" * 5)
print("        !!!            " * 5)
print("byeGood" * 10)
print("What?" * 0)
print("|Snake Bear!" * 30)

***
## Exercise 1
This code prints an abstract pattern.  It is not very interesting, because it doesn't use the string repetition operator or other fun things.

In [None]:
print("-------")
print("------|")
print("-----||")
print("----|||")
print("---||||")
print("--|||||")
print("-||||||")
print("|||||||")

### To Do: Abstract Pattern with String Repetition
Write a program that uses string repetition to make each line of the pattern, printing the exact same thing.  That is, you'll still have eight `print()` statements, but the string to be printed in each won't be written out directly, but rather created using an expression with string repetition.

Start with the first and last lines; they're easier.  Then work on the lines in between; they'll require string repetition *and* concatenation. 

In [None]:
# Write your code here

### To Do: Shorter Code
The code you wrote above isn't much better than the original code.  In fact, you could claim it is less clear.  However, can you see the pattern in the lines of code you've written?  That pattern means each line is doing just about the same thing as the others, only slightly differently.  That's what *loops* are great for!  In fact, we can use a loop to *simplify* that code.

Write a program that makes the same pattern using only *two* lines of code.  Think through how you can call ``print()`` eight times with only two lines of code...  Ask for a tip if you need one!

In [None]:
# Write your code here

When you have this working, ask an instructor to come over and look at your code.  We might be able to give a few additional pointers on your code.

### To Do: A More Flexible Program

The loop will also let us make it more *general*, meaning we could use it to easily make the same pattern in a bunch of different sizes.

Take the simplified code you wrote above and make a new program that asks the user to input a number (at least 3), then prints out the same basic pattern but with as many lines as the number the user provided.  For example:

```
Please enter a number greater than two: 6

-----
----|
---||
--|||
-||||
|||||
```

Or:

```
Please enter a number greater than two: 3

--
-|
||
```

Make sure you test your program with a wide range of values.  Just because it works for one input doesn't mean it will work for every input.

In [None]:
# Write your code here

***
## Exercise 2 

### Some Notes on Getting Text Data

In the first chapter of the textbook, we saw some code that counts words in a string of text that start with a given letter.  Here we are going to do something similar to practice using for loops.

Look at the code cell below. The first line of code assigns a string to the variable called ``filename``. The string happens to be the name of an actual file that we are going to use. In this case its a textfile containing the opening text of the book *A Tale of Two Cities*.

The next two lines open the file and then assign its contents to the variable ``file_contents``. The final line takes the file contents (the words) and puts them separately in a list and then assigns that list to the variable ``words``.

You can change the ``filename`` variable (currently ``"atotc_opening.txt"``) to refer to a different file and re-run the cell to get a list of words from a different text file.  There are three text files included in this directory already (see the file browser on the left side of your screen), and you can upload your own if you want using the up arrow also on the left.  Note: if you want to analyze your own writings you will need to save them as txt files before you upload them. 

### To Do: Get Some Text Data
Execute the code cell below to read the file and create the ``words`` list.  There will be no output, but it will create the variables ``file_contents`` and ``words`` that you can then use in later cells.  You do *not* need to copy this code into your code below.

In [None]:
filename = "atotc_opening.txt"

with open(filename) as f:
    file_contents = f.read()

words = file_contents.split()

### Some Notes on len()
If you have executed the code cell above, we should now have our text in two forms. 
The first is the entire text as a single long string assigned to the ``file_contents`` variable.
The second is the entire text broken up into individual words, put into a list, and then assigned to the ``words`` variable. 

Remember the ``len()`` function? It can tell us the length of the ``file_contents`` string; which would tell us the total number of characters in our text. 
The ``len()`` function can also can also give us the number of items in a list; so we can use ``len()`` to tell us the total number of words in our text.

### To Do: Number of Characters and Words
Write some code to print out the number of characters in the text and the number of words in the text.
For example:  There are 684,768 characters and 121,567 words in ``pandp.txt`` (the novel *Pride and Prejudice*).

In [None]:
# Write your code here

### Some Notes on For Loops and Whitespace

Do you think Jane Austen used longer words on average than Charles Dickens?

We could write a pretty simple program to answer this question. First we would need to know the number of characters in all of the words in our text. Now you may be thinking we've already done that in the character count from above. But that's not quite the same. Take a minute and see if you can figure out why its different. 

The character count above counted all the characters in the text, which includes white space, not just letters. So we can't use that character count to answer our question. Instead we need to ignore the white space and count just the letters. If we can get the sum of all of the letters and then divide that by the number of words, we will have the average word length.

So how do we count just the letters? There's many ways to do it but for this exercise we want you to write a program that counts the letters in each word of our ``words`` list. When our ``words`` list was created, all the white space was removed, so it has all the letters we want and nothing else. That challenge here is to write a program that moves through the list, counts all the letters in each word, and adds up a total.  

Remember that a for loop lets us repeat some code for every item in a list.  To write code that adds the lengths of the words, think about what steps need to be repeated; those steps will go into the body of the for loop.  You will also need to think about how to use a variable to store and update the growing sum.  As always, ask for a tip if you need one!

*[Suggestion: if you have a for loop repeating once for every word in the list, you probably **don't** want to put a ``print()`` statement inside that loop.  It will print way more than is useful, and it may take a long time to finish, too.]*

### To Do: Loop and Count

Write a program that uses a for loop that moves through the word list, counts all the letters in each word, adds up a total, and then divides that total by word count. Print the results.  

Your code should find that in ``pandp.txt`` (*Pride and Prejudice*) there are 121567 words with total of 560736 characters. The average word length is 4.6126. What is the average length of the words in ``atotc.txt`` (the complete text of *A Tale of Two Cities*)?

In [None]:
# Write your code here

***
## Exercise 3

If you still have time, here's a more challenging one.  Using the ``words`` list generated above, write code that will find the *longest* word in the list.

It's often useful to think through a problem like this first in terms of how *you* would solve it yourself, without a computer.  That is, if someone gave you a book or other long text, a notepad, and a pencil, what steps would you follow to find the longest word in the text?  Then, think about what steps you are repeating, and try to make a formal series of instructions for doing the work.  

Once you know what steps you would follow you can think about how to translate that into instructions for the computer to follow.  This basic process helps you work from a high level of understanding down to a more detailed level with the instructions themselves, and it can be much easier than trying to think through all of the details of the code all at once.

(Remember that when testing, you can change the ``filename`` variable and re-run the cell that reads the file to get different data into ``words`` if you want to see what your code does on the other files.)

### To Do: Longest Words

Using the ``words`` list generated above, write code that will find the *longest* word in the list.

Your code should find that in ``pandp.txt``, the longest word has 28 characters.  In fact, this will point out a flaw in the algorithm; you'll see that the longest "word" isn't a word at all due to the simplicity of the processing.  Always check your results to see if they make sense!  Here, that result is fine, but in general you may find that simple initial solutions don't always work as well as you would like.

In [None]:
# Write your code here

Feel free to show this one to an instructor in class, or bring it to office hours if you want someone to look it over later.