# Python Basics

## Let's search some text

You already know the components of programming. You have been exercising the reasoning programming relies on for your entire life, probably without even realizing it. Programming is just a way to take the logic you already use on a daily basis and express it in a way a computer can understand and act upon.

It's just learning how to write in a different language.

One very important disclaimer before we start doing just that: **Nobody memorizes this stuff**. We all have to look stuff up all the time. We don’t expect you to memorize it, either. Ask questions. Ask us to review things we’ve already told you.

(Most of us ask questions we've asked before daily — we just ask them of Google.)

Now for some code. Let's say you want to search 130,000 lines of text for certain terms -- which are most common, how frequently do they occur, how often are they used in a way that's concentrated, which might indicate places you want to look more closely.

No person wants to do that by hand. And people are bad at precisely that kind of work. But it's perfect for a computer.

That length happens to correspond to The Iliad. In groups of two or three, think about a book like that. In your groups, figure out two things:

- A whole text is made up of what parts?
- What is the first thing you need to know to begin to search a file of text? The second thing? Third thing?

Roughly, the steps might look like this:
1. open the file
2. break the file into individual lines
3. begin to examine each line
4. if the line contains the term you're looking for, capture that
5. does anything else about the line interest you? Is your term there multiple times, for instance?
6. if none of your conditions are met, keep going

This is a program! See, you already know how to program. Now let’s take a minute to step through this the way a computer might.

In Python and other languages, we use the concept of **variables** to store values. A variable is just an easy way to reference a value we want to keep track of. So if we want to store a search term and how often our program has found it, we probably want to assign them to variables to keep track of them.

Create a string that represents the search term we want to find and assign it to a variable `search_term`. Let's search for the string `'Achilles'`:

In [55]:
search_term = 'Achilles' # This cousld just as easily be 'horse' or 'Helen' or 'Agamemnon' or `sand` -- or 'Trojan'


Now tell Python where the file is and to open it. The path to the file can be one variable: `file_location`. And we can use that to open the file itself and store that opened file in a variable `file_to_read`

In [56]:
file_location = 'sample_data/iliad.txt'
file_to_read = open(file_location)
all_lines = file_to_read.readlines()

In [57]:
all_lines

['The Iliad of Homer\n',
 '\n',
 '\n',
 'Translated by Alexander Pope,\n',
 '\n',
 'with notes by the\n',
 'Rev. Theodore Alois Buckley, M.A., F.S.A.\n',
 '\n',
 'and\n',
 '\n',
 "Flaxman's Designs.\n",
 '\n',
 '1899\n',
 '\n',
 '\n',
 '\n',
 '\n',
 '\n',
 'CONTENTS\n',
 '\n',
 '\n',
 'INTRODUCTION.\n',
 "POPE'S PREFACE TO THE ILIAD OF HOMER\n",
 'BOOK I.\n',
 'BOOK II.\n',
 'BOOK III.\n',
 'BOOK IV.\n',
 'BOOK V.\n',
 'BOOK VI.\n',
 'BOOK VII.\n',
 'BOOK VIII.\n',
 'BOOK IX.\n',
 'BOOK X.\n',
 'BOOK XI.\n',
 'BOOK XII.\n',
 'BOOK XIII.\n',
 'BOOK XIV.\n',
 'BOOK XV.\n',
 'BOOK XVI.\n',
 'BOOK XVII.\n',
 'BOOK XVIII.\n',
 'BOOK XIX.\n',
 'BOOK XX.\n',
 'BOOK XXI.\n',
 'BOOK XXII.\n',
 'BOOK XXIII.\n',
 'BOOK XXIV.\n',
 'CONCLUDING NOTE.\n',
 '\n',
 '\n',
 '\n',
 '\n',
 '\n',
 'ILLUSTRATIONS\n',
 '\n',
 '\n',
 'HOMER INVOKING THE MUSE.\n',
 'MARS.\n',
 'MINERVA REPRESSING THE FURY OF ACHILLES.\n',
 'THE DEPARTURE OF BRISEIS FROM THE TENT OF ACHILLES.\n',
 'THETIS CALLING BRIAREUS TO THE

Now create a variable `term_count` containing the integer value of how many times we've seen it in the text. So far, that's zero

In [58]:
term_count = 0# how many times our search_term has occurred

So any time we want to check to see how many times we've seen our `search_term` or check where our `file_location` is, we can use these variables instead of typing out the card value!

If you forget what one of the variables is set to, you can `print` your string. Let's also make a comment to remind us of what this variable does.

In [59]:
# how many lines contain at least two of our search_term
multi_term_line = 0

When it's on multiple lines, note the line number in a `line_number` variable and collect all relevant line numbers into a list assigned to the variable `line_numbers_list`.

In [60]:
# so far, zero
line_number = 0


In [61]:
# an empty list we hope to fill with lines we might want to explore in greater detail
line_numbers_list = []

Remember that a string is just a series of characters, not a word or a sentence.
So you can represent those characters as lowercase or uppercase, or see whether the string starts with or ends with specific things.
Try to make our `search_term` lowercase:

In [62]:
# we want to compare lowercase only against lowercase
search_term = search_term.lower()
print(search_term)

achilles


We've decided to standaradize our strings for comparison by making them lowercase. Cool. Now we need to do the comparing. Our open file is ready to explore. And to do that, we'll need a `for loop`. The loop will assign each line to a variable on the fly, then reference that variable to do stuff we tell it to:

In [63]:
# begin looping line by line through our file
for line in all_lines:
    # increment the line_number (line_number = Line_number +1)
    line_number += 1
    # make the line lowercase
    line = line.lower()
    # check whether our search_term is in the line
    if search_term in line:
        # if it is, use a tool Python gives us to count how many times
        # and add that to the number of times we've seen already
        term_count += line.count(search_term)
        # if it has counted more than one in the line, we know it's there multiple times;
        # keep track of that, too
        if line.count(search_term) > 1:
          multi_term_line += 1
            # and add that to the list using a tool Python give us for lists
          line_numbers_list.append(line_number)