Title: COMPSCI 32  
Author: Mike Smith  
Copyright: Copyright 2021 by Michael D. Smith. All rights reserved.

# A1S1: Scripts #

## P1: Read a Children's Book ##

Today we learn to write our first script. Recall that our goal is to get the machine to act in a certain way, to perform a specific computation that we desire. The purpose of a script is to articulate the sequence of steps that will, when executed, cause the machine to perform the actions we desire.

In order to be run, scripts must eventually be written in a programming language. Computer programming languages are not as flexible and forgiving (or ambiguous, depending on your point of view) as human languages. Each programming language defines its own specific syntax for the collection of general concepts that set the foundation for programming in most all programming languages. You should know that it is easy to get bogged down in the minutiae of these syntactic details. Python, while better than many, still has its own share of such nit-picky details.

As you start to write your own scripts, I don't want you to start by worrying about how you say something with precise Python syntax. Why? Because you are just starting to communicate in this new language, and as a child begins to talk, most parents do not correct their every grammatical mistake. Instead, parents encourage the child's broken English (or whatever your native tongue is). The goal is to encourage them to *put words to their ideas*.

As we start to program, we have the same goal. Broken English for us is what computer scientists call *pseudocode*. In pseudocode, it is fair game to write down the step you'd like the computer to take at this point in your script in any English phrase that makes sense to you. If you want the computer to open a window on the screen, you can type "open a window on the screen". If you want the computer to add 5 to 6, you can type "add 5 + 6". I don't care what you write as long as you --- and hopefully others --- have a strong sense of what you mean. Every time I will write some new code in class, I'll try to start with some my own pseudocode. 

Once we have written a nice chunk of pseudocode (i.e., some code that does something interesting), we can then translate that pseudocode into actual Python code and try to run it on our machine. I encourage you to not short-circuit these two steps and start writing in Python until you find yourself thinking in Python. Even after decades of programming, I still sketch out some pseudocode (and often draw pictures) before I try to write any actual lines of code in any of the programming languages I know.

**Step 1: Define the problem to be solved.** Okay, let's start writing some code to solve a problem. We have been talking about children learning to read and speak, and this brought to mind one of my favorite children's book, *The Cat in the Hat* by Dr. Seuss. I just happen to have a plain text file here with the beginning of the book. Let's see if we can get the computer to read it to us.

**Step 2: Consider your inputs.** Once you've defined a problem to be solved, you should next consider the types of inputs you'll provide to the script you write. At this early stage in the design process, you don't have to consider the full range of inputs, but you should at least review (or list) some example inputs. For our current problem, let's take a quick look at this plain text file I just mentioned.

As you can see, the file contains only the text of *The Cat in the Hat*. It doesn't include any of the whimsical pictures.

**Step 3: Choose your IDE.** I want to write some pseudocode, but where am I going to do it? There are numerous applications that help us to write, run, and debug our code. These tools are typically called an *editor*, an *interpreter* (or *compiler*), and a *debugger*. While you can use three standalone tools, many programmers like to use what is called an *Integrated Development Environment (IDE)*. An IDE packages together all the tools you'll need in a single application. 

I'll often use Microsoft's Visual Studio Code IDE in class. You may choose to use this tool. It's free. You may find that it has a steep learning curve, but don't let yourself be overwhelmed with its many features. I use only a small slice of its functionality myself.

You do have other choices, especially as we get started. The "Tools and Environments" page on our Canvas course site talks about the concept of editors and interpreters in more detail, and it tells you about a range of choices and the tradeoffs in those choices. Different choices give you different advantages.

As this book begins, I encourage you to follow along and write your own version of the script we'll build together in a browser-based IDE called *Replit*. You can access it by opening your favorite browser and typing [replit.com](http://replit.com) in the address bar. When the page appears, go ahead and sign up (unless you already did this in class) and then log in.

The narrative in this first chapter is built on Replit and its interface as of October 2021. One of the most appealing features of Replit is that you can get coding without having to struggle with the complexities of creating a Python development environment on your own personal machine. This is a task that we'll take on later, but let's get started with the fun parts.

You may, however, have some familiarity and experience with another IDE. If so, please feel free to try Replit or follow along in the environment you already know as we develop our very first Python script together.

Finally, I should mention this book is available as an Interactive Python Notebook (`.ipynb`). If you load this first chapter into Google Colab, for example, you can run most of the code blocks. Code blocks that DO NOT start with a comment of the form "`### Informative message`" are code blocks associated with the script we're building together. You can always run them. On the other hand, code blocks that DO start with a `###` comment may or may not be runnable. Those that are runnable often encourage you to try a piece of Python code as a learning exercise. The rest exist for other pedagogical purposes and are marked as "`### NOT a script and therefore NOT executable`". Please be alert to these informative messages.

Both Replit and Google Colab allow us to write, run, and debug Python code. They differ in that Replit's approach assumes you're writing scripts to be then fed to the Python interpreter while Google Colab's approach foregrounds the interactive Python interpreter. If you don't understand what this means, that's ok. You will very shortly. Just remember this difference if you're reading this text and using Google Colab instead of Replit.

**Step 4: Getting started with Replit.** Using the Replit IDE, let's start building our first Python script together.

Click the "Create Repl" button; you may have to click the sidebar button to see it. In the resulting dialog box, choose a Python template, change the title to "seuss", and click "Create Repl" button at the bottom of the dialog box. 

What results is a window segmented into three pieces. On the left is a file browser, in the middle is the editor where we will write our scripts, and on the right is where we'll see the results of running our scripts. Right now, we have one file `main.py`, which is selected and displayed as empty. The `.py` extension tells the system that it will find (eventually!) Python code in this file.

We are now ready to start solving our problem of getting the computer to read a book to us!

**Step 5: Pseudocode and comments.** In the center panel, write a piece of pseudocode that says:

In [None]:
# Open the book so we can read it

We just wrote in English what we want our script to do. This explanation begins with a character (`#`) that holds special meaning in Python. When the Python interpreter sees it, it tells the interpreter to ignore everything from the hash mark to the end of the line. This is like a note to ourselves or to anyone reading our script. It's not a part of the script that we want the computer to interpret or act out. Programmers refer to things like this as *comments*. While this is only one of several ways to write comments in Python, this is sufficient for us right now.

What are comments good for? According to Swaroop in [*A Byte of Python*](https://python.swaroopch.com/), "Code tells you how, comments should tell you why." And that's generally true, but we're not commenting a completed piece of code. We are simply using this commenting capability to sketch out what we want our script to do.

**Step 6: Commands and input parameters.** Normally, we would write more pseudocode to sketch out what we want the computer to do for us, but let's convert this pseudocode into Python code so that we can start to get a feel for what it is like to command the computer to do things for us.

If we had a physical book, the first thing we would do to begin reading it would be to open it up. Let's write `open('CatInTheHat.txt')` below the comment line. This happens to be a valid line of Python code. It tells the Python interpreter to do something. What does it tell the Python interpreter to do? To open the file named `CatInTheHat.txt`.

Let's talk about the syntax here for a moment. The form of this statement to the Python interpreter is: *command*(*input parameters*). In the case of an `open` command, the input parameter tells the system what we want opened.

What is this command `open`? It is something that the Python interpreter knows how to do because: (1) it is built into the Python language; (2) it was added to the language by some other person; or (3) we added it to the language ourselves! In the case of `open`, it is a built-in function. We will talk about this more later, but for now, that's all you need to know.

Great! Now what? 

**Step 7: Scripts versus execution.** Well, so far we've only written some things in a script. It's like a playwright writing words on a page. What's the equivalent in programming of a playwright handing the script to some actors so that they can perform the script (i.e., so that the computer can execute our script)? In this IDE, the big green button at the top of our window labeled "Run" is a big hint. This button tells the IDE to take our script and run the entire thing.

Let's try it with the script we've written so far, which is copied below:

In [None]:
# Open the book so we can read it
open('CatInTheHat.txt')

Oops, we got a whole lot of red text in the rightmost panel. 

**Step 8: Our first error.** No worries. The Python interpreter is simply trying to tell us that it couldn't understand our script in some way. If it helps, think about the interpreter as an actor whose English isn't too great. What can we learn from the broken English in this error message?

Yup, `FileNotFoundError` is a great clue, and the file that the interpreter couldn't find was our `CatInTheHat.txt` file. In addition, we can learn that it had this trouble when it tried to execute our `open` command on line 2 of our script.

This makes sense because there is nothing but our `main.py` file in our file view panel. How can we change this? Well, we could click the "Add File" button in the file view panel, but then we'd have to type in our text file. I'd rather not type in something that already exists. So where can I find it? It's in the `txts` directory inside the `a1s1` (which stands for Act I, Scene I) directory under "Files" on the course's Canvas site. Great, if we click the three vertical dots at the top of the "Files" panel on Replit, we see that we can "Upload File". So, let's download the `CatInTheHat.txt` file from Canvas (to our machines using the three vertical dots next to the file on Canvas) and then upload it to our individual Replit sessions.

With that done, let's click the green Run button again.

Success! I mean, at least no error this time. But also nothing appeared to happen. The right panel in Replit simply displayed a blank line.

**Step 9: The interactive interpreter.** The rightmost panel in Replit isn't just a place where we see the results of the execution of our script, but it is also where we can feed pieces of Python code directly to the *interactive* Python interpreter. It's like asking the actor to read one line of your script so that you can see how it sounds, or in our case, what the line causes the interpreter to do.

Let's feed the two lines of our script to the interactive Python interpreter one at a time.

When we feed our comment line to the interactive interpreter, what happens? Nothing! Exactly what the interpreter should have done with a comment line.

Let's now type our `open` line in the rightmost panel. What happened this time? Something interesting. It's not red-colored text, and so it is not an error. The interpreter is trying to tell us something different. In technical jargon, the interactive Python interpreter prints for us the value computed by the statement it executed. To understand what this means, let's try something simpler.

**Step 10: The interactive interpreter as a calculator.** What happens if I write `2 + 3` as the input to this interactive Python interpreter? Before I hit return, take a moment and think about what I'm asking the interpreter to do?

Cool, we can use the interpreter like an infix calculator. It takes an arithmetic expression, and it prints what it calculates. More generally, the interpreter will print for us the value it computes, if any, for each statement we ask it to execute.

Take a bit of time and play with the interpreter as a calculator.


Ok, so we have validated that the interactive Python interpreter prints the value of any statement we ask it to evaluate. So, what is this value that it printed when we asked it to open a file?

**Step 11: Python help.** Wouldn't it be nice to ask for some help? Yes, it would. So let's do it! Type `help(open)` in the interactive interpreter and execute it.

In [None]:
### EXECUTABLE, but NOT part of our script
help(open)

The `help` command is another Python built-in function, and it tells us about what we passed to this function. We can learn that the built-in function `open` takes a filename and a whole bunch of other input parameters. These other parameters, however, are optional parameters, which are given default values if you do not specify them. That's what the `=` syntax means.

For now, let's look only at `mode` and `encoding`. `mode` defaults to `r`, which means that we are going to read the file. Alternatively, we might want to write the file. `encoding` is something we will talk about in greater detail later, but for now you can consider it as telling the interpreter in what language the file is written (e.g., a computer equivalent of asking if a sentence is written in English or French).

Finally, we notice that the explanation starts by telling us that this built-in function opens a file and returns a stream. Again, we will talk about file I/O in greater detail later in the semester, but think about it. When I open a book, I probably open the cover and put my finger on the first word on the first page. You can think of a stream as my virtual finger telling me where I'll read the next characters in the file.

Back to what was printed, the system is telling me about its virtual finger on the page. Roughly, I have something that does I/O on a text file whose name is `CatInTheHat.txt`. The file is open for reading, and the interpreter assumes that the content of the file is encoded in this UTF-8 language.

**Step 12: Revisiting our first error.** What happens if we ask the interactive Python interpreter to open a file that doesn't exist? We saw that this caused an error when we executed our whole script and the text file we wanted wasn't in the list of files. Will this error message look different when the problem occurs in the interactive interpreter? I don't know. Ok, I do, but let's pretend I don't.

Let's misspell the file name and execute that `open` statement. Type and execute `open('CatInTheHut.txt')` in the interactive interpreter.

We got a red error message again, as expected. This time it shouldn't look as scary, as we're already familiar with it. Yup, `FileNotFoundError` as we might have expect to occur again. This is a type of `OSError` that `help` told us would be raised upon failure to successfully complete the `open` operation.

The only difference this time is that the `Traceback` looks a bit different. It doesn't tell us where in our script the execution error occurred, but that there was a failure on line 1 of this cryptic file "`<stdin>`". We don't really need to understand this broken English, since this traceback information is only really needed when we have an error buried somewhere in a multi-line script. 

**Step 13: Undefined names.** Ok, let's get back to our script in the middle panel and what we were supposed to be doing. After `open`, the interpreter is sitting waiting for us to tell it what we want it to do with the open file (our virtual open book).

Well, how about we read the first line? If Python has a builtin file read command, maybe it has another built-in function called `readline`. Let's type `readline()` on line 3 of the `main.py` script and run the entire script again.

In [None]:
# Open the book so we can read it
open('CatInTheHat.txt')
readline()

Oops, another error that says, `NameError: name 'readline' is not defined`. This seems to be telling us that Python doesn't know what to do when we tell it to execute a command called `readline`.

Let's look at the online Python documentation and in particular at the list of [buit-in functions](https://docs.python.org/3/library/functions.html#built-in-funcs%20https://docs.python.org/3/library/functions.html%23built-in-funcs).

The command `open` is there, but no `readline`. In fact, there's no read functions at all. There must be something wrong with our thinking. What good is it to open a file if you cannot read it?

**Step 14: Talking through your confusion.** Let's try these commands on a person to see if we can glean what might have gone wrong. You should try it with a friend and I'll try it with one of my teaching fellows (TFs). Command them to `open('book_name')`, where `book_name` is the name of a book that they're holding, and then command them to `readline()`.

Did your friend or the TF do what was commanded? Often the answer is "Yes, but." Yes, they opened the book I specified, but it's hit or miss if they read a line from that book. When I said `readline()`, I didn't specify what line I wanted read. What if our script had opened two files? How would the interpreter know which of the open files it should read?

This is an example of what's good and bad about computers. They are very literal. They will do exactly what you ask them to do and nothing more.

If you're lucky, when you tell a computer to do something ambiguous, it won't do anything and tell you that it is confused by your command. That's what the Python interpreter did when I asked it to execute a command name that it didn't know.

If you're unlucky, it will do something random, as the TF did for me, and you need to determine for yourself if the computer did what you wanted it to do. We will spend a good bit of time in this class talking about how to avoid these situations and what to do when you find yourself in them.

**Step 15: Naming the result of a computation.** Maybe we can solve this problem of ours by using the result of the `open` command. In particular, we want to tell the interpreter to use that thing it computed during `open` to read the first line in that file's stream. You know, that line under its virtual finger. This sounds promising. Let's delete `readline()` and type a new piece of pseudocode:

In [None]:
# Open the book so we can read it
open('CatInTheHat.txt')
# Read the first line using that thing you computed

Hmmm. We have a naming problem. How do I tell the interpreter what "thing" I'm talking about? Luckily, Python and all other programming languages allow us to name the results of a computation.

Edit the `open` statement as follows, which assigns the result of the `open` command to the name `my_open_book`:

In [None]:
# Open the book so we can read it
my_open_book = open('CatInTheHat.txt')
# Read the first line using that thing you computed

A single equal sign (`=`) is the *assignment operator* in Python. We'll talk more about naming and the intricacies of assignment, but for the moment, you can think of this single equal sign as saying, "Whatever `open` produced as a result, we will call it `my_open_book`."

With a name for our open book, we can append `my_open_book.readline()` to the end of our script, and the interpreter will go ahead and read the first line from the specified file. Here's the entire script we've written so far:

In [None]:
# Open the book so we can read it
my_open_book = open('CatInTheHat.txt')
# Read the first line using that thing that you computed
my_open_book.readline()

Click the green Run button and run it. No errors! But also no output.[^fn1] 

**Step 16: Debugging.** The script ran, but it didn't produce the output that we expected. It's one thing to have the machine report to you that something didn't work, but it can be even more frustrating when the machine completes its execution and the result isn't what you expected. Where do you start to figure out what went wrong?

We will learn that there's no single, simple answer to this important question. The good news is that there are many effective strategies you can employ to determine what went wrong. Because our current script is so small, we'll start with the most straightforward, which you can think of as the brute force approach. In this approach, we start at the beginning and review the operation of each line in our script. To help us in this, we can use the interactive interpreter, which not only allows us to execute an individual line of Python code, but it will also keep the state of our computations.

It's perhaps easier to understand what this means if we run through the execution of our script in the interactive debugger. Feed each line of our script that isn't a comment, one at a time, to the interactive debugger. The following is a transcript of what happened for me when I did exactly this.

In [None]:
### NOT a script and therefore NOT executable
> my_open_book = open('CatInTheHat.txt')
> my_open_book
<_io.TextIOWrapper name='CatInTheHat.txt' mode='r' encoding='UTF-8'>
> my_open_book.readline()
'The sun did not shine.\n'

You'll see that I typed the name of the result we computed in our `open` command at the interactive interpreter's prompt after the first line in our script, and the interpreter nicely told me the value of associated with that name (i.e., a `TextIOWrapper` object). We don't (yet) fully understand the details of this object, but it looks ok. 

More importantly, by typing this line at the interactive interpreter's prompt, which didn't exist in our original script, we can see what it means for the interactive interpreter to keep the state of our computation. It hasn't forgotten about the result we computed in the `open` command or that we asked for it to be named `my_open_book`. In this way, we can look at (and even modify) the computation state at any point in the execution of our script. 

By the way, congratulations! You've successfully begun your first debugging session.

Continuing on, I then typed the second line of our script, and hit return. Whoa! This time the interpreter printed the first line in the book! Our script will read the first line of our book!

**Step 17: Running versus debugging a script.** The interactive interpreter is a great way to test little pieces of code. It's like asking the actors in the theater to run through just a small piece of the overall script to test out that piece. We expect our actors to jump in and out of character while we run these tests on pieces of the script.

When it comes to showtime, however, we want slightly different behavior. Actors should stay on script and do only what's specified in the script. Extending this description to our computer environment, running the entire Python script by clicking the green Run button in Replit tells the Python interpreter that this is its showtime. The showtime run of our entire Python script printed nothing in the rightmost execution panel because we didn't tell the Python interpreter to print anything. In particular, the last statement we told the interpreter to execute was to read a line from our file. And that's all it did. If we want the interpreter to print the line it read, we must tell it explicitly to do this. It's like the difference between asking a human actor to read a line in a book while onstage versus reading a line in a book and saying it out loud.

**Step 18: Printing to the console pane.** Replit calls one of the tabs in its rightmost panel the *console*. We will also refer to the console as the *terminal window*, since that's where the text we print will appear when we run our Python scripts at a command prompt in a terminal window.

To print to the console or terminal window, you simply command the interpreter to `print`. The argument to the `print` command is what we want to print, which in our case is that thing the interpreter read in `readline`. I guess we had better name it so that we can refer to it in our `print` command. Let's call it `the_line`. The following is what our script looks like after these additions.

In [None]:
# Open the book so we can read it
my_open_book = open('CatInTheHat.txt')
# Read the first line using that thing that you computed
the_line = my_open_book.readline()
print(the_line)

In pseudocode, `print(the_line)` says something like "please print the value of the object named `the_line` to the terminal screen."

Do we dare try running our entire script now? Are you ready for showtime? Go ahead and run it. 

Success! We have started to create a digital agent that can read a book to us, or it could be a digital actor that could read a script to an audience.

**Step 19: Statements, objects, attributes, types, and string literals.** Let's take a moment here and make sure we understand everything we have done and why the interpreter understood some of what we did and not other things we tried.

We have been writing *statements*, one per line, that the Python interpreter reads. As the interpreter reads a line, it discards anything that is a comment, and if there's something left (the actual statement), it performs the requested computation.

When the Python interpreter executes a computation based on a statement in our script, it works with things called *objects*. Almost everything in Python is an object. For now, think of it as a blob with *attributes*. We haven't looked at the attributes of any object yet, but the collection of attributes an object has is defined by its *type*, also called its *class*. 

For example, the result of our `open` command was an object, which we named `my_open_book`. The result of the `readline` command was also an object, which we named `the_line` and printed to the terminal screen.

We'll encounter attributes soon and talk more about types and classes later in the course, but for now you should begin understanding what's a Python object. For example, the first Python statement in our script that isn't a comment refers to three distinct objects. (1) `'CatInTheHat.txt'` is a *literal* object, which has a fixed value; (2) `open` without its parentheses and arguments is an object that allows us to do things; and (3) there is an object produced by the execution of `open` that we have named `my_open_book`. You will also hear people talk about `my_open_book` as a *variable*, since we can have it name one object at one time and a different object at another. 

Python has a few builtin functions that can help you understand what kind of objects you have in your script. The first of these is the builtin function `type`. Try it out by running the following four commands in the interactive interpreter as I've done below:

In [None]:
### NOT a script and therefore NOT executable
> type('CatInTheHat.txt')
<class 'str'>
> type(open)
<class 'builtin_function_or_method'>
> my_open_book = open('CatInTheHat.txt')
> type(my_open_book)
<class '_io.TextIOWrapper'>

Unsurprisingly, `'CatInTheHat.txt'` is a string, which is what we wanted it to be. It is of type `class 'str'`. You can write strings in Python by enclosing a set of characters in either single or double quotes. This flexibility makes it easy to include single or double quote characters in a Python string. For instance, try typing `type("Mike's")` at the interactive interpreter's prompt and hit return.

The `open` command is a builtin function as we hoped. For now, we will use the terms "command", "function", and "method" interchangeably. They all roughly mean: "go do something complex based possibly on some input data I provide and return to me the result of your computation."

Finally, `my_open_book` names an object of the type of the computation that the `open` command produced.

**Step 20: Aliasing.** I need to emphasize that this variable `my_open_book` may be one of many names for the same object. For instance, I can create another name for the object named `my_open_book` with the assignment operator.

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_same_book = my_open_book

Let's verify this claim and learn another builtin command along the way. First let's look at what the `type` command tells us about `the_same_book`.

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_same_book = my_open_book
type(the_same_book)

This command tells us that both variables have the same type. We'd expect that if both named the same object, since each object has only one type. But `the_same_book` might not refer to the same single object as `my_open_book`, since it might be simply a copy of that original object. We need to determine if the assignment operator just creates a new name for an object, or if it creates a copy of the object and gives the new name to the new copy.

This is the purpose of the builtin `id` function. It tells us the value of an object's identifier, which is an attribute common to every Python object. Object identifiers are unique. No two objects should have the same object identifier. It's like Social Security Numbers (SSNs) in the United States; no two people should have the same SSN. Let's print the object identifiers for both of our variables.

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_same_book = my_open_book
id(my_open_book), id(the_same_book)

You probably got back two large numbers, and if you look closely, they are the same number. Therefore, `my_open_book` and `the_same_book` are two different names for the same object. This is equivalent to the fact that I respond to both Mike and Michael.

Why is this an important fact to know? Have the Python interpreter execute the following two statements:

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_same_book = my_open_book
my_open_book.readline()
the_same_book.readline()

What happened? My virtual finger moved along in the book no matter which name I used to read a line! This is called *aliasing* in computer science, which makes sense given our common understanding of the term.

Again, I raise this point for two reasons: (1) You should have the right mental model for what's happening in the interpreter; and (2) you will be caught by this aliasing bug at some point in your writing of Python scripts.

**Step 21: Namespaces.** I want to get back to building our digital actor that reads our book, but I should take another moment to explain the syntax of the `readline` statement that actually worked.

We saw that `readline` by itself was not a name for a Python object because we received a `NameError` when we tried using it. (If you pass `readline` as the argument to the builtin `type` or `id` commands, you'll learn the same fact.) But by typing `my_open_book.readline()`, we were able to successfully call a function by that name that reads a line. (You might want to try asking `type` and `id` about `my_open_book.readline` in the interactive interpreter and see what happens.) What's going on here?

The computer science term for this is *namespaces*. This concept isn't hard because we use namespaces in our daily lives all the time. For example, when I use the name "Christina" at the office, I mean my administrative assistant. When I think of "Christina" at home, I mean by wife. My office and my home are two different namespaces, and I replace one with the other as I move from home to work and back again.

As another informative example, when I think of "Jim" while at work, I'm thinking of Jim Waldo, my long-time co-instructor. If someone mentions "Jim" to me at home, I'll look confused. I don't have anyone at home named "Jim".

This second example is what happened with the name `readline`. This name is known within the namespace of an `_io.TextIOWrapper`, but not in the namespace of builtin functions.

Python provides a builtin function, called `dir`, that lets you see what names are currently defined in a namespace. Type `dir()` at the interactive interpreter's prompt, and it will show you the names in the current, local namespace. Hey, there are our two defined names, `my_open_book` and `the_same_book`. When we did an assignment to these names (via the `=` operator), Python put these names in our local namespace so that later uses of those names worked without error.

In the list names we got back from `dir`, you'll also see a number of special names with the double underscores at the start and end of the identifiers. I don't want to dwell on these special names at this point, but I will direct your attention to the name `__builtins__`. Type `dir(__builtins__)` and you can see that this is where the Python keeps the names for its builtin functions (e.g., `help`, `type`, `dir`, `NameError`, and `FileNotFoundError`).

Of particular note is that the name `readline` is not in the local namespace or the builtin namespace. What about the namespace associated with the object `my_open_book`? Let's type `dir(my_open_book)` and see. Bingo! There it is! These are the attributes of the object known to us as `my_open_book` (and `the_same_book`).

Putting this all together. When we wrote `my_open_book.readline()`, you can read that as saying, "We expect that in the namespace of the Python object named `my_open_book` there is an attribute called `readline`. And because we followed that attribute with empty parentheses, we believe that this attribute is a function we can invoke without any input parameters. Assuming we're correct, execute this command and please return to me the result of that computation."

Give yourself a high-five. You have learned a lot about writing scripts that a program can interpret and how it accomplishes this.


**Step 22: Reading more than one line, by duplicating commands.** I don't know about you, but if someone told me that they were going to read to me and then stopped reading after the first line, I'd probably be unhappy with them. How do we get our script to read the next line in our book?

Well, we saw earlier that multiple calls to the object representing my virtual finger in the file moves my virtual finger forward. So, if we add another copy of the command `my_open_book.readline()` to the end of our script, I bet we would read the next line and move our virtual finger to the third line. And if we want to see the line we just read, we should print again. Since our script is starting to get long, I'll also remove the comments we originally inserted. When your script looks like the following, hit that green Run button.

In [None]:
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
print(the_line)
the_line = my_open_book.readline()
print(the_line)

Yea!

**Step 23: Carriage returns.** You might be annoyed by the blank line inserted between the first two lines of our book. To understand why this happens, we first need to talk about special characters, and specifically, the carriage return character.

What, might you ask, is a carriage return? It is a term that had meaning when people used [old style typewriters](https://www.youtube.com/watch?v=FkUXn5bOwzk). Around 1:28 into the linked video, the person pushes the silver handle to reset the carriage to the start of the next line. This silver handle is a carriage return. On our computers when we're in a text editor, we simply hit [the button labeled "return" (Apple Mac) or "enter" (IBM PC)](https://en.wikipedia.org/wiki/Enter_key).

The question for us is how do we represent this carriage return in our text documents? The answer is that it becomes a special character sequence in our strings. We can find this sequence at the end of each line we read in using the `readline` method on a stream. Again, we can use the interactive interpreter to take a peek behind the scenes at our computation. Remember, in the interactive interpreter, typing an object's name tells the interactive interpreter that we want to see the value of that object.

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
the_line

The backslash followed by a little n (`\n`) is the printed representation for a carriage return (also called a newline character). You can type a backslash followed by a little n in any string literal to say that you want a carriage return at that point.

Now type `print(the_line)` and hit return in the interactive interpreter. Notice that when we print the value of this object, the computer system knows that we want an actual return and not a character representation of the carriage return.

Wait a minute! How many carriage returns occurred in the execution of `print` command that ends the following script?

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
print(the_line)

Two! One at the end of "The sun did not shine." and another at the end of the blank line that follows it.

We can verify this by creating a new string object without any carriage return at the end of the string.

In [None]:
### EXECUTABLE, but NOT part of our script
a_plain_line = 'The sun did not shine.'
print(a_plain_line)

There are a lot of formatting options associated with `print`, and I encourage you to read about them. We won't cover them all. But to answer our mystery, we can look quickly at `help(print)` and notice this input parameter called `end`. By default, a newline character is appended to the end of anything we print with `print`.

To fix our printed output, we simply need to write our script so that this extra newline isn't appended. That means we just specify the `end` paramater we want, which is nothing (i.e., a string with no characters). A string with no characters is just this '', which looks like a double quote but is actually two single quotes right next to each other (i.e., we type a single quote to indicate the start of a string, type no characters in the string, and then type another single quote to mark the end of the string).

Here's our updated script, which eliminates the unwanted blank lines in our printed output.

In [None]:
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
print(the_line, end='')
the_line = my_open_book.readline()
print(the_line, end='')

**Step 24: Control-flow statements.** We have a script that reads the first two lines of the story, and we could clearly copy the last two statements many more times if we wanted the entire story read. But that creates a script that is tailored to the length of this specific story. What if we wanted to build a digital actor that could read an entire story no matter what its length?

This is the purpose of *control-flow* statements in computer programs. The behavior of our script changes based on the testing of some condition in our code. In this case, the condition we want to test is if we have run out of lines in our story.

What would the pseudocode for such a script look like? Well, instead of duplicating these last two lines, let's reuse the original ones.

Ok, let's write some pseudocode and figure out the actual Python syntax we need later. Instead of repeating `readline` and `print`, let's insert a test to see if we just printed the last line of the story.

In [None]:
my_open_book = open('CatInTheHat.txt')
# Here.
the_line = my_open_book.readline()
print(the_line, end='')
# Was that the last line? If yes, quit. If no, go here.

The pseudocode in the comment at the end of the script illustrates the basic structure of a *condition branch* in most programming languages. With it, the computer tests some condition, and if the condition is true, it executes one block of statements, otherwise it executes a different block of statements.

The pseudocode also illustrates that we can add statements that (unconditionally) jump around in our script, both forward (toward statements later in our script) and backward (toward statements earlier in our script). With this new capability, we don't have to just execute a script one statement at a time from the first to the last. Computer programs now look more like musical scores than the script for a play. Notwithstanding this issue with my theater analogy, I'll continue to use it.

**Step 25: Unit tests.** Before we convert this pseudocode to actual Python statements, let's ask ourselves if this logic is correct. This is not easy to do. As I designed this logic, I'm already inclined to believe it is correct. That's just human nature.

A good way to avoid this trap is to pick a set of test inputs from different parts of the input space, and then carefully work through our code thinking about what it will do on each input. When we have running code, we can perform these same tests by building actual test inputs and running the code to see what happens, but for now, let's just analyze things by hand.

What describes the space of inputs for our script?

It is certainly true that we're hoping to read a number of different input files a line at a time. It might be useful, therefore, to test our pseudocode on input files of different lengths, as measured in lines. Let's try text files of three different lengths: (a) a file with no lines; (b) a file with one line; and (c) a file with more than 1 line.

Starting with a file with no lines in it, what will our script do?

Let's see. It first opens the empty file. It next attempts to read a line from this empty file. What does `readline` return in this case? We can use the builtin `help` function to answer this question.

If you type `help(my_open_book)`, Python assumes we want to know what we can do with the object that our variable `my_open_book` names. What a nice feature of the Python help command!

In [None]:
### EXECUTABLE, but NOT part of our script
my_open_book = open('CatInTheHat.txt')
help(my_open_book)

If you scroll down to where help talks about `readline`, you'll see that `readline` returns an empty string if EOF is hit immediately (i.e., when EOF is hit). EOF stands for *end of file*, i.e. that there's nothing more to read (or in this case, nothing to read at all).

Is it ok to print an empty string? Let's try it.

In [None]:
### EXECUTABLE, but NOT part of our script
print('')

Yup! No complaints from the interpreter; it happily printed nothing.

Next, we need to answer the question whether this was the last line in the file. Well, not really. We've stepped beyond the last line of the file. Yet, if we change the condition we test to ask, "Did `open` return EOF?", things should work just the same.

In [None]:
my_open_book = open('CatInTheHat.txt')
# Here.
the_line = my_open_book.readline()
print(the_line, end='')
# Did open return EOF? If yes, quit. If no, go here.

**Step 26: Be open to change.** Take note of this simple change, and think about what we just did there. It is an example of making small adjustments to the logic of our originally conceived algorithm so that what we are designing fits better with the set of things that our programming language makes easy for us to do.

Be open to this. Playwrights, artists, engineers, and other designers do this all the time. They listen to the "backtalk" from their materials or, in the case of playwrights, from their actors. There are typically many ways to accomplish something, and it is often a Good Thing to listen to this backtalk and adjust your design accordingly.

**Step 27: No spaghetti code.** Let's now convert this pseudocode into actual Python code. This is a bit tricky because computer scientists are quite picky about the kind of control flow you should write in a high-level language (HLL) like Python.

By HLL, I mean a programming language that is closer to something we humans can read and understand than the types of statements that the hardware processor in our computers can read and understand. We will talk later in this course about how what we write is converted into something that the hardware on our machines understand.

Our commented pseudocode contains something that looks like a `goto` statement. The interpreter is sitting at one point in our script's list of statements, and a `goto` statement says jump to somewhere else in the script. This somewhere is often identified with what computer scientists call a *label* (e.g., our comment stating `Here`), but it could be something as simple as the line number of the statement where the interpreter should start interpreting next. In particular, I could change our comment to read "Did open return EOF? If yes, quit. If no, goto line 2."

In [None]:
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
print(the_line, end='')
# Did open return EOF? If yes, quit. If no, goto line 2.

This tiny `goto` doesn't look like it is a problem, and it is what we will effectively do in a moment. The problem with this `goto` is that it gives you the power to go anywhere in your script.

As John Dalberg-Acton wrote in a [1887 letter](https://history.hanover.edu/courses/excerpts/165acton.html) to Bishop Mandell Creighton, "Power tends to corrupt, and absolute power corrupts absolutely." Used in an constrained way, this tiny `goto` leads, it has been argued (vehemently), to unmaintainable spaghetti code. You can read about [this debate](https://en.wikipedia.org/wiki/Goto) from last century on Wikipedia.

**Step 28: Testing a condition.** From `help`, we know that `readline` on line 2 of our script returns an empty string on EOF, and we can test for that condition writing the expression `the_line == ''`. This says, "Is the value of the object called `the_line` equal to an empty string?"

Notice the special way we wrote "is equal to?", which distinguishes this operation from the assignment operator we used in lines 1 and 2. The double equal (`==`) operator compares the objects to its left and right, and it returns `True` if they are equal and `False` if they are not.

`True` and `False` are builtin literals of type `bool`, and combined with the `if` statement, you can use them to dynamically control the execution of one or two groups of other statements. For example,

In [None]:
my_open_book = open('CatInTheHat.txt')
the_line = my_open_book.readline()
print(the_line, end='')
# Did open return EOF? If yes, quit. If no, goto line 2.
if the_line == '':
    print("The End.")

You can put any expression that evaluates to a `bool` between the `if` and the colon (`:`). Only if this expression is `True` will the indented statements after the `if` be executed.

**Step 29: Indentation.** Indentation, or the whitespace at the start of each line in your script, holds meaning in Python. This shouldn't be a completely strange concept to you for we use whitespace in our writing to, for example, indicate the start of a new paragraph.

Other programming languages use explicit symbols to group statements together. The difference is more taste than anything, but Python believes that indentation without any extra special symbols makes it easier to read and understand a script.

Take the last two paragraphs. I could have written them as follows if I had chosen to indicate new paragraphs with a paragraph mark (U+00B6) rather than a blank line.

Indentation, or the whitespace at the start of each line in your script, holds meaning in Python. This shouldn't be a completely strange concept to you for we use whitespace in our writing to, for example, indicate the start of a new paragraph. U+00B6 Other programming languages use explicit symbols to group statements together. The difference is more taste than anything, but Python believes that indentation without any extra special symbols makes it easier to read and understand a script.

Which do you find easier to read?

**Step 30: Spaces, not tabs.** We will talk more about indentation, but for now, please realize that it matters to the way that the Python will interpret your script. If you want to avoid problems in your Python programming, use spaces, not tabs, when indenting your Python statements. Tabs are another special character, and these characters can be interpreted differently by different applications.

Furthermore, all the Python statements you want grouped together should be indented by the same amount. If they are not, you will have problems.

**Quick quiz.** What will our last listed script print to the terminal when run with our text file? (A) Nothing. (B) The first line of `CatInTheHat.txt`. (C) The first line of `CatInTheHat.txt` followed by `The End.` (D) Just `The End.` (E) All the lines of `CatInTheHat.txt` followed by `The End.`?

What would this script print to the terminal when run with an empty `CatInTheHat.txt` file? Choose from the list of the previous five possible answers.

**Step 31: Loops and breaks.** The answer to our first question is not what we want our script to do. The problem we have is that we didn't write code for the "if no" portion of our pseudocode comment. This portion says we should branch backward so that we can repeat the earlier statements. A backward jump like this is what we call a *loop* construct.

There are two compound statements in Python to code loops. We will use the first of them, a `while` loop, in our script.

In [None]:
my_open_book = open('CatInTheHat.txt')
while True:
    the_line = my_open_book.readline()
    print(the_line, end='')
    if the_line == '':
        print("The End.")

Like the `if` statement, the `while` statement tests a condition and only if that condition is `True` will it execute an(other) iteration of the indented statements. The indented statements include everything on lines 3 through 6. Line 6 is indented twice because it is part of the statements grouped under the `while` as well as dependent upon the execution of the `if` on line 5.

Because we test the literal `True` in this `while` loop, this is what is called an *infinite loop*. It runs forever unless you provide another way for the interpreter to jump out of this loop. And we haven't given any.

Shall we try executing our script anyway? Go ahead and run it.

It printed the entire contents of `CatInTheHat.txt`, but it went by too fast for you to see it. Our computers are fast.

If you're running this in Replit, you may be able to tell that `The End.` continues to print over and over again. To stop this, click the grey Stop button, which replaced the Green Run button.

You'll probably execute an infinite loop at some point. Don't panic. Look for a Stop button like we were given in Replit. If you don't have anything that looks like a Stop button, control+c will end the runaway process on most systems.

To fix this little problem, we can use the `break` statement. It is basically a `goto`, but it is constrained to take you to the exit of the nearest enclosing loop. Let's add the `break` statement and see if this eliminates our infinite loop problem.

In [None]:
my_open_book = open('CatInTheHat.txt')
while True:
    the_line = my_open_book.readline()
    print(the_line, end='')
    if the_line == '':
        print("The End.")
        break

Our infinite-loop problem is gone, and we've got a complete script for reading out one of my favorite children's books, *The Cat in the Hat* by Dr. Seuss. Problem solved! Congratulations!

**Step 32: Flexibility.** By the way, our script is flexible enough handle any book (i.e., plain text file) we give it. Go ahead and download the plain text version of any book from [Project Gutenberg](https://www.gutenberg.org/) and give it a try. In particular, use their advanced search option and set "Filetype" to "Plain Text (txt)" and "Language" to English.

[^fn1]: Remember, if you're running this in Google Colab or some other interactive Python notebook application, you will see the result of the last statement printed. You need to run the script outside of an interactive environment.