# Programming Techniques

## Objectives

- To practice documenting code for clarity and reuse
- To write basic functions to reduce redundancy
- To interpret Python exceptions (error messages) in order to fix bugs

## Introduction

By now, you have written Python code to solve various problems. Some problems could be solved in one or two lines; others (like finding the most expensive textbook in our dataset) required several lines of code. 

Given all that you have learned so far, you could go on to solve many, many more problems using Python. In other words, we've covered the bare essentials; it's one of the main advantages of Python (as a programming language) that these essentials can be covered in just a couple of days.

But the Python language offers many other tools and techniques to make code more robust, reusable, and efficient. One way to think about these tools and techniques is that they help us write _programs_ as opposed to writing code, a **program** being some code that is designed for use by different users in different contexts. Even if you are the only user of your code, applying these programming techniques can make your work more productive and your code more effective.

This notebook is intended for you to work through independently, in order to review and clarify the concepts introduced on Python Camp Day 2, and to lay the groundwork for the activities on Python Camp Day 3. However, feel free to collaborate with others in working through it. It is also intended to serve as a resource you can return to review as necessary.

#### How to Use this Notebook

1. Read the documentation above each cell containing code and run the cell (`Ctrl+Enter` or `Cmd+Return`) to view the output.


2. Follow the prompts labeled `Try it out!` that ask you to write your own code in the provided blank cells.


3. (Hidden) solutions to these exercises follow the blank cells; click the toggle bar to expand the solution to compare with your approach.


4. Some prompts include alternative exercises (Parsons Problems) that will be linked from the prompt. These alternatives may help clarify concepts (especially if you find yourself struggling to keep up with all the syntax).


5. Optional annotations (labeled `For the curious...`) provide additional explanation and/or context for those who want them. Feel free to skip these sections if you like. As a beginner, it's important to maintain a balanced cognitive load: taking in too much information all at once can impede your progress toward understanding. This balance looks different for everyone, but we have tried to keep the main content focused on a few key concepts, tools, and techniques, while providing that additional context for those who might benefit from it.

6. Follow the instructions at the end to complete and submit a short, autograded assignment to test your knowledge. (You may submit the assignment as many times as you like.)



## I. Writing & Reading Documentation

One of the most important techniques doesn't involve writing code at all! But while good documentation won't technically make your code work better, it can make _you_ work better. Good documentation makes clear -- to you or anyone else who might want to use your code -- how code is intended to be used. 

Conversely, not having documentation in your code is a recipe for frustration. Writing code is always a matter of choosing one path over many other possible paths, and your future self is not likely to remember why you chose a particular path in every case (nor even necessarily what you were trying to accomplish). 

In the exercise below, you'll practice documenting some code that has already been written. The code uses a logical pattern that we've seen before but in a novel way.

For this exercise, we're using the bookstore dataset, so the first step is to load it from disk.

In [2]:
import json
with open('../../../data/bookstore-data-summer-2023.json') as f:
    bkst_data = json.load(f)

### I.1 Commenting on Code

The code below loops through the bookstore data and counts the total number of textbooks where the type of the text is digital (as opposed to print). 

Above each line of code is a blank line beginning with the hash symbol (`#`). This is a Python [comment](https://gwu-libraries.github.io/python-camp/glossary.html#term-comment). The Python interpreter ignores comments when executing code, so they are present purely for the programmer's benefit.

#### Try it out!

For each comment, write (in your own words) an explanation of what the line of code below the comment is doing. Your comment text can be anything that makes sense to you. Just make sure that your text follows the hash symbol. (If you want to make a comment that spans multiple lines, just create an extra line below the first and begin that new line with the hash symbol.)


In [6]:
#
digital_count = 0
#
for course in bkst_data:
    #
    for text in course['texts']:
        #
        if text['item_type'] == 'digital':
            #
            digital_count += 1
#
print('Number of digital textbooks:', digital_count)

Expand the cell below to see one possible way of documenting this code.

In [None]:
# Initialize the counter 
digital_count = 0
# Loop over each course in the bkst_data list
for course in bkst_data:
    # Loop over each text in the course dictionary
    for text in course['texts']:
        # Check the value of the item_type key: is it a digital book?
        if text['item_type'] == 'digital':
            # If so, increment the counter
            digital_count += 1
# Print the total number of digital texts
print('Number of digital textbooks:', digital_count)

### I.2 Using the Python documentation

Lucky for us, documentation consists of much more than lines of comments on code. Both the Python **standard library** (the set of functions, data types, methods, and other tools that are available with the basic installation of Python) and a wide array of third-party Python libraries come with extensive documentation. 

Learning how to read and navigate this documentation is a skill in itself. 

The official Python documentation -- for the core language and the standard library -- resides at https://docs.python.org/3/. This site will often appear in Google results when searching for documentation on specific functions, methods, etc.

#### Try it out!


In the previous homework, we used the `str.split()` [method](https://gwu-libraries.github.io/python-camp/glossary.html#term-method) to separate a single [string](https://gwu-libraries.github.io/python-camp/glossary.html#term-string) on its [white space](https://gwu-libraries.github.io/python-camp/glossary.html#term-white-space) into a [list](https://gwu-libraries.github.io/python-camp/glossary.html#term-list) of substrings.
The code
```
'CHEM 1001 10'.split()
```
yields the output
```
['CHEM', '1001', '10']
```
(The method is referred to as `str.split()` in the docs because it can be used on any value of the string [type](https://gwu-libraries.github.io/python-camp/glossary.html#term-type).)

Here is the [documentation](https://docs.python.org/3/library/stdtypes.html#str.split) for `str.split()`. 

Reading that documentation, can you tell how to use the `str.split()` method to separate a string on _something other_ than white space? 

For example, our bookstore dataset indicates whether a text is for sale or rental, new or used, by the following strings:

 - `BUY_NEW`
 - `BUY_USED`
 - `RENTAL_NEW`
 - `RENTAL_USED`

Write some code that will split such a string on the underscore character (`_`), so that we can separate each of these strings into two data points. 

Expand the hidden cell below to see an explanation and a solution.



In [None]:
# Your code here


The crucial line of the Python documentation for methods is the first, called the **method signature**. For `str.split()` the method signature looks like this:
```
str.split(sep=None, maxsplit=-1)
```
- As mentioned above, the `str` here refers to any Python value of type string (`str`). In other words, you can call the `split` method on anything between single or double quotes, or on any variable that has been assigned to a value surrounded by single or double quotes.

- The part between [parentheses](https://gwu-libraries.github.io/python-camp/glossary.html#term-parentheses) defines the method's **arguments**. 

- Each argument is given a **default value**, meaning that (in this case), these arguments are optional.
  - The `sep` argument defaults to `None`. 
  - The `maxsplit` argument defaults to `-1`.

If the user of the method does not supply a given argument, the default value will be used. Reading the documentation below, we see that 

> If sep is not specified or is None, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace.

That's a little dense, but basically, it describes the behavior we've seen when using `str.split()` (with nothing between the parentheses): the string is split on the white space.

To split on something else, we need to provide a value for the `sep` argument. We can do that in one of two ways:

```
'RENTAL_NEW'.split(sep='_')
```
or 
```
'RENTAL_NEW'.split('_')
```
Either of those will yield the result we want: `['RENTAL', 'NEW']`. Note that the underscore character is _enclosed in quotation marks_ when passing it as an argument to `str.split()`. 



## II. Writing Functions

**Functions** and **methods** may feel pretty inscrutable. Unlike lists and dictionaries, we can't "look inside" them to see what they consist of, the way we examined `bkst_data` to determine its structure. (We _can_ read the source code for functions and methods, but unless you know what you're looking for, that's not always helpful.) That's one reason documentation is so important; reading the documentation is usually the best way to understand what a function or method does.

But in what follows, we'll demystify functions a bit by writing our own. 

Practically, a function is just a way of _encapsulating_ code. It's like a shortcut on your computer: in many apps, you can type `Ctrl+S` (or `Cmd+S`) in order to save the current document, page, etc. Those keystrokes are shorthand for the series of operations involved in doing a save. 

### II.1 Defining vs. calling functions

Below is a simple Python function that prints a message to the screen.

In [10]:
def print_message():
    print("Hello from my function!")

#### Notes

When you run the above cell, you shouldn't see any output. That's because **defining a function** and using the function are two separate operations. 

Let's unpack our function definition, piece by piece:
 - The `def` [keyword](https://gwu-libraries.github.io/python-camp/glossary.html#term-keyword) tells Python that we're defining a function.
 
 
 - We have to give our function a name, following the same rules as for Python variable names (only letters, numbers, and underscores; must begin with a letter).
 
 
 - As with variables, our function names should be unique. We _don't_ want to give a function the same name as an existing Python function. For instance, calling this function `print` would overwrite the built-in Python `print()` function, which would mean that we could no longer use the latter.
 
 
 - Immediately after the function name, we need **parentheses**. Here the parentheses are empty because this function takes no **arguments**. We'll look at arguments later.
 
 
 - Then there's a **colon**, followed by an indented [block](https://gwu-libraries.github.io/python-camp/glossary.html#term-block) (as in the `for` and `if` statements you wrote today). 
 
 
 - The code in the indented block is the **body** of the function. The function body is what will be executed when we **call** the function
 


In [11]:
print_message()

In the cell above, we **call**ed the `print_message()` function. Calling a function, like defining a function, requires the parentheses, even when there are no arguments. The presence of the parentheses lets Python know that it should execute the code within the body of the function.

### II.2 Arguments and return values

We write functions and methods in order to be able to re-use code in different contexts. Basically, it saves us from having to repeat ourselves.

But our `print_message()` always prints the same message, which is not likely to be very useful. 

Most functions are used to transform data in some way. Such functions take some input and return some (different) output. Examples you've seen so far include the following:

| function name | input | output |
| :- | :- | :- |
|`str.split()` | a string, and optionally, a separator | a list of strings |
| `open()` | the name of a file | a file handle (for working with the file's contents) |
| `float()`| a string | a float |
| `print()`| one or more values of any Python type | those values, displayed on the screen |

The **arguments** in the function definition are variables that hold the input. These variables are used within the body of the function (in the block of code that comes after the `def` line). 

In order to produce output, a function will usually **return** a value, using the `return` keyword. `return` will usually start the last line of the function. 

For instance, the function below calculates sales tax on a given price.

In [19]:
def add_sales_tax(price):
    # Price should be a numeric value
    # Returns the price + 10% for sales tax
    price_with_tax = price * 1.1
    return price_with_tax

Now we can use the `add_sales_tax` function with any price (provided we give it a float or integer for the `price` argument).

In [21]:
prices = [9.99, 11.99, 55.95, 100.19]
for p in prices:
    print('Price is', p)
    print('Price with tax is', add_sales_tax(p))

#### Notes

- In the code above, we called `add_sales_tax` inside a `for` loop, passing it the [loop variable](https://gwu-libraries.github.io/python-camp/glossary.html#term-loop-variable) `p` in parentheses.


- During the execution of the function, the value of `p` is copied into to the `price` variable (which is internal to the function). 


- We are calling `add_sales_tax()` _inside_ our call to the `print()` function.
```
print('Price with tax is', add_sales_tax(p))
```
  Note the nested parentheses. Python will execute `add_sales_tax(p)` first, get the returned value, and then pass that as the second argument to `print()`. 
  
  
- Note that we documented our function inside its definition, so that others would understand how to use it.



#### Try it out!


Write a function, `dollars_to_float()`, to convert a string starting with a dollar sign to a float. You've used this code before -- now encapsulate it in a function so that you can re-use it in the future without retyping it.

Don't forget to `return` something, otherwise your function will have no effect!

Once you've written your function, test it by calling it on the following strings: 

  - `'$10'`
  - `'$9.99'`
  - `'$200.00'`
  


In [None]:
# Your code here

In [22]:
def dollars_to_float(dollar_amt):
    # Converts a string prefixed with a dollar sign to a float
    amt = dollar_amt[1:]
    return float(amt)

If you wrote the function as shown above (in the collapsed cell), you should be able to get the intended result for all three examples.

But an amount containing a comma should produce a `ValueError`:

In [27]:
dollars_to_float('$10,000.00')

## III. Errors & Exceptions

Errors can be jarring. For many of us, error messages on the computer inspire frustration, even anger. They may even make us feel misunderstood, or else that we lack understanding, that we just don't "get" it. 

Learning to program (in virtually any language) involves learning how to deal with errors. There's no such thing as a piece of software without "bugs," if only because for any given piece of software, someone can find a way to use it for which it was not intended.

In the example above, we didn't design the function specifically to work for strings with commas, and indeed, it doesn't. The `ValueError` -- technically, this kind of error is called an **exception** -- means simply that the Python interpreter encountered a kind of value it didn't expect and doesn't know what to do with. 

Such errors/exceptions are valuable (no pun intended) to the programmer; they tell us where the bugs are (or might be) in our programs. Python even provides mechanisms for checking for errors proactively, so that our code doesn't grind to a halt whenever it encounters one.

On Days 3 and 4, we'll work more with errors and exceptions, and we'll learn about an approach to writing code that uses errors to guide development.

#### For the curious

What's the difference between an **error** and an **exception**? 

Technically, even though they often have the word `Error` in their name, most of the errors you'll encounter when programming in Python are called _exceptions_. These describe situations where a line of code may or may not work, depending on other factors. These other factors include the rest of the code in the program, the input to the program, the environment in which the program is being run, etc.   

The example above is an exception because the line `return float(amt)` works _except_ when `amt` is a string that contains something besides numerals and a period. 

Different from exceptions are **syntax errors**. A syntax error means that the line of code will never work because Python cannot parse it (i.e., interpret it correctly). 

For example, the following line of code will produce a syntax error because I failed to close the quotation marks around the string argument to `print()`:

```
print('Hello, cruel world!)
```

But in practice, this distinction doesn't matter very much, so feel free to refer to both exceptions and syntax errors as errors!



## Wrap up / Final exercise

In this homework, you practiced writing documenting code with **comments**. You used the official Python documentation to learn more about the `str.split()` **method**. And you wrote your first Python **function**, `dollars_to_float`. You also encountered some Python error messages, which we'll learn more about tomorrow.

You'll use your `dollars_to_float` function, as well as the skills covered in this homework, in the team exercises tomorrow.

The final part of the homework or today is an [autograded exercise](./HW_2_GR.ipynb), designed to test your grasp of the concepts covered on Days 1 and 2. 