<img align="left" src="https://ithaka-labs.s3.amazonaws.com/static-files/images/tdm/tdmdocs/CC_BY.png"><br />

Adapted by Sarah Connell, Dipa Desai, Juniper Johnson, Liam MacLean, Sara Morrell, and Emre Tapan from two notebooks created by [Nathan Kelber](https://nkelber.github.io/) and Ted Lawless for [JSTOR Labs](https://labs.jstor.org/) under [Creative Commons CC BY License](https://creativecommons.org/licenses/by/4.0/). See [here](https://github.com/ithaka/constellate-notebooks) for the original versions. Some exercises were adapted from teaching notebooks created by Laura Nelson, University of British Columbia, and from [Python for Everybody](https://www.py4e.com/). Warm thanks to Kate Kryder, Data Analysis & Visualization Specialist at Northeastern University, for helping to develop these notebooks.<br />
___

##How to Approach this Course
This is an interactive, self-paced course. Through these six notebooks, you will be asked to predict, interact with, and change code. This is meant to get you familiar with both reading and writing code.

##Notebooks and Google Colab
This notebook is divided into two sections. First, we will cover the basics of interacting with Python notebooks in Google Colab and then we will introduce some first concepts of Python code.

### Cells

Similar to the way an essay is composed of paragraphs, notebooks are composed of cells. A cell is like a container for a particular kind of content. There are essentially two kinds of content in notebooks:

1. Text Cells—These can contain text, images, video, and the other kinds of explanatory content you might find on a regular website. The cell you're reading right now is a text cell. Text cells are written in what is called **markdown**, which is a lightweight method for formatting text.
2. Code Cells—These can contain code written in a variety of languages.

A **code cell** can be distinguished from a **text cell** by the fact that it contains a pair of brackets on its left. Code cells in Google Colab also have a grey background.

In [None]:
# This is a code cell

A text cell provides information, but a code cell can be executed to perform an action. The code cell above does not contain any executable content, only a text comment. We can tell the text in the code cell is a comment because it is prefixed by a ``#``. In Python code, if a line is prefaced by a ``#`` then that line is a comment and will not be executed if the code is run. In the Google Colab default view, comments are green.

When you are learning code, commenting is *essential*; you should add comments to explain to yourself what the code is doing, to mark any questions that you have, and to remind yourself where you left off in your work (an important rule of coding is that your future self will not remember anything). Commenting is also a responsible practice for any code that you might produce and share in the future. Your public-facing comments should be written to explain how the code is expected to behave, point out the rationale behind your design decisions, and mark out places where a user might want to modify the code. The comments you make as you are learning can be used to explain how the code works to yourself, and to mark any questions you have or places you got stuck.

### Hello World: Your First Code

It is traditional in programming education to begin with a program that prints ``Hello World``. In Python, this is a simple task using the ``print()`` function. A function is a block of code that performs some action—we will cover functions in more detail below. This function simply displays whatever is inside the parentheses. The full syntax to print ``Hello World`` looks like this:

```print("Hello World")```

The code cell below has the ``print()`` function set up to get you started, so all you need to do is write the text you want to print (in this case, "Hello World") inside the quotation marks—make sure not to delete these! We'll cover why the quotation marks are needed soon.

To **execute** (or **run**) our code, we have a couple of options:

### Option One

Mouse over the code cell you wish to run and then push the "Play" triangle button to the left of the cell.
### Option Two

Click in the code cell you wish to run and press Ctrl + Enter (Windows) or Control + Return (OS X) on your keyboard.

In [None]:
#Fill in "Hello World!" inside of the quotation marks below and then run this block of code
print("")

After your code runs, you'll receive any **output** underneath the code that you ran. In this case, that's the words displayed beneath the code. (Often if you run code and it seems like "nothing happens", it's because your code didn't include a step to output whatever happened.)

After you click out of the cell, a number will appear in the pair of brackets to the left of the code cell to show the order the cell was run. For example, assuming the code cell above is the first one you ran in this notebook, you should see a 1 in the square brackets if you click onto another cell.

If your code is complicated or takes some time to execute, you will see an ellipsis (…) in the output line while the code executes. The "Play" button will also spin and show you a "Stop" option, which you can use to interrupt the code if it gets stuck. The first cell you run in a Colab notebook will also take a bit longer than usual.


Notice that each time you run a code cell, the number in the pair of brackets increases. This keeps track of the order in which cells were run. Technically, you can run the cells in any order, but it is usually a good idea to run them sequentially from top to bottom, to avoid errors.

### Editing in Google Colab
If you want to add a cell in Google Colab, you can do so with the "+ Code" and "+ Text" buttons at the top left, under the main menu bar. Click on any cell to see your options for editing it, including: cut, copy, move up, move down, and delete.

If you want to edit a text cell, double-click on it. Google Colab will automatically show you a preview of how the markdown will display when it is run. Google Colab has some buttons that will help you fill in the markdown. There are plenty of resources online if you do want to add any formatting that isn't provided by the buttons. For more on markdown, see [this guide](https://www.markdownguide.org/cheat-sheet/) or the resource from Google Colab linked below.

When you click out of a text cell in Google Colab, it will automatically go back to displaying the formatted version.



Test editing a text cell here by double-clicking and then filling in your name.

**My name is:**

Google Colab also has some features to help keep Notebooks organized. There is an outline button on the left that will let you see each of the sections in the Notebook. There is also a search option. Google Colab will automatically collapse and "hide" groups of multiple cells in a section, as well as very long code cells. To view these, just click on the notification that the cells have been hidden.

Google has very good documentation for working in Colab. Here are a few links you might find useful:
* [Overview of Colab features](https://colab.research.google.com/notebooks/basic_features_overview.ipynb)
* [Guide to Markdown](https://colab.research.google.com/notebooks/markdown_guide.ipynb)
* [Frequently Asked Questions](https://research.google.com/colaboratory/faq.html)

Code in Google Colab is executed in a **virtual machine** private to your account. Importantly, virtual machines can be deleted by Google if they are idle for a while. Your notebooks (like this one) are secure, but it is best to store any data they generate in separate files on your personal machine, rather than using Colab for long-term data storage.

While the code you execute on your Google Colab is private to your account, you should exercise caution before running Colab notebooks that are authored by other individuals. Like emails, social media, and other digital content, Colab notebooks can be used for phishing, data privacy violations, and other malicious attacks. Review the code of unfamiliar Colab notebooks before running the cells.

While we'll be using Google Colab for this tutorial, it isn't the only way to code in Python! We will discuss other environments for Python programming in Notebook 5.


### What to do if things seem broken
Don't worry, you can't break anything in this notebook! If you need to, you can always make a new copy from the template folder. Google Colab also has a "playground" mode that will prevent you from making any permanent changes. Go to "Open in playground mode" under the "File" menu to access this (note that "playground" mode is an option only for Colab notebooks you have saved to your own Drive).

If you have a more serious issue with the notebook and you just want to start over, you can go to the "Runtime" menu at the top and hit "Restart runtime" (or "Factory reset runtime" for more serious issues). From this menu, you can also "Interrupt execution" for any processes you want to stop (this is the same functionality as hitting the "Stop" button).

From the "Edit" menu, you can choose "Clear all outputs," if you want a clean copy of the notebook without any output from the code cells.

Okay, now that we've covered the basics of Notebooks in Google Colab, let's turn to some fundamentals of the Python programming language.

# Python fundamentals

Python is a computer programming language that is widely used in data science and the digital humanities. We'll cover a few Python basics here, giving you the tools to understand some core concepts and run several pre-constructed analyses. If you'd like to learn more, there are many excellent resources online for learning Python, such as [Python for Everybody](https://www.py4e.com/) and the tutorials published by the [Programming Historian](https://programminghistorian.org/en/lessons/?topic=python).

**Making Mistakes is Important**

Every programmer at every skill level gets errors in their code. Making mistakes is how we all learn to program. Programming is a little like solving a puzzle where the goal is to get the desired outcome through a series of attempts. You won't solve the puzzle if you're afraid to test if the pieces match. An error message will not break your computer. Google Colab can also help you spot mistakes in your code by underlining them in red. Remember, you can always restart the Runtime in a notebook if it stops working properly or make a new copy if you misplace an important piece of code. To learn any skill, you need to be willing to play and experiment. Programming is no different.

## Expressions and Operators

One very simple form of Python programming is an expression using an operator. For example, you might have a simple mathematical statement like:

> 1 + 3

The operator in this case is `+`, sometimes called "plus" or "addition". This particular **expression** is a combination of two **values** (1 and 3) and an **operator** (`+`). In Python, expressions are combinations of values, operators, functions, and variables (more on these last two soon!).

In the code block below, try writing an expression that uses the addition operator.

In [None]:
# Type an expression in this code block, adding the year you are graduating to your age.

# Then, run the code block.

You can also do subtraction, multiplication, and division, among other mathematical operations. To multiply in Python, you use an asterisk (\*) and to divide, you use a forward slash (/).

In [None]:
# Now try multiplication or division in this code block


When you run, or **evaluate**, an expression in Python, the order of operations is followed. (You may remember learning the shorthand "PEMDAS".) This means that expressions are evaluated in this order:

1. Parentheses
2. Exponents
3. Multiplication and Division (from left to right)
4. Addition and Subtraction (from left to right)

Python can evaluate parentheses and exponents, as well as a number of additional operators you may not have learned in grade school. Here are the main operators that you might use presented in the order they are evaluated:

|Operator| Operation| Example | Evaluation |
|---|----|---|---|
|\*\*| Exponent/Power| 3 ** 3 | 27 |
|%| Modulus/Remainder| 34 % 6 | 4 |
|/| Division | 30 / 6 | 5|
|\*| Multiplication | 7 * 8 | 56 |
|-| Subtraction | 18 - 4| 14|
|+| Addition | 4 + 3 | 7 |

In [None]:
# Try a few more operations in this code cell.

You are probably not going to replace the calculator on your phone with Python! But, this example shows you something about how Python works: here, you are creating an **expression** by combining **values** with an **operator** and running the code to produce **output**.

## Data Types (Integers, Floats, and Strings)

In the above examples, our expressions evaluated to a single numerical value. Numerical values come in two basic forms:

* integer
* float (or floating-point number)

An integer, what we sometimes call a "whole number," is a number without a decimal point that can be positive or negative. When a value uses a decimal, it is called a float or floating-point number. Two numbers that are mathematically equivalent could be in two different data types. For example, mathematically 5 is equal to 5.0, yet the former is an integer while the latter is a float.

Python can also manipulate text. A snippet of text in Python is called a **string**. A string can be written with single or double quotes, but they need to match each other and they need to be the "straight" version, not curly/smart quotes (“ and ”).


A string can use letters, spaces, and numbers. So ```5``` is an integer and ```5.0``` is a float, but ```'5'``` and ```'5.0'``` are strings.

|Familiar Name | Programming name | Examples |
|---|---|---|
|Whole number|integer| -3, 0, 2, 534|
|Decimal|float | 6.3, -19.23, 5.0, 0.01|
|Text|string| 'Hello world', '1700 butterflies', '', '1823'|

The distinction between each of these data types may seem unimportant, but Python treats each one differently. For example, we learned the "+" operator earlier, but take a look at how "+" does different things for integers and for strings:

In [None]:
# Run this code cell to see what the "+" does
10 + 10

In [None]:
# Run this code cell to see what the "+" does
10 + 10.5

In [None]:
# Run this code cell to see what the "+" does
"hello" + "goodbye"

In [None]:
# Run this code cell to see what the "+" does
'ten' + 'ten'

In [None]:
# Run this code cell to see what the "+" does
"10" + "10"

When Python detects that we have two numbers (whether integers or floats), it uses "+" to add them. But when it detects that we have two strings, it _concatenates_ them. Python relies on the quote marks to decide whether something is a string or not.

When we use the addition operator, the values must be all numbers or all strings. Combining them will create an error.

In [None]:
# Try adding a string to an integer
'55' + 23

Here, we receive an error because Python doesn't know how to join a string to an integer. Putting this another way, Python is unsure if we want:

>'55' + 23

to become
>'5523'

or
>78

Because these data types operate differently, it is very useful to be able to check which type you're working with. You can do this with the `type()` function. Try running the three code blocks below to check the types for 15, 15.0 and "15".

In [None]:
# Check the type for 15
type(15)

In [None]:
# Check the type for 15.0
type(15.0)

In [None]:
# Check the type for "15"
type("15")

## Variables
We noted above that expressions are combinations of values, operators, and variables, and said that we'd be returning to variables. A variable is like a container that stores information. There are many kinds of information that can be stored in a variable, including the data types we have already discussed (integers, floats, and strings). We create (or **initialize**) a variable with an assignment statement. The assignment statement gives the variable an initial value.

Variables are stored in your "working memory" during a coding session, which means that they are not saved to your hard drive but that they will persist during your session and will be usable from any cell in your notebook once you have initialized them. When you start a new session or after you clear a notebook, you will need to re-initialize any variables you will be using (that is, you will need to re-run the code with the assignment statements for any variables that you need).


In [None]:
# Initialize an integer variable
# Note that this code doesn't produce any output; it just establishes the variable
new_integer_variable = 6

In [None]:
# Running this code will let you see the value of your variable
new_integer_variable

In [None]:
# Add 22 to our new variable
new_integer_variable + 22

The value of a variable can be overwritten with a new value. You can test this by changing the value in the first code block above, and then re-running everything.

We can also overwrite the value of a variable by performing an operation using its original value. In the two cells below, we establish a variable and then add 2 to that variable. As we did above, we then run a line of code that just has the variable name to see the value of our variable.

In [None]:
# Creating a variable "cats_in_house"
cats_in_house = 1
cats_in_house

In [None]:
# Adding 2 to our initial variable
cats_in_house = cats_in_house + 2
cats_in_house

Making changes like these is one way a Python program can "save" the result of an operation. For example, see what happens if we try to add something to a variable, but we don't include the step of overwriting the original value:

In [None]:
# Creating a variable "current year"
current_year = 2024
current_year

In [None]:
# Try getting to the future!
current_year + 1
current_year

In [None]:
# This code will actually "update" the year
current_year = current_year + 1
current_year

Whenever you create a new variable, you can always confirm what data type it is with the `type()` function. For example:

In [None]:
#Checking the type of the variable cats_in_house
type(cats_in_house)

It can be difficult to keep track of which variables you've initialized, but, fortunately, there is a trick you can use. Running ```%whos``` will give you the basic details for the variables that are active in your current session.

**Note**: this command will work in Google Colab Notebooks, but it can't be used with all ```.py``` files.

In [None]:
%whos

Variable       Type    Data/Info
--------------------------------
current_year   int     2026


### Variable naming guidelines
You can create a variable with almost any name, but there are a few guidelines that are recommended. First, variable names should be clear and descriptive.

For example, if we create a variable that stores the day of the month, it is helpful to give it a name that makes the value stored inside it obvious, something like `day_of_month`. From the computer's perspective, we could call the variable almost anything (`potato`, `bananafish`, `flat_tire`). As long as we are consistent, the code will execute the same. When it comes time to read, modify, and understand the code, however, it will be confusing to you and others. Consider this simple program that lets us compute the full-semester pay for an employee.

In [None]:
# Compute the semesterly wages for an employee
hours_per_week = 20
rate = 24
weeks_per_semester = 13

hours_per_week * rate * weeks_per_semester

We could write a program that is logically the same, but uses unhelpful variable names.

In [None]:
hotdogs = 20
sasquatch = 24
example = 13

hotdogs * sasquatch * example

This code gives us the same answer as the first example, but it is confusing. Not only does this code use variable names that make no sense, it also does not include any comments to explain what the code does. It is not clear that we would change `hotdogs` to set a different number of hours per week. It is not even clear what the purpose of the code **is**. As code gets longer and more complex, having clear variable names and explanatory comments is very important.

Keep in mind that abbreviation and shorthand is also likely to turn into nonsense when you forget the original context. Below is a third way to write the exact same program, which might be tempting in the moment, but could cause a lot of frustration to those who read the code in the future.

In [None]:
# money money money
hpw = 20
r = 24
wps = 13

hpw * r * wps

To recap: variable names should be clear, brief, and descriptive, so that you and everyone else who uses your code can easily remember them and recognize what they are meant to represent.

### Variable naming rules

In addition to the social best practices discused above, variable names must follow three basic rules that Python enforces:

1. Must be one word (no spaces allowed)
2. Only letters, numbers and the underscore character (\_) are allowed
3. Cannot begin with a number

Additionally, there are some "reserved words" in Python that you are not allowed to use for the names of variables (or for any other identifiers that you choose). These words are "reserved" because they are already used in the actual Python code. You can see a list of these words [here](https://www.w3schools.com/python/python_ref_keywords.asp). You should also be careful never to use Python function names (like `print`) as your variable names.

Finally, it's important to note that Python is case sensitive: ```new_integer``` and ```New_Integer``` are two completely different variables.

In [None]:
# Which of these variable names are acceptable?
# "Comment out" the variables that are not allowed in Python by putting a # before each line with an invalid variable name
# Then run this cell to check if the variable assignment works.
# If you get an error, the variable name is not allowed in Python.

$variable = 1
a variable = 2
a_variable = 3
4variable = 4
variable5 = 5
variable-6 = 6
variAble = 7
Avariable = 8


## Functions

Many different kinds of programs often need to do very similar operations. Instead of writing the same code over and over again, you can use a function. Essentially, a function is a small snippet of code that can be quickly referenced and reused, and that does some specific task.

There are three kinds of functions:
* Native functions built into Python
* Functions others have written that you can import
* Functions you write yourself

We have already used a couple of functions, `type()` and `print()`:

In [None]:
type(5.5)

In [None]:
print(5.5)

The above example just prints a float. We could also define a variable with our chosen float and then **pass** that variable to the `print()` function. It is common for functions to take an input, called an **argument**, that is placed inside the parentheses.

In [None]:
# Define a variable with a float and then print it
hours_worked = 5.5
print(hours_worked)

You can create local variables like hours_worked and wage_per_hour with float or integer values and run calculations with functions. We will learn how to write an expression to calculate the wages earned after working 5.5 hours this week for $15 an hour.


As we've already seen, you can use the variables that you create to perform calculations. For example, you might want to calculate your wages for the week by creating variables for your hours worked and your wage per hour, then multiplying them as in the code cell below.


In [None]:
# Write an expression to calculate the week's wages and then print it
hours_worked = 5.5
wage_per_hour = 15
wages_this_week = wage_per_hour * hours_worked
print(wages_this_week)



However, it is often more useful to have flexible code that can take different inputs. For example, you might **define a function** that can multiply any two input values.

This a preview to writing your own functions, which we will learn in the next workshop.

In [None]:
# Write a generalized function
def total_wages(wage_per_hour, hours_worked):
  return(wage_per_hour * hours_worked)

total_wages(15, 10)

In [None]:
# On your own, try running this function with a different set of values as the argument
# Replace the #s with the numbers you want to multiply
total_wages(##, ##)

In this lesson, we've covered several key concepts for working in Python: expressions, operators, data types, variables, and functions. In our next session, we'll start running more advanced forms of code so we can get a better sense of the logic behind Python. There are some practice exercises below to help you reinforce the concepts we've covered so far.

# Practice exercises

As you're learning code, it's important to try varying different things to see how your results change. The quick exercises will prompt you to test some variations, but you should also be experimenting on your own. Make a change, then think about how you anticipate it will impact your results, then see what happens.

Here are a few exercises to give you some more practice with these concepts. There is a solution key at the end of this notebook.

**Exercise One: Creating and Modifying Variables**

Use the code block below to **initialize** a variable called `my_favorite_number` whose value is your favorite number.

Now, **print** your new variable using the `print()` function in the code block below.

Now, **overwrite** `my_favorite_number` by using the variable in an expression. Specifically, assign a new value to `my_favorite_number` by adding the original variable to the year you were born.

What do you think the value of `my_favorite_number` is now? Fill in your answer here to practice editing a text cell. (Remember you can double-click on the cell to edit it.)

My guess on the value of `my_favorite_number`:

Finally, use the `print()` function in the code block below to print out the value for `my_favorite_number` and confirm your answer.

**Exercise Two: Performing calculations**

Below is code that will calculate and print out the number of hours in a specified number of days. First, run the code as it is, then try changing the number in the first line to calculate the number of hours in a different amount of weeks.

In [None]:
weeks = 3
days_in_week = 7
days_total = weeks * days_in_week
print(days_total)

In the block below, modify this code to instead calculate the number of minutes in a week.

In [None]:
# Modify this code to calculate the minutes in a week
weeks = 3
days_in_week = 7
days_total = weeks * days_in_week
print(days_total)



**Exercise Three: Writing a Generalized Function**

Earlier in this notebook, we looked at a simple program for calculating wages per semester. Run this code below to remind yourself of how the program works:

In [None]:
# Calculate pay per semester
hours_per_week = 20
rate = 24
weeks_per_semester = 13

hours_per_week * rate * weeks_per_semester

Now, let's make a generalized function that can calcluate pay per semester for any three input values. Fill in the code below to define this function, then try running it for 15 hours per week, $20 per hour, and 13 weeks per semester. You will know you have the function correct if the code executes without errors and returns a value of 3,900.

In [None]:
# Fill in this code to create a generalized function that calculates the number of days in a semester

# First replace the #s with the variables you want to multiply
def pay_per_semester(##, ##, ##):
  return(## * ## * ##)

# Then, fill in the three values you want to test the function on (15, 20, and 13)
pay_per_semester(##, ##, ##)

# Solutions
Here are some solutions for the exercises in this notebook. There are many different ways to approach coding, so you might have done something different. As long as the program runs correctly and you understand the concepts at stake, you're on the right track. You can make your code more efficient as you keep learning.

Exercise One:
The exact values will vary, but the overwritten value of `my_favorite_number` should be the original value plus the number you added.

In [None]:
# Exercise Two
weeks = 3
days_in_week = 7
hours_in_day = 24
minutes_in_hour = 60
minutes_total = weeks * days_in_week * hours_in_day * minutes_in_hour
print(minutes_total)

In [None]:
# Exercise Three

def pay_per_semester(hours_per_week, rate, weeks_per_semester):
  return(hours_per_week * rate * weeks_per_semester)


pay_per_semester(15, 20, 13)
