# Introduction to Python and JupyterLab

## Skills

1. **Understand datatypes and basic python operations.**
3. **Store data in variables.**
2. **Use basic functions.**
4. Manipulate strings using string methods.
6. Use lists and tuples to store multiple pieces of data.

## Vocabulary List

**argument.** An input to a **function.**

**int** and **float.** Data types representing numbers. An `int` is an integer, while a `float` represents any decimal number (including integers like 3.0)

**function.** A piece of code which takes in some input and changes it into an output. Its inputs are called **arguments** and it **returns** the output. You may have seen functions like $\sin(\theta)$ in a trigonometry course, which takes an angle as an argument and returns a value between -1 and 1.

**list.** A Python data type which can contain multiple values, not just one.

**keyword argument.** An optional argument of a function which must be named in order to be used. They are always written as `keyword=value`. If you call a function `printOut(data, color="blue")`, the function is called `printOut`, a variable named `data` is one argument, and `"blue"` is the keyword argument for `color=`.

**module.** A collection of functions, data, and possibly many other things that give python more functionality. There are tens of thousands of modules which have been written by the python community which do everything from working with websites to generating art.

**object.** A specialized data type which can contain anything and also have their own associated functions called methods. In this course, we'll be working with objects that represent large amounts of data, and others which perform specialized functions for us, like reading and transforming that data.

**return value.** The output of a **function.**

**string.** A data type representing text. Strings are made of individual characters, enclosed within either single `'` or double `"` quotation marks.

**variable.** A container that stores a piece of information, which can be an int, a float, a string, or many other kinds of objects that we'll learn about. That information can be changed or called upon later by the user. As the programmer, you get to decide what to name your variable, which acts like a label on a box to find it later.


## The Beginning

This is a cell with code in it. If you run it, you should see the output 8

To run code, type Shift+Enter, or press the play button at the top of the notebook window.


In [None]:
3 + 5

8

## Numbers and Strings

In this course, we'll be working with data which represent both text and numbers. We'll use two types of numbers in Python: integers (`int`s) and decimal numbers (`float`s). Python handles them *almost* the same way. I'll point out the times when the distinction matters.

We'll also be working with text *quite a lot*. That is the point of this course, after all. Python calls texts `string`s, because they are strings of individual characters.

In [None]:
# This is an int. Note the lack of decimal point in the output.
type(3)

int

These are both floats, even though the second is a whole number.

In [None]:
type(5.2)

float

In [None]:
3.0

When writing a string in python, put it inside of quotation marks, either single `'`, or double `"` quotes. Anything inside of quotation marks is a string. There's no difference between them.

In [None]:
type("hello")

str

In [None]:
'3'

## Basic Numeric Operations
Using the number types (int and float), we can perform typical mathematical operations between numbers: addition (`+`), subtraction (`-`), multiplication (`*`), division (`/`), integer division (`//`), and remainder or "modulo" (`%`):

In [None]:
3 + 2

In [None]:
4 * 4

In [None]:
7 / 2

In [None]:
7 // 2

3

In [None]:
7 % 2

1

Which can be nested inside of parentheses `()`:

In [None]:
(3 + 5)/4


2.0

You can also add strings together:

In [None]:
"hello" + "goodbye"

'hellogoodbye'

Even though this course's goal is to understand texts, performing operations on numbers will be very important. Perhaps we want to know the average length of words in a document, or what the most common words are. Those questions will require us to do simple math or counting to get an answer.

## Variables

A variable is a way to store data for usage later. You can think of a variable as a box that you have put data into, that has a name so you know where to find it later. To assign data to a variable, you use the equals sign `=` :

`variable_name = data`

The variable's name goes on the left. A variable can be named almost anything, can only consist of letters, numbers, and the underscore `_` character, and the first character can't be a number.

After the equal sign can go any expression, including just a simple number or string (e.g. "hello" or 15), or it can be a function output (which will get into in a moment), or even another variable.

When you assign a variable, there's no Python output, so if you see any output, then *you have not stored anything in a variable*. Let's see a few examples:

In [None]:
name = "Donovan Dutcher"

In [None]:
name

'Donovan Dutcher'

In [None]:
this = 3
that = 5

In [None]:
this + that

8

In [None]:
that = this
this + that

6

Storing individual numbers and strings in a variable may not seem particularly useful, and you're right. It will become much more powerful when we start looking at much larger, more complex data, like spreadsheets or entire books in them.

## Comparators

Comparators compare two values, and tells you if the statement is true or false. The most common ones are:
* `>` greater than
* `<` less than
* `==` equal to (if there's only one `=`, then you're assigning to a variable)
* `!=` not equal to
* `>=` and `<=` greater than or equal to and less than or equal to.

Here's are simple examples: 3 is less than 5, and 7 is not greater than 12:

In [None]:
3 < 5

In [None]:
7 > 12

Comparators will be useful when we start doing analysis on large datasets. If I want to find all of the movies before 1990, then I'll use something like `year < 1990` to find the entries I want. It will also be important when we want to sort things from high to low; Python's sorting systems require an idea of "which of these is larger?".

Also note that Python will allow comparators to work on strings as well, using alphabetical order:

In [None]:
"Alex" < "Brianna"

True

## Functions

Below is a **function**, called `max()`, which tells us the largest of two or more pieces of data:

In [None]:
A = 5
B = 12
C = 8

max(A, B, C)

12

Functions are pieces of code that take in some data, do some work, and then give a result back to the user. What they do can range from very simple, like the `max()` function, to very complex. In the next section of the course, we'll learn functions that transform and plot large quantities of data.

The inputs to a function are called its **arguments** (like the 3, 12, and 15 above), and the output of the function is called its **return value** (above, the return value was 12). The arguments are put in parentheses after the name of the function, and are separated by commas. Even if there are no arguments for a function, the parentheses are still necessary.

Another simple function is the `len()` function, which tells you the length of a *string* or **list**, a type we'll get to in a bit. Its input are the string you want the length of, and its return value is an *int*, that string's length:

In [None]:
course = "DIDA 210 - Digital Text Analysis"

len(course)

#### The `print()` function

You may have noticed that if you have two or more lines of code in a cell, only the last one gets output. The `print()` function gives us a way to do that. If you have multiple `print()` functions, you will see all of them, one per line:

In [None]:
print( 200 + 10 )
print( course )
print( len(course) )

#### More with Strings
Here are just a few miscellaneous properties that *strings* have:
* You can add a newline to a string with the text `\n`. This counts as a single character, and is called an *escape sequence*.
* `\'` and `\"` can be used to add quotation marks, as necessary.
* If you want to add a backslash to a string, use two `\\`.
* Python will happily work with non-English Unicode characters.
* If you want to add a multiline text in your code, enclose it in triple quotes `"""`

In [None]:
print("line one with a forward slash (/)\nline two with a backslash (\\)")

line one with a forward slash (/)
line two with a backslash (\)


In [None]:
text = "私は日本に行きます"
print(text)

In [None]:
text = """The beginning of the text.
The end of the text."""

print(text)