# Getting started with Python: Variables, Expressions and Statements



## Values and types

A value is one of the basic things a program works with, like a letter or a number. The values are, for example, 1, 2, and "Hello, World!"

These values belong to different types: 2 is an integer, and "Hello, World!" is a string, so called because it contains a "string" of letters. You (and the interpreter) can identify strings because they are enclosed in quotation marks.

The __print__ statement also works for integers. 


In [2]:
print(4)

4


If you are not sure what type a value has, the interpreter can tell you.

In [3]:
type('Hello, world!')

str

In [4]:
type(17)

int

Not surprisingly, strings belong to the type __str__ and integers belong to the type __int__. Less obviously, numbers with a decimal point belong to a type called __float__, because these numbers are represented in a format called floating point.

In [5]:
type(3.2)

float

What about values like '17' and "3.2"? They look like numbers, but they are in quotation marks like strings.

In [6]:
type('17')

str

In [1]:
type("3.2")

str

They're strings.

When you type a large integer, you might be tempted to use commas between groups of three digits, as in 1,000,000. This is not a legal integer in Python, but it is legal:

In [9]:
print(1,000,000)

1 0 0


Well, that's not what we expected at all! Python interprets 1,000,000 as a comma-separated sequence of integers, which it prints with spaces between.

This is the first example we have seen of a semantic error: the code runs without producing an error message, but it doesn't do the "right" thing.

## Variables

One of the most powerful features of a programming language is the ability to manipulate variables. A variable is a name that refers to a value.

An assignment statement creates new variables and gives them values:

message = 'It is a message'
n = 17

This example makes two assignments. The first assigns a string to a new variable named message; the second assigns the integer 17 to n.

To display the value of a variable, you can use a print statement:

In [3]:
print(n)

17


The type of a variable is the type of the value it refers to.

In [4]:
type(message)

str

In [5]:
type(n)

int

## Variable names and keywords

Programmers generally choose names for their variables that are meaningful and document what the variable is used for.

Variable names can be arbitrarily long. They can contain both letters and numbers, but they cannot start with a number. It is legal to use uppercase letters, but it is a good idea to begin variable names with a lowercase letter (you'll see why later).

The underscore character ( _ ) can appear in a name. It is often used in names with multiple words, such as my_name or airspeed_of_unladen_swallow. Variable names can start with an underscore character, but we generally avoid doing this unless we are writing library code for others to use.

If you give a variable an illegal name, you get a syntax error:

In [6]:
76trombones = 'big parade'

SyntaxError: invalid syntax (<ipython-input-6-ee59a172c534>, line 1)

In [7]:
more@ = 1000000

FileNotFoundError: [Errno 2] No such file or directory: '@ = 1000000'

In [8]:
class = 'Advanced Theoretical Zymurgy'

SyntaxError: invalid syntax (<ipython-input-8-73fc4ce1a15a>, line 1)

76trombones is illegal because it begins with a number. more@ is illegal because it contains an illegal character, @. But what's wrong with class?

It turns out that class is one of Python's keywords. The interpreter uses keywords to recognize the structure of the program, and they cannot be used as variable names.

Python reserves 35 keywords:

and       del       from      None      True
as        elif      global    nonlocal  try
assert    else      if        not       while
break     except    import    or        with
class     False     in        pass      yield
continue  finally   is        raise     async
def       for       lambda    return    await

You might want to keep this list handy. If the interpreter complains about one of your variable names and you don't know why, see if it is on this list.

## Statements

A statement is a unit of code that the Python interpreter can execute. We have seen two kinds of statements: print being an expression statement and assignment.

When you type a statement in interactive mode, the interpreter executes it and displays the result, if there is one.

A script usually contains a sequence of statements. If there is more than one statement, the results appear one at a time as the statements execute.

For example, look at the output which is produced by the script 


In [9]:
print(1)
x=2
print(x)

1
2


The assignment statement produces no output.

## Operators and operands

Operators are special symbols that represent computations like addition and multiplication. The values the operator is applied to are called operands.

The operators +, -, *, /, and ** perform addition, subtraction, multiplication, division, and exponentiation, as in the following examples:


In [10]:
20+32

52

In [25]:
2**4

16

In [12]:
hour=1
minute=50
hour*60+minute

110

The result of the division is a floating point result

In [13]:
minute/60

0.8333333333333334

Floored (// integer) division divide two integers and truncate the result to an integer:

In [14]:
minute//60

0

See the difference

In [15]:
5//2

2

In [16]:
5/2

2.5

In [17]:
5%2

1

The remainder of a division can be discovered using the operator %. The *modulus operator* works on integers and yields the remainder when the first operand is divided by the second. In Python, the modulus operator is a percent sign (%). So 5 divided by 2 is 2 with 1 left over.

The modulus operator turns out to be surprisingly useful. For example, you can check whether one number is divisible by another: if x % y is zero, then x is divisible by y.

You can also extract the right-most digit or digits from a number. For example, x % 10 yields the right-most digit of x (in base 10). Similarly, x % 100 yields the last two digits.

## Expressions

An expression is a combination of values, variables, and operators. 

If you type an expression in interactive mode, the interpreter evaluates it and displays the result:

In [22]:
x = 17

In [24]:
x + 7

24

But in a script, an expression all by itself doesn't do anything! __This is a common source of confusion for beginners.__

## Order of operations

When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. For mathematical operators, Python follows mathematical convention. The acronym PEMDAS is a useful way to remember the rules:

- __P__arentheses have the highest precedence and can be used to force an expression to evaluate in the order you want. Since expressions in parentheses are evaluated first, 2 \* (3-1) is 4, and (1+1)\*\*(5-2) is 8. You can also use parentheses to make an expression easier to read, as in (minute * 100) / 60, even if it doesn't change the result.

- __E__xponentiation has the next highest precedence, so 2\*\*1+1 is 3, not 4, and 3\*1\*\*3 is 3, not 27.

- __M__ultiplication and __D__ivision have the same precedence, which is higher than __A__ddition and __S__ubtraction, which also have the same precedence. So 2*3-1 is 5, not 4, and 6+4/2 is 8, not 5.

- Operators with the same precedence are evaluated from left to right. So the expression 5-3-1 is 1, not 3, because the 5-3 happens first and then 1 is subtracted from 2.

When in doubt, always put parentheses in your expressions to make sure the computations are performed in the order you intend.

## String operations

The + operator works with strings, but it is not addition in the mathematical sense. Instead it performs concatenation, which means joining the strings by linking them end to end. For example:

In [26]:
first = 10
second = 15
print(first + second)

25


In [27]:
first = '10'
second = '15'
print(first + second)

1015


The \* operator also works with strings by multiplying the content of a string by an integer. For example:

In [28]:
print(first * 3)

101010


## Asking the user for input

Sometimes we would like to take the value for a variable from the user via their keyboard. Python provides a built-in function called __input__ that gets input from the keyboard1. When this function is called, the program stops and waits for the user to type something. When the user presses Return or Enter, the program resumes and input returns what the user typed as a string.

In [29]:
a = input()

something


In [30]:
print(a)

something


In [31]:
print(type(a))

<class 'str'>


Before getting input from the user, it is a good idea to print a prompt telling the user what to input. You can pass a string to input to be displayed to the user before pausing for input:

In [33]:
name = input('What is your name?\n')

What is your name?
Olga


In [34]:
print(name)

Olga


The sequence \n at the end of the prompt represents a newline, which is a special character that causes a line break. That's why the user's input appears below the prompt.

If you expect the user to type an integer, you can try to convert the return value to int using the int() function:

In [35]:
prompt = 'How old are you?\n'
year = input(prompt)
print(year, type(year))

How old are you?
23
23 <class 'str'>


In [39]:
y = int(year)
print(y, type(y))

23 <class 'int'>


In [40]:
year + 5

TypeError: must be str, not int

In [41]:
y + 5

28

## Comments

As programs get bigger and more complicated, they get more difficult to read. Formal languages are dense, and it is often difficult to look at a piece of code and figure out what it is doing, or why.

For this reason, it is a good idea to add notes to your programs to explain in natural language what the program is doing. These notes are called comments, and in Python they start with the number sign # symbol:


In [None]:
# compute the percentage of the hour that has elapsed
percentage = (minute * 100) / 60

In this case, the comment appears on a line by itself. You can also put comments at the end of a line:

In [None]:
percentage = (minute * 100) / 60     # percentage of an hour

Everything from the # to the end of the line is ignored; it has no effect on the program.

Comments are most useful when they document non-obvious features of the code. It is reasonable to assume that the reader can figure out what the code does; it is much more useful to explain why.

This comment is redundant with the code and useless:

In [None]:
v = 5     # assign 5 to v

This comment contains useful information that is not in the code:

In [None]:
v = 5     # velocity in meters/second

Good variable names can reduce the need for comments, but long names can make complex expressions hard to read, so there is a trade-off.

## Choosing mnemonic variable names

As long as you follow the simple rules of variable naming, and avoid reserved words, you have a lot of choice when you name your variables. In the beginning, this choice can be confusing both when you read a program and when you write your own programs. For example, the following three programs are identical in terms of what they accomplish, but very different when you read them and try to understand them.

In [None]:
a = 35.0
b = 12.50
c = a * b
print(c)

In [None]:
hours = 35.0
rate = 12.50
pay = hours * rate
print(pay)

In [None]:
x1q3z9ahd = 35.0
x1q3z9afd = 12.50
x1q3p9afd = x1q3z9ahd * x1q3z9afd
print(x1q3p9afd)

The Python interpreter sees all three of these programs as exactly the same but humans see and understand these programs quite differently. Humans will most quickly understand the intent of the second program because the programmer has chosen variable names that reflect their intent regarding what data will be stored in each variable.

We call these wisely chosen variable names "mnemonic variable names". The word *mnemonic* means "memory aid". We choose mnemonic variable names to help us remember why we created the variable in the first place.

While this all sounds great, and it is a very good idea to use mnemonic variable names, mnemonic variable names can get in the way of a beginning programmer's ability to parse and understand code. This is because beginning programmers have not yet memorized the reserved words (there are only 33 of them) and sometimes variables with names that are too descriptive start to look like part of the language and not just well-chosen variable names.

Take a quick look at the following Python sample code which loops through some data. We will cover loops soon, but for now try to just puzzle through what this means:

In [None]:
for word in words:
    print(word)

What is happening here? Which of the tokens (for, word, in, etc.) are reserved words and which are just variable names? Does Python understand at a fundamental level the notion of words? Beginning programmers have trouble separating what parts of the code must be the same as this example and what parts of the code are simply choices made by the programmer.

The following code is equivalent to the above code:

In [None]:
for slices in pizza:
    print(slices)

It is easier for the beginning programmer to look at this code and know which parts are reserved words defined by Python and which parts are simply variable names chosen by the programmer. It is pretty clear that Python has no fundamental understanding of pizza and slices and the fact that a pizza consists of a set of one or more slices.

But if our program is truly about reading data and looking for words in the data, pizza and slice are very un-mnemonic variable names. Choosing them as variable names distracts from the meaning of the program.

After a pretty short period of time, you will know the most common reserved words and you will start to see the reserved words jumping out at you. Many text editors are aware of Python syntax and will color reserved words differently to give you clues to keep your variables and reserved words separate. After a while you will begin to read Python and quickly determine what is a variable and what is a reserved word.

## Debugging

At this point, the syntax error you are most likely to make is an illegal variable name, like class and yield, which are keywords, or odd~job and US$, which contain illegal characters.

If you put a space in a variable name, Python thinks it is two operands without an operator:

In [42]:
bad name = 5

SyntaxError: invalid syntax (<ipython-input-42-67877bfd5d2a>, line 1)

In [43]:
month=09

SyntaxError: invalid token (<ipython-input-43-f8ebf3f12c93>, line 1)

For syntax errors, the error messages don't help much. The most common messages are __SyntaxError: invalid syntax__ and __SyntaxError: invalid token__, neither of which is very informative.

The runtime error you are most likely to make is a "use before def;" that is, trying to use a variable before you have assigned a value. This can happen if you spell a variable name wrong:

In [44]:
principal = 327.68
interest = principle * rate

NameError: name 'principle' is not defined

Variables names are case sensitive, so LaTeX is not the same as latex.

At this point, the most likely cause of a semantic error is the order of operations. For example, to evaluate 1/2$\pi$, you might be tempted to write

In [None]:
1.0 / 2.0 * pi

But the division happens first, so you would get $\pi$/2, which is not the same thing! There is no way for Python to know what you meant to write, so in this case you don't get an error message; you just get the wrong answer.

## Glossary

__assignment__

A statement that assigns a value to a variable. 

__concatenate__

To join two operands end to end. 

__comment__

Information in a program that is meant for other programmers (or anyone reading the source code) and has no effect on the execution of the program. 

__evaluate__

To simplify an expression by performing the operations in order to yield a single value. 

__expression__

A combination of variables, operators, and values that represents a single result value. 

__floating point__

A type that represents numbers with fractional parts. 

__integer__

A type that represents whole numbers. 

__keyword__

A reserved word that is used by the compiler to parse a program; you cannot use keywords like if, def, and while as variable names. 

__mnemonic__

A memory aid. We often give variables mnemonic names to help us remember what is stored in the variable. 

__modulus operator__

An operator, denoted with a percent sign (%), that works on integers and yields the remainder when one number is divided by another. 

__operand__

One of the values on which an operator operates. 

__operator__

A special symbol that represents a simple computation like addition, multiplication, or string concatenation. 

__rules of precedence__

The set of rules governing the order in which expressions involving multiple operators and operands are evaluated. 

__statement__

A section of code that represents a command or action. So far, the statements we have seen are assignments and print expression statement. 

__string__

A type that represents sequences of characters. 

__type__

A category of values. The types we have seen so far are integers (type int), floating-point numbers (type float), and strings (type str). 

__value__

One of the basic units of data, like a number or string, that a program manipulates. 

__variable__

A name that refers to a value. 