# Variables and values

Elements of Data Science

by [Allen Downey](https://allendowney.com)

[MIT License](https://opensource.org/licenses/MIT)

## Numbers

This notebook introduces the most fundamental tools for working with data: representing numbers and other values, and performing arithmetic operations.

Python provides tools for working with numbers, words, dates, times, and locations (latitude and longitude).

Let's start with numbers.  Python can handle several types of numbers, but the two most common are:

* `int`, which represents integral values like `3`, and
* `float`, which represents numbers that have a fraction part, like `3.14159`.

Most often, we use `int` to represent counts and `float` to represent measurements.
Here's an example of an `int` and a `float`:

In [1]:
3

3

In [2]:
3.14159

3.14159

 `float` is short for "floating-point", which is the name for the way these numbers are stored.

## Arithmetic

The operators that perform addition and subtraction are `+` and `-`:

In [3]:
2 + 1

3

In [4]:
2 - 1

1

The operators that perform multiplication and division are `*` and `/`:

In [5]:
2 * 3

6

In [6]:
2 / 3

0.6666666666666666

And the operator for exponentiation is `**`:

In [7]:
2**3

8

Unlike math notation, Python does not allow "implicit multiplication".  For example, in math notation, if you write $3 (2 + 1)$, that's understood to be the same as $3 \times (2+ 1)$.

Python does not allow that notation:

In [8]:
3 (2 + 1)

TypeError: 'int' object is not callable

In this example, the error message is not very helpful, which is why I am warning you now.  If you want to multiply, you have to use the `*` operator:

In [None]:
3 * (2 + 1)

The arithmetic operators follow the rules of precedence you might have learned as "PEMDAS":

* Parentheses before
* Exponentiation before
* Multiplication and division before
* Addition and subtraction

So in this expression:

In [None]:
1 + 2 * 3

The multiplication happens first.  If that's not what you want, you can use parentheses to make the order of operations explicit:

In [None]:
(1 + 2) * 3

**Exercise:** Write a Python expression that raises `1+2` to the power `3*4`.  The answer should be `531441`.

In [None]:
# Solution

(1+2) ** (3*4)

## Math functions

Python provides functions that compute all the usual mathematical functions, like `sin` and `cos`, `exp` and `log`.

Actually, it provides two versions of these functions, in two different libraries, called `math` and `numpy`.  The version we'll use is `numpy`, which stands for "numerical python", and is pronounced "num' pie".

Before we can use `numpy`, we have to "import" it like this:

In [None]:
import numpy as np

It is conventional to import `numpy` **as** `np`, which means we can refer to it by the short name `np` rather than the longer name `numpy`.

As an example, we can use `np` to read the value `pi`, which is an approximation of the mathematical constant $\pi$.

In [None]:
np.pi

The result is a `float` with 16 digits.  As you probably know, we can't represent $\pi$ with a finite number of digits, so this result is only approximate. 

`numpy` provides `log`, which computes the natural logarithm, and `exp`, which raises the constant `e` to a power.

In [None]:
np.exp(1)

In [None]:
np.log(100)

**Exercise:** Use these functions to confirm the mathematical identity $\log(e^x) = x$, which should be true for any value of $x$.

With floating-point values, this identity should work for values of $x$ between -700 and 700.  What happens when you try it with larger and smaller values?

In [None]:
# Solution

np.log(np.exp(-100))

As this example shows, floating-point numbers are finite approximations, which means they don't always behave like math.

As another example, see what happens when you add up `0.1` three times:

In [None]:
0.1 + 0.1 + 0.1

The result is close to `0.3`, but not exact.  We will see other examples of floating-point approximation later, and learn some ways to deal with it.

## Variables

A variable is a name that refers to a value.

The following statement assigns the `int` value 5 to a variable named `x`:

In [None]:
x = 5

The variable we just created has the name `x` and the value 5.

If a variable name appears at the end of a cell, Jupyter displays its value.

In [None]:
x

If we use `x` as part of an arithmetic operation, it represents the value 5:

In [None]:
x + 1

In [None]:
x**2

We can also use `x` with `numpy` functions:

In [None]:
np.exp(x)

Notice that the result from `exp` is a floating-point number, even though the value of `x` is an integer.

## Calculation with variables

Now let's use variables to solve a problem involving mathematical calculation.

Suppose we have the following formula for computing compound interest [from Wikipedia](https://en.wikipedia.org/wiki/Compound_interest#Periodic_compounding):

"The total accumulated value, including the principal sum $P$ plus compounded interest $I$, is given by the formula:

$V=P\left(1+{\frac {r}{n}}\right)^{nt}$

where:

* $P$ is the original principal sum
* $V$ is the total accumulated value
* $r$ is the nominal annual interest rate
* $n$ is the compounding frequency
* $t$ is the overall length of time the interest is applied (expressed using the same time units as $r$, usually years).

"Suppose a principal amount of \$1,500 is deposited in a bank paying an annual interest rate of 4.3\%, compounded quarterly.
Then the balance after 6 years is found by using the formula above, with

In [None]:
P = 1500
r = 0.043
n = 4
t = 6

We can compute the total accumulated value by translating the mathematical formula into Python syntax:

In [None]:
P * (1 + r/n)**(n*t)

**Exercise:** Continuing the example from Wikipedia:

"Suppose the same amount of \$1,500 is compounded biennially", so `n = 1/2`.  

What would the total value be after 6 years?  Hint: we expect the answer to be a bit less than the previous answer.

In [None]:
# Solution

n = 1/2
P * (1 + r/n)**(n*t)

**Exercise:** If interest is compounded continuously, the value after time $t$ is given by the formula:

$V=P~e^{rt}$

Translate this function into Python and use it compute the value of the investment in the previous example with continuous compounding.  Hint: we expect the answer to be a bit more than the previous answers.

In [None]:
# Solution

P * np.exp(r*t)

**Exercise** Applying your algebra skills, solve the previous equation for $r$.  Now use the formula you just derived to answer this question.

"Harvard's tuition in 1970 was \$4,070 (not including room, board, and fees).  

"In 2019 it is \$46,340.  What was the annual rate of increase over that period, treating it as if it had compounded continuously?"

In [None]:
# Solution

P = 4070
V = 46340
t = (2019 - 1970)
np.log(V/P) / t

The point of this exercise is to practice using variables.  But it is also a reminder about logarithms, which come up all the time when you do data science work.