# Data Types

In [None]:
from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')

## Arithmetic

In [None]:
2 + 3

In [None]:
2 * 3

In [None]:
2 ** 3

In [None]:
2 * * 3

In [None]:
(10 * 2) ** 3

### Floats

In [None]:
0.75 * 2

Computer cannot represent every real number exactly.  That would require infinite memory, since some numbers have an infinite number of digits.

In [None]:
1 / 3

In [None]:
2 / 0

#### Scientific Notation

Represent some numbers as $b \times 10^e$.

Examples:
* `1.23e5` is $1.23 \times 10^5$.
* `6.667e-07` is $6.67 \times 10^{-7}$.

In [None]:
2 / 3000

In [None]:
2 / 3000000

In [None]:
0.6666666666666666 - 0.6666666666666666123456789

In [None]:
0.000000000000000123456789

In [None]:
0.000000000000000000000000000000000000000000000000000000000000000000000123456789

#### Rounding Errors

Since numbers aren't always represented exactly, small errors may creap when we operated on floats.  Too small for us to worry about in this class.

In [None]:
2 ** 0.5

In [None]:
2 ** 0.5 * 2 ** 0.5

In [None]:
2 ** 0.5 * 2 ** 0.5 - 2

## Strings ##

String values capture text data (sequences of characters).  Use single quotes or double quotes around strings.

In [None]:
'Moo'

In [None]:
"Moo"

Variables vs Strings

In [None]:
print("moo") # String value

moo = 4   # variable named moo
print(moo)

Why both single and double quotes?

In [None]:
'Don't always use single quotes'

In [None]:
"Don't always use single quotes"

In [None]:
'cs' + '104' # concatenation

In [None]:
'cs' + ' ' +  '104' # spaces aren't added for you

Can only concatenate *two strings*.

In [None]:
number = 104
'cs' + number

*Convert* numbers to strings when you want to use them to build larger strings.

In [None]:
'cs' + str(number)

Can convert from string back to numbers as well.

In [None]:
int('3')

In [None]:
float('3.0')

In [None]:
int(str(number))

### Thought Questions

Suppose we start with these three variables

In [None]:
x = 3
y = '4'
z = '5.6'

What happens when we run the following cells?

In [None]:
x + int(y)

In [None]:
x + y

In [None]:
x + float(y + z)

In [None]:
x + int(y + z)

In [None]:
y + float(z)

## Type

Can ask for the type of a value or variable.

In [None]:
type(3)

In [None]:
temperature = 98.6
type(temperature)

## Arrays

Array:  sequence of values, all the same type, "boxed up"

In [None]:
plot = Table.read_table("data/hopkins-plot-0011.csv")
plot.column("count")

Arithmetic operations are *broadcast*

In [None]:
counts = plot.column("count")
counts * 2

In [None]:
counts + 5

In [None]:
other = make_array(1,2,3,4,5,6,7,8)
other

In [None]:
counts + other

In [None]:
len(counts)

In [None]:
max(counts)

In [None]:
min(counts)

In [None]:
sum(counts)

In [None]:
counts + make_array(1,2)

*Index* into array to retrieve items.  Indices start at 0.

In [None]:
counts.item(0)

In [None]:
counts.item(1)

In [None]:
counts.item(3)

Think of `item(n)` as asking for the item that has `n` items before it.

## Table Operation: column

How many red maples are in Hopkins Forest? What are the top ten plots for read maples?

In [None]:
trees = Table.read_table("data/hopkins-trees.csv")

In [None]:
trees.where("common name", are.equal_to("Maple, red"))

In [None]:
red_maples = trees.where("common name", are.equal_to("Maple, red")).select("plot", "count").sort("count", descending=True)
red_maples

In [None]:
sum(red_maples.column("count"))

In [None]:
np.average(red_maples.column("count"))

In [None]:
red_maples.barh("plot", "count")

## Table Operation: take

In [None]:
top_ten = red_maples.take(make_array(0,1,2,3,4,5,6,7,8,9))
top_ten

In [None]:
top_ten.barh("plot", "count")

### Creating Ranges

What if I wanted the top 50?  `make_array(0,1,2,...,49)`?  Ugh.
We can make an array for a *range* of numbers with `np.arange(low,high)`, which gives us the integers in the range `[low,high)`.

In [None]:
np.arange(0,10)

In [None]:
top_ten = red_maples.take(np.arange(0,10))
top_ten

See other forms of ranges in [book](https://inferentialthinking.com/chapters/05/2/Ranges.html).