# Demos for Lecture 04: Data Types

In [None]:
import datascience
import numpy as np

## Names

In [None]:
(6.862 - 6.755)/6.755

If we make a calculation without using names to document the meanings, it can get very confusing! Compare the previous cell with the following:

Indiana's population in 2020 was 6.755 million. In 2023 it was 6.862 million. What was the growth rate of Indiana's population over this 3-year period?

In [None]:
IN_pop_millions_2020 = 6.755
IN_pop_millions_2023 = 6.862
growth_rate_over_3_years = IN_pop_millions_2023 / IN_pop_millions_2020 - 1
growth_rate_over_3_years

Descriptive names are important in code, because they help us communicate our ideas.

## Numbers

In [None]:
# 10 * 3 is an integer (data type is int)
print(type(10 * 3))
10 * 3

In [None]:
# 10 / 3 is a floating-point number (data type is float)
print(type(10 / 3))
10 / 3

Is it a mathematical fact that 10 divided by 3 equals 3.3333333333333335?

What's going on here? 

  - How many significant figures does Python use when storing a floating-point number?
  - What does "floating" mean in this context?
  - Is it a problem that Python can't represent 10 / 3 exactly? Would another programming language do a better job?
  - Do integers also suffer from finite precision and rounding errors?

In [None]:
# Pi is a float in Python; does Python know the EXACT value of Pi?
import math
math.pi

In [None]:
# Notice Pi Squared has the same number of significant digits as Pi
math.pi**2

In [None]:
# But when we raise an integer to an integer power, it picks up extra digits ...
12345**67

Explain what's happening in this next calculation:

In [None]:
123456789.111111111111111111111111111111111 - 123456789.1111111

In [None]:
# What about this one? Notice the use of scientific notation:
5e-30 + 5e30

Takeaway: If we need to make **exact** calculations in a Python program, we should use integers.

## Strings

In [None]:
# A string represents text
print(type("Flavor"))
"Flavor"

In [None]:
Flavor

In [None]:
# The text can have spaces
"Vanilla Flavor"

In [None]:
# or digits
"12345"

In [None]:
# or an apostrophe
"Don't forget to brush your teeth"

In [None]:
# or quotation marks
'She said, "For sure!"'

In [None]:
# we can assign a string value to a variable
greeting = "hello"
greeting

In [None]:
# we can concatenate strings with +
greeting + " " + greeting

In [None]:
greeting + " students"

In [None]:
# we can concatenate a string with itself three times
greeting * 3

In [None]:
# we can convert an integer to a string with the str() function
str(123)

In [None]:
# and we can convert a string of digits to an integer
int('123') + 456

In [None]:
# or to a float
float("3.14")

## Types

In [None]:
# what is the data type of 10?
type(10)

In [None]:
# mathematically, 10.0 is an integer. But Python stores it as a float, 
# because it has a decimal point
type(10.0)

In [None]:
# What is the type of "hi"?
type("hi")

In [None]:
# What type of data is returned by Table.read_table(...)?

from datascience import *
t = Table.read_table('nba_salaries.csv')
type(t)

In [None]:
# A Boolean value in Python is either True or False
type(True)

In [None]:
type(False)

In [None]:
# What type of data is enclosed by square brackets?
type(['cat', 3, True])  

Notice that the values in a list do not need to all be of the same data type. The list in the previous cell has a string, an integer, and a boolean.

In [None]:
# What type is abs?
type(abs)

In [None]:
# What type is datascience?
type(datascience)

## Arrays

An array is a sequence of values, all of the same type.

In [None]:
first_four = make_array(1, 2, 3, 4)
first_four

In [None]:
type(first_four)

When you have an array, you can perform Numpy operations on it:

In [None]:
np.average(first_four)

In [None]:
np.max(first_four)

In [None]:
np.sum(first_four)

In [None]:
first_four + 3

In [None]:
first_four * 10

In [None]:
first_four ** 2

In [None]:
second_four = make_array(5, 6, 7, 8)
print(first_four)
print(second_four)

In [None]:
first_four + second_four

In [None]:
first_four * second_four

In [None]:
# Use array.item() to index into an array
# Indexing starts at 0 in Python, so the item at index 1 is 
# the SECOND item in the array
animals = make_array('cat', 'dog', 'monkey')
animals.item(1)