# Lab 2: Data Types

Welcome to Lab 2!  

Last time, we had our first look at Python and Jupyter notebooks. 

In this lab, we are going to look at a few different data types including numbers and text.  A piece of text is called a *string* in Python.

Last, you'll learn more about working with datasets in Python.

In [None]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install git+https://github.com/grading/gradememaybe.git

First, initialize the grader. Each time you come back to this site to work on the lab, you will need to run this cell again.

In [None]:
from gofer.ok import check

## 1. Numbers

Quantitative information arises everywhere in data science. In addition to representing commands to print out lines, expressions can represent numbers and methods of combining numbers. The expression `3.2500` evaluates to the number 3.25. (Run the cell and see.)

In [None]:
3.2500

Notice that we didn't have to `print`. When you run a notebook cell, if the last line has a value, then Jupyter helpfully prints out that value for you. However, it won't print out prior lines automatically. If you want to print out a prior line, you need to add the `print` statement. Run the cell below to check.

In [None]:
print(2)
3
4

Above, you should see that 4 is the value of the last expression, 2 is printed, but 3 is lost forever because it was neither printed nor last.

### 1.1. Arithmetic
The line in the next cell subtracts.  Its value is what you'd expect.  Run it.

In [None]:
3.25 - 1.5

Many basic arithmetic operations are built in to Python.  The textbook section on [Expressions](http://www.inferentialthinking.com/chapters/03/1/expressions.html) describes all the arithmetic operators used in the course.  The common operator that differs from typical math notation is `**`, which raises one number to the power of the other. So, `2**3` stands for $2^3$ and evaluates to 8. 

The order of operations is what you learned in elementary school, and Python also has parentheses.  For example, compare the outputs of the cells below. 

In [None]:
6+6*5-6*3**2*2**3/4*7

In [None]:
6+(6*5-(6*3))**2*((2**3)/4*7)

In standard math notation, the first expression is

$$6 + 6 \times 5 - 6 \times 3^2 \times \frac{2^3}{4} \times 7,$$

while the second expression is

$$6 + (6 \times 5 - (6 \times 3))^2 \times (\frac{(2^3)}{4} \times 7).$$

**Question 1** <br /> Write a Python expression in this next cell that's equal to $5 \times (4 \frac{10}{11}) - 51 \frac{1}{3} + 2^{.5 \times 22} + \frac{26}{33}$.  That's five times four and ten elevenths, minus 51 and a third, plus two to the power of half of 22, plus 26 33rds.  By "$4 \frac{10}{11}$" we mean $4+\frac{10}{11}$, not $4 \times \frac{10}{11}$.

Replace the ellipses (`...`) with your expression.  Try to use parentheses only when necessary.

*Hint:* The correct output should be a familiar number.

In [None]:
expression = ...
expression

In [None]:
# Test cell; please do not change!
check('tests/q1.py')

## 2. Names (may also be referred to as variables)
In natural language, we have terminology that lets us quickly reference very complicated concepts.  We don't say, "That's a large mammal with brown fur and sharp teeth!"  Instead, we just say, "Bear!"

Similarly, an effective strategy for writing code is to define names for data as we compute it, like a lawyer would define terms for complex ideas at the start of a legal document to simplify the rest of the writing.

In Python, we do this with *assignment statements*. An assignment statement has a name on the left side of an `=` sign and an expression to be evaluated on the right.

In [None]:
ten = 3 * 2 + 4

When you run that cell, Python first evaluates the first line.  It computes the value of the expression `3 * 2 + 4`, which is the number 10.  Then it gives that value the name `ten`.  At that point, the code in the cell is done running.

After you run that cell, the value 10 is bound to the name `ten`:

In [None]:
ten

The statement `ten = 3 * 2 + 4` is not asserting that `ten` is already equal to `3 * 2 + 4`, as we might expect by analogy with math notation.  Rather, that line of code changes what `ten` means; it now refers to the value 10, whereas before it meant nothing at all.

If the designers of Python had been ruthlessly pedantic, they might have made us write

    define the name ten to hereafter have the value of 3 * 2 + 4 

instead.  You will probably appreciate the brevity of "`=`"!  But keep in mind that this is the real meaning.

### 2.1 Syntax and Style

A common pattern in Jupyter notebooks is to assign a value to a name and then immediately evaluate the name in the last line in the cell so that the value is displayed as output. 

In [None]:
close_to_pi = 355/113
close_to_pi

Another common pattern is that a series of lines in a single cell will build up a complex computation in stages, naming the intermediate results.  This is good style particularly when beginning as it allows for easy reading and identification of steps in problem solving.  

In [None]:
bimonthly_salary = 840
monthly_salary = 2 * bimonthly_salary
number_of_months_in_a_year = 12
yearly_salary = number_of_months_in_a_year * monthly_salary
yearly_salary

Note you could have named these variables however you wanted, as long as the name started out with a letter.  

**However**, names are very important for making your code *readable* to yourself and others.  The cell above is shorter, but it's totally useless without an explanation of what it does.

According to a famous joke among computer scientists, naming things is one of the two hardest problems in computer science.  (The other two are cache invalidation and "off-by-one" errors.  And people say computer scientists have an odd sense of humor...)

**Question 2** <br /> Assign the name `seconds_since_2000` to the number of seconds between midnight January 1, 2000 and midnight January 1, 2022. Use Python to perform any required arithmetic.

*Hint 1:* If you're stuck, the next section shows you how to get hints.
*Hint 2:* There are 6 leap years between 2000 and 2022.  

In [None]:
# Change the next line so that it computes the number of
# seconds in the last 22 years and assigns that number the name
# seconds_since_2000
seconds_since_2000 = ...

# We've put this line in this cell so that it will print
# the value you've given to seconds_since_2000 when you
# run it.  You don't need to change this.
seconds_since_2000

Running the following cell will test whether you have assigned `seconds_since_2000` correctly in Question 2. 

Sometimes the tests will give hints about what went wrong. If the test doesn't pass, read the output, adjust your answer to the question, run the answer cell again to update the name `seconds_since_2000`, then run this test cell again.

Sometimes the tests will tell you the answer. Rather than copying the answer, try to understand how it was reached. 

In [None]:
# Test cell; please do not change!
check('tests/q2.py')

You may have noticed this line in the cell above:

    # Test cell; please do not change!

That is called a *comment*.  It doesn't make anything happen in Python; Python ignores anything on a line after a #.  Instead, it's there to communicate something about the code to you, the human reader.  Comments are extremely useful.

### 2.2 Application: A physics experiment

On the Apollo 15 mission to the Moon, astronaut David Scott famously replicated Galileo's physics experiment in which he showed that gravity accelerates objects of different mass at the same rate. Because there is no air resistance for a falling object on the surface of the Moon, even two objects with very different masses and densities should fall at the same rate. David Scott compared a feather and a hammer.

You can run the following cell to watch a video of the experiment.

In [None]:
from IPython.display import YouTubeVideo
# The original URL is:
#   https://www.youtube.com/watch?v=U7db6ZeLR5s
YouTubeVideo("U7db6ZeLR5s")

Here's the transcript of the video:

**167:22:06 Scott**: Well, in my left hand, I have a feather; in my right hand, a hammer. And I guess one of the reasons we got here today was because of a gentleman named Galileo, a long time ago, who made a rather significant discovery about falling objects in gravity fields. And we thought where would be a better place to confirm his findings than on the Moon. And so we thought we'd try it here for you. The feather happens to be, appropriately, a falcon feather for our Falcon. And I'll drop the two of them here and, hopefully, they'll hit the ground at the same time. 

**167:22:43 Scott**: How about that!

**167:22:45 Allen**: How about that! (Applause in Houston)

**167:22:46 Scott**: Which proves that Mr. Galileo was correct in his findings.

**Newton's Law.** Using this footage, we can also attempt to confirm another famous bit of physics: Newton's law of universal gravitation. Newton's laws predict that any object dropped near the surface of the Moon should fall

$$\frac{1}{2} G \frac{M}{R^2} t^2 \text{ meters}$$

after $t$ seconds, where $G$ is a universal constant, $M$ is the moon's mass in kilograms, and $R$ is the moon's radius in meters.  So if we know $G$, $M$, and $R$, then Newton's laws let us predict how far an object will fall over any amount of time.

To verify the accuracy of this law, we will calculate the difference between the predicted distance the hammer drops and the actual distance.  (If they are different, it might be because Newton's laws are wrong, or because our measurements are imprecise, or because there are other factors affecting the hammer for which we haven't accounted.)

Someone studied the video and estimated that the hammer was dropped 113 cm from the surface. Counting frames in the video, the hammer falls for 1.2 seconds (36 frames).

**Question 3** <br /> Complete the code in the next cell to fill in the *data* from the experiment.

In [None]:
# t, the duration of the fall in the experiment, in seconds.
# Fill this in.
time = ...

# The estimated distance the hammer actually fell, in meters.
# Fill this in.
estimated_distance_m = ...

In [None]:
check('tests/q3.py')

**Question 4** <br /> Now, complete the code in the next cell to compute the difference between the predicted and estimated distances (in meters) that the hammer fell in this experiment.

This just means translating the formula above ($\frac{1}{2}G\frac{M}{R^2}t^2$) into Python code.  You'll have to replace each variable in the math formula with the name we gave that number in Python code.

In [None]:
# First, we've written down the values of the 3 universal
# constants that show up in Newton's formula.

# G, the universal constant measuring the strength of gravity.
gravity_constant = 6.674 * 10**-11

# M, the moon's mass, in kilograms.
moon_mass_kg = 7.34767309 * 10**22

# R, the radius of the moon, in meters.
moon_radius_m = 1.737 * 10**6

# The distance the hammer should have fallen over the
# duration of the fall, in meters, according to Newton's
# law of gravity.  The text above describes the formula
# for this distance given by Newton's law.
# **YOU FILL THIS PART IN.**
predicted_distance_m = ...

# Here we've computed the difference between the predicted
# fall distance and the distance we actually measured.
# If you've filled in the above code, this should just work.
difference = predicted_distance_m - estimated_distance_m
difference

In [None]:
check('tests/q4.py')

## 3. Text
Programming doesn't just concern numbers. Text is one of the most common types of values used in programs. 

A snippet of text is represented by a **string value** in Python. The word "*string*" is a programming term for a sequence of characters. A string might contain a single character, a word, a sentence, or a whole book.

To distinguish text data from actual code, we demarcate strings by putting quotation marks around them. Single quotes (`'`) and double quotes (`"`) are both valid, but the types of opening and closing quotation marks must match. The contents can be any sequence of characters, including numbers and symbols. 

Just like names can be given to numbers, names can be given to string values.  The names and strings aren't required to be similar in any way. Any name can be assigned to any string.

In [None]:
one = 'two'
plus = '*'
print(one, plus, one)

**Question 5** <br/> Yuri Gagarin was the first person to travel through outer space.  When he emerged from his capsule upon landing on Earth, he [reportedly](https://en.wikiquote.org/wiki/Yuri_Gagarin) had the following conversation with a woman and girl who saw the landing:

    The woman asked: "Can it be that you have come from outer space?"
    Gagarin replied: "As a matter of fact, I have!"

The cell below contains unfinished code.  Fill in the `...`s so that it prints out this conversation *exactly* as it appears above.

In [None]:
woman_asking = ...
woman_quote = '"Can it be that you have come from outer space?"'
gagarin_reply = 'Gagarin replied:'
gagarin_quote = ...

print(woman_asking, woman_quote)
print(gagarin_reply, gagarin_quote)

In [None]:
check('tests/q5.py')

## 4. Calling functions

The most common way to combine or manipulate values in Python is by calling functions. Python comes with many built-in functions that perform common operations.

For example, the `abs` function takes a single number as its argument and returns the absolute value of that number.  The absolute value of a number is its distance from 0 on the number line, so `abs(5)` is 5 and `abs(-5)` is also 5.

### 4.1. Application: Computing walking distances
Chunhua is on the corner of 7th Avenue and 42nd Street in Midtown Manhattan, and she wants to know far she'd have to walk to get to Gramercy School on the corner of 10th Avenue and 34th Street.

She can't cut across blocks diagonally, since there are buildings in the way.  She has to walk along the sidewalks.  Using the map below, she sees she'd have to walk 3 avenues (long blocks) and 8 streets (short blocks).  In terms of the given numbers, she computed 3 as the difference between 7 and 10, *in absolute value*, and 8 similarly.  

Chunhua also knows that blocks in Manhattan are all about 80m by 274m (avenues are farther apart than streets).  So in total, she'd have to walk $(80 \times |42 - 34| + 274 \times |7 - 10|)$ meters to get to the park.

<img src="map.jpg" alt="visual map about distance calculation"/>

**Question 6** <br /> Finish the line `num_avenues_away = ...` in the next cell so that the cell calculates the distance Chunhua must walk and gives it the name `manhattan_distance`.  Everything else has been filled in for you.  **Use the `abs` function.**

In [None]:
# Here's the number of streets away:
num_streets_away = abs(42-34)

# Compute the number of avenues away in a similar way:
num_avenues_away = ...

street_length_m = 80
avenue_length_m = 274

# Now we compute the total distance Chunhua must walk.
manhattan_distance = street_length_m*num_streets_away + avenue_length_m*num_avenues_away

# We've included this line so that you see the distance
# you've computed when you run this cell.  You don't need
# to change it, but you can if you want.
manhattan_distance

Be sure to run the next cell to test your code.

In [None]:
check('tests/q6.py')

##### Multiple arguments
Some functions take multiple arguments, separated by commas. For example, the built-in `max` function returns the maximum argument passed to it.

In [None]:
max(2, -3, 4, -5)

### 4.2 Understanding nested expressions
Function calls and arithmetic expressions can themselves contain expressions.  You saw an example in the last question:

    abs(42-34)

has 2 number expressions in a subtraction expression in a function call expression.  And you probably wrote something like `abs(7-10)` to compute `num_avenues_away`.

Nested expressions can turn into complicated-looking code. However, the way in which complicated expressions break down is very regular.

**Question 7** <br /> Given the heights of the Splash Triplets from the Golden State Warriors, write an expression that computes the smallest difference between any of the three heights. Your expression shouldn't have any numbers in it, only function calls and the names `klay`, `steph`, and `kevin`. Give the value of your expression the name `min_height_difference`.

In [None]:
# The three players' heights, in meters:
klay =  2.01 # Klay Thompson is 6'7"
steph = 1.91 # Steph Curry is 6'3"
kevin = 2.06 # Kevin Durant is officially 6'9", but many suspect that he is taller.
             # (Further complicating matters, membership of the "Splash Triplets" 
             #  is disputed, since it was originally used in reference to 
             #  Klay Thompson, Steph Curry, and Draymond Green.)

# We'd like to look at all 3 pairs of heights, compute the absolute
# difference between each pair, and then find the smallest of those
# 3 absolute differences.  This is left to you!  If you're stuck,
# try computing the value for each step of the process (like the
# difference between Klay's heigh and Steph's height) on a separate
# line and giving it a name (like klay_steph_height_diff).
min_height_difference = ...

In [None]:
check('tests/q7.py')

### 4.3 String Methods

Strings can be transformed using **methods**, which are functions that involve an existing string and some other arguments. One example is the `replace` method, which replaces all instances of some part of a string with some alternative. 

A method is invoked on a string by placing a `.` after the string value, then the name of the method, and finally parentheses containing the arguments. Here's a sketch, where the `<` and `>` symbols aren't part of the syntax; they just mark the boundaries of sub-expressions.

    <expression that evaluates to a string>.<method name>(<argument>, <argument>, ...)

Try to predict the output of these examples, then execute them.

In [None]:
'hitchhiker'.replace('hi', 'ma')

In [None]:
# Replace a sequence of letters, which appears twice
'hitchhiker'.replace('hi', 'ma')

Once a name is bound to a string value, methods can be invoked on that name as well. The name is still bound to the original string, so a new name is needed to capture the result. 

In [None]:
sharp = 'edged'
hot = sharp.replace('ed', 'ma')
print('sharp:', sharp)
print('hot:', hot)

Just like we can nest functions together such as what you did in question 4.2, you can also invoke a method on the output of another method call, this is also sometimes called 'chained' methods.  

In [None]:
# Calling replace on the output of another call to replace
'train'.replace('t', 'ing').replace('in', 'de')

Here's a picture of how Python evaluates a "chained" method call like that:

<img src="chaining_method_calls.jpg" alt="In 'train'.replace('t', 'ing').replace('in', 'de'), 'train'.replace('t', 'ing')' is ran first and evaluates to 'ingrain'. Then 'ingrain'.replace('in', 'de') is evaluated to 'degrade'"/>

**Question 8** <br/> Assign strings to the names `you` and `this` so that the final expression evaluates to a 10-letter English word with three double letters in a row. Essentially we're starting with the word 'beeper' and we want to convert this to another word using the string method replace.  

*Hint:* The call to `print` is there to print out the intermediate result called `the`. This should be an English word with two double letters in a row.

*Hint 2:* Run the tests if you're stuck.  They'll give you some hints.

In [None]:
you = ...
this = ...
a = 'beeper'
the = a.replace('p', you) 
print('the:', the)
the.replace('bee', this)

In [None]:
check('tests/q8.py')

Other string methods do not take any arguments at all, because the original string is all that's needed to compute the result. In these cases, parentheses are still needed, but there's nothing in between the parentheses. Here are some methods that take no arguments:

|Method name|Value|
|-|-|
|`lower`|a lowercased version of the string|
|`upper`|an uppercased version of the string|
|`capitalize`|a version with the first letter capitalized|
|`title`|a version with the first letter of every word capitalized||

All these string methods are useful, but most programmers don't memorize their names or how to use them.  Instead, people usually just search the internet for documentation and examples. A complete [list of string methods](https://docs.python.org/3/library/stdtypes.html#string-methods) appears in the Python language documentation. [Stack Overflow](http://stackoverflow.com) has a huge database of answered questions that often demonstrate how to use these methods to achieve various ends.

### 4.3.1 Strings as function arguments

String values, like numbers, can be arguments to functions and can be returned by functions.  The function `len` takes a single string as its argument and returns the number of characters in the string: its **len**gth.  

Note that it doesn't count *words*. `len("one small step for man")` is 22, not 5.

**Question 9**  <br/> Use `len` to find out the number of characters in the very long string in the next cell.  (It's the first sentence of the English translation of the French [Declaration of the Rights of Man](http://avalon.law.yale.edu/18th_century/rightsof.asp).)  The length of a string is the total number of characters in it, including things like spaces and punctuation.  Assign `sentence_length` to that number.

In [None]:
a_very_long_sentence = "The representatives of the French people, organized as a National Assembly, believing that the ignorance, neglect, or contempt of the rights of man are the sole cause of public calamities and of the corruption of governments, have determined to set forth in a solemn declaration the natural, unalienable, and sacred rights of man, in order that this declaration, being constantly before all the members of the Social body, shall remind them continually of their rights and duties; in order that the acts of the legislative power, as well as those of the executive power, may be compared at any moment with the objects and purposes of all political institutions and may thus be more respected, and, lastly, in order that the grievances of the citizens, based hereafter upon simple and incontestable principles, shall tend to the maintenance of the constitution and redound to the happiness of all."
sentence_length = ...
sentence_length

In [None]:
check('tests/q9.py')

### 4.3.2 Converting to and from Strings

Strings and numbers are different *types* of values, even when a string contains the digits of a number. For example, evaluating the following cell causes an error because an integer cannot be added to a string.

In [None]:
8 + "8"

However, there are built-in functions to convert numbers to strings and strings to numbers. 

|Function name|Effect|Example|
|-|-|-|
|`int`  |Converts a string of digits and perhaps a negative sign to an integer (`int`) value|`int("42")`|
|`float`|Converts a string of digits and perhaps a negative sign and decimal point to a decimal (`float`) value|`float("4.2")`|
|`str`  |  Converts any value to a string (`str`) value|`str(42)`|


What do you think the following cell will evaluate to?

In [None]:
8 + int("8")

**Question 10** <br/> Use `replace` and `int` together to compute the time between between the the year 105 BCE ([Ts'ai Lun invents paper based on tree bark for the Emperor of China](https://en.wikipedia.org/wiki/Paper)) and the year 1440 AD ([Start of the Print Revolution](https://en.wikipedia.org/wiki/Printing_press). Try not to use any numbers in your solution, but instead manipulate the strings that are provided.

*Hint*: It's ok to be off by one year. In historical calendars, there is no year zero, but astronomical calendars do include [year zero](https://en.wikipedia.org/wiki/Year_zero) to simplify calculations.

In [None]:
invented = 'BC 105'
revolution = 'AD 1440'
start = ...
end = ...
print('The time between the first invention of paper and the print revolution is', end-start, 'years from', invented, 'to', revolution)

In [None]:
check('tests/q10.py')

### 4.4 Importing code

> What has been will be again,  
> what has been done will be done again;  
> there is nothing new under the sun.

Most programming involves work that is very similar to work that has been done before.  Since writing code is time consuming, it's good to rely on others' published code when you can.  Rather than copy-pasting, Python allows us to **import** other code, creating a **module** that contains all of the names created by that code.

Python includes many useful modules that are just an `import` away.  We'll look at the `math` module as a first example. The `math` module is extremely useful in computing mathematical expressions in Python. 

Suppose we want to very accurately compute the area of a circle with radius 5 meters.  For that, we need the constant $\pi$, which is roughly 3.14.  Conveniently, the `math` module has `pi` defined for us:

In [None]:
import math
radius = 5
area_of_circle = radius**2 * math.pi
area_of_circle

`pi` is defined inside `math`, and the way that we access names that are inside modules is by writing the module's name, then a dot, then the name of the thing we want:

    <module name>.<name>
    
In order to use a module at all, we must first write the statement `import <module name>`.  That statement creates a module object with things like `pi` in it and then assigns the name `math` to that module.  Above we have done that for `math`.


**Modules** can provide other named things, including **functions**.  For example, `math` provides the name `sin` for the sine function.  Having imported `math` already, we can write `math.sin(3)` to compute the sine of 3.  (Note that this sine function considers its argument to be in [radians](https://en.wikipedia.org/wiki/Radian), not degrees.  180 degrees are equivalent to $\pi$ radians.)

**Question 11** <br/> A $\frac{\pi}{4}$-radian (45-degree) angle forms a right triangle with equal base and height, pictured below.  If the hypotenuse (the radius of the circle in the picture) is 1, then the height is $\sin(\frac{\pi}{4})$.  Compute that using `sin` and `pi` from the `math` module.  Give the result the name `sine_of_pi_over_four`.

<img src="http://mathworld.wolfram.com/images/eps-gif/TrigonometryAnglesPi4_1000.gif">
(Source: [Wolfram MathWorld](http://mathworld.wolfram.com/images/eps-gif/TrigonometryAnglesPi4_1000.gif))

In [None]:
sine_of_pi_over_four = ...
sine_of_pi_over_four

In [None]:
check('tests/q11.py')

For your reference, here are some more examples of functions from the `math` module.

Note how different methods take in different number of arguments. Often, the documentation of the module will provide information on how many arguments is required for each method.

In [None]:
# Calculating factorials.
math.factorial(5)

In [None]:
# Calculating logarithms (the logarithm of 8 in base 2).
# The result is 3 because 2 to the power of 3 is 8.
math.log(8, 2)

There's many variations of how we can import methods from outside sources. For example, we can import just a specific method from an outside source, we can rename a library we import, and we can import every single method from a whole library. 

In [None]:
# Importing just cos and pi from math.
# Now, we don't have to use "math." before these names.
from math import cos, pi
print(cos(pi))

In [None]:
# We can nickname math as something else, if we don't want to type the name math
import math as m
m.log(m.pi)

In [None]:
# Lastly, we can import ever thing from math and use all of its names without "math."
from math import *
log(pi)

## 5. Arrays

Up to now, we haven't done much that you couldn't do yourself by hand, without going through the trouble of learning Python.  Computers are most useful when a small amount of code performs a lot of work by *performing the same action* to *many different things*.

For example, in the time it takes you to calculate the 18% tip on a restaurant bill, a laptop can calculate 18% tips for every restaurant bill paid by every human on Earth that day.  (That's if you're pretty fast at doing arithmetic in your head!)

**Arrays** are how we put many values in one place so that we can operate on them as a group. For example, if `billions_of_numbers` is an array of numbers, the expression

    .18 * billions_of_numbers

gives a new array of numbers that's the result of multiplying each number in `billions_of_numbers` by .18 (18%).  Arrays are not limited to numbers; we can also put all the words in a book into an array of strings.

Concretely, an array is a **collection of values of the same type**, like a column in an Excel spreadsheet. 

<img src="excel_array.jpg" alt="In Excel, columns of text are like array of strings for tables.  The same can be said about numbers (ints, floats) as well">

### 5.1. Making arrays
You can type in the data that goes in an array yourself, but that's not typically how programs work. Normally, we create arrays by loading them from an external source, like a data file.

First, though, let's learn how to start from scratch. Execute the following cell so that all the names from the `datascience` module are available to you. The documentation for this module is available at [http://data8.org/datascience](http://data8.org/datascience/).

In [None]:
from datascience import *

Now, to create an array, call the function `make_array`.  Each argument you pass to `make_array` will be in the array it returns.  Run this cell to see an example:


In [None]:
make_array(0.125, 4.75, -1.3)

Each value in an array (in the above case, the numbers 0.125, 4.75, and -1.3) is called an *element* or *item* of that array.

Arrays themselves are also values, just like numbers and strings.  That means you can assign them names or use them as arguments to functions.

**Question 12** <br/> Make an array containing the numbers 0, 1, -1, $\pi$, and $e$, in that order.  Name it `interesting_numbers`.  *Hint:* How did you get the values $\pi$ and $e$ earlier?  You can refer to them in exactly the same way here.

In [None]:
interesting_numbers = ...
interesting_numbers

In [None]:
check('tests/q12.py')

### 5.2.  `np.arange`
Arrays are provided by a package called [NumPy](http://www.numpy.org/) (pronounced "NUM-pie" or, if you prefer to pronounce things incorrectly, "NUM-pee").  The package is called `numpy`, but it's standard to rename it `np` for brevity.  You can do that with:

    import numpy as np

Very often in data science, we want to work with many numbers that are evenly spaced within some range.  NumPy provides a special function for this called `arange`.  `np.arange(start, stop, space)` produces an array with all the numbers starting at `start` and counting up by `space`, stopping before `stop` is reached.

For example, the value of `np.arange(1, 6, 2)` is an array with elements 1, 3, and 5 -- it starts at 1 and counts up by 2, then stops before 6.  In other words, it's equivalent to `make_array(1, 3, 5)`.

`np.arange(4, 9, 1)` is an array with elements 4, 5, 6, 7, and 8.  (It doesn't contain 9 because `np.arange` stops *before* the stop value is reached.)

**Question 13** <br/>Import `numpy` as `np` and then use `np.arange` to create an array with the multiples of 99 from 0 up to (**and including**) 9999.  (So its elements are 0, 99, 198, 297, etc.)

In [None]:
...
multiples_of_99 = ...
multiples_of_99

In [None]:
check('tests/q13.py')

### 5.3. Application: Temperature readings
NOAA (the US National Oceanic and Atmospheric Administration) operates weather stations that measure surface temperatures at different sites around the United States.  The hourly readings are [publicly available](http://www.ncdc.noaa.gov/qclcd/QCLCD?prior=N).

Suppose we download all the hourly data from the Oakland, California site for the month of December 2015.  To analyze the data, we want to know when each reading was taken, but we find that the data don't include the timestamps of the readings (the time at which each one was taken).

However, we know the first reading was taken at the first instant of December 2015 (midnight on December 1st) and each subsequent reading was taken exactly 1 hour after the last.

**Question 14** <br/>Create an array of the *time, in seconds, since the start of the month* at which each hourly reading was taken.  Name it `collection_times`.

*Hint 1:* There were 31 days in December, which is equivalent to ($31 \times 24$) hours or ($31 \times 24 \times 60 \times 60$) seconds.  So your array should have $31 \times 24$ elements in it.

*Hint 2:* The `len` function works on arrays, too.  If your `collection_times` isn't passing the tests, check its length and make sure it has $31 \times 24$ elements.

In [None]:
collection_times = ...
collection_times

In [None]:
check('tests/q14.py')

**Question 15** <br/>The powers of 2 ($2^0 = 1$, $2^1 = 2$, $2^2 = 4$, etc) arise frequently in computer science.  (For example, you may have noticed that storage on smartphones or USBs come in powers of 2, like 16 GB, 32 GB, or 64 GB.)  Use `np.arange` and the exponentiation operator `**` to compute the first 15 powers of 2, starting from `2^0`.

In [None]:
powers_of_2 = ...
powers_of_2

In [None]:
check('tests/q15.py')

## 6. Success! 

Congratulations, you're done with lab 2!  Be sure to 
- **run all the tests and verify that they all pass** (the next cell has a shortcut for that), 
- **Save and Checkpoint** from the `File` menu,
- **Download as a html or ipynb** Submit this to canvas for credit!

In [None]:
import glob
from gofer.ok import check
for x in range(1, 16):
    print('Testing question {}: '.format(str(x)))
    display(check('tests/q{}.py'.format(str(x))))