# Homework 0: Introduction To Python 🐍

## Why Python?

Python is one of the hottest languages in industry today, especially in machine learning and data science. It is widely appreciated for its clean, readable, and generally no-nonsense code that enables singleminded focus on the task at hand. It was designed to be both simple and friendly, yet still powerful and expressive.

In this course, we will be using Python 3.10.

## About This Homework

In this homework, we assume that you have taken an introductory course in computer science, or you are familiar with programming in another programming language (e.g. C++ or Java). If you are already familiar with Python, then you may skip to the [discussion of NumPy](#4.-NumPy). Here we will briefly demonstrate how to write Python code and introduce some of the tools we frequently use.

### This is a Jupyter Notebook

As you may already know, this homework is formatted as a Jupyter Notebook, which provides a Python environment that you can interact with. This means that you can run all of the code examples you find below, and even create your own code to try things out (you are highly encouraged to do so!).

Each chunk of code or text in this notebook is written in a **block**. There are three types of blocks in Jupyter Notebooks: code, markdown, and raw.

#### Code Blocks

**Code blocks**, are exactly what their name implies: you write code in them. They are blocks that can be executed in the interactive Python environment we mentioned earlier. You can **run** code blocks by clicking into them so that you see a blue bar to the left of the block, and then either pressing the small ▶ button in the top bar, or pressing `Shift + Enter`.

**Try this!** Here is an example of a code block, run it! 

In [84]:
print("I'm a code block!")

I'm a code block!


You can also add new blocks by pressing the + button in the top bar.

#### Markdown Blocks

All of the text you see in this notebook was written with the second type of block we mentioned: **markdown blocks**. These blocks allow you to write formatted text with a simple markdown language called, well, [Markdown](https://en.wikipedia.org/wiki/Markdown). You can find a very simple demonstration of markdown [here](https://markdown-here.com/livedemo.html). You can also double-click any of the text blocks in this document, **like this one**, to see the markdown "source" code that produced it. Just **run** it to render it again.

#### Raw Blocks

The last type of block is a **raw block**, which allows you to write preformatted or "raw" text. The contents of these blocks are not rendered after running them, like with the markdown blocks. These blocks are not very common in a notebook, but there are times where you will find them useful.

## 3. Working in Python

As mentioned before, Python is renowned for its simplicity. And, while this introduction may be long, we strongly believe that as you continue to write Python code, you will find that the language just "fades into the background," allowing you to focus on the data science that you are here to learn.

As you read through the rest of the homework, remember that all of the code blocks are interactive and that you should **run** them to see what happens. Furthermore, you are _encouraged_ to make your own blocks (with the + button) and to try experimenting with things yourself. Just keep in mind that on homeworks, we may not allow extra blocks to be used.

### Expressions

As with any programming language, complex ideas are built up from small, *primitive* expressions. More generally, anything that can be *evaluated* to a value is an expression. For example, a number (1, 2, etc.) in code is an expression because that code can be evaluated to the value of that number.

**Try this!** Execute the cells in this section and observe their output.

In [85]:
217

217

By combining simple expressions with operators, we can even express complex ideas, like those used to find patterns in data. For example,

In [86]:
217 * 9 + 8 * 7 + 10

2019

Some other types of expressions are `strings` and `booleans`:

In [87]:
"I'm an expression too!"

"I'm an expression too!"

In [88]:
True or False is not False

True

### Variables

You can create variables by assigning a value to a name (or, if you prefer, by giving a name a value). Values can be expressed as expressions, which are evaluated prior to being assigned to a name.

In [89]:
greeting = 'Hello'
greeting

'Hello'

If you are in a hurry, you can also assign names to multiple values simultaneously. But a caveat for this is that assignment is done after evaluation so you cannot use `one` and `two` in the expression for `three`.

In [90]:
# this won't work!
one, two, three = 1, 2, one + two

In [91]:
# this is ok!
one, two, three = 1, 2, 3

### Strings

Working with data often means working with strings. Recall that strings are what we call words and sentences in programming languanges because they are essentially a group of characters, like `a` or `b`, that have been *strung* together like `hello there!`. Being familiar with how to manipulate strings is not only important, but very useful. Many professionals love Python because of how easy it is to work with strings, especially in areas like [Natural Language Processing](https://en.wikipedia.org/wiki/Natural_language_processing).

Like in other languages, you can denote a string using the `"` symbol. For example, `"this is a string"`. Alternatively in Python you may also use a "single quote", `'`, to denote a string (e.g. `'this is also a string'`).

**Try this!** In the follow block, try assigning a string containing your name to a variable called `my_name`.

In [92]:
# replace None with a string containing your name
my_name = None
my_name = "Silas Nevstad"

#### Indexing and Slicing

You can access specific characters in a string using square bracket notation, `[]`. 

In [93]:
my_name[0]

'S'

Remember that, like Java, strings are indexed from `0`.

You can also slice strings, or get a part of a string using **slice** indexing. Slicing works by specifying the range of indices that you are interested in retrieving. The way to describe this is the same as [interval notation](http://www.mathwords.com/i/interval_notation.htm) if you remember from math:

$$[\text{begin} : \text{end}).$$

The selected string will include the character at the first index given, but will not include the character of the second index.

In [94]:
my_name[0:2]

'Si'

Often, you will be interested in either the first few or last few characters in a string, in which case you can leave out the corresponding index. In the following example, I only want the characters from index `1` to the end so I can leave off the ending index.

In [95]:
my_name[1:]

'ilas Nevstad'

**Try this!** Get all characters of the following string but the first four without counting its length: 

In [96]:
my_string = "This is a really long sting and you have now clue on how long it is!"

# your code here 
my_string[4:]

' is a really long sting and you have now clue on how long it is!'

#### More On Strings

There are many more things that you can do with strings, for example you can concatenate them together (`'hello' + ', '+ 'world'`). However, in this course, you will primarily be working with _numerical_ data, instead of strings, so we will not linger on this topic. For more information on strings, please see this [article from Google](https://developers.google.com/edu/python/strings).

### Lists

Lists are like `arrays` in Java, except that you can put whatever you want into them. They are also flexible in size so you can keep adding to them as much as you'd like. Python lists are also the basis of strings. Because of this, strings and lists both share many of the same features.

Here we create a new list, using square bracket notation, and then fill it with `1`, the value of variable `one`, the character `'a'`, and the value of the variable `my_name`.

In [97]:
[1, one, 'a', my_name]

[1, 1, 'a', 'Silas Nevstad']

#### Adding to a List

You can add to lists using the `append` method like this.

In [98]:
a_list = [1, 2, 3, 4, 5]
a_list.append(6)
a_list

[1, 2, 3, 4, 5, 6]

It is also possible to concatenate two lists, i.e. you can add two lists end-to-end.

In [99]:
a_list + a_list

[1, 2, 3, 4, 5, 6, 1, 2, 3, 4, 5, 6]

#### Indexing

The good thing about this is that strings are essentially the same as lists so you can use the same indexing techniques as you did for strings for lists.

**Try This!** Get the second to fourth elements of `a_list`.

In [100]:
# your code here 
a_list[1:4]

[2, 3, 4]

We can also do more crazy indexing things, like skipping elements:

In [101]:
a_list[0::2]

[1, 3, 5]

**Write-up!** How would you explain this line of code in words? Write your response in the box below.

**Your response here:** 
This line of code is indexing the list `a_list` starting from the first element and then skipping every other element.

Or reversing it:

In [102]:
a_list[::-1]

[6, 5, 4, 3, 2, 1]

**Try This!** How would you get all of the elements from `a_list` with slicing? _Note that this is not the same as simply evaluating `a_list` again. Instead, it will make a copy of the list._

In [103]:
# your code here 
a_list[:]

[1, 2, 3, 4, 5, 6]

### Control Flow and Loops

#### If-Else Conditionals

Besides the `elif` statement, there are three more things to notice. First, the conditional statements don't have to be in parentheses, which allows for easier reading of code with much less clutter. Second, each statement is terminated with a `:`. This is essentially saying that we will "define" what this statement entails. And, third, that each "block" following a statement is merely indented with either 4 spaces or a tab. This is, again, for readability. You can almost think of Python code as _outlining_ what you want it to do, where each statement is kind of like a heading and each block is an indented "idea," so to speak.

In [104]:
if 2 < 1 and 1 > 0:
    print('Not both!')
elif 1 > 0:
    print('Just one.')
else:
    print('Everything else')

Just one.


#### Loops

The syntax for a `while` loop should look relatively similar to those in Java. The differences lie in the same places as for the `if` statements. A   `while` loop will iterate "while" its condition remains `True`.

In [105]:
n = 5

while n > 0:
    print(n)
    n -= 1

5
4
3
2
1


`for` loops may look a little unfamiliar at first, but there is good reason for this. For loops in Python are designed to have greater functionality without sacrificing readability. Below is an example of how you would iterate 5 times.

In [106]:
for i in range(5):
    print(i)

0
1
2
3
4


You can also easily iterate through lists the same way as lists are also `iterables`.

In [107]:
a_list = ['Hello', 'Machine', 'Learning', '!!!']

for word in a_list:
    print(word)

Hello
Machine
Learning
!!!


In an earlier example, we took a `range`, which is "list-like", and square each value. The `range` function returns an `iterable` from an _optional_ starting point to an end point.

In [108]:
range(5)

range(0, 5)

**Try this!** Create a for loop to print out every even number from in the interval `[1:50]` (inclusive). Hint: create an appropriate `iterable` first.

In [109]:
# your code here 
for i in range(1, 51):
    if i % 2 == 0:
        print(i)
        
# or
for i in range(2, 51, 2):
    print(i)
    
# or
lst = list(range(1, 51))
for i in lst[1::2]:
    print(i)

2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50


#### List Comprehensions

A very useful feature of Python lists is called **comprehension** notation. It allows us to use for loops to create lists! This notation is very similar to set-builder notation from math and allows us to succinctly create lists from other lists.

In [110]:
squares = [x * x for x in range(5)]
squares

[0, 1, 4, 9, 16]

In [111]:
squares_plus = [x + 1 for x in squares]
squares_plus

[1, 2, 5, 10, 17]

### Functions

Functions in Python, and in general, can be thought of as a generalized method in Java. Where methods operate on and are attached to classes, functions are not. However, both constructs facilitate "abstraction" in our code. Abstraction is usually defined as a process by which we hide away all the little details about something in order to focus on how the thing interacts. Practically in CS this means:

> [Abstraction's] main goal is to handle complexity by hiding unnecessary details from the user. -[Stackify](https://stackify.com/oop-concept-abstraction/?utm_referrer=https%3A%2F%2Fwww.google.com%2F)

This is what allows us to think about driving a car without actually thinking about all the complex details of what happens when we press the gas pedal. In the same way, functional abstraction allows us to think about functions as a collection of code that, given a particular input, will return a particular output.

A very simple example might be a sum function, which computes the sum of a `list` of values.

In [112]:
sum([1, 2, 3, 4, 5])

15

With this function, I can retrieve the sum of a group of values without ever having to know _how_ the sum is actually computed. Another benefit of grouping code into functions is that it makes the code easy to test — just call the function with some inputs and see if the right output is returned. The last benefit I will mention here is that a function is a way of separating concerns and ideas. When you write a `sum` function, all you are responsible for is the correctness and efficiency of the computation, nothing else. Also, when writing a function, you are putting code that does a particular computation into a logical grouping, much like how in writing you would group similar ideas into a paragraph.

#### Syntax

The syntax of defining a function in Python is quite simple. You let the interpreter know that you want to define a function using the `def` statement. Then, you follow that with your function's name and the arguments. There is no need to specify the types that the function accepts as Python will figure that out itself. If it cannot, then it will let you know in the form of an error.

In [113]:
def my_function(arg1, arg2):
    output = arg1 + arg2
    return output

Notice that, again, much like the `if` statements and loops, function definitions end with a `:` and the contents are indented.

Some will miss the safety of static typing in Java (others won't). While this is true, in exchange you can reduce the redundancy of code between one version of a function that takes in one type of input and another version of the same function that takes in different type (e.g. `len(int[])` and `len(char[])`).

#### More Examples

In [114]:
def mean(values):
    return sum(values) / len(values)

Notice that the `len` function returns the length of any list.

#### Lambdas

Lambdas are a special type of function called anonymous functions. Whereas with normal functions you must name the function in the definition, you do not have to name lambda functions.

```
def named_function():
    pass
```

Lambdas are treated as expressions that evaluate to functions exactly like how `'hello'` is evaluated to a string. This means that you can store lambdas in variables, but this is _widely_ considered bad practice as a full function definition is much more readable.

```
lambda x, y: x + y
```

Here, the `lambda` keyword tells Python that we want to make a lambda function. The `x` and `y` before the colon denote the arguments of the function. The expression after the colon represents the logic of the function. In this case, we could call this function `add` since it takes two values and adds them.

Again, it is very bad practice to assign lambdas to variables. This might make you think that they are useless, but I assure you that they are not. The most common use case of lambdas is when another function takes a function as an argument. An example of this is the `max` function.

I'm sure everyone is familiar with the `max` function, but in case you are not, this function returns the largest element in a list. Typically you would simply call the `max` function with `some_list` and get the largest element.

Try evaluating this next cell.

In [115]:
some_list = [1, 2, 3, 4]
max(some_list)

4

As expected, the largest value in `some_list` is `4`. However, there are cases where you want to get the max element of a list but the elements have complex structure. For example, consider a class roster, which is a list of students. In this situation, let us represent each student as a `list` containing their name, age, and graduation year.

In [116]:
roster = [
    ['Billy',  21, 2021],
    ['Meghan', 18, 2020],
    ['Jeff',   21, 2019],
    ['Alex',   50, 2021],
    ['Cate',   21, 2020]
]

We want to find the oldest student, `Alex`, in this group of students. How can we do that? Let's try directly calling `max` with this `roster` and see what we get.

In [117]:
max(roster)

['Meghan', 18, 2020]

The `max` function returned `['Meghan', 18, 2020]`, which isn't what we are looking for.

> **Note**: this result makes sense because the `max` function sees a list of lists and defaults to using the first element of each list to compare them. In this case, `Meghan` starts with `M`, which comes later in the alphabet than the rest of the students' initials.

In order to get the oldest student, we will need to show the `max` function which values we want it to compare. To do this we will use a `key`, which is a function. Instead of writing an entire function for this, we can just pass in a lambda that, when called with a roster, will return the age of the student.

In [118]:
max(roster, key=lambda student: student[1])

['Alex', 50, 2021]

Here we see that `'Alex'`, who is 50, was returned as the oldest student.

### Other Python Topics

A complete discussion of Python would include topics such as [dictionaries](https://www.programiz.com/python-programming/dictionary), [iterators](https://www.programiz.com/python-programming/iterator), and [classes](https://www.programiz.com/python-programming/object-oriented-programming), among other important topics. However, this has already been a lot of information and we will not run into these additional topics in the first few homeworks. Because of this, we will opt to introduce these other topics as they appear throughout this course.

## 4. NumPy

You can put just about anything into a Python list. They are designed to be completely agnostic to types, making them just about as flexible as a data structure can get. You want to store a string, an int, and another list? No problem. However, this versatility doesn’t come for free. In exchange for quality of life, we must give up some degree of computational efficiency — though probably not enough to tip the scales against lists in most use cases.

One case, however, where lists are not ideal is mathematical computation. Here, we don't need the flexibility that lists give us since we know upfront that we are only dealing with _numbers_ and _how many_ of these numbers we have (e.g. the dimensionality of a column vector is fixed for a particular problem). This leads us to seek an alternative data structure that is optimized for these constraints (i.e. known type and shape): the **array**.

These types of math-specialized arrays are not provided by Python itself. Instead, they can be found in the `numpy` package. Here we import `numpy` with the alias `np`. This is for the sakes of both convention and convenience; $2 < 5$.

In [119]:
import numpy as np

### Creating Arrays

Arrays can be created from many "list-like" objects by calling `np.array` on the object.

In [120]:
some_list = range(10)
np.array(some_list)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

You may also created multi-dimensional, or `ndarray`s, in this way.

In [121]:
some_ndlist = [
    [1, 2, 3],
    [4, 5, 6]
]
np.array(some_ndlist)

array([[1, 2, 3],
       [4, 5, 6]])

#### Zeros

Sometimes you just need a _uniform_ array of some value. Here are some examples of how you can make these.

In [122]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In the following example, we pass in a `tuple` (kind of like a list) with the shape of the array we want, ie. `(5, 5)`. _**The resulting array has two dimensions (rows and columns)**_.

In [123]:
np.zeros((5, 3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

**Try this!** Create a three dimensional array of shape $2 \times 3 \times 4$ with zero entries.  

In [124]:
# your code here 
np.zeros((2, 3, 4))

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

This looks pretty complicated and luckily for our course we will mosly use one and two dimensional arrays as they can represent _vectors_ and _matrices_. 

#### Ones

In [125]:
np.ones(5)

array([1., 1., 1., 1., 1.])

If you need an array of `5` then make an array of `ones` of the desired shape and multiple it by `5`.

In [126]:
np.ones(5) * 5

array([5., 5., 5., 5., 5.])

#### Evenly Spaced Sequences

You can also get an array of evenly spaced numbers over a specified interval using `np.linspace`.

```
np.linspace(start, stop, number)
```

In [127]:
np.linspace(0, 10, 20)

array([ 0.        ,  0.52631579,  1.05263158,  1.57894737,  2.10526316,
        2.63157895,  3.15789474,  3.68421053,  4.21052632,  4.73684211,
        5.26315789,  5.78947368,  6.31578947,  6.84210526,  7.36842105,
        7.89473684,  8.42105263,  8.94736842,  9.47368421, 10.        ])

### Thinking of Arrays as Vectors

One of the main benefits of using arrays rather than lists is because of _vectorized_ operations, which essentially allow us to think of an entire array as a unit and operate at the array level — we don't have to concern ourselves with each individual number. 
> [🐍 **Python Feature** 🐍]: Python uses amazingly simple syntax: we can write `a+b` or `a*b` to compute the sum or product of two _scalars_ or two (same-sized) _vectors_ using the exacly same syntax. 

#### Scalars:

In [128]:
a = 5
b = 1

a + b

6

In [129]:
a * b

5

#### Vectors:
Below is an example of "vectorized" addition and "vectorized" multiplication.

In [130]:
a = np.ones(3)             # a = [1, 1, 1]
b = np.array([1, 2, 3])    # b = [1, 2, 3]

a + b

array([2., 3., 4.])

In [131]:
a * b

array([1., 2., 3.])

Note that multiplication above is carried out "element-wise". The i-th element in the resulting vector will be computed as: $$a_i b_i \quad i=0, 1, \ldots, d$$

#### Dot Product
In many cases, _vectorized_ operations are simply _element-wise operations_ as we have seen for addition and multiplication above. 

However, we need to pay special attention to products. A product between two vectors can be defined in multiple ways. Above we have seen _element-wise_ multiplication, but most often when we deal with vectors, especially in data science and machine learning, we actually want to compute the _dot product_, or _inner product_ of two $d$-dimensional vectors (mathspeak for $d$-sized arrays). 

The dot product between two vectors $\textbf{a}$ and $\textbf{b}$ is defined as 
$$\textbf{a} \cdot \textbf{b} = \textbf{a}^T\textbf{b} = \sum_{i=1}^{d} a_i b_i.$$ 
This translates to "_multiply each element of $\textbf{a}$ by the element at the same index in $\textbf{b}$ and sum the products._" Note that the *T* stands for transpose (more info [here](https://en.wikipedia.org/wiki/Row_and_column_vectors).

A Python function for this might look like:

In [132]:
def dot_inefficient(a, b):
    '''computes the dot product of A and B using Python built-in functions'''
    products = []
    
    for i in range(len(a)):
        p = a[i] * b[i]
        products.append(p)
        
    return sum(products)

In [133]:
dot_inefficient(a, b)                  # 1*1 + 1*2 + 1*3 = 6

6.0

#### Loops
Notice the `for loop` that is required. **Python loops to perform vector operations are always inefficient!** Note that even simple "element-wise" products `a*b`, sums `a+b`, etc. will involve looping. 

Looping itself in Python is not necessarily slow, but given the constraints of this context (recall, we are only dealing with numbers and we know how many of them we have) we can do bettter by leveraging "super" fast C libraries to do this for us.

> **For the curious**: In this case, we use Python and NumPy as an interface for highly optimized C routines.

**Try this!** Now that you know more about what you can do with arrays, let's try to revise our `dot` function to use NumPy computations.

In [134]:
def dot_revised(a, b):
    '''computes the dot product of A and B using NumPy computations'''
    # your code here 
    solution = np.dot(a, b)
    # without using .dot
    solution = np.sum(a * b)

    return solution

In [135]:
dot_revised(a, b)                  # 1*1 + 1*2 + 1*3 = 6

6.0

Or even better: 

In [136]:
np.dot(a,b)

6.0

#### Array Operations
There are many other vectorized array operations, such as `np.min`, `np.sqrt`, among others that also take advantage of the `array` datastructure to compute results very quickly. An extensive list of mathematical operations provided by NumPy can be found in [its documentation](https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.math.html).

Furthermore, some computations can be found as methods of arrays themselves. These include `array.min()`, `array.mean()`, etc. A list of these can be found [here](https://docs.scipy.org/doc/numpy-1.14.0/reference/arrays.ndarray.html#calculation).

### Arrays as Matrices

Arrays are considered to be very generic. They can be used to represent vectors, often as points in Euclidean space, but they can also just represent a collection of numbers you would like to do math with. Moving from $1$ dimension to $2$ dimensions, an array could represent both a collection of vectors aligned side-by-side or a matrix in a more traditional mathematical sense. We won't spend much time on the matrix sense as most of the applications of _ndarrays_ we will see in this class are better described as a collection of vectors.

An example of this could be a collection of vectors representing the prices of several stocks and their "earnings per share," which together are used in finance to compute a price-earnings ratio, or more commonly, a [P/E ratio](https://www.investopedia.com/university/peratio/peratio1.asp).

$$
\begin{bmatrix}
\text{prices} \\
\text{earnings per share}
\end{bmatrix}
=
\begin{bmatrix}
175 & 150 & 180 \\
1.60 & 1.03 & 2.00
\end{bmatrix}
$$

In NumPy, this could be represented by a 2D array as follows.

In [137]:
prices = [175, 150, 180]
earnings = [1.60, 1.03, 2.00]

data = np.array([prices, earnings])
data

array([[175.  , 150.  , 180.  ],
       [  1.6 ,   1.03,   2.  ]])

Here we have represented our data, $\mathcal{D}$, as a collection of three column vectors, each representing one observation. With this data, we can calculate the P/E ratios of each of these stocks.

$$\text{P/E Ratio}\; = \frac{\text{Price per Share}}{\text{Earnings per Share}}$$

In [138]:
def pe_ratio(data):
    prices = data[0, :] # all values from row 1
    earnings = data[1, :] # row 2
    
    return prices / earnings

In [139]:
print("P/E ratios are:", pe_ratio(data))

P/E ratios are: [109.375      145.63106796  90.        ]


> **Notice:**  we did **not** need to _manually_ loop through the elements. 😀

### Arrays have Axes

NumPy arrays have axes that correspond to the order of numbers supplied when indexing.

![axes](utility/pics/elsp_0105.png)

It makes sense to consider axes for many operations. For example, the `min` method on arrays will default to returning the minimum value in each column, preferring to follow axis 0. However, there are cases where you want to get the minimum of each row instead. For these cases, you can specify which axis `min` should use.

To demonstrate, we will use our sample BMI dataset. Here is a reminder of what the data looked like.

In [140]:
data

array([[175.  , 150.  , 180.  ],
       [  1.6 ,   1.03,   2.  ]])

If we wanted to find the minimum value in `data` we would simply use the `min` method on the array as is done in the next cell.

In [141]:
data.min()

1.03

However, often times we would like to find the minimum value in a particular row or column. Depending on the data you are representing with the array, this might mean finding the minimum stock price and stock earnings per share as with our P/E ratio example.

We can find the minimum price and earnings (each row; axis 1) we can specify that `min` should find the `min` in a specific axis.

In [142]:
data.min(axis=1)

array([150.  ,   1.03])

Many vectorized operations that manipulate arrays can take an axis argument, allowing you more flexibility.

Now, using the power of arrays, let's reattempt our goal from above to find the oldest student, Alex, in the group of students. Note that we do not want strings in our numpy arrays, so we will put the names into a list and the numeric entries into a two dimensional array (matrix). 

In [143]:
names = ['Billy','Meghan','Jeff', 'Alex','Cate']
roster = [
    [21, 2021],
    [18, 2020],
    [21, 2019],
    [50, 2021],
    [21, 2020]
]
R = np.array(roster)
R

array([[  21, 2021],
       [  18, 2020],
       [  21, 2019],
       [  50, 2021],
       [  21, 2020]])

**Try this!** Get the age and graduation year of the oldest person in our `roster` with Python built-in functions.

In [144]:
# your code here 
max_age = max(R[:,0])
grad_year = R[R[:,0].argmax(), 1]
max_age, grad_year

(50, 2021)

The array approach to doing this would be to identify the row that contains the greatest age. First, we can find the [`argmax`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html), or the "arg with the max value" of the first column in `R` — this will return the index of the row containing the max. Then, we can simply retrieve the value in `names` at that index.

This comes together in the following cell.

In [145]:
names[R[:,0].argmax()]

'Alex'

**Try this!** Adapt your code to reveal the name of the youngest person. You can use the reference on [array functions](https://docs.scipy.org/doc/numpy-1.14.0/reference/arrays.ndarray.html#calculation) linked above.

In [146]:
# your code here 
names[R[:,0].argmin()]

'Meghan'