# Performance statistics

You work in a fashion company with 100 employees. You want to start tracking the effectiveness of your tailors and decide to keep track of their performance for the month January of 2018.

To the right, you can see an example sheet of the performance metrics for a tailor called Vivienne V. Versace. There are some tables with the employee and product information and a performance table.

Finally, there's a bigger table, which contains the performance metrics:

- Finished: the amount of finished products that day
- Output: the combined value of those finished products
- Cost: the cost to produce those products
- Net: the difference between output and cost
- Performance: the performance of the employee, bad, acceptable or good.

- See 02.01

# Flow control - IF

Throughout the previous chapters, there have been several occasions where you had to use logical values: TRUE or FALSE. Now here's where they get really useful: flow control functions.

These are functions that use a certain logical value as one of their arguments and evaluate according to that value. One will be specifically important in this exercise:

IF(logical_expression, value_if_true, value_if_false): depending on the logical_expression, return value_if_true when its result is TRUE, return value_if_false otherwise.
For example, = IF(TRUE, 2, -2) would evaluate to 2.

- See 02.02

# Nested logical functions - IF

We've merged the cost you calculated in the previous exercise into one column, G, to win some space. Now, it's time to dig deeper, you're going to use a nested if statement.

To understand this, you can think of IF functions as parts of a decision tree. In each splitting of the tree, you follow a path depending on the value of a logical expression. If the expression is TRUE, you follow one branch, if it is FALSE you follow the other. When you nest IF statements, you're just following along the branches of the decision tree. 

<center><img src="images/02.03.png"  style="width: 400px, height: 300px;"/></center>

This image illustrates a decision tree where if Net is smaller than 0, it evaluates to "bad", if it is bigger than 150, evaluates to "good" and if it is in between, evaluates to "acceptable".

- See 02.03

# Combining logical values - OR, WEEKDAY

It's time to revise a column you previously left untouched: weekend. The value is either TRUE or FALSE. Right now, all the values are filled in manually, it's hard coded. We can do better using the following formulas:

- OR(logical_expression1, [logical_expression2, ...]): this is the logical operator that returns TRUE if one of the expressions is TRUE and FALSE if and only if all of them are FALSE.

For example, we can determine whether a cell (e.g. A2) is equal to 21 or 22 by using the following formula: =OR(A2 = 21, A2 = 22).
- WEEKDAY(date, [type]): evaluates to the day of the week of a date. type is 1, 2 or 3.

    - type = 1: Sunday is day 1 and Saturday is day 7 (default)
    - type = 2: Monday is day 1 and Sunday is day 7
    - type = 3: Monday is day 0 and Sunday is day 6

- See 02.04

# Conditional counting - COUNTIF

Finally, you'll need to fill in the frequency table in I8:I10. You can do this by using the following function in Google Sheets:

- COUNTIF(range, criterion): count the number of times the criterion is met in the specified range.

    - range: the source data that is used. Typically, you'll need to use an absolute reference for this one.
    - criterion: a pattern to check for. It can be as simple as a string you want to match on. For example: "good". You'll see more complex criterions in later exercises.
    - For example: if A1:A3 contains "good", "bad", "bad", then = COUNTIF(A1:A3, "bad") evaluates to 2.

- See 02.05

# Conditional aggregation - COUNTIF

Let's dive a bit deeper into the world of conditional aggregation functions. These functions can be used to calculate summary statistics for each category of data, like COUNTIF.

The data you'll be working with is a set of payments for dinner, gas, and drinks that four friends made to you in 2017.

A refresher:

- COUNTIF(range, criterion): count the number of times the criterion is met in the specified range.
    - range: the source data that is used. Typically, you'll need to use an absolute reference for this one.
    - criterion: a pattern to check for. It can be as simple as a string you want to match on, e.g. "Dylan". You'll see more complex criterions in later exercises.
    - For example if A1:A3 holds "Arun", "Dylan", "Dylan", then = COUNTIF(A1:A3, "Dylan") evaluates to 2.

- See 02.06

# Conditional sum - SUMIF

Things get a bit more complex when the range to check the criterion on is not the same as the range we want the statistics for.

This can be the case for SUMIF:

- SUMIF(range, criterion, sum_range): evaluates to the conditional sum across a range.
    - range: the range on which the criterion will be checked
    - criterion: the pattern that will be checked, e.g. "Dylan"
    - sum_range: the range of values that will be summed up
    - For example if A1:A3 holds "Arun", "Dylan", "Dylan" and B1:B3 has 3, 4, 8, then = SUMIF(A1:A3, "Dylan", B1:B3) evaluates to 12.

- See 02.07

# Conditional average - AVERAGEIF

Another interesting statistic you can calculate grouped per all the categories is the average.

You're going to be using AVERAGEIF:

- AVERAGEIF(range, criterion, average_range): evaluates to the conditional average across a range.
    - range: the range on which the criterion will be checked
    - criterion: the pattern that will be checked, e.g. "Dylan"
    - average_range: the range of values that will be summed up
    - For example if A1:A3 holds "Arun", "Dylan", "Dylan" and B1:B3 has 3, 4, 8, then = AVERAGEIF(A1:A3, "Dylan", B1:B3) evaluates to 6.

- See 02.08

# Advanced conditions - AVERAGEIF

Up until now, the condition was always an equality check on certain ranges. However, as specified in the documentation, the criterion argument can be a bit more advanced:

- Equals: 1 or "= 1"
- Greater than: "> 1"
- Greater than or equal to: ">= 1"
- Less than: "< 1"
- Less than or equal to: "<= 1"
- Not equal to: "<> 1"
Note that this is very similar to the comparison operators you saw earlier.

As a refresher, here's the signature of the AVERAGEIF function:

- AVERAGEIF(range, criterion, average_range): evaluates to the conditional average across a range.
- range: the range on which the criterion will be checked
- criterion: the pattern that will be checked, e.g. "Dylan"
- average_range: the range of values that will be summed up

- See 02.09

# Filters - FILTER, DATEVALUE, MEDIAN

Finally, you'll have to find the conditional median on a range. However, there's no such function as MEDIANIF, so you'll have to find a way to generalize what you've learned previously.

You can do so using a filter. A filter will take a range, apply a condition to all values of it and evaluate to the range of values where the condition passed. Specifically, you'll be using the following:

- FILTER(range, condition1, [condition2, ...]): evaluates to a filtered version of range, based on the passed conditions. condition1 here is substantially different from the criterion argument you're used to. condition1 is not a string, but rather a range of logical values, for example A1:A5 > 5.

- For example, if we wanted to calculate the average amount spent on dinners, we could use the following formula: =AVERAGE(FILTER(D3:D26, E3:E26 = "Dinner")). Here, we filter the range of amount spent (D3:D26) based on whether the range E3:E26 contains the word "Dinner". We then take the average of this filtered range.
- DATEVALUE(date_string): evaluates to the date object of a date_string

- See 02.10

# Grades in class

You'll be learning about the concept of lookup in Google Sheets. Lookups work similarly to how you'd look up phone numbers in a phonebook.

You'll see some exact definitions later, first let's have a look at the data. You'll be working with some grades that a university student achieved on her courses. She asked you to do some analysis. She gave you three tables:

- The actual grades are at the top. These are given in GPA. There are some empty columns which you'll fill in throughout the exercises.
- In the middle, there's a table with some specifics on the courses.
- On the bottom, there's a lookup table to convert GPA to a letter grade 

- See 02.11

# Automating the lookup - VLOOKUP

Introducing VLOOKUP:

- `VLOOKUP(search_key, range, index, is_sorted)`: look for a match in the leftmost column of a lookup table and return the value in a certain column:
    - search_key: the value to search for
    - range: the lookup table, without the headers. You typically use an absolute reference for this.
    - index: the column number of the value to be returned, where the first column in range is numbered 1
    - is_sorted: should be FALSE for now
    - You can compare it to the process of looking through a phone book. The search_key would be the name of the person you want the phone number of. The range is the data in the book, with the names in the leftmost column. Finally, the index is the number of the column where you find what you need, the phone number.

- See 02.12

# More about lookup - VLOOKUP

Let's talk a bit more about VLOOKUP:

- `VLOOKUP(search_key, range, index, is_sorted)`
- The index argument is probably the most difficult to grasp. The confusing part here is that Google Sheets normally uses letters to define columns. However, index actually refers to the number of the column, relative to the range you defined. E.g. the leftmost column in the range would be defined with index 1, the third column would have index 3.

- An example, A has: "ML101", "CP101" and B has: 3, 6. Then =VLOOKUP("CP101", $A$1:$B$2, 2, FALSE) evaluates to 6.

- See 02.13

# Horizontal lookup - HLOOKUP

Although way less common, it can be useful to do a lookup through horizontally organized data. Introducing HLOOKUP:

- HLOOKUP(search_key, range, index, is_sorted): similar to VLOOKUP but in a horizontal fashion. The key will be looked for in the uppermost row, and index now refers to the row number.
    - You're now going to use the last argument, is_sorted. If set to TRUE (default), the function assumes that the values in range are sorted. When this is the case, the match doesn't have to be exact, but HLOOKUP will look for the closest match less than or equal to search_key. If search_key is FALSE, an exact match is required.
    - For example, =HLOOKUP(0.57, $C$29:$H$30, 2, TRUE) would evaluate to E in the given spreadsheet, as the closest match less than or equal to 0.57 is 0.33.

- See 02.14

# Weighted average - SUMPRODUCT, HLOOKUP

you'll calculate an average GPA and grade. To do so, the following function might come in handy:

- SUMPRODUCT(array1, [array2, ...]): figure out the sum of products of 2 or more ranges of equal size.

E.g. SUMPRODUCT(A1:A3, B1:B3) evaluates to the result of (A1 * B1) + (A2 * B2 )+ (A3 * B3). In mathematics, this operation is called the dot product.

In addition, you will again need to use HLOOKUP to calculate your grade:

HLOOKUP(search_key, range, index, is_sorted)

- See 02.15