# Exercises

### Ex 1 - Getting started with running code in a notebook
First lets go through some basic examples to explain editing and running of notebook cells. In the Help menu you can find a list of keyboard shortcuts and a tour of the notebook user interface. You don't need to learn all that now, but do take a look later if you intend to use notebooks a lot.

#### Ex 1a - Moving between cells and running them

* When there's a blue box around a cell you can move up and down with arrow keys. Press `Enter` or click in the textbox to start editing the cell.
* When there's a green box around a cell you can edit the text in it. `Enter` inserts a new line, `Shift+Enter` runs the code in the cell.

Run the following cell by clicking in it then pressing `Shift+Enter`:

In [None]:
x = 7 * 6
print(x)

Try the next one as well, see how the variable x defined in the previous cell is available:

In [None]:
print(x)

Placing a variable alone on the last line displays it. We'll be using `print(x)` instead for the rest of the examples, since that can be used any place in the cell.

Try running this cell:

In [None]:
x = 3 * 1.2
x

Now go up the the `print(x)` cell above and run it again. Note that the value of variables is determined by the order you execute the cells, not the order of the cells in the notebook document. The execution order is recorded as `In[4]` to the left.

To clear all previously defined variables, use the menu `Kernel -> Restart & Clear output`.

#### Ex 1b - Debugging a faulty program

Programs are rarely correct on the first try. Here's an attempt at computing total cost of a number of items.

* Try running the next cell. Is the printed total correct?
* Fix the code so the total is computed correctly and run again.

In [None]:
price = 15.0
quantity = 3

In [None]:
total = price + quantity
print(total)

### Ex 2 - Variables

#### Ex 2a - Counting with integers
Lets warm up with some basic integer variables. After each step, use `print` to inspect the variable values. Try to figure out what will be printed _before_ running the cell.

* Set a variable `i` to any integer number.
* Set another variable `j` to equal `i + 2`.
* Add another integer number to `i`. What is the value of `j` now?

In [None]:
# Implement here!

#### Ex 2b - Swap two variables
Someone has mixed up the first and last name here, add code to swap the values of the two variables.

In [None]:
first = "Trump"
last = "Donald"

In [None]:
# Implement here!

print(first, last)

### Ex 3 - Calculations with variables

#### Ex 3a - Buying fruit
Given the price and quantity of apples and pears provided here, compute

* The number of fruits.
* The average cost of a fruit.
* The total cost of this checkout.

Store each in its own variable and print it.

In [None]:
apples = 7
apple_price = 1.0

pears = 2
pear_price = 1.2

In [None]:
# Implement here!

#### Ex 3b - Integer division and rounding of numbers
As a storage worker, you need to pack some items into crates. Naturally you sit down and write a small python program to figure out how best to do this.

* How many crates can be filled completely with the number of items?
* Compute the remainder, how many items are left in the last crate?
* From the number of filled crates and the remainder, compute how many items you originally had. Does it match the original number of items? If not, you have a bug in your program and need to find and fix it!

In [None]:
items = 123
crate_size = 20
print("items: {}, crate size: {}".format(items, crate_size))

In [None]:
# Implement here!

#### Ex 3c - Investigating the behaviour of floating point numbers
Lets try a simple calculation with floating point numbers:

* Set a variable to 0.7 (pick your own variable name)
* Add 0.1 to the variable
* Print the variable

In [None]:
# Implement here!

Are you surprised by the result? Floating point numbers have high precision but are not exact.

* Print the difference between `result` and `0.8` to see approximately how precise these numbers are.

In [None]:
# Print difference between result and 0.8 here

The difference you just printed should be on a format like `1.23e-9` which means $1.23 \times 10^{-9}$. The exponent of the number you printed is approximately the number of accurate digits in a float.

The next cell prints the number 0.7 with only 3 decimals, which gets rounded to 0.700 in the string formatting.

* To avoid the rounding, change the number of decimals from 3 to a number larger than the number of accurate digits you found above.

In [None]:
print("{:.3f}".format(0.7))

### Ex 4 - Strings
Text string manipulation is used for things like dealing with filenames and paths, reading data from text files, and writing data to text files.

#### Ex 4a - Adding strings
Add the first and last name together with a space between to form the full name.

In [None]:
firstname = "John"
lastname = "Doe"

In [None]:
fullname = "Implement me"

print(fullname)

#### Ex 4b - Splitting and formatting strings
* Split `fullname` into two new variables `firstname` and `lastname`, make sure to remove any spaces.
* Write a string format `template` to match how James Bond likes to introduce himself.

When running the cell, the output should be _My name is Bond. James Bond._

In [None]:
fullname = "Bond, James"

In [None]:
# Replace these three lines with your implementation
firstname = "..."
lastname = "..."
template = "..."

print(template.format(first=firstname, last=lastname))

### Ex 5 - Loops
Before we move on to more complex data structures, lets try some loops.

#### Ex 5a - Looping over number ranges

Using `range`,

* Print the numbers from 0 through 5 (inclusive)
* Loop over the numbers from 3 to 10 (not including 10)
* Print the even numbers from 4 through 8 (inclusive)


In [None]:
# Implement here

#### Ex 5b - Double until there's no more grains left
If you give away one grain on the first chessboard square, 2 on the second, 4 on the third, and so on by doubling the number of grains for each square, how many squares until you have given away more than a quadrillion ($10^15$) grains?

* Use a while loop to find the answer.

_Tip: If you get stuck in an infinite loop, it will look like `In [*]` on the side of the cell. If that happens, select the menu "Kernel -> interrupt", fix your code, and try again._

In [None]:
# Implement here

### Ex 6 - Lists
Lets get familiar with lists, which form the basis of most structured data processing.

#### Ex 6a - Accessing single values from a list
Here we define a list with the numbers from 1 up to and including 1000.

Print the first, the middle, and the last value from these list.

Keep in mind that the first index is `0` and the last is `length of list - 1`.

In [None]:
values = list(range(1, 1001))
size = len(values)

In [None]:
# Implement here

#### Ex 6b - Computing something for each value in a list
Compute $sin(x)$ for each value $x$ in the `values` list. Print the value and the computed result together for each value.

In [None]:
from math import sin, pi
values = [0.0, 0.5*pi, pi, 1.5*pi]

In [None]:
# Implement here

#### Ex 6c - Lists of lists of lists
We have been given some `data` about the sales completed by each member of the sales team.
The `data` is a list of lines, where each line is itself a list.
Each line contains two items: the `name` of a sales team member, and a list of `sales` made by that person.

We want to build a new list from this data.
For each salesperson, the new list `summary` should contain a list with name, number of sales, total revenue from all sales made by this person, and a computed bonus that is $20\%$ of the revenue the exceeds 200.

```
summary = [
    ["name", num_sales, total, bonus],
]
```

* To warm up, print the last sale that Ida made by indexing the lists directly. _Hint: `data[1]` is the line for Ida._
* Build `summary` from `data`. _Hint: You can loop over the data with_: `for name, sales in data:`.
* For each line in the computed stats, print a summary line with the computed information.
* Bonus points: find and print who made the largest single sale.

In [None]:
data = [
    ["Mark", [100, 20, 50, 60]],
    ["Ida", [300, 210]],
    ["Bob", [30, 70, 90, 160, 80]],
    ["Heidi", [10, 110, 200]],
]

In [None]:
# Implement here

### Ex 7 - Dicts
Dictionaries hold key,value pairs and allow looking up a value given a key.

#### Ex 7a - Formatting a string from dict entries
Provided the dict `person` here, print a string formatted with the phone number first, followed by the full name in lastname, firstname format.

In [None]:
person = {
    "firstname": "Ola",
    "lastname": "Nordmann",
    "address": "Drammensveien 1",
    "phone": "+47 1234567890"
}

In [None]:
# Implement here!

#### Ex 7b - Filter products from nested dicts
Here's some information about a phone store. You want to find and display the most expensive phone variant from each model, but unfortunately the data input was somewhat messily done and the model names use different case and sometimes contain spaces that makes it harder to compare the strings.

* Clean up the products list by normalizing the `"model"` field to be lower case without spaces.
* Build a new dict `products_by_model` where the keys are normalized model strings and the values are lists of tuples `(price, variant)`, one for each product with the specific model.
* For each model, print the number of products and all product variants sorted by price.

In [None]:
products = [
    { "model": "iPhone", "variant": "8", "price": 8000 },
    { "model": "Galaxy", "variant": "S6", "price": 4000 },
    { "model": "GALAXY", "variant": "S8", "price": 6000 },
    { "model": "iphone", "variant": "X", "price": 10000 },
    { "model": " galaxy", "variant": "S7", "price": 5000 },
    { "model": "IPHONE ", "variant": "7", "price": 5000 },
]

In [None]:
# Implement here!

### Ex 8 - Conditional branching with if statements


#### Ex 8a - Odd or even numbers
Write a for loop over the numbers starting at 1 and ending with 10, and print each number labelled as odd or even.

In [None]:
# Implement here

#### Ex 8b - Pick your priorities
Some tasks are important, some are urgent, some are both, and some are neither.
Write a program that prints advice for each task topic on what to do
based on whether the task is important and/or urgent.

In [None]:
tasks = [
    { "topic": "pay overdue invoce",    "important": True,  "urgent": True },
    { "topic": "learn programming",     "important": True,  "urgent": False },
    { "topic": "reply to all emails",   "important": False, "urgent": True },
    { "topic": "sorting pens by color", "important": False, "urgent": False },
]

In [None]:
# Implement here!

### Ex 9 - Find names of text files on file system

* Get a list of all the files in the current directory using `os.listdir`.

* Print the full path of each filename.

* Split the full path into the directory name, the file extension, and the root of the filename (the root is what's left when removing directory name and file extension).

* Join the directory name, filename root, and file extension together to get back the full path.

_Hint: `os.path` contains many useful functions. See the documentation on python.org and the tab completion tip below._

In [None]:
# Run this to import the os module which is needed for the exercises below
import os

In [None]:
# Tip: Run this cell to see documentation for the function
# os.path.splitext in a window at the bottom of the page:
os.path.splitext?

In [None]:
# Tip: Place the text cursor after 'os.path' below, then
# type '.' and then 'Tab', to see a list of things available in os.path
os.path

In [None]:
# Implement here

### Ex 10 - A taste of text encoding problems
For long winded historical reasons, there is a plethora of standards for how to represent text strings in computers.

Strings in python (version 3+) are always _unicode_ strings and can contain most letters from most languages, including fictional ones. You don't have to worry about how these are represented internally.

Unfortunately, strings stored as bits and bytes in files on disk can originate from different programs and must be decoded using the same encoding standard they were written with.

The `utf-8` encoding can represent the full range of unicode strings.
Older standards only included some subset of letters for a specific language.
The encodings `cp858` and `cp1252` are common on Norwegian Windows systems.
The encoding `cp1253` has Greek letters but not Norwegian ones.

* Try decoding the bytestring below using the encoding standards `cp858`, `cp1252`, `utf-8`, and `cp1253` and print the resulting strings. Which is the right one? What happens with the Greek encoding version?

_Hint: encode a text string using the notation `mystring.encode("utf-8")` and decode a bytestring using the notation `bytestring.decode("utf-8")`._

In [None]:
# These bytes are the result of encoding a text string with one specific encoding standard
bytestring = b'\xc3\x98yvind \xc3\x85sen'

In [None]:
# Implement here

### Ex 11 - Counting overtime hours
Lets try some more exercises where you need to nest loops and branches.

You're running a shop with four employees. Because of poor planning you end up pushing some of your employees to work very hard part of the week. Now you need to compute the number of overtime hours each employee to get the overtime payment correct. Your employees are entitled to overtime payment for all hours above 7.5 any single day.

* Print name and total hours worked for each employee.
* Compute the number of overtime hours this week for each employee.
* Compute the total number of overtime hours for all employees.


In [None]:
# This dict maps employee name to number of hours worked each day for a week
hours_per_day = {
    "Bob":   [12.5, 12.5, 12.5, 0.0, 0.0],
    "Ida":   [10.0, 10.0,  9.0, 4.5, 4.0],
    "Dave":  [ 8.0,  8.5,  7.5, 7.5, 6.0],
    "Marie": [ 7.5,  7.5,  7.5, 7.5, 7.5],
}

In [None]:
# Implement here

### Ex 12 - Analysing tweets by Donald Trump
Here's a collection of tweets by the president of the United States of America.
We'll try to find twitter style @mentions, print tweets mentioning certain words,
and find the most common words.

In [None]:
tweets = """
Healthy young child goes to doctor, gets pumped with massive shot of many vaccines, doesn’t feel good and changes – AUTISM. Many such cases!
.@ariannahuff is unattractive both inside and out. I fully understand why her former husband left her for a man- he made a good decision.
The hatchet job in @NYMag about Roger Ailes is total bullshit. He is the ultimate winner who is surrounded by a great team. @FoxNews
Sorry losers and haters, but my I.Q. is one of the highest -and you all know it! Please don’t feel so stupid or insecure,it’s not your fault
Windmills are the greatest threat in the US to both bald and golden eagles. Media claims fictional ‘global warming’ is worse.
Let’s take a closer look at that birth certificate. @BarackObama was described in 2003 as being ‘born in Kenya.’ bit.ly/Klc9Uu
We should have gotten more of the oil in Syria, and we should have gotten more of the oil in Iraq. Dumb leaders.
Russian leaders are publicly celebrating Obama’s reelection. They can’t wait to see how flexible Obama will be now.
The Miss Universe Pageant will be broadcast live from MOSCOW, RUSSIA on November 9th. A big deal that will bring our countries together!
If the morons who killed all of those people at Charlie Hedbo would have just waited, the magazine would have folded – no money, no success!
Every time I speak of the haters and losers I do so with great love and affection. They cannot help the fact that they were born fucked up!
"""

#### Ex 12a - Split lines

* Each line in the multiline string `tweets` is one tweet. Convert it into a list of single line tweets, and print the number of tweets in the list (should be 11).
* For each tweet, print its length and an estimate of how many words it has.
* Print the tweet mentioning @BarackObama.

In [None]:
# Implement here

#### Ex 12b - Data cleaning

As is often the case, the data set here is a bit messy and needs some tidying.
For example we don't want to count "The" and "the," as different words.
Let's normalize the data to simplify further analysis:

* For each single tweet, build a list of words, in lower case and with punctuation removed.
  Make sure to split words where there's no space after comma.
* To test your data normalization, print the tweets where the normalized word list
  contains "windmills", "syria", or "insecure". There should be one of each.
  (This can also be solved without building the word list first. How?)

In [None]:
punctuation = ".,:;!'’‘-–"

In [None]:
# Implement here

#### Ex 12c - Count and analyse

Now try to answer these questions using the normalized word lists:

* Who have been mentioned with @username in all the tweets combined?
* How many unique words are there across the tweets?
* What are the most popular 10 words across all the tweets?

In [None]:
# Implement here