# Technical Tasks

* The aim of these tasks is to demonstrate a justifiable approach to common ML tasks.
* The aim is **not** particularly about code quality. 
* Only spend a small amount of time summarising work in markdown cells (<20% total time), focus on the data manipulation and model build.

# 1. Regression: Data preparation and model build
This has 2 parts. In the first part you will explore and visualise some data, and in the second part you will apply any learnings from part 1 while implementing a simple modelling pipeline for the same data.

After each part we will spend 5-10 minutes discussing your approach.

## 1.1 Exploration

* Explore and plot data
    * Establish distributions of variables
    * Find any problems in the data to fix (to be fixed in next section)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split

In [None]:
data = pd.read_csv("diabetes.csv")

X, y = data.iloc[:, :-1], data.iloc[:, -1]

print(X.head())
print("\n")
print(y.head())

## 1.2 Write a simple model-building pipeline including the three tasks below:

* Clean data
    * fix any problems with the data that are necessary to fix before building a model
* Normalise data
    * normalise the variables for use in a linear regression 
* Fit and evaluate model
    * estimate the generalisation error of your model

# 2. General ML
To be discussed verbally.

### 2.1
Highlight an interesting project from your CV and discuss what machine learning approaches worked well and didn't work well on that project.

### 2.2
In a regression problem with feature vectors $\mathbf{x_1}, ..., \mathbf{x_n} \in {\rm I\!R^d}$ and targets $y_1, ..., y_n \in {\rm I\!R}$, how would you adjust the following loss function on parameters $\mathbf{b} \in {\rm I\!R^d}$ to achieve the sparsest solution?
$$\mathcal{L}(\mathbf{b}) = \sum_{i=1}^n (y_i - \mathbf{x_ib})^2 + \lambda\sum_{j=1}^d |b_j|^q$$

