# Machine Learning Engineer Excercise

Amber's provides real time wholesale electricity prices to customers. In the energy market, there is a rolling reverse-auction process that generates a new price for electricity every five minutes. Household smart-meters record the electricity consumed in every five minute interval and there is a 1:1 relationship between metering intervals and those prices.

Amber also integrates with a wide range of smart home devices to monitor and manage household energy consumption to minimise cost (or maximise earnings!) in real time.

We are able to use the telemetry from these devices to generate forcasts of household energy consumption. These forcasts allow us to better inform customers of their expected costs and are a key input into our automation algorithms.

Your task is to do a minimal version of the above. Use the provided set of historical instantaneous power observations for a single customer to generate a forecast of their energy consumption in our defined format.

In the real world smart meters measure energy (in kWh), but in the interest of simplicity for this exercise we're asking you to work exclusively in power (typically in W or kW). Units are included to denote magnitude, **you will not need to do any conversion between differing types of units.**

Here's a summary of the dataset:

| Name | Description | 
| ---- | ----------- |
| CREATED_AT | Time the observation was recorded (UTC). |
| NAME | Attribute name. |
| VALUE | Instantaneous gross power consumed by the household. |
| UNIT | Unit corresponding to the observed value. |


## Requirements

- Your model should predict average power consumption for each five minute interval over the next 24 hours. For example, if you were to generate predictions at `2022-10-26T15:05:00Z` forecast outputs should be in the following format:

| forecast_at | forecast_interval_start | forecast_interval_end | forecast_value |
| ----------- | ----------------- | -------------- | ---- |
| 2022-10-26T15:05:00Z | 2022-10-26T15:05:00Z | 2022-10-26T15:10:00Z | 98 |
| 2022-10-26T15:05:00Z | 2022-10-26T15:10:00Z | 2022-10-26T15:15:00Z | 132 |
| 2022-10-26T15:05:00Z | 2022-10-26T15:15:00Z | 2022-10-26T15:20:00Z | 502 |
| ... | ... | ... | ... |
| 2022-10-26T15:05:00Z | 2022-10-27T15:00:00Z | 2022-10-27T15:05:00Z | 630 |

- When making key data preprocessing / model selection decisions please briefly explain your reasoning.

- Please evaluate your model using appropriate error metrics and visualisations.

**Note**: we don't expect your model to produce excellent results, just get into the ballpark - this is primarly a conversation starter to explore how you think through the problem.

We strongly recommend you use an off the shelf open source model to generate the predictions. 

## Questions we will cover in review

- We need to do these forecasts for each household where we have data. How would you iterate on your solution going from 10k to 100k to 1m households?

- There are no data standards between vendors for smart home devices providing inputs into this model. As we consume data from more and more heterogeneous sources - how would you handle the inputs in a scalable and maintainable way?

- (if time permits) When a customer first signs up to Amber we won't have any historical usage data for their site. How would you generate forecasts for these customer?