# Assumptions

## What is an assumption anyways?

### Challenge: identifying assumptions

Take 5 minutes to list out as many assumptions as you can about the
EIA 923 Puerto Rico data (`pr_gen_fuel_monthly.parquet`) in the [data directory](../data/).

Please put them in the shared Google doc that your instructor prepared for you.
This will serve as a foundation for future challenges in this lesson.

The goal is to get past the obvious ones and start thinking of some un-obvious assumptions -
no need to limit yourself to 'realistic' ones at this stage.

Some prompts to get you started:

* what problems have you run into in previous datasets?
* if you were here for the data exploration episode,
  what are some things you learned about the data then?
* what are other people writing, does that trigger new thoughts about what to add?

When we return, we'll talk about which things worked. 

## How to test your assumptions

In [None]:
import pandas as pd

### Example: testing assumptions

In [None]:
assert 1 == 1

In [None]:
assert 1 == 2

In [None]:
assert 1 == 2, "Expected 1 to be equal to 2."

In [None]:
monthly_gen_fuel = pd.read_parquet("../data/pr_gen_fuel_monthly.parquet")

In [None]:
fuel_consumed_mmbtu = monthly_gen_fuel["fuel_consumed_mmbtu"]

In [None]:
assert (fuel_consumed_mmbtu >= 0).all(), "The reported fuel consumption in MMBTU should be non-negative"

In [None]:
fuel_consumed_mmbtu[~(fuel_consumed_mmbtu >= 0)]

In [None]:
assert (fuel_consumed_mmbtu.dropna() >= 0).all(), "If fuel consumption in MMBtu is reported at all, it should be non-negative."

### Challenge: prioritizing assumptions

Now it's time to try out that prioritization framework!

Let's start by looking at the list of assumptions we came up with.

Take a few minutes to put a `+` emoji reaction next to 3-5 assumptions that feel important to test.

We'll then discuss a few assumptions with many `+` reacts and how they fit into the framework above.

### Challenge: testing an assumption

Now that we have our list of high-priority testing targets,
we can go ahead and write some tests for them!

Pick one of the assumptions identified as high-priority in the last challenge,
and write some code to test whether it's true or false.

When we finish,
we'll talk about challenges we ran into in writing these tests.

In [None]:
# we still have monthly_gen_fuel available from the example before - so use that for your assertions here!