# Meters

Every now and then I go down into the basement and note the values on my watermeter, gasmeter and electricity meter. That last one has a day and a night counter, which is because electrity is cheaper at night. You can find this data in the "files/Meters.ods". (It's a LibreOffice-spreadsheet.)

Start by loading the data. It's in ODS-format (not csv), so you may need to install the library first. Add an additional code block if you still need it.

In [None]:

import pandas as pd

bad_df = pd.read_excel("files/Meters.ods", usecols="A:C", decimal=",", thousands=".", header = 0, names=["What", "Date", "Value"])

bad_df['What'] = bad_df['What'].replace('EL.Dag', 'EL.Day')
bad_df['What'] = bad_df['What'].replace('EL.Nacht', 'EL.Night')

bad_df.head()

There is a small problem with this data: In stead of making the following spreadsheet:

![](files/2023-10-04-16-46-45.png)

I made the following sheet:

![](files/2023-10-04-16-47-16.png)

But the first one would have been much nicer. Going form the good version to the bad version would be "[melt](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.melt.html)", the one we want is "[pivot](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot.html)".

In [None]:
# DELETE
df = bad_df.pivot(index="Date", columns="What", values="Value")

df.head()

Do you feel a line graph coming up? Because I sure do!

In [None]:
water_df = bad_df[bad_df['What'] == 'Water']
print(water_df)

In [None]:
df.plot()

Can you spot the installation of my solar panels? And when I set up my pool? I'm guessing yes on the solar panels and no on the pool. If you had graphed the water separately you would have.

Could you graph water, gas and electricity in the same graph with [different scale](https://stackabuse.com/matplotlib-plot-multiple-line-plots-same-and-different-scales/) for water?

In [None]:
# DELETE

import matplotlib.pyplot as plt

fig, ax = plt.subplots()

df[["EL.Night", "EL.Day", "Gas"]].plot(ax=ax)
ax.tick_params(axis='y', labelcolor='red')
ax2 = ax.twinx()
df.Water.plot(ax=ax2, color='pink')
ax2.tick_params(axis='y', labelcolor='pink')

plt.show()


The good news is that our date has now automatically been stored as a date in the index. That means we can simply select all measurements for 2021.

In [None]:
df[df.index.year == 2021]

And had we had more data, that would have made for some nice plots.

# Filling values

In the AWS-course, a couple of different filling methods where used for NaN-values in time series:

![](files/2023-10-05-14-29-27.png)

Let's try them out! Forward and backward fill are [easy](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html). The differences are best observed when you only plot the gas usage.

In [None]:
# DELETE

df_forward = df.ffill()
df_forward.Gas.plot()

In [None]:
# DELETE

df_back = df.bfill()
df_back.Gas.plot()

Moving on to [moving average](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html)... In this case it's a bad filler by the way, can you say why?

In [None]:
#DELETE

df_moving = df.fillna(df.rolling(5, min_periods=1).mean())
df_moving.Gas.plot()

And finally [interpolating](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.interpolate.html). This requires you to know your data because you need to choose if a polynomial is required, and if so of what order. Let's compare a linear line vs a polynomial of the second and fifth order. (Interpolate three times and graph all the line on one graph.)

In [None]:
#DELETE

df_inter_linear = df.interpolate(method="linear")
df_inter_poly_2 = df.interpolate(method="polynomial", order=2)
df_inter_poly_5 = df.interpolate(method="polynomial", order=5)

ax = df_inter_linear.Gas.plot()
df_inter_poly_2.Gas.plot(ax=ax)
df_inter_poly_5.Gas.plot(ax=ax)
ax.legend(["linear", "poly 2", "poly 5"])

Maybe zoom in on the years 2021-2023?

In [None]:
#DELETE

ax = df_inter_linear[(2021 <= df.index.year) & (df.index.year <= 2023)].Gas.plot()
df_inter_poly_2[(2021 <= df.index.year) & (df.index.year <= 2023)].Gas.plot(ax=ax)
df_inter_poly_5[(2021 <= df.index.year) & (df.index.year <= 2023)].Gas.plot(ax=ax)
ax.legend(["linear", "poly 2", "poly 5"])

Which is better? Difficult to say. The real problem is there isn't enough data to distinguish between winter and summer. The gas is only used for heating the house and making hot water, so there should be a difference because we heat the house only in winter, but we use hot water throughout the year. The house is well insulated so I'm pretty pleased with the linearity of this line as it shows that we don't use a lot of extra gas in winter, meaning the cost of heating is low.

But as we said, there simply isn't enough data to make this distinction. If there really was enough data we should have been able to see when I was on holiday (no warm water usage).

So let's leave the dataset as a good example of the 4 methods of filling in blanks in time-series.