# Demo caching

This notebook shows how caching of daily results is organised. First we show the low-level approach, then a high-level function is used.

## Low-level approach

In [1]:
import pandas as pd
from opengrid.library import misc
from opengrid.library import houseprint
from opengrid.library import caching
import charts
hp = houseprint.Houseprint()

Server running in the folder /Users/Jan/opengrid/notebooks/Demo at 127.0.0.1:60033
Opening connection to Houseprint sheet
Opening spreadsheets
Parsing spreadsheets
24 Sites created
24 Devices created
75 sensors created
Houseprint parsing complete


We demonstrate the caching for the minimal daily water consumption (should be close to zero unless there is a water leak).  We create a cache object by specifying what we like to store and retrieve through this object.  The cached data is saved as a single csv per sensor in a folder specified in the opengrid.cfg.  Add the path to a folder where you want these csv-files to be stored as follows to your opengrid.cfg

    [data]
    folder: path_to_folder

In [None]:
cache_water = caching.Cache(variable='water_daily_min')
df_cache = cache_water.get(sensors=hp.get_sensors(sensortype='water'))
charts.plot(df_cache.ix[-8:], stock=True, show='inline')

If this is the first time you run this demo, no cached data will be found, and you get an empty graph. 

Let's store some results in this cache.  We start from the water consumption of last week.

In [None]:
hp.sync_tmpos()

In [None]:
start = pd.Timestamp('now') - pd.Timedelta(weeks=1)
df_water = hp.get_data(sensortype='water', head=start, )
df_water.info()

We use the method *daily_min()* from the analysis module to obtain a dataframe with daily minima for each sensor.

In [None]:
daily_min = analysis.DailyAgg(df_water, agg='min').result
daily_min.info()

In [None]:
daily_min

In [None]:
cache_water.update(daily_min)

Now we can get the daily water minima from the cache directly. Pass a *start* or *end* date to limit the returned dataframe.

In [None]:
sensors = hp.get_sensors(sensortype='water') # sensor objects
charts.plot(cache_water.get(sensors=sensors, start=start, end=None), show='inline', stock=True)

## A high-level cache function

The caching of daily results is very similar for all kinds of results.  Therefore, a high-level function is defined that can be parametrised to cache a lot of different things.

In [None]:
import pandas as pd
from opengrid.library import misc
from opengrid.library import houseprint
from opengrid.library import caching
from opengrid.library import analysis
import charts   

In [None]:
hp = houseprint.Houseprint()
#hp.sync_tmpos()

In [None]:
sensors = hp.get_sensors(sensortype='water')
caching.cache_results(hp=hp, sensors=sensors, resultname='water_daily_min', AnalysisClass=analysis.DailyAgg, agg='min')

In [None]:
cache = caching.Cache('water_daily_min')
daily_min = cache.get(sensors = sensors, start = '20151201')
charts.plot(daily_min, stock=True, show='inline')

## Caching Daily Weather Data

You only get a 1000 free requests per day from ForecastIO (DarkSky), so it makes sense to cache them.

In [7]:
from opengrid.library.forecastwrapper import Weather
from opengrid.config import Config
from collections import namedtuple

In [3]:
c = Config()
api_key = c.get('Forecast.io', 'apikey')

So first, you want to define your request: the start and end dates and the location

In [4]:
end = pd.Timestamp.utcnow()
start = end - pd.Timedelta(days=5)
location = 'Ukkel'

We create a cache object and ask if it already has some data for us

In [5]:
cache = caching.Cache('weather_ukkel')

Cache object created for variable: weather_ukkel


The `get` method expects a list of sensor objects. Each sensor needs a `key` parameter. We can mimic this behaviour with a namedtuple

In [10]:
WeatherSensor = namedtuple('WeatherSensor', ['key'])

In [11]:
temperature_sensor = WeatherSensor(key='temperature')

In [16]:
data = cache.get(sensors=[temperature_sensor], start=start, end=end)

In [17]:
data

Lets try to get the already existing dates from the frame so we know what we still need to fetch

In [32]:
cached_dates = {i.date() for i in data.index}
requested_dates = set(misc.dayset(start, end))
dates_to_fetch = requested_dates - cached_dates

In [33]:
dates_to_fetch

{datetime.date(2016, 10, 24),
 datetime.date(2016, 10, 25),
 datetime.date(2016, 10, 26),
 datetime.date(2016, 10, 27),
 datetime.date(2016, 10, 28),
 datetime.date(2016, 10, 29)}

In [15]:
len(data)

0

In [34]:
misc.dayset(None, end)

TypeError: can't compare offset-naive and offset-aware datetimes