# Getting started with OpenGrid

The OpenGrid project has the ambition to provide open source algorithms to extract more knowledge and insights from building monitoring data. The algorithms are available in the `opengrid` python module (<a href="https://pypi.python.org/pypi/opengrid/" target="_blank">available on pypi</a>) and we have created a set of jupyter notebooks to illustrate their use. 

In this notebook show how to install `opengrid` and then we introduce some general concepts like loading sample data and making your first plots. 

## Installation as user

If you have a verion of Python (Python 3.3 or greater, or Python 2.7), you can install the module with 

    pip install opengrid
    
Please note that on some systems with Python 3.x, you need to replace `pip` by `pip3`.  If you also want to run the notebooks, you will have to install `jupyter notebook` separately because we did not include it in the requirements of `opengrid`.  Just do

    pip install jupyter
    
## Installation as developer

As developer, you'll want the source code.  The detailed instructions for installation and setup can be found <a href="https://github.com/opengridcc/opengrid/#installation" target="_blank">here</a>. In short: fork and then clone your fork of the <a href="https://github.com/opengridcc/opengrid/" target="_blank">opengrid repository on github</a>. Enter your cloned copy of the code and install the requirements with 

    pip install -r requirements.txt
    
Also install `jupyter notebook` as indicated above. Finally, add the path to your opengrid clone to your `PYTHONPATH`.  Test your setup by running a unittest as described.

# Imports and module structure

By convention, import opengrid as follows

In [None]:
import opengrid as og

The structure of the opengrid module is as follows:

    opengrid
    | -datasets
    | -library
      | -analysis
      | -weather
      | -regression
      | -plotting
      | -exceptions
    | -recipes
    | -tests

In `datasets` we have collected some sample data and an easy method to load the data (see below). 
All algorithms are available in the `library`, the core of the module. The most-used methods are also made available directly under opengrid. Type `og.<TAB>` in a notebook cell or python console to see what's available. 
In `recipes` we collect python scripts that can be run as jobs and all tests are in the `tests` folder. 

The jupyter notebooks (like this one) are not included in the python module, but in the `notebooks` folder, next to the `opengrid` folder that forms the python module. 

To set some **plot style defaults**, we have a simple one-liner that returns `matplotlib.pyplot`. By convention, use it as follows to obtain the commonly used `plt` object:


In [None]:
plt = og.plot_style()

# Load OpenGrid sample data and make simple plots

### List available datasets

We have included a few sample datasets in the opengrid module. They are stored as zipped pandas dataframe pickles and can be listed and loaded automatically with the `opengrid.datasets` module.

In [None]:
og.datasets.list_available()

In [None]:
# load the hourly gas consumption for 2016 and see what's inside
df = og.datasets.get('gas_2016_hour')
df.info()

The data is loaded into a `pandas.DataFrame` object called `df`.  With `df.info()` we can see what's inside: 3 columns of data with 8785 datapoints each.  The index is an hourly `pandas.DatetimeIndex` for 2016 in local time (UCT+1).

Below, we will use some other pandas functionality for plotting and resampling the data to monthly values.  See the <a href="https://pandas.pydata.org" target="_blank">documentation of pandas</a> for more info.  

### Make a few plots

In [None]:
# First we use the pandas wrapper for matplotlib plots to generate two different time series plots
fig = df.plot()
fig = df.plot(subplots=True)

In [None]:
# now we resample the data to monthly values and create a bar chart with pandas
df_month = df.resample(rule='MS').sum()
# conversion from Wh to kWh
df_month = df_month/1000 
# create the plot
ax = df_month.plot(kind='bar')
# add the ylabel. The '_ = ' in front of the command is only to hide the output of this notebook cell.
_ = ax.set_ylabel('kWh')

In [None]:
# compute the total consumption of each of the sensors (in kWh/year)
df_month.sum()

That's it for this introduction.  You can look at more notebooks with demos on the <a href="https://opengridcc.github.io" target="_blank">opengrid homepage</a>. 