KB-74-OPSCHALER

All up-to-date models are found here.

The research paper is found here.
The (final) presentation given at the symposium is found here.

KB-74-OPSCHALER

This repository contains code for the KB-74-OPSCHALER project. KB-74 stands for the minor Applied Data Science at The Hague University of Applied Sciences, with Opschaler being the project name. The goal of this project is to predict the energy usage of houses, 1 week ahead with a 10 second resolution. More information about Opschaler can be found at their website.

Personal portfolio's

Links to the personal portfolio's of the KB-74-OPSCHALER group members are listed below.

About the data

There also is sensor data (occupancy, CO2 values, humidity, temperature and more) from within the dwellings available, this has not been added to this file.

Smart meter data

Parameter	Unit	Sample rate	Description
Timestamp	-	10 s	Timestamp of data telegram (set by smart meter) in local time
eMeter	kWh	10 s	Meter reading electricity delivered to client, normal tariff
eMeterReturn	kWh	10 s	Meter reading electricity delivered by client, normal tariff
eMeterLow	kWh	10 s	Meter reading electricity delivered to client, low tariff
eMeterLowReturn	kWh	10 s	Meter reading electricity delivered by client, low tariff
ePower	kWh	10 s	Actual electricity power delivered to client
ePowerReturn	kWh	10 s	Actual electricity power delivered by client
gasTimestamp	-	1 h	Timestamp of the gasMeter reading (set by smart meter) in local time
gasMeter	m3	1 h	Last hourly value (temperature converted0, gas delivered to client

Weather data

This is weather data from the KNMI weather station in Rotterdam with a sample rate of 15 minutes.
A representative from OPSCHALER says that this weather station is the most nearby all the dwellings, the exact dwelling locations however are unknown.
They probably are in a 25 km radius from this weather station.

Parameter	Unit	Description
DD	degrees	Wind direction
DR	s	Precipitation time
FX	m/s	Maximum gust of wind at 10 m
FF	m/s	Windspeed at 10 m
N	okta	Cloud coverage
P	hPa	Outside pressure
Q	W/m2	Global radiation
RG	mm/h	Rain intensity
SQ	m	Sunshine duration (in minutes)
T	deg C	Temperature at 1,5 m (1 minute mean)
T10	deg C	Minimum temperature at 10 cm
TD	deg C	Dew point temperature
U	%	Relative humidity at 1,5 m
VV	m	Horizontal sight
WW	-	Weather- and station-code

----- Notes for the group members are listed below -----

All (sub)chapters below are ment for the KB74-Opschaler group members.

Setting up GitHub on JupyterHub

Login to JupyterHub on the datascience server.
In the top right press 'New -> Terminal'. A SSH terminal should pop up in a new window.
Next follow this tutorial: link.
When you have done this you will need to add the SSH key to your GitHub account: link. Notice that step 1 will not work because 'clip' is not recognized! Work around this by using FileZilla to browse to your ~/.ssh/id_rsa.pub and download the file. Where ~ is your home folder. Then open the file with a texteditor, copy the contents and go on with the tutorial.
Test your connection: link
You are ready to clone repositories.

Basic SSH commands

ls Lists directory contents
`cd directory_name' Moves up to directory_name
cd .. Moves down a directory
cp Copies a file or directory to directory
Press tab to finish a word automatically.
Note that ~ represents your home folder. More info on Linux commands: link

Cloning the KB-74-OPSCHALER repository

Once GitHub has been setup correctly you can clone this reposotiry by pressing the green Clone or download button, copy the (link](https://github.com/deKeijzer/KB-74-OPSCHALER.git).
In the jupyter terminal window you should see the line studentnumber@datascience:~$. Move to the 'notebooks' folder by typing cd notebooks. The directory you are in now should be ~/notebooks.
While in here type git clone <the link you copied, from this repository>.
Once this is done, move to the 'KB-74-OPSCHALER' folder by typing cd KB-74-OPSCHALER. 5. Once in here type git status. This will give you additional information and show you that you have cloned successfully.

Git push & pull

Before you start working on code in jupyter, be sure that you have the latest version of this repository. Do this by typing git pull. Once you have written certain parts of code and want to upload it to this repository do this as follows.

git add . (this will select all files)
git commit -m 'commit message. For examples changes that you made to the code.'
git push More push & pull information can be found in this notebook.

Important data locations

Below is a list of the most important data locations for the Opschaler project. Make sure to not modify or add any files in the folders listed below. Some notebooks have been programmed in such a way that they expect all files in a folder to have a certain file structure. For example: in the smartmeter_data folder the only files in there should be smartmeter files in the format dwelling_id.csv. Any other file in there will crash the notebook which uses this folder to process the files.

Only read files, do not write to them.
Use the Processed dwelling_id dataframes files for EDA.

KNMI

The KNMI data consists of two dataframes. One is the raw format, this is the way KNMI has provided the data. The other dataset is the processed one, this has been cleaned/prepared/processed in such a way that it can be used for EDA.

KNMI Raw data

Location: /datc/opschaler/weather_data/knmi_10_min_raw_data
This is the raw 10 minute interval data from 2015 till 2018 as provided by the KNMI (by mail).

KNMI preprocessed dataframe

Location: //datc//opschaler//weather_data//weather.csv
The KNMI dataframe (1,82 GB) contains weather data from 2015 to 2018, with a 10 minute resolution. More information can be found in this notebook.
Reading in the data is done as follows:

weather = pd.read_csv('//datc//opschaler//weather_data//weather.csv', delimiter='\t', comment='#', parse_dates=['datetime'])
weather = weather.set_index(['datetime'])
weather.head()

Smartmeter data (from the TU Delft server)

This is the smartmeter data as downloaded from the TU Delft server.

Raw smartmeter data (from the TU Delft server)

Location: /datc/opschaler/smartmeter_data
These are the raw smartmeter dataframes from the TU Delft server.
They should be in the format export_dwelling_id.csv.
These files contain the raw electricity and raw gas data.

preprocessed dwelling_id dataframes

Location: //datc//opschaler//combined_gas_smart_weather_dfs//unprocessed
The smartmeter, gasmeter and weather dataframes merged into one dataframe.
_hour has a one hour sample rate, _10s has a 10 second sample rate.
NaNs are not removed, the following has been done (in order):
For _hour files:

1. gasPower calculated by using .diff() on gas column.
1. smartmeter and weather data downsampled to 1 hour, using mean.
1. merged smartmeter, gas and weather data.

For _10s files:

1. gas has been upsampled to 10s by using forward fill (.ffill())
1. gasPower calculated by using .diff() on gas column.
1. weather upsampled to 10s by using forward fill
1. merged smartmeter, gas and weather data

Processed dwelling_id dataframes (Use these for analysis)

Location: /datc/opschaler/combined_gas_smart_weather_dfs/processed
The smartmeter, gasmeter and weather dataframes merged into one dataframe. Rows containing a NaN streak which is higher than accepted have been dropped. NaNs in the weather data have been forward filled. NaNs in 'eMeter', 'eMeterReturn', 'eMeterLowReturn', 'gasMeter' have been interpolated. ePower, ePowerReturn and gasPower might still contain NaNs, drop these after reading in the files (if required). More information can be found here

dir = '//datc//opschaler//combined_gas_smart_weather_dfs//processed//'
dwelling_id = 'P01S01W0373' (for example)
df = pd.read_csv(dir+dwelling_id+'.csv', delimiter='\t', parse_dates=['datetime'])
df = df.set_index(['datetime'])

Honeywell sensor data

Location: /datc/opschaler/honeywell_sensors_per_dwelling_combined/honeywell_all_dwellings_combined.csv Processed Honeywell sensordata.
All sensordata in one dataframe with dwelling labels.
Note that the serial data in this file has not yet been converted to the room labels. The serial to room datafile honeywell_serial_to_room.xlsx can be found in the same folder.

NaN Information of not-processed dataframes

Location: /datc/opschaler/nan_information
This folder contains dwelling_id_threshold_percentage.csv files together with corresponding plots to get indepth knowledge about the NaNs in all used data. The notebook in which dwelling_id_threshold_percentage.csv is created can be found here.

EDA results locations

location: //datc//opschaler//EDA// The EDA results, saved per dwelling.
For example, correlation coefficient matrices are saved in //datc//opschaler//EDA//correlation_matrices

Usefull terminal commands

In Linux:

top to see CPU & RAM.
`nvidia-smi -l 1' to see GPU usage and refresh this information every second.

On Windows:
To use nvidia-smi first move to:

cd C:\Program Files\NVIDIA Corporation\NVSMI
Then run nvidia-smi by:
.\nvidia-smi -l 1.

To see the CPU usage:

wmic cpu get loadpercentage

Name		Name	Last commit message	Last commit date
Latest commit History 1,052 Commits
.idea		.idea
.ipynb_checkpoints		.ipynb_checkpoints
Personal_folders		Personal_folders
__pycache__		__pycache__
appendix/friday presentations		appendix/friday presentations
.gitignore		.gitignore
GitHub push & pull tutorial.ipynb		GitHub push & pull tutorial.ipynb
Opschaler final v1.1 - 14-01-2019.pdf		Opschaler final v1.1 - 14-01-2019.pdf
Opschaler.py		Opschaler.py
README.md		README.md
README_Opschaler_module		README_Opschaler_module
Untitled.ipynb		Untitled.ipynb
jupyterlab.bat		jupyterlab.bat
nano.save		nano.save
run_notebook.bat		run_notebook.bat
ssh portforward.bat		ssh portforward.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

All up-to-date models are found here.

KB-74-OPSCHALER

Personal portfolio's

About the data

Smart meter data

Weather data

----- Notes for the group members are listed below -----

Setting up GitHub on JupyterHub

Basic SSH commands

Cloning the KB-74-OPSCHALER repository

Git push & pull

Important data locations

KNMI

KNMI Raw data

KNMI preprocessed dataframe

Smartmeter data (from the TU Delft server)

Raw smartmeter data (from the TU Delft server)

preprocessed dwelling_id dataframes

Processed dwelling_id dataframes (Use these for analysis)

Honeywell sensor data

NaN Information of not-processed dataframes

EDA results locations

Usefull terminal commands

About

Releases

Packages

Contributors 5

Languages

deKeijzer/KB-74-OPSCHALER

Folders and files

Latest commit

History

Repository files navigation

All up-to-date models are found here.

KB-74-OPSCHALER

Personal portfolio's

About the data

Smart meter data

Weather data

----- Notes for the group members are listed below -----

Setting up GitHub on JupyterHub

Basic SSH commands

Cloning the KB-74-OPSCHALER repository

Git push & pull

Important data locations

KNMI

KNMI Raw data

KNMI preprocessed dataframe

Smartmeter data (from the TU Delft server)

Raw smartmeter data (from the TU Delft server)

preprocessed dwelling_id dataframes

Processed dwelling_id dataframes (Use these for analysis)

Honeywell sensor data

NaN Information of not-processed dataframes

EDA results locations

Usefull terminal commands

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages