<a href="https://colab.research.google.com/github/arangodb/interactive_tutorials/blob/master/notebooks/Window.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WINDOW AQL Function
Aggregate adjacent documents or value ranges with a sliding window to calculate running totals, rolling averages, and other statistical properties

In [None]:
%%capture
!pip3 install pyarango
!pip3 install "python-arango>=5.0"

In [None]:
%%capture
!git clone https://github.com/arangodb/interactive_tutorials.git -b oasis_connector --single-branch
!rsync -av interactive_tutorials/ ./ --exclude=.git

Here we import the oasis package along with our python driver. Here we have imported both but only one is necessary.

In [None]:
import oasis

from pyArango.connection import *
from arango import ArangoClient

## Connecting

Be sure to update the `tutorialName` variable with your tutorials name.

In [None]:
# Retrieve tmp credentials from ArangoDB Tutorial Service

# ** UPDATE THE FOLLOWING VARIABLE  **
tutorialName = "Window"
login = oasis.getTempCredentials(tutorialName=tutorialName, credentialProvider="https://tutorials.arangodb.cloud:8529/_db/_system/tutorialDB/tutorialDB")

# Here is an example of connecting with python arango 
database = oasis.connect_python_arango(login)

# These are the credentials
print("https://"+login["hostname"]+":"+str(login["port"]))
print("Username: " + login["username"])
print("Password: " + login["password"])
print("Database: " + login["dbName"])

# Importing Data
You are free to parse and import your data however you choose but some simple options for those already familiar with ArangoDB are:
* [arangorestore](https://www.arangodb.com/docs/stable/programs-arangorestore.html)
* [arangoimport](https://www.arangodb.com/docs/stable/programs-arangoimport.html)

It is sometimes necessary to adjust the permissions of the tools folder, if you are using any tools in it.

In [None]:
!chmod -R 755 ./tools/*
!mkdir data
!curl -o ./data/sensor_data.csv https://raw.githubusercontent.com/arangodb/interactive_tutorials/master/notebooks/data/2017-07_bme280sof_smaller.csv
# Complete data located here: https://www.kaggle.com/hmavrodiev/sofia-air-quality-dataset?select=2017-07_bme280sof.csv

In [None]:
%%capture
! ./tools/arangoimport -c none --server.endpoint http+ssl://{login["hostname"]}:{login["port"]} --server.username {login["username"]} --server.database {login["dbName"]} --server.password {login["password"]} --file "data/sensor_data.csv" --type "csv" --collection "sensor_data" --create-collection true

#Row-Based Aggregation
* Allows aggregating over a fixed number of rows, following or preceding the current row. 
* It is also possible to define that all preceding or following rows should be aggregated ("unbounded").


In [None]:
aql = database.aql
results = aql.execute(
    """
    FOR t IN sensor_data
      SORT t.timestamp
      WINDOW { preceding: 1, following:1}
      AGGREGATE rollingAvg = AVG(t.temperature), rollingSum = SUM(t.temperature)
      WINDOW { preceding: "unbounded", following: 0}
      AGGREGATE cumulativeSum = SUM(t.temperature)
      LIMIT 10
      RETURN {
          time: t.timestamp,
          temp: t.temperature,
          sensor: t.sensor_id,
          rollingAvg,
          rollingSum,
          cumulativeSum
          
          }
      """
)
for res in results:
  print(res)

#Duration-based Aggregation
* Allows aggregating over all documents by time intervals. 
* Calculate timestamp offsets using positive ISO 8601 duration strings (P1Y6, PT30M).


In [None]:
results = aql.execute(
    """
    FOR t IN sensor_data
      WINDOW DATE_TIMESTAMP(t.timestamp) WITH { preceding: "PT30M" }
      AGGREGATE rollingAverage = AVG(t.temperature), rollingSum = SUM(t.temperature)
      LIMIT 10
      RETURN {
        time: t.timestamp,
        temperature: t.temperature,
        sensor: t.sensor_id,
        rollingAverage,
        rollingSum
      }
    """
)
times = []
temps = []
rollingAverages = []
for res in results:
  times.append(res['time'])
  temps.append(res['temperature'])
  rollingAverages.append(res['rollingAverage'])
  print(res)

In [None]:
import time
import datetime as dt
import matplotlib.pyplot as plt

# Create figure for plotting
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(times, temps, label="Temperatures")
plt.ylabel('Temperature',fontsize=18)
plt.xlabel('Dates',fontsize=18)
plt.legend(loc="upper left")



# Draw plot
# ax.annotate("Original Temps", (5,5), color='red', size=20)
ax2 = ax.twinx()
ax2.plot(times, rollingAverages, 'b-', label="Rolling")
plt.ylabel('rollingAverage',fontsize=18)
plt.legend(loc="upper right")

# Format plot
# plt.xticks(rotation=45, ha='right')
plt.subplots_adjust(bottom=0.30)
fig.set_size_inches(20, 15)
plt.title('Temperature over Time', fontsize=20)

# Draw the graph
plt.show()


If you would like to share your notebook simply place it in the `community_notebooks` folder in the interactive-tutorials repository and make a pull request.

Good luck and we are excited to see what you are working on!