# Time Series database with InfluxDB InfluxDB

#### Links
- Readthedocs : https://influxdb-python.readthedocs.io/en/latest/api-documentation.html
- Pypi:https://pypi.org/project/influxdb/
- GitHub: https://github.com/influxdata/influxdb-python

This Notebook is for InfluxDB 1.8.
November 2021.

## Contents
0. Install packages
1. Some basic scripts with InfluxDBClient
2. Some basic scripts with DataframeClient

## 0. Install packages

In [4]:
%pip install influxdb

Collecting influxdb
  Downloading influxdb-5.3.1-py2.py3-none-any.whl (77 kB)
Installing collected packages: influxdb
Successfully installed influxdb-5.3.1
Note: you may need to restart the kernel to use updated packages.


In [7]:
%pip install influxdb-client

Collecting influxdb-client
  Downloading influxdb_client-1.23.0-py3-none-any.whl (522 kB)
Collecting rx>=3.0.1
  Downloading Rx-3.2.0-py3-none-any.whl (199 kB)
Installing collected packages: rx, influxdb-client
Successfully installed influxdb-client-1.23.0 rx-3.2.0


## 1. Some basic scripts with InfluxDBclient

In [1]:
import notebooks_config
red_rpi_ip =notebooks_config.red_rpi_ip
print(red_rpi_ip)

192.168.178.50


In [2]:
#check connection with the Influxdb on the raspberry
from influxdb import InfluxDBClient
client = InfluxDBClient(host= red_rpi_ip, port=8086) # rry ip on a local network
pong = client.ping()
print(pong)

1.8.10


### 1.1 Getting, creating and deleting users

In [3]:
# get a list of users
from influxdb import InfluxDBClient
client = InfluxDBClient(host='192.168.178.50', port=8086)
users = client.get_list_users()
print(users)

[]


In [4]:
# create a new user called Henk
client.create_user('Henk', 'henkspw', admin=False)
users = client.get_list_users()
print(users)

[{'user': 'Henk', 'admin': False}]


In [5]:
# drop Henk again
client.drop_user('Henk')
users = client.get_list_users()
print(users)

[]


### 1.2 Getting, creating and deleting databases

In [6]:
#also working
dbs = client.get_list_database()
print(dbs)

[{'name': '_internal'}, {'name': 'SNORING'}]


In [26]:
#create a new database and check it 
client.create_database('example2')
dbs = client.get_list_database()
print(dbs)

[{'name': '_internal'}, {'name': 'SNORING2'}, {'name': 'SNORING'}, {'name': 'example'}, {'name': 'example2'}]


In [32]:
# delete a database and check it
client.drop_database('example2')
dbs = client.get_list_database()
print(dbs)

[{'name': '_internal'}, {'name': 'SNORING2'}, {'name': 'SNORING'}]


In [8]:
# switch to database
client.switch_database('SNORING')

### 1.3 Get measurements and series

In [9]:
#when switched to a database you can get a list of measurements
msmts = client.get_list_measurements()
print(msmts)

[{'name': 'my_snoring'}]


In [10]:
series = client.get_list_series(database=None, measurement=None, tags=None)
series

['my_snoring']

### 1.4 Queries

In [17]:
my_query =client.query("SELECT snoring FROM my_snoring WHERE time > now() -12h") # not enough memory for a full night 
print(my_query)

ResultSet({'('my_snoring', None)': [{'time': '2022-01-11T20:30:16.545278Z', 'snoring': 0.06364712864160538}, {'time': '2022-01-11T20:30:17.089403Z', 'snoring': 0.06481613218784332}, {'time': '2022-01-11T20:30:17.632948Z', 'snoring': 0.06085388362407684}, {'time': '2022-01-11T20:30:18.174472Z', 'snoring': 0.07503601908683777}, {'time': '2022-01-11T20:30:18.718526Z', 'snoring': 0.09147290885448456}, {'time': '2022-01-11T20:30:19.263252Z', 'snoring': 0.08799783140420914}, {'time': '2022-01-11T20:30:19.807354Z', 'snoring': 0.1255640834569931}, {'time': '2022-01-11T20:30:20.351558Z', 'snoring': 0.07267340272665024}, {'time': '2022-01-11T20:30:20.894120Z', 'snoring': 0.07314488291740417}, {'time': '2022-01-11T20:30:21.437782Z', 'snoring': 0.07205292582511902}, {'time': '2022-01-11T20:30:21.982186Z', 'snoring': 0.07616283744573593}, {'time': '2022-01-11T20:30:22.525461Z', 'snoring': 0.10176752507686615}, {'time': '2022-01-11T20:30:23.070807Z', 'snoring': 0.06448861211538315}, {'time': '2022-0

In [None]:
my_query =client.query("SELECT snoring FROM my_snoring WHERE time > now() -12h") # not enough memory for a full night 
print(my_query)

In [13]:
type(my_query)

influxdb.resultset.ResultSet

## 1.5 Retention policies

In [18]:
#create a retention policty
client.create_retention_policy('my_retention_policy', '4w', replication=1,  database='SNORING', default=False)

In [19]:
#get a list of retention policies
client.get_list_retention_policies(database='SNORING')

[{'name': 'autogen',
  'duration': '0s',
  'shardGroupDuration': '168h0m0s',
  'replicaN': 1,
  'default': True},
 {'name': 'my_retention_policy',
  'duration': '672h0m0s',
  'shardGroupDuration': '24h0m0s',
  'replicaN': 1,
  'default': False}]

In [9]:
# drop retention policy
client.drop_retention_policy('my_retention_policy', database='SNORING')

In [20]:
#Check by getting the list again
client.get_list_retention_policies(database='SNORING')

[{'name': 'autogen',
  'duration': '0s',
  'shardGroupDuration': '168h0m0s',
  'replicaN': 1,
  'default': True},
 {'name': 'my_retention_policy',
  'duration': '672h0m0s',
  'shardGroupDuration': '24h0m0s',
  'replicaN': 1,
  'default': False}]

## 2. Some basic scripts with DataFrameClient.
https://influxdb-python.readthedocs.io/en/latest/api-documentation.html#dataframeclient

In [1]:
import notebooks_config
red_rpi_ip =notebooks_config.red_rpi_ip
print(red_rpi_ip)

192.168.178.50


In [2]:
# instantiate a client and assign the RPI and database to it
import pandas as pd
from influxdb import DataFrameClient
mydf_client = DataFrameClient(host= red_rpi_ip, port=8086, database='SNORING')# this is my raspberry ip on a local network
pang = mydf_client.ping()
print(pang)

ConnectionError: HTTPConnectionPool(host='192.168.178.50', port=8086): Max retries exceeded with url: /ping (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x0000016CAF49CE80>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'))

In [26]:
my_df = mydf_client.query('SELECT * FROM my_snoring WHERE time > now() -36h')
my_df

defaultdict(list,
            {'my_snoring':                                         sheets   silence   snoring  \
             2022-01-10 20:35:08.377425+00:00  1.651487e-03  0.934139  0.064199   
             2022-01-10 20:35:08.926313+00:00  1.482377e-03  0.936084  0.062425   
             2022-01-10 20:35:09.481355+00:00  1.522833e-03  0.932783  0.065684   
             2022-01-10 20:35:10.005872+00:00  1.588519e-03  0.935053  0.063349   
             2022-01-10 20:35:10.549029+00:00  1.626174e-03  0.933614  0.064749   
             ...                                        ...       ...       ...   
             2022-01-12 01:00:25.762287+00:00  2.746958e-08  0.074458  0.925533   
             2022-01-12 01:00:26.294059+00:00  6.762272e-06  0.844603  0.155386   
             2022-01-12 01:00:26.844680+00:00  1.053335e-07  0.126094  0.873903   
             2022-01-12 01:00:27.393015+00:00  9.624481e-08  0.126924  0.873065   
             2022-01-12 01:00:27.938871+00:00  2.884231

In [33]:
my_df2 = mydf_client.query("SELECT * FROM my_snoring WHERE time >= '2021-12-23 22:58:38' AND time <='2021-12-24 07:42:38'")
my_df2

{}

In [29]:
my_df['my_snoring']

Unnamed: 0,sheets,silence,snoring,softtalking
2022-01-10 20:35:08.377425+00:00,1.651487e-03,0.934139,0.064199,0.000011
2022-01-10 20:35:08.926313+00:00,1.482377e-03,0.936084,0.062425,0.000009
2022-01-10 20:35:09.481355+00:00,1.522833e-03,0.932783,0.065684,0.000010
2022-01-10 20:35:10.005872+00:00,1.588519e-03,0.935053,0.063349,0.000010
2022-01-10 20:35:10.549029+00:00,1.626174e-03,0.933614,0.064749,0.000011
...,...,...,...,...
2022-01-12 01:00:25.762287+00:00,2.746958e-08,0.074458,0.925533,0.000008
2022-01-12 01:00:26.294059+00:00,6.762272e-06,0.844603,0.155386,0.000004
2022-01-12 01:00:26.844680+00:00,1.053335e-07,0.126094,0.873903,0.000003
2022-01-12 01:00:27.393015+00:00,9.624481e-08,0.126924,0.873065,0.000011


In [24]:
snoring_df = my_df['my_snoring']
snoring_df

Unnamed: 0,sheets,silence,snoring,softtalking
2022-01-10 20:35:08.377425+00:00,1.651487e-03,0.934139,0.064199,0.000011
2022-01-10 20:35:08.926313+00:00,1.482377e-03,0.936084,0.062425,0.000009
2022-01-10 20:35:09.481355+00:00,1.522833e-03,0.932783,0.065684,0.000010
2022-01-10 20:35:10.005872+00:00,1.588519e-03,0.935053,0.063349,0.000010
2022-01-10 20:35:10.549029+00:00,1.626174e-03,0.933614,0.064749,0.000011
...,...,...,...,...
2022-01-12 01:00:25.762287+00:00,2.746958e-08,0.074458,0.925533,0.000008
2022-01-12 01:00:26.294059+00:00,6.762272e-06,0.844603,0.155386,0.000004
2022-01-12 01:00:26.844680+00:00,1.053335e-07,0.126094,0.873903,0.000003
2022-01-12 01:00:27.393015+00:00,9.624481e-08,0.126924,0.873065,0.000011


In [7]:
snoring_df.describe()

Unnamed: 0,sheets,silence,snoring,softtalking
count,85027.0,85027.0,85027.0,85027.0
mean,0.001944689,0.8800199,0.118016,1.920559e-05
std,0.004878714,0.1657497,0.16557,0.0001295645
min,3.613211e-23,4.187816e-07,0.025557,1.87257e-10
25%,0.001207991,0.9285924,0.064717,8.333343e-06
50%,0.001813529,0.9321837,0.06595,1.18519e-05
75%,0.001948026,0.9335327,0.070025,1.274839e-05
max,0.3389907,0.9666468,1.0,0.01228169


### 2.1 Create a query string and call the database

In [28]:
base_string = "SELECT * FROM my_snoring WHERE time >="
middle_string = " AND time <="
base_string+middle_string

'SELECT * FROM my_snoring WHERE time >= AND time <='

In [33]:
start_time = "'2021-12-11 23:02:31'" #does not work with time zone! Make sure you'll use '' for the time zone
end_time = "'2021-12-12 09:00:31'"

In [34]:
query_string = base_string + start_time + middle_string + end_time
query_string

"SELECT * FROM my_snoring WHERE time >='2021-12-11 23:02:31' AND time <='2021-12-12 09:00:31'"

In [35]:
my_df = mydf_client.query(query_string)
my_df

defaultdict(list,
            {'my_snoring':                                     sheets   silence   snoring  softtalking
             2021-12-11 23:02:31.386149+00:00  0.001840  0.932856  0.065292     0.000012
             2021-12-11 23:02:31.929059+00:00  0.001868  0.933797  0.064324     0.000012
             2021-12-11 23:02:32.467791+00:00  0.001810  0.933317  0.064862     0.000011
             2021-12-11 23:02:33.014969+00:00  0.001759  0.933197  0.065032     0.000011
             2021-12-11 23:02:33.559208+00:00  0.001639  0.932970  0.065380     0.000011
             ...                                    ...       ...       ...          ...
             2021-12-12 09:00:28.677956+00:00  0.000047  0.325029  0.674875     0.000049
             2021-12-12 09:00:29.227086+00:00  0.001690  0.928522  0.069777     0.000011
             2021-12-12 09:00:29.777596+00:00  0.001856  0.931699  0.066433     0.000012
             2021-12-12 09:00:30.325116+00:00  0.001994  0.932072  0.065921   

In [27]:
my_df = mydf_client.query("SELECT * FROM my_snoring WHERE time >='2021-12-11 23:02:31' AND time <= '2021-12-12 09:00:31'")
my_df

defaultdict(list,
            {'my_snoring':                                     sheets   silence   snoring  softtalking
             2021-12-11 23:02:31.386149+00:00  0.001840  0.932856  0.065292     0.000012
             2021-12-11 23:02:31.929059+00:00  0.001868  0.933797  0.064324     0.000012
             2021-12-11 23:02:32.467791+00:00  0.001810  0.933317  0.064862     0.000011
             2021-12-11 23:02:33.014969+00:00  0.001759  0.933197  0.065032     0.000011
             2021-12-11 23:02:33.559208+00:00  0.001639  0.932970  0.065380     0.000011
             ...                                    ...       ...       ...          ...
             2021-12-12 09:00:28.677956+00:00  0.000047  0.325029  0.674875     0.000049
             2021-12-12 09:00:29.227086+00:00  0.001690  0.928522  0.069777     0.000011
             2021-12-12 09:00:29.777596+00:00  0.001856  0.931699  0.066433     0.000012
             2021-12-12 09:00:30.325116+00:00  0.001994  0.932072  0.065921   