## My Snoring 

Use the data of the sound snoring sensor and visualise it here. 

- Use snoring sensor on RPi and store it on the InfluxDB
- Use Grafana (hass.io) to get the data from the device.
- Download CSV from Grafana and import it here.

## Contents
0. Get the data and load into Pandas df
1. Data inspection, cleaning and visualisation
2. Statistical analysis
3. Store the results as json

## 0. Get the data and load into Pandas df

In [33]:
pwd

'C:\\Users\\31653\\Documents\\GitHub\\Notebooks'

In [34]:
import glob
my_csvs = glob.glob('*.csv')
my_csvs

['2021-12-5_Features_.csv',
 'foo.csv',
 'OEtest_210505_youtube_traffic.csv',
 'OEtest_210506_youtubetraffic.csv',
 'Snoring-data-2021-12-03 09_15_24.csv',
 'Snoring-data-2021-12-04 11_41_48.csv',
 'Snoring-data-2021-12-04 11_51_51.csv',
 'Snoring-data-2021-12-06 19_03_36.csv',
 'Snoring-data-2021-12-10 13_27_41.csv',
 'Snoring-data-2021-12-12 09_38_24.csv',
 'snoring_2021_12_3-4b.csv',
 'UMTMVS.csv']

In [36]:
import pandas as pd
snoring_df=pd.read_csv('Snoring-data-2021-12-12 09_38_24.csv')
snoring_df.tail()

Unnamed: 0,Time,my_snoring.mean
43196,2021-12-12 09:37:36,0.0636
43197,2021-12-12 09:37:37,0.0627
43198,2021-12-12 09:37:38,0.0657
43199,2021-12-12 09:37:39,0.0671
43200,2021-12-12 09:37:40,0.0653


## 1. Data inspection, cleaning and visualisation


In [37]:
#show colums
snoring_df.columns

Index(['Time', 'my_snoring.mean'], dtype='object')

In [38]:
#describe data
snoring_df.describe()

Unnamed: 0,my_snoring.mean
count,42976.0
mean,0.120823
std,0.162814
min,0.0406
25%,0.0649
50%,0.0659
75%,0.0756
max,1.0


In [None]:
#remove NaN

## 2. Statistical analysis

In [40]:
print(snoring_df.sort_values('my_snoring.mean', ascending='False'))

                      Time  my_snoring.mean
9618   2021-12-12 00:17:58           0.0406
8026   2021-12-11 23:51:26           0.0443
35705  2021-12-12 07:32:45           0.0457
6649   2021-12-11 23:28:29           0.0462
9815   2021-12-12 00:21:15           0.0473
...                    ...              ...
220    2021-12-11 21:41:20              NaN
10006  2021-12-12 00:24:26              NaN
21479  2021-12-12 03:35:39              NaN
21480  2021-12-12 03:35:40              NaN
21481  2021-12-12 03:35:41              NaN

[43201 rows x 2 columns]


In [41]:
# add a column 'bins' to the df and create bins with labels

snoring_df['bins'] = pd.cut(snoring_df['my_snoring.mean'], bins=[0.0, 0.10, 0.50, 1.00],
                    labels=['no snoring', 'light snoring', 'loud snoring']) 


In [42]:
snoring_df.tail()

Unnamed: 0,Time,my_snoring.mean,bins
43196,2021-12-12 09:37:36,0.0636,no snoring
43197,2021-12-12 09:37:37,0.0627,no snoring
43198,2021-12-12 09:37:38,0.0657,no snoring
43199,2021-12-12 09:37:39,0.0671,no snoring
43200,2021-12-12 09:37:40,0.0653,no snoring


In [43]:
total_length = snoring_df['bins'].count()
total_length

42976

In [44]:
stats_df = pd.DataFrame()
stats_df['class'] =snoring_df['bins'].value_counts()
stats_df

Unnamed: 0,class
no snoring,34974
light snoring,6004
loud snoring,1998


In [46]:
#define parameters
no_snoring =stats_df['class'][0]
light_snoring=stats_df['class'][1]
loud_snoring=stats_df['class'][2]

In [47]:
time_snoring = round(((light_snoring+loud_snoring)/60),2)
print("you've snored: " +str(time_snoring) + " minutes")

you've snored: 133.37 minutes


In [48]:
percentage = ((loud_snoring+light_snoring) / total_length)*100
print ("that is " + str(round(percentage,2)) + "% the measured time")

that is 18.62% the measured time


In [49]:
loud_perc = round(loud_snoring/total_length*100,2)
print ("loud snoring was " + str(loud_perc) + "% of the measured time")

loud snoring was 4.65% of the measured time


### 2b. Make a graph

In [None]:
#visualise the data. To do: larger plot and mean.values
#snoring_df.plot.scatter(x="Time", y="my_snoring.mean")

## 3. Store the results