Improve reporting and commandline argument flexibility

hiveeyes · Jul 1, 2017 · 508925f · 508925f
1 parent f0d912e
commit 508925f
Show file tree

Hide file tree

Showing 3 changed files with 173 additions and 41 deletions.
diff --git a/README.rst b/README.rst
@@ -1,23 +1,79 @@
-###########
-Audiohealth
-###########
+########################################################
+Activity/health status of a bee colony by audio analysis
+########################################################
 
 
 ************
 Introduction
 ************
 The people around `Open Source Beehives (OSBH) <https://opensourcebeehives.com/>`_ were directing towards audio from the very beginning and therefore presenting coefficients which where won via `machine learning <https://github.com/opensourcebeehives/MachineLearning-Local>`_, learned via the `audio-samples <https://www.dropbox.com/sh/us1633xi4cmtecl/AAA6hplscuDR7aS_f73oRNyha?dl=0>`_ they had so far.
 
-The promising output is simply the activity/health status of a bee colony! So far it can tell whether they are dormant, active, pre-, post- or swarming, if the queen is missing or hatching. For more background information about the audio processing, please follow up reading
+The output is simply the activity/health status of a bee colony. So far the algorithm can tell whether the colony is dormant, active, pre-, post- or swarming, if the queen is missing or hatching. For more background information about the audio processing, please follow up reading
 `current work status thread in the OSBH Forum <https://community.akerkits.com/t/main-thread-current-work-status/326>`_.
+So far, the results are promising.
 
 
 *******
 Details
 *******
+We forked the "`OSBH machine learning <https://github.com/opensourcebeehives/MachineLearning-Local>`_" repository to `osbh-audioanalyzer <https://github.com/hiveeyes/osbh-audioanalyzer>`_ to make it able to obtain an input file option. The wrapper script resides in the [audiohealth] repository.
+
 For more information, see also `Rate vitality of bee colony via analysing its sound <https://community.hiveeyes.org/t/rate-vitality-of-bee-colony-via-analysing-its-sound/357/6>`_.
 
 
+*****
+Usage
+*****
+Synopsis::
+
+    audiohealth --audiofile ~/audio/samples/beehive_before_25_to_15.ogg --analyzer tools/osbh-audioanalyzer/bin/test
+
+Output::
+
+    ==================
+    Sequence of states
+    ==================
+    pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, active, active, active, active, active, active, active, active, active, active, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, pre-swarm, active, active, active, active, active, active, active, pre-swarm, active, pre-swarm, pre-swarm, pre-swarm, pre-swarm, active, active, active, active, pre-swarm, pre-swarm, active, active, pre-swarm, active, pre-swarm, active, active, active, pre-swarm,
+
+    ===================
+    Compressed timeline
+    ===================
+      0s -  80s   pre-swarm       ========
+     90s - 180s   active          =========
+    190s - 310s   pre-swarm       ============
+    320s - 380s   active          ======
+    390s - 400s   pre-swarm       =
+    400s - 410s   active          =
+    410s - 440s   pre-swarm       ===
+    450s - 480s   active          ===
+    490s - 500s   pre-swarm       =
+    510s - 520s   active          =
+    530s - 540s   pre-swarm       =
+    540s - 550s   active          =
+    550s - 560s   pre-swarm       =
+    560s - 580s   active          ==
+    590s - 600s   pre-swarm       =
+
+    ==============
+    Total duration
+    ==============
+           320s   pre-swarm       ================================
+           280s   active          ============================
+
+    ======
+    Result
+    ======
+    The most common events (i.e. the events with the highest total duration) are:
+
+         The colony is mostly in »PRE-SWARM« state, which is going on for 320 seconds.
+         Sometimes, the state oscillates to »ACTIVE«, for 280 seconds in total.
+
+    ==========
+    Disclaimer
+    ==========
+    THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. NO LIABILITY FOR ANY DAMAGES WHATSOEVER.
+
+
 *****
 Setup
 *****
@@ -32,9 +88,12 @@ Repository
 
 Prerequisites
 =============
+To relieve your machine from compiling SciPy or NumPy, install the python libraries from your distribution. `audiohealth` furthermore relies on `sox <http://sox.sourceforge.net/Docs/Documentation>`_ for audio resampling.
+We also recommend `youtube-dl <http://youtube-dl.org/>`_ for downloading audio samples from Youtube.
+
 Install some distribution software packages::
 
-    apt install python-scipy python-numpy sox youtube-dl
+    apt install python-scipy python-numpy sox libsox-fmt-mp3 youtube-dl
 
 Build the `osbh-audioanalyzer <https://github.com/hiveeyes/osbh-audioanalyzer>`_::
 
@@ -47,20 +106,13 @@ Main program
 ============
 ::
 
-    virtualenv --system-site-packages venv27
-    source venv27/bin/activate
-    python setup.py install
-
-
-*****
-Usage
-*****
-::
-
-    audiohealth --file ~/audio/samples/swarm_-15_-5.ogg --analyzer tools/osbh-audioanalyzer/bin/test
+    virtualenv --system-site-packages .venv27
+    source .venv27/bin/activate
+    python setup.py develop
 
 
 *******
 Credits
 *******
 The driving force behind the audio signal processing at OSBH is `Javier Andrés Calvo <https://github.com/Jabors>`_, so we want to send a big thank you to him and the whole OSBH team - this program is really standing on the shoulders of giants. Keep up the good work!
+
diff --git a/audiohealth.py b/audiohealth.py
@@ -1,12 +1,18 @@
+# -*- coding: utf-8 -*-
+# (c) 2017 Richard Pobering <richard@hiveeyes.org>
+# (c) 2017 Andreas Motl <andreas@hiveeyes.org>
+import os
 import sys
 import shlex
 import subprocess
 from docopt import docopt
 from tempfile import NamedTemporaryFile
+from operator import itemgetter
+from colors import color
 import scipy.io.wavfile as wav
 
 
-VERSION  = '0.1.0'
+VERSION  = '0.2.0'
 APP_NAME = 'audiohealth ' + VERSION
 
 def resample(audiofile):
@@ -45,59 +51,122 @@ def analyze(datfile, analyzer=None):
     stdout, stderr = process.communicate()
     #print(stdout)
     states = stdout.decode('utf-8').split('\n')
-    #print(states)
     return states
 
 def report(states):
-    # see tools/osbh-audioanalyzer/params.h and main.cpp: DetectedStates.size()==5
+
+    # The audio is chunked into segments of 10 seconds each, see:
+    #   - tools/osbh-audioanalyzer/params.h: float windowLength=2; //Window Length in s
+    #   - tools/osbh-audioanalyzer/main.cpp: DetectedStates.size()==5
     window_length = 2 * 5
+
     chronology = []
     aggregated = {}
     current = None
+    applied = False
     for i, state in enumerate(states):
         state = state.strip()
         if not state: continue
 
         aggregated.setdefault(state, 0)
         aggregated[state] += window_length
 
-        if state != current:
-            time = (i + 1) * window_length
-            entry = {'time': time, 'state': state}
-            #line = '{time}s {state}'.format(time=time, state=state)
-            #print(line)
+        applied = False
+        time = i * window_length
+        if state == current:
+            chronology[-1].update({'time_end': time})
+        else:
+            time_begin = time
+            time_end   = time_begin + window_length
+            entry = {'time_begin': time_begin, 'time_end': time_end, 'state': state}
             chronology.append(entry)
             current = state
+            applied = True
+
+    # Properly handle the last state
+    if not applied:
+        chronology[-1].update({'time_end': i * window_length})
+
 
-    print('Timeline:')
+    print('==================')
+    print('Sequence of states')
+    print('==================')
+    print(', '.join(states))
+    print
+
+    print('===================')
+    print('Compressed timeline')
+    print('===================')
     for i, entry in enumerate(chronology):
         duration = None
         try:
-            duration = chronology[i+1]['time'] - chronology[i]['time']
+            #duration = chronology[i+1]['time'] - chronology[i]['time']
+            duration = entry['time_end'] - entry['time_begin']
         except IndexError:
             pass
         entry['duration'] = duration
         entry['duration_vis'] = None
         if duration:
             entry['duration_vis'] = int(duration / window_length) * "="
-        line = '{time:3}s {state:15} {duration_vis}'.format(**entry)
+
+        #line = '{time:3}t {state:15} {duration_vis}'.format(**entry)
+        line = '{time_begin:3}s - {time_end:3}s   {state:15} {duration_vis}'.format(**entry)
         print(line)
-    print()
+    print
+
+    print('==============')
+    print('Total duration')
+    print('==============')
+    aggregated_sorted = sorted(aggregated.items(), key=itemgetter(1), reverse=True)
+    for state, duration in aggregated_sorted:
+        duration_vis = int(duration / window_length) * "="
+        line = '{duration:10}s   {state:15} {duration_vis}'.format(**locals())
+        print(line)
+    print
+
+    print('======')
+    print('Result')
+    print('======')
+    print('The most common events (i.e. the events with the highest total duration) are:')
+    print
 
-    print('Aggregated:')
-    print(aggregated)
+    try:
+        winner_state, winner_duration = aggregated_sorted[0]
+        print('     The colony is mostly in »{state}« state, which is going on for {duration} seconds.'.format(state=emphasize(winner_state.upper()), duration=emphasize(winner_duration)))
+    except:
+        pass
 
+    try:
+        second_state, second_duration = aggregated_sorted[1]
+        print('     Sometimes, the state oscillates to »{state}«, for {duration} seconds in total.'.format(state=emphasize(second_state.upper()), duration=emphasize(second_duration)))
+    except:
+        pass
+
+    print
+
+    print('==========')
+    print('Disclaimer')
+    print('==========')
+    print('THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. NO LIABILITY FOR ANY DAMAGES WHATSOEVER.')
+
+    print
+
+def emphasize(text):
+    return color(text, fg='yellow', style='bold')
 
 def main():
     """
     Usage:
-      audiohealth --file audiofile --analyzer /path/to/osbh-audioanalyzer [--debug]
+      audiohealth --audiofile audiofile --analyzer /path/to/osbh-audioanalyzer [--debug] [--keep]
+      audiohealth --datfile datfile --analyzer /path/to/osbh-audioanalyzer [--debug]
       audiohealth --version
       audiohealth (-h | --help)
 
     Options:
-      --file=<audiofile>        Process audiofile. Please use sox-compatible input formats.
+      --audiofile=<audiofile>   Process audiofile. Please use sox-compatible input formats.
+      --datfile=<datfile>       Process datfile.
       --analyzer=<analyzer>     Path to OSBH audioanalyzer binary
+      --keep                    Keep (don't delete) downsampled and .dat file
       --debug                   Enable debug messages
       -h --help                 Show this screen
 
@@ -107,16 +176,27 @@ def main():
     options = docopt(main.__doc__, version=APP_NAME)
     #print options
 
-    inputfile = options.get('--file')
+    audiofile = options.get('--audiofile')
     analyzer = options.get('--analyzer')
     #print inputfile
 
-    tmpfile = resample(inputfile)
-    if tmpfile:
+    if audiofile:
+        tmpfile = resample(audiofile)
+        if not tmpfile:
+            print("Error whild downsampling: Did you install sox?")
+            sys.exit(2)
+
         datfile = wav_to_dat(tmpfile)
-        #print(datfile)
-        states = analyze(datfile, analyzer=analyzer)
-        report(states)
+
     else:
-        print("did you install sox?")
+        datfile = options.get('--datfile')
+
+    #print(datfile)
+    states = analyze(datfile, analyzer=analyzer)
+    report(states)
 
+    # Cleanup
+    if not options.get('--keep'):
+        if audiofile:
+            os.unlink(tmpfile)
+            os.unlink(datfile)
diff --git a/setup.py b/setup.py
@@ -4,6 +4,7 @@
 
 requires = [
     'docopt==0.6.2',
+    'ansicolors==1.1.8',
     'scipy',
     'numpy',
 ]
@@ -12,7 +13,7 @@
 }
 
 setup(name='audiohealth',
-      version='0.1.1',
+      version='0.2.0',
       description='',
       long_description='',
       license="AGPL 3",
@@ -23,7 +24,6 @@
         "Environment :: Console",
         "Intended Audience :: Education",
         "Intended Audience :: Information Technology",
-        "Intended Audience :: Manufacturing",
         "Intended Audience :: Science/Research",
         "Topic :: Communications",
         "Topic :: Scientific/Engineering :: Information Analysis",