### Status (03/22/2024)
- no existing library fulfills all requirements 
- core problem: model firmly baked into code everywhere in all of them
- we had to start from a very low level:
  - only able to analyze files
  - core necessary parts:
    - abstractions for recording, preprocessor, model 
    - model, preprocessing disentangled from rest => allows runtime replacement of model
  
- currently there: 
  - [x] model, preprocessor abstractions 
  - [x] disentanglement of data and analysis => allows exchange of analysis part 
  - [x] integration of core birdnet functionality back 
  - runs in python from prerecorded files 
  
- not yet there: 
  - user interface 
  - integrated system that would 'run alone'  
    - `birdnetlib` provides at the very least starting points 
  - documentation
  - no support for pytorch

### Code example

clone repo and install first from repo directory `python3 -m pip install .`

In [None]:
from pathlib import Path

In [None]:
import sys

sys.path.append(str(Path.home() / Path("Development") / "iSparrow"))

In [None]:
from iSparrow import PreprocessorBase
from iSparrow import ModelBase
from iSparrow import SparrowRecording
from iSparrow import SpeciesPredictorBase
from iSparrow import SparrowWatcher
import iSparrow.utils as utils

import tests.set_up_sparrow_env as sp

In [None]:
import pandas as pd
import tests.set_up_sparrow_env as sp
import yaml


In [None]:
# make a mock install of sparrow. will be invisible in the future
sp.install()

### Demo

- load model, preprocessor you want 
- add recording to be analyzed and assign it the model, preprocessor to use 
- analyze 

In [None]:
# variables for analysis
sigmoid_sensitivity = 1.0
num_threads = 12
min_conf = 0.25

# variables for recording
recording_path = sp.EXAMPLES / "soundscape.wav"
sample_rate = 48000
overlap = 0.0
sample_secs = 3.0
resample_type = "kaiser_fast"

In [None]:
ppd = utils.load_module("ppm", sp.MODELS / Path("birdnet_default") / "preprocessor.py")

In [None]:
md = utils.load_module("md", sp.MODELS / Path("birdnet_default") / "model.py")

In [None]:
preprocessor = ppd.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = md.Model(sp.MODELS / "birdnet_default", num_threads=5, sigmoid_sensitivity=1.)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
IPython.display.Audio(sp.EXAMPLES / "soundscape.wav")

In [None]:
pd.DataFrame(recording.detections)

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("birdnet_custom") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("birdnet_custom") / "model.py")

### Support for Birdnet's transfer learning approach 

- train a new classifier in birdnet itself (port of training logic to sparrow possible, but low priority atm)

- use in conjunction with the default model 

- **does allow for appending new species to existing list, including mammals**

In [None]:
preprocessor = ppc.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = mc.Model(default_model_path=sp.MODELS / "birdnet_default", model_path=sp.MODELS / "birdnet_custom", num_threads=5, sigmoid_sensitivity=1.)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

### Load a different model into an existing recording instance

- read module that contains the model, preprocessor during program execution
- build preprocessor
- build model
- change model, preprocessor to new one, reset recording
- analyze
- change model upon request
- analyze again..

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("google_perch") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("google_perch") / "model.py")

In [None]:
preprocessor = ppc.Preprocessor(sample_rate=32000, sample_secs=5., resample_type=resample_type)
model = mc.Model(model_path=sp.MODELS / "google_perch", num_threads=5)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("birdnet_default") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("birdnet_default") / "model.py")

preprocessor = ppd.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = md.Model(sp.MODELS / "birdnet_default", num_threads=5, sigmoid_sensitivity=1.)

recording.set_analyzer(model, preprocessor)

print(recording.analyzer.name)
print(recording.processor.name, recording.processor.sample_rate)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

### Current concept for usage in final deployment 

- bundle model file with implementations of `model`, `preprocessor` derived from a base provided by sparrow. ==> Scientist 
- upload models to huggingface
- give url or model name to Sparrow
  - Sparrow handles caching of models so they aren't downloaded again
- execute procedure above in an encapsulated way 
- shouldn't create gap in data acquisition?

## Status 04/18/2024

- separate repository for recorder [iSparrowRecord](https://github.com/ssciwr/iSparrowRecord) and for the module that sends data (TODO)[iSparrowChirp](https://github.com/ssciwr/iSparrowChirp)
  - have the recorder, sender and analyzer process isolated from each other to limit interference. 
  
- multiprocessing approach to analyzer: be able to continually watch for incoming data and analyze in the background while at the same time staying responsive to user input: sending data, change model on the fly, doing other tasks... 
- final deployment: docker? 
- currently working on: sending data

In [None]:
from pathlib import Path
import sys

sys.path.append(str(Path.home() / Path("Development") / "iSparrow"))

In [None]:
from iSparrow import PreprocessorBase
from iSparrow import ModelBase
from iSparrow import SparrowRecording
from iSparrow import SpeciesPredictorBase
from iSparrow import SparrowWatcher
import iSparrow.utils as utils

import tests.set_up_sparrow_env as sp

In [None]:
import pandas as pd
import tests.set_up_sparrow_env as sp
import yaml
from datetime import datetime

In [None]:
# make a mock install of sparrow. will be invisible in the future
sp.install()

### Watcher process 
- runs in the background  
- can be started and stopped upon request

In [None]:
preprocessor_cfg = {
    "sample_rate": 48000,
    "overlap": 0.0,
    "sample_secs": 3.0,
    "resample_type": "kaiser_fast",
}

model_cfg = {
    "num_threads": 1,
    "sigmoid_sensitivity": 1.0,
    "species_list_file": None,
}

recording_cfg = {
    "date": datetime(year=2022, month=5, day=10),
    "lat": 35.4244,
    "lon": -120.7463,
    "species_presence_threshold": 0.03,
    "min_conf": 0.25,
}

species_predictor_cfg = {
    "use_cache": True,
    "num_threads": 1,
}

runner = SparrowWatcher(
    Path.home() / "iSparrow_data",
    Path.home() / "iSparrow_output",
    Path.home() / "iSparrow/models",
    "birdnet_default",
    preprocessor_config=preprocessor_cfg,
    model_config=model_cfg,
    recording_config=recording_cfg,
    species_predictor_config=species_predictor_cfg,
)

In [None]:
runner.start()

In [None]:
runner.pause()

In [None]:
runner.go_on()

In [None]:
runner.stop()

In [None]:
runner.is_running

### Switch model on the fly 
- creates a new results folder 
- stores the parameters as a .yml file with the results 
- can create unanalyzed files because the watcher process will be stopped and started again.

In [None]:
preprocessor_cfg = {
    "sample_rate": 48000,
    "overlap": 0.0,
    "sample_secs": 3.0,
    "resample_type": "kaiser_fast",
}

model_cfg = {
    "num_threads": 1,
    "sigmoid_sensitivity": 1.0,
    "default_model_path": str(Path.home() / "iSparrow/models/birdnet_default"),
}

recording_cfg = {
    "species_presence_threshold": 0.03,
    "min_conf": 0.5,
}

runner.change_analyzer(
    "birdnet_custom", preprocessor_config=preprocessor_cfg, model_config=model_cfg
)

In [None]:
runner.stop()

### Run cleanup occassionally

- find if a model change or other task has caused data loss 
- fix data loss 
- remove recordings upon request
- record which data has been re-analyzed

In [None]:
runner.delete_recordings = "never"

In [None]:
runner.reanalyze_on_cleanup = False

In [None]:
runner.clean_up()

In [None]:
runner.delete_recordings = "on_cleanup"

In [None]:
runner.clean_up()

In [None]:
import shutil 
shutil.rmtree(str(Path("/home/hmack/iSparrow_data")))
shutil.rmtree(str(Path("/home/hmack/iSparrow_output")))