### Status (03/22/2024)
- no existing library fulfills all requirements 
- core problem: model firmly baked into code everywhere in all of them
- we had to start from a very low level:
  - only able to analyze files
  - core necessary parts:
    - abstractions for recording, preprocessor, model 
    - model, preprocessing disentangled from rest => allows runtime replacement of model
  
- currently there: 
  - [x] model, preprocessor abstractions 
  - [x] disentanglement of data and analysis => allows exchange of analysis part 
  - [x] integration of core birdnet functionality back 
  - runs in python from prerecorded files 
  
- not yet there: 
  - user interface 
  - integrated system that would 'run alone'  
    - `birdnetlib` provides at the very least starting points 
  - documentation
  - no support for pytorch

### Code example

clone repo and install first from repo directory `python3 -m pip install .`

In [None]:
import sys 
sys.path.append("..")
import iSparrow.sparrow_model_base as spm
import iSparrow.sparrow_recording as spr
import iSparrow.preprocessor_base as spb
import iSparrow.utils as utils
import IPython

import tests.set_up_sparrow_env as sp

In [None]:
from pathlib import Path
import pandas as pd

In [None]:
# make a mock install of sparrow. will be invisible in the future
sp.install()

### Demo

- load model, preprocessor you want 
- add recording to be analyzed and assign it the model, preprocessor to use 
- analyze 

In [None]:
# variables for analysis
sigmoid_sensitivity = 1.0
num_threads = 12
min_conf = 0.25

# variables for recording
recording_path = sp.EXAMPLES / "soundscape.wav"
sample_rate = 48000
overlap = 0.0
sample_secs = 3.0
resample_type = "kaiser_fast"

In [None]:
ppd = utils.load_module("ppm", sp.MODELS / Path("birdnet_default") / "preprocessor.py")

In [None]:
md = utils.load_module("md", sp.MODELS / Path("birdnet_default") / "model.py")

In [None]:
preprocessor = ppd.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = md.Model(sp.MODELS / "birdnet_default", num_threads=5, sigmoid_sensitivity=1.)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
IPython.display.Audio(sp.EXAMPLES / "soundscape.wav")

In [None]:
pd.DataFrame(recording.detections)

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("birdnet_custom") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("birdnet_custom") / "model.py")

### Support for Birdnet's transfer learning approach 

- train a new classifier in birdnet itself (port of training logic to sparrow possible, but low priority atm)

- use in conjunction with the default model 

- **does allow for appending new species to existing list, including mammals**

In [None]:
preprocessor = ppc.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = mc.Model(default_model_path=sp.MODELS / "birdnet_default", model_path=sp.MODELS / "birdnet_custom", num_threads=5, sigmoid_sensitivity=1.)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

### Load a different model into an existing recording instance

- read module that contains the model, preprocessor during program execution
- build preprocessor
- build model
- change model, preprocessor to new one, reset recording
- analyze
- change model upon request
- analyze again..

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("google_perch") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("google_perch") / "model.py")

In [None]:
preprocessor = ppc.Preprocessor(sample_rate=32000, sample_secs=5., resample_type=resample_type)
model = mc.Model(model_path=sp.MODELS / "google_perch", num_threads=5)

In [None]:
recording = spr.SparrowRecording(preprocessor, model, sp.EXAMPLES / "soundscape.wav", min_conf=min_conf)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

In [None]:
ppc = utils.load_module("ppm", sp.MODELS / Path("birdnet_default") / "preprocessor.py")
mc = utils.load_module("md", sp.MODELS / Path("birdnet_default") / "model.py")

preprocessor = ppd.Preprocessor(sample_rate=sample_rate, overlap=overlap, sample_secs=sample_secs, resample_type=resample_type)

model = md.Model(sp.MODELS / "birdnet_default", num_threads=5, sigmoid_sensitivity=1.)

recording.set_analyzer(model, preprocessor)

print(recording.analyzer.name)
print(recording.processor.name, recording.processor.sample_rate)

In [None]:
recording.analyze()

In [None]:
pd.DataFrame(recording.detections)

### Current concept for usage in final deployment 

- bundle model file with implementations of `model`, `preprocessor` derived from a base provided by sparrow. ==> Scientist 
- upload models to huggingface
- give url or model name to Sparrow
  - Sparrow handles caching of models so they aren't downloaded again
- execute procedure above in an encapsulated way 
- shouldn't create gap in data acquisition?