# Analyzing audio recordings using CNN

## Set up

Here we create a coding environment to use Tensorflow

https://www.pnas.org/doi/10.1073/pnas.2004702117

https://onlinelibrary.wiley.com/doi/10.1111/oik.08525

https://github.com/tensorflow/models/tree/master/research/audioset/vggish

https://zenodo.org/records/3907296


To create the environment for this workflow follow the following steps:
1) Install Anaconda (https://www.anaconda.com/download/)
2) Create and activate environment, using powershell
```
conda create --name soundScape python=3.10
conda activate soundScape
```

3) Install base libraries
```
conda install -c conda-forge mamba
mamba install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0 cudatoolkit-dev ipykernel nbformat numpy scipy
python -m pip install "tensorflow<2.11" tf-slim resampy soundfile
```

4) Check tensorflow install
```
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
```

5) Copy vggish
```
# "G:\Shared drives\Projets\Actif\2023_ECCC4_Biodiv\3-Analyses\2-Analyses\vggish"
```

## Test install

In [1]:
# Load package
import os

# Specity parameters
script="\"P:/Projets/Actif/2023_ECCC4_Biodiv/3-Analyses/2-Analyses/vggish/vggish_smoke_test.py\""

# Change directory
os.chdir('P:\\Projets\\Actif\\2023_ECCC4_Biodiv\\3-Analyses\\2-Analyses\\vggish')

# Run terminal command
!python {script}

# Change back directory
os.chdir('C:\\Projects\\2023_ECCC4_Biodiv\\1.SoundEmbedding_VGGish')

# Clean environment
del script


Testing your install of VGGish

Resampling via resampy works!
Log Mel Spectrogram example:  [[-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]
 [-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]
 [-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]
 ...
 [-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]
 [-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]
 [-4.48313252 -4.27083405 -4.17064267 ... -4.60069383 -4.60098887
  -4.60116305]]
VGGish embedding:  [-2.73169428e-01 -1.80366933e-01  5.19998372e-02 -1.43594891e-01
 -1.04789361e-01 -4.96687800e-01 -1.75353721e-01  4.23048645e-01
 -8.22081447e-01 -2.16846049e-01 -1.17590576e-01 -6.70083344e-01
  1.43157810e-01 -1.44236773e-01  8.80868733e-03 -8.71441662e-02
 -1.84470892e-01  5.96522272e-01 -3.43975008e-01 -5.78861833e-02
 -1.64907858e-01  4.22914624e-02 -2.55291790e-01 -2.36321270e-01
  1.80

2024-04-18 09:12:53.154741: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-18 09:12:54.573515: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1616] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5450 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 3070 Ti Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
2024-04-18 09:12:54.898285: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled
2024-04-18 09:13:17.624593: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2024-04-18 09:13:20.196874: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This

## CODE

https://colab.research.google.com/drive/1E3CaPAqCai9P9QhJ3WYPNCVmrJU4lAhF#scrollTo=O1YVQb-MBiUx

https://github.com/tensorflow/models/tree/master/research/audioset/vggish

In [2]:
# Load packages
from __future__ import print_function

# Change directory
import os
os.chdir('P:\\Projets\\Actif\\2023_ECCC4_Biodiv\\3-Analyses\\2-Analyses\\vggish')

import tensorflow.compat.v1 as tf
import vggish_input
import vggish_params
import vggish_postprocess
import vggish_slim

# Specify file locations
waveFile=r"P:/Projets/Actif/2023_ECCC4_Biodiv/3-Analyses/2-Analyses/vggish/Testing/test.wav"
pcaParms=r"P:\\Projets\\Actif\\2023_ECCC4_Biodiv\\3-Analyses\\2-Analyses\\vggish\\vggish_pca_params.npz"
checkP=r"P:\\Projets\\Actif\\2023_ECCC4_Biodiv\\3-Analyses\\2-Analyses\\vggish\\vggish_model.ckpt"

# Prepare wave file
examples_batch = vggish_input.wavfile_to_examples(waveFile)

# Prepare a postprocessor to munge the model embeddings.
pproc = vggish_postprocess.Postprocessor(pcaParms)

# Define the model in inference mode, load the checkpoint, 
# and locate input and output tensors.
with tf.Graph().as_default(), tf.Session() as sess:    
    vggish_slim.define_vggish_slim(training=False)
    vggish_slim.load_vggish_slim_checkpoint(sess,checkP)
    features_tensor = sess.graph.get_tensor_by_name(
        vggish_params.INPUT_TENSOR_NAME)
    embedding_tensor = sess.graph.get_tensor_by_name(
        vggish_params.OUTPUT_TENSOR_NAME)

    # Run inference and postprocessing.
    [embedding_batch] = sess.run([embedding_tensor],
                                 feed_dict={features_tensor: examples_batch})
    postprocessed_batch = pproc.postprocess(embedding_batch)

# Clean environment
del print_function, waveFile, pcaParms, checkP, examples_batch, pproc, sess
del features_tensor, embedding_tensor, embedding_batch



INFO:tensorflow:Restoring parameters from P:\\Projets\\Actif\\2023_ECCC4_Biodiv\\3-Analyses\\2-Analyses\\vggish\\vggish_model.ckpt


In [4]:
print(postprocessed_batch[1])

[137  87 114 104  80  15 235 110 208 125 164  86 191 134 186  97 185  70
  91 207 160 129 218 108  83 180  58 136 255 137   0  21 178  80 170 146
 127  75  41  71 107 126 117 115 124  23  50 160 153 116 110 255 166  60
 205 185   9 122   0 141 140  57  79 114 106  49  60  96 173  37 126 143
 232 148  63  62  19 221 255  57 157  55 165 255 217  15 229 165 181  90
 221 189 159 180 113  98 158 130  39 186  86  95  13  34 255 179 197  62
  35   0  94 173  86 109   4 126 133  61 117   0 150 188 170  92 235  68
  35 138]
