# Automatically create a stimuli data base

When creating a stimuli data base you often need to do a lot of automatic tasks : convert sound files to mono, cut a sound/video file into different sound/video files, properly name them, etc.

In this tutorial you will learn how to do all this automatically, without spending hours cutting and naming files.

There are several interesting functions in the transform_audio.py file. Here are some:

In [13]:
from transform_audio import *

## Format conversion functions

In [19]:
help(aif_to_wav)

Help on function aif_to_wav in module transform_audio:

aif_to_wav(source, target)
    source : source audio file
    target : target audio file



In [20]:
help(mono_to_stereo)

Help on function mono_to_stereo in module transform_audio:

mono_to_stereo(source, target)



## Indexing functions

In [16]:
# a fucntion to get the time tags of sentences inside a file
help(extract_sentences_tags)

Help on function extract_sentences_tags in module transform_audio:

extract_sentences_tags(source, rms_threshold=-50, WndSize=16384, overlap=8192)
    source : fsource audio file
    This function separates all the sentences inside an audiofile.
    It takes each sentence and put it into one audio file inside target_folder with the name target_nb
    The default parameters were tested with notmal speech.
    Only works if file is at least 500 ms long, which can be tuned
    You can change the rms threshold to tune the algorithm
    
    returns tags in pairs of [begining end]



In [25]:
# Fucntion to cut a wav file with several sentences into many files
help(index_wav_file)

Help on function index_wav_file in module transform_audio:

index_wav_file(source, rms_threshold=-50, WndSize=16384, target_folder='Indexed')
    source : source audio file
    This function separates all the sentences inside an audiofile.
    It takes each sentence and put it into one audio file inside target_folder with the name target_nb
    The default parameters were tested with notmal speech.
    Only works if file is at least 500 ms long, which can be tuned
    You can change the rms threshold to tune the algorithm



In [22]:
# Function to cut the silence at the begining and at the end of the sound
help(cut_silence_in_sound)

Help on function cut_silence_in_sound in module transform_audio:

cut_silence_in_sound(source, target, rmsTreshhold=-40, WndSize=128)
    source : fsource audio file
    target : output sound
    This function cuts the silence at the begining and at the end of an audio file in order. 
    It's usefull for normalizing the length of the audio stimuli in an experiment.
    The default parameters were tested with notmal speech.



In [23]:
# Function to extract time tags without silence neither at the begining nor at the end of the sound
help(get_sound_without_silence)

Help on function get_sound_without_silence in module transform_audio:

get_sound_without_silence(source, rmsTreshhold=-40, WndSize=128)
    source : source audio file
    This function returns a begining and end time tags for the begining and the end of audio in a file



## Dynamically create a stimuli a data base

In the STIM module, there are some useful functions to cut audio/video files which contains several sentences.
All the imporant functions can be found in the transform_audio scripts

So for instance let's take an example sound file which has several sentences in it

In [1]:
import IPython
long_file = "sounds/several_sentences.wav"
IPython.display.Audio(long_file)

As you can see, this audio file is stereo

In [5]:
from audio_analysis import get_nb_channels
print get_nb_channels(long_file)

2


So let's transform it to mono:

In [7]:
help(wav_to_mono)

Help on function wav_to_mono in module transform_audio:

wav_to_mono(source, target)
    source : source audio file
    target : target audio file



In [8]:
long_file_mono = "sounds/several_sentences_mono.wav"
wav_to_mono(long_file, long_file_mono)

Now the new sound is in mono:

In [9]:
get_nb_channels(long_file_mono)

1

Now we can cut all the sentences inside with the function 

In [14]:
# Fucntion to cut a wav file with several sentences into many files
help(index_wav_file)

Help on function index_wav_file in module transform_audio:

index_wav_file(source, rms_threshold=-50, WndSize=16384, target_folder='Indexed')
    input:
            source : fsource audio file
            rms_threshold : this is the threshold
            WndSize : window size to compue the RMS on
            target_folder : folder to save the extracted sounds in
    
    This function separates all the sentences inside an audiofile.
    It takes each sentence and put it into one audio file inside target_folder with the name target_nb
    The default parameters were tested with notmal speech.
    Only works if file is at least 500 ms long, which can be tuned
    You can change the rms threshold to tune the algorithm



In [21]:
#create a folder to save the sounds in
import os
folder_db = "sounds/data_base/"
os.mkdir(folder_db)

In [26]:
index_wav_file(long_file_mono, target_folder = folder_db)

duree  :  1.11455782313
<type 'file'>
duree  :  1.48607709751
<type 'file'>
duree  :  1.48607709751
<type 'file'>
duree  :  1.48607709751
<type 'file'>


The sound has now been indexed and are now presented in the database folder

In [36]:
#here is an example
import glob
list_of_files = glob.glob("sounds/data_base/*.wav")
IPython.display.Audio(list_of_files[0])