# Sample Preparation

This notebook is part of the project 'Continuous State Modeling for Statistical Spectral Synthesis'.



## Imports
Import the relevant moduls and functions

In [3]:
import CSM_functions as csm
import numpy as np
import os


## Paths
All relevant paths should be declared here.
You might have to reassign some of these.

In [4]:
# path of notebook
path_nb = os.getcwd()

# path to the list_single.txt file
path_frequency = '../TU-Note_Violin/File_Lists/list_Single.txt'

#path to the segmentation file
path_annotations = '../TU-Note_Violin/Segments/SingleSounds/SampLib_DPA_'

#path to audio files TU-Note Violin Sample Lirary is in 96k
path_soundfile_96k = '../TU-Note_Violin/WAV/SingleSounds/BuK/SampLib_BuK_'

#path to audio files in 44.1k
#path_soundfile = '/Users/tim-tarek/Desktop/TU-Note_Violin_41kHz/WAV/SingleSounds/BuK/SampLib_BuK_'
path_soundfile = os.path.join(path_nb,"input44100/sounds/")

#output path to extracted parameters 
path_extracted = os.path.join(path_nb,"extracted_parameters")


## Convert Samples to 44.1kHz


In [5]:
#file_indices = np.linspace(1,84*4, 84*4)
file_indices = np.array([1])
csm.batch_convert_to_44100(file_indices, path_in=path_soundfile_96k, path_out= path_soundfile)

1


## Extract relevant information from TU-Note Violin Sample Library
The frequencies contained in the list_Single.txt file will help us narrow down the range of the fundamental frequency.

Here is how you can extract the fundamental frequency.

In [10]:
list_single = csm.read_list_single_TU_VSL(path_frequency)
frequency_test = csm.read_frequency_TU_VSL(list_single, index = 1)
print(frequency_test)

197.33


Now we can extract the envelope annotation from the TU-Note Violin Sample Library


In [11]:
start_sec, stop_sec = csm.read_annotations_TU_VSL(path_annotations, index = 96)
start_samp, stop_samp = int(start_sec*44100), int(stop_sec*44100)
print("Start: ", start_sec, " sec, ", start_samp, " samples. \nStop: ", stop_sec, " sec,", stop_samp, " samples.")

Start:  0.205333  sec,  9055  samples. 
Stop:  4.898667  sec, 216031  samples.


## Extract Parameters
Now we can put everything together:

We read the list_single file into memory.

Then we create a list of sound item indices, through which we iterate.

For every sound item we now read the frequency as well as start and stop annotations for the sustain part.

The ```csm.write_parameter()``` function now writes the statistical parameters of the partial trajectories (mean amp, std amp, std freq) into txt files.

In [4]:
#only needed to be read once, so out of loop
list_single = csm.read_list_single_TU_VSL(path_frequency)

#put indices of sound items you want to read in list
#list_items = [1]
list_items = np.linspace(1,84*4, 84*4)

for index in list_items:
    index = int(index)
    #read frequency from list_single.txt
    frequency_f0 = csm.read_frequency_TU_VSL(list_single, index)

    #read start and stop sample indices from annotation file 
    start_sec, stop_sec = csm.read_annotations_TU_VSL(path_annotations, index)
    start_samp, stop_samp = int(start_sec*44100), int(stop_sec*44100)

    #enter sms-tools parameters
    window='blackman'
    M=1201
    N=2048
    t=-110
    minSineDur=0.05
    nH=100
    minf0= 1/1.2 * frequency_f0
    maxf0= 1.2 * frequency_f0
    f0et=7 
    harmDevSlope=0.01
    Ns = 512
    H = 128

    #write to file
    csm.write_parameters(path_soundfile_96k, 
                         path_extracted,  
                         frequency_f0, 
                         index, 
                         start_samp, 
                         stop_samp, 
                         window, 
                         M, 
                         N, 
                         t, 
                         minSineDur, 
                         nH, 
                         minf0, 
                         maxf0, 
                         f0et, 
                         harmDevSlope, 
                         Ns, 
                         H, 
                         verbose = False)

  (fs, x) = read(input_path)


## Refine parameters

Some settings do not work for all 336 items. Especially high and quiet sounds tend to fail using parameters tuned to lower and louder items.
Quieter sound items also tend to have an unstable pitch trajectory. Here the maximum f0 error threshold can be raised.
For these I suggest the following parameters:
M: 201 (high) 2001 (low)
N: 512 (high) 4096 (low)
t: -110
minSineDur: 0.01
nH: 100
minf0: 1/1.2 * frequency_f0
maxf0: 1.2 * frequency_f0
f0et: 40
harmDevSlope: 0.01

It could also help to switch to the 96k version of the soundfiles. For this it makes sense to increase FFT size Ns and hopsize H by * 2.





In [62]:
#only needed to be read once, so out of loop
list_single = csm.read_list_single_TU_VSL(path_frequency)

#put indices of sound items you want to read in list
list_bad_items = np.arange(62, 63 )

#list_bad_items = np.arange(175,337)

for index in list_bad_items:
    index = int(index)
    
    #read frequency from list_single.txt
    frequency_f0 = csm.read_frequency_TU_VSL(list_single, index)

    #read start and stop sample indices from annotation file 
    start_sec, stop_sec = csm.read_annotations_TU_VSL(path_annotations, index)
    start_samp, stop_samp = int(start_sec*44100), int(stop_sec*44100)

    #enter sms-tools parameters
    window='blackman'
    M=2001
    N=4096
    t=-110
    minSineDur=0.01
    nH=100
    minf0= 1/1.7 * frequency_f0
    maxf0= 1.7 * frequency_f0
    f0et=80
    harmDevSlope=0.01 
    Ns = 512*2
    H = 128*2

    #write to file
    csm.write_parameters(path_soundfile_96k, 
                         path_extracted,  
                         frequency_f0, 
                         index, 
                         start_samp, 
                         stop_samp, 
                         window, 
                         M, 
                         N, 
                         t, 
                         minSineDur, 
                         nH, 
                         minf0, 
                         maxf0, 
                         f0et, 
                         harmDevSlope, 
                         Ns, 
                         H, 
                         verbose = False)