# Fastai Migration Notes (v1)

This notebook contains notes and experiments using fastai v1. Unfortunately, 
I cannot combine notes for both v1 and v2 in the same notebook because they
need to be run using different Python environments.

For each step that touches fastai or fastai_audio, I will create equivalent
code in the corresponding `fastai-migration-notes-v2.ipynb` notebook.

In [None]:
from pathlib import Path

from audio import (
    AudioList, 
    AudioConfig, 
    ClassificationInterpretation,
    SpectrogramConfig,
    get_spectro_transforms,
    audio_learner,
)

## Data Check

I'm following the rough structure of `2_FastAI_v2_Script.py`. I'll keep some text from that notebook as quotes for reference.

> This step just checks data and provide some summary statistics like sampling rate of different audio clips and length distribution of each waveFile

In [None]:
## Defining path of modeling related data (Contains two folder positive and negative)
data_folder = Path("data/sample/")
audios = AudioList.from_folder(data_folder)

len(audios)

In [None]:
len_dict = audios.stats(prec=1)

## Load Data

As far as I can tell, everything that's being done by AudioConfig and AudioList can be replicated using torchaudio transforms. I will preview the spectrograms without any masking and make sure they're identical to the output from torchaudio's equivalent code.

In [None]:
## Definining Audio config needed to create on the fly mel spectograms
config = AudioConfig(standardize=False, 
                     sg_cfg=SpectrogramConfig(
                         f_min=0.0,  ## Minimum frequency to Display
                         f_max=10000, ## Maximum Frequency to Display
                         hop_length=256,
                         n_fft=2560, ## Number of Samples for Fourier
                         n_mels=256, ## Mel bins
                         pad=0, 
                         to_db_scale=True, ## Converting to DB sclae
                         top_db=100,  ## Top decible sound
                         win_length=None, 
                         n_mfcc=20)
                    )
config.duration = 4000 ## 4 sec padding or snip
config.resample_to=20000 ## Every sample at 20000 frequency
config

In [None]:
print(f"{config.max_to_pad=}")
print(f"{config.segment_size=}")
print(f"{config.sg_cfg.to_db_scale=}")
print(f"{config.mfcc=}")
print(f"{config.standardize=}")
print(f"{config.delta=}")
print(f"{config.duration=}")
print(f"{config._processed=}")
print(f"{config._sr=}")
print(f"{config.duration=}")
print(f"{config.sg_cfg.hop_length=}")
print(f"{config.pad_mode=}")

config.sg_cfg.top_db
config.sg_cfg.mel_args()

In [None]:
audios = AudioList.from_folder(data_folder, config=config)

In [None]:
from fastai.vision import ItemList

ItemList.get(audios, 0)

In [None]:
from torchvision.transforms import functional as F

tensor = audios.get(0).get_spec_images()[0].px
print(tensor.shape)
image = F.to_pil_image(tensor)
image

In [None]:
audios.get(0).show()

This code creates a AudioDataBunch which apply defined transformations (In our case frequency masking) on the fly and provide input spectograms to the model in defined bactch size (64) 

In [None]:
data_folder = Path("./data/train/mldata/all/") 
audios = AudioList.from_folder(data_folder, config=config).split_by_rand_pct(.2, seed=4).label_from_folder()

## Defining Transformation
tfms = None

## Frequency masking:ON
tfms = get_spectro_transforms(mask_time=False, mask_freq=True, roll=False) 

## Creating a databunch
db = audios.transform(tfms).databunch(bs=64)

## Let's insepect some data
db.show_batch(1)

## Model Training

Code below creates a ResNet18 model, removes the last 2 fully connected layer and then add new fully connected layers and load the pretrained weights from ImageNet Training.

In [None]:
## Default learner is ResNet 18 
learn = audio_learner(db)

This is key feature of FastAI library, this helps us find the ideal learning rate by running model on sample data to see how the accuracy progresses. Output of this step is a learning rate curve (Choose the learning rate where loss starts bumping again)

In [None]:
## Find ideal learning rate
learn.lr_find()
learn.recorder.plot()

Training model, two cool things to highlight - 
- **This model is getting trained using [1 cycle learning policy]**(https://arxiv.org/abs/1803.09820) which leads to faster conversion, Here is a [cool blog](https://towardsdatascience.com/finding-good-learning-rate-and-the-one-cycle-policy-7159fe1db5d6) explaing the same if you are not a paper person
- **Differential learning rate** - You want different learning rate for different layer of models. In transfer learning you don't want to change learning rate of early layers as fast as later layers in network. (The slice function allows us to pass that information in FastAI)

In [None]:
## 1-cycle learning (5 epochs and variable learning rate)
learn.fit_one_cycle(5, slice(2e-3, 2e-2))

FastAI outputs the model training porgress per epoch, Note that the accuracy is only calculated on Validation set (20% holdout set created during creating AudioDatabunch)

In [None]:
## Find ideal learning rate
learn.lr_find()
learn.recorder.plot()

In [None]:
## 1-cycle learning (5 epochs and variable learning rate)
learn.fit_one_cycle(5, slice(1e-5, 1e-3))

In [None]:
## Exporting the model
learn.export('models/stg2-rn18.pkl')

With just 15 minutes of training we got our accuracy up to ~93.7% on 20% holdout set which was not used for training!

## Model Evaluation

A cool function in fastAI to plot different evaluation measures

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(5,5))

Plot top losses help you plot 10 most wrong prediction by the model, this helps you listen/visualize the sound. This helps you understand where the model is not performing the best and provide key insights. As we can listen in below examples some of these audios don't contain Orca Call but the labeling process has marked them positive and some cases where model thinks there is a Orca call but nobody tagged it as positive.

In [None]:
interp.plot_top_losses(10, heatmap = False)

## Model Evaluation on testing set

Defining DataFolder

In [None]:
test_data_folder = Path("./data/test/all/")
test_data_folder

Creating a AudioBunch

In [None]:
test = AudioList.from_folder(test_data_folder, config=config).split_none().label_from_folder()
testdb = test.transform(tfms).databunch(bs=64)

## Also extracting true labels
true_value = pd.Series(list(testdb.train_ds.y.items))

Generating predictions : 
- **To-Do** - There should be a better way to batch scoring, write now we have to score 1 by 1

In [None]:
predictions = []
for item in tqdm_notebook(testdb.x):
    predictions.append(learn.predict(item)[2][1])

Calulating performance measure

In [None]:
print("AUC Score :{0:.2f} \nF-1 Score :{1:.2f} \nAccuracy Score :{2:.2f} \nAveragePrecisionScore :{1:.2f}".format(
    roc_auc_score(true_value,pd.Series(predictions)), 
    f1_score(true_value,pd.Series(predictions)>0.5), 
    accuracy_score(true_value,pd.Series(predictions)>0.5),
    average_precision_score(true_value,pd.Series(predictions) )
))

Wohoo model seems to performing inline with our initial model training process on this test set. Let's plot a confusion matrix.

In [None]:
plot_confusion_matrix(true_value, pd.Series(predictions)>0.5, classes=["No Orca","Orca"])

## Scoring for official evaluation

Loading the trained model

In [None]:
learn = load_learner("./data/train/mldata/all/models/", 'stg2-rn18.pkl')

Loading the 2 sec audio clips generated in Data prepration step for evaluation

In [None]:
test_data_folder = Path("./data/test/OrcasoundLab07052019_Test/test2Sec/")
tfms=None
test = AudioList.from_folder(test_data_folder, config=config).split_none().label_empty()
testdb = test.transform(tfms).databunch(bs=64)

Runnning though model and generating predictions

In [None]:
predictions = []
pathList = [] 
for item in tqdm_notebook(testdb.x):
    predictions.append(learn.predict(item)[2][1])
    pathList.append(str(item.path))

Exporing the predictions

In [None]:
prediction = pd.DataFrame({'FilePath': pathList, 'pred': predictions})
prediction['FileName'] = prediction.FilePath.apply(lambda x: x.split('/')[6].split("-")[0])
prediction.loc[:,['FileName','pred']].to_csv('./test2Sec.csv', index=False)

Converting the predictions in standard evaluation format

In [None]:
## Load predictions
test2secDF = pd.read_csv("./test2Sec.csv") 

## Clean the predictions(it got converted in string)
test2secDF['pred'] = test2secDF.pred.apply(lambda x: float(x.split('(')[1].split(')')[0])) 

In [None]:
## Extracting Start time from file name
test2secDF['startTime'] = test2secDF.FileName.apply(lambda x: int(x.split('__')[1].split('.')[0].split('_')[0]))

## Sorting the file based on startTime
test2secDF = test2secDF.sort_values(['startTime']).reset_index(drop=True)

In [None]:
test2secDF.head()

In [None]:
## Rolling Window (to average at per second level)
submission = pd.DataFrame({'pred': list(test2secDF.rolling(2)['pred'].mean().values)}).reset_index().rename(columns={'index':'StartTime'})

## Updating first row
submission.loc[0,'pred'] = test2secDF.pred[0]

## Adding lastrow
lastLine = pd.DataFrame({'StartTime':[submission.StartTime.max()+1],'pred':[test2secDF.pred[test2secDF.shape[0]-1]]})
submission = submission.append(lastLine, ignore_index=True)

finalSubmission = submission.loc[submission.pred > 0.5,:].reset_index(drop=True)
finalSubmission['Duration'] = 1

In [None]:
## Final submission file
finalSubmission.loc[:,['StartTime','Duration']].to_csv('../evaluation/submission/submission2SecFastAI.csv', index=False)