<div class="alert alert-success">  
</div>

<div>
    <h1 align="center">"AutoML in Kaggle Kernels"</h1></h1>
    <h1 align="center">Rainforest Connection Species Audio Detection</h1>
    <h4 align="center">By: Somayyeh Gholami & Mehran Kazeminia</h4>
</div>

<div class="alert alert-success">  
</div>

### Description:
- This type of method only works for some challenges and is not a general method.

- If we could test our answers unlimitedly (it means that there were no five tries limit per day or we could create a Kaggle simulator), we would complete our notebook to automatically optimize the results of all kernels. In this case, even non-experts could optimize the results of the experts' work, and our notebook could approach the concept of "AutoML". But our method currently works only empirically as well as "trial and error".

<div class="alert alert-success">  
</div>

### Import & Data Set

In [None]:
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt

%matplotlib inline

# _______________________________________

# Kernels Data (Public Score & File Path)

ked = pd.DataFrame({      
    'Kernel ID': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'],
    'Symbol':    ['SoliSet', '[Inference] ResNest RFCX Audio Detection',  'notebookba481ef16a', 'All-in-one RFCX baseline for beginners', 'RFCX: train resnet50 with TPU',  'RFCX Resnet50 TPU', 'ResNet34 More Augmentations+Mixup+TTA (Inference)', '[Inference][TPU] RFCX Audio Detection Fast++', 'RFCX Bagging'],
    'Score':     [ 0.589 , 0.594 , 0.613 , 0.748 , 0.793 , 0.824 , 0.845 , 0.861 , 0.871 ],
    'File Path': ['../input/audio-detection-soliset-201/submission.csv', '../input/inference-resnest-rfcx-audio-detection/submission.csv', '../input/minimal-fastai-solution-score-0-61/submission.csv', '../input/all-in-one-rfcx-baseline-for-beginners/submission.csv', '../input/rfcx-train-resnet50-with-tpu/submission.csv', '../input/rfcx-resnet50-tpu/submission.csv', '../input/resnet34-more-augmentations-mixup-tta-inference/submission.csv', '../input/inference-tpu-rfcx-audio-detection-fast/submission.csv', '../input/rfcx-bagging-with-different-weights-0871-score/submission.csv'],        
    'Note'     : ['xgboost & cuml(https://rapids.ai)', 'torch & resnest50', 'fastai.vision & torchaudio', 'torch & resnest50', 'tensorflow & tf.keras.Sequential', 'tensorflow & tf.keras.Sequential', 'tensorflow & classification_models.keras', 'torch & resnest50', 'To sort the scores and use their ranks.']                                                  
})    
    
ked    

<div class="alert alert-success">  
</div>

### Kernel Class & Instances

In [None]:
class Kernel():    
    '''
       Class Kernel V 1.0
       Input Argument:       
       - symbol      (kernel name OR author)       
       - score       (Score for the kernel)
       - file_path   (CSV file address)
    ''' 
      
    def __init__(self, symbol, score, file_path):  
        
        self.symbol = symbol
        self.score = score
        
        self.file_path = file_path
        self.sub = pd.read_csv(self.file_path)
        
            
    def __str__(self):
        return f'Kernel: {self.symbol}\t| Score: {self.score}'

    
    def __repr__(self):
        return f'Class: {self.__class__.__name__}\nName: {repr(self.symbol)}\t| Score: {self.score}'   

        
    def print_head(self):
        print(self)
        print(f'\nHead:\n')
        print(self.sub.head())        
    
    
    def print_description(self):
        print(self)      
        print(f'\nDescription:\n')
        print(self.sub.describe())
        
        
    def generation(self, other, coeff):
        g1 = self.sub.copy()
        g2 = self.sub.copy()
        g3 = self.sub.copy()
        g4 = self.sub.copy() 
        
        if isinstance(other, Kernel):             
            for i in self.sub.columns[1:]: 
                
                lm, Is = [], []                
                lm = self.sub[i].tolist()
                ls = other.sub[i].tolist()        
                res1, res2, res3, res4 = [], [], [], []  
                
                for j in range(len(self.sub)): 
                    
                    res1.append(max(lm[j] , ls[j]))
                    res2.append(min(lm[j] , ls[j]))
                    res3.append((lm[j] + ls[j]) / 2)
                    res4.append((lm[j] * coeff) + (ls[j] * (1.- coeff)))        
        
                g1[i] = res1
                g2[i] = res2
                g3[i] = res3
                g4[i] = res4
                
        return g1,g2,g3,g4   
    
# ____________________________________________
    
# Seven instance of "Kernel" class is defined.

for i in range(9):   
    ked.iloc[i, 0] = Kernel(ked.iloc[i, 1], ked.iloc[i, 2], ked.iloc[i, 3])     
#    ked.iloc[i, 0].print_head() 
#    ked.iloc[i, 0].print_description() 


In [None]:
# print(ked.iloc[0, 0])
# ked.iloc[0, 0].sub.describe()

In [None]:
# print(ked.iloc[1, 0])
# ked.iloc[1, 0].sub.describe()

In [None]:
# print(ked.iloc[2, 0])
# ked.iloc[2, 0].sub.describe()

In [None]:
# print(ked.iloc[3, 0])
# ked.iloc[3, 0].sub.describe()

In [None]:
# print(ked.iloc[4, 0])
# ked.iloc[4, 0].sub.describe()

In [None]:
print(ked.iloc[5, 0])
ked.iloc[5, 0].sub.describe()

In [None]:
print(ked.iloc[6, 0])
ked.iloc[6, 0].sub.describe()

In [None]:
print(ked.iloc[7, 0])
ked.iloc[7, 0].sub.describe()

In [None]:
print(ked.iloc[8, 0])
ked.iloc[8, 0].sub.describe()

<div class="alert alert-success">  
</div>

### Increase the best score.
Can the results of the better kernels support each other? YES:)

In [None]:
# Auxiliary function
def generate(main, support, coeff):
    g1 = main.copy()
    g2 = main.copy()
    g3 = main.copy()
    g4 = main.copy()
    
    for i in main.columns[1:]:
        lm, Is = [], []                
        lm = main[i].tolist()
        ls = support[i].tolist() 
        
        res1, res2, res3, res4 = [], [], [], []          
        for j in range(len(main)):
            res1.append(max(lm[j] , ls[j]))
            res2.append(min(lm[j] , ls[j]))
            res3.append((lm[j] + ls[j]) / 2)
            res4.append((lm[j] * coeff) + (ls[j] * (1.- coeff)))
            
        g1[i] = res1
        g2[i] = res2
        g3[i] = res3
        g4[i] = res4
        
    return g1,g2,g3,g4


<div class="alert alert-success">  
</div>

## Example:1
To increase the score of the best kernel (Score: 0.845), we get help from the kernel with a score of 0.824.

In [None]:
g1,g2,g3,g4 = generate(ked.iloc[6, 0].sub, ked.iloc[5, 0].sub, 0.8)

# g1,g2,g3,g4 = ked.iloc[6, 0].generation(ked.iloc[5, 0], 0.8)

In [None]:
# print('Maximum function    | Score: 0.828')
# g1.describe()

In [None]:
print('Minimum function    | Score: 0.848')
g2.describe()

In [None]:
# print('Mean function    | Score: 0.845')
# g3.describe()

In [None]:
# print('Coefficient function (Coeff: 0.8, 0.2)    | Score: 0.847')
# g4.describe()

In [None]:
# Version 1
# We have now selected the minimum function.

# sub = g2

## Version-1 results
### We have now selected the minimum function (g2).
### [G: (Score: 0.845), F: (Score: 0.824)] >>> g2: (Score: 0.848)

## visualization
### We draw the values of a column (for example, column 17).
### For better visualization, you can draw the remaining 23 columns in the same way.

In [None]:
main = ked.iloc[6, 0].sub
X  = main.iloc[:, 17]

support = ked.iloc[5, 0].sub
Y1 = support.iloc[:, 17]

Y2 = g2.iloc[:, 17]

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S17 >>>\n\nOn the X axis >>> G: (Score: 0.845)\nOn the Y axis >>> F: (Score: 0.824)')
for k in range(1992):            
    plt.scatter(X[k], Y1[k], s=20, alpha=0.8)
plt.show() 

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S17 >>>\n\nOn the X axis >>> G: (Score: 0.845)\nOn the Y axis >>> g2: (Score: 0.848)')          
for k in range(1992):            
    plt.scatter(X[k], Y2[k], s=20, alpha=0.8)
plt.show() 


<div class="alert alert-success">  
</div>

## Example: 2
To increase the score of the best kernel (Score: 0.861), we get help from the g2 kernel with a score of 0.848.

In [None]:
f1,f2,f3,f4 = generate(ked.iloc[7, 0].sub, g2, 0.8)


In [None]:
# print('Maximum function    | Score: ----')
# f1.describe()

In [None]:
print('Minimum function    | Score: 0.866')
f2.describe()

In [None]:
# print('Mean function    | Score: ----')
# f3.describe()

In [None]:
# print('Coefficient function (Coeff: 0.8, 0.2)    | Score: ----')
# f4.describe()

In [None]:
# Version 2
# We have now selected the minimum function.

# sub = f2

## Version-2 results
### We have now selected the minimum function (f2).
### [H: (Score: 0.861), g2: (Score: 0.848)] >>> f2: (Score: 0.866)

## visualization
### We draw the values of a column (for example, column 11).
### For better visualization, you can draw the remaining 23 columns in the same way.

In [None]:
main = ked.iloc[7, 0].sub
X  = main.iloc[:, 11]

support = g2
Y1 = support.iloc[:, 11]

Y2 = f2.iloc[:, 11]

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S11 >>>\n\nOn the X axis >>> H: (Score: 0.861)\nOn the Y axis >>> g2: (Score: 0.848)')
for k in range(1992):            
    plt.scatter(X[k], Y1[k], s=20, alpha=0.8)
plt.show() 

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S11 >>>\n\nOn the X axis >>> H: (Score: 0.861)\nOn the Y axis >>> f2: (Score: 0.866)')          
for k in range(1992):            
    plt.scatter(X[k], Y2[k], s=20, alpha=0.8)
plt.show() 


<div class="alert alert-success">  
</div>

## Example: 3
To increase the score of the best kernel (Score: 0.871), we get help from the f2 kernel with a score of 0.866.

In [None]:
e1,e2,e3,e4 = generate(ked.iloc[8, 0].sub, f2, 0.7)


In [None]:
# print('Maximum function    | Score: ----')
# e1.describe()

In [None]:
# print('Minimum function    | Score: ----')
# e2.describe()

In [None]:
print('Mean function    | Score: 0.874')
e3.describe()

In [None]:
# print('Coefficient function (Coeff: 0.7, 0.3)    | Score: ----')
# e4.describe()

In [None]:
# Version 3
# We have selected the mean function.

# sub = e3

## Version-3 results
### We have now selected the mean function (e3).
### [I: (Score: 0.871), f2: (Score: 0.866)] >>> e3: (Score: 0.874)

## visualization
### We draw the values of a column (for example, column 21).
### For better visualization, you can draw the remaining 23 columns in the same way.

In [None]:
main = ked.iloc[8, 0].sub
X  = main.iloc[:, 21]

support = f2
Y1 = support.iloc[:, 21]

Y2 = e3.iloc[:, 21]

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S21 >>>\n\nOn the X axis >>> I: (Score: 0.871)\nOn the Y axis >>> f2: (Score: 0.866)')
for k in range(1992):            
    plt.scatter(X[k], Y1[k], s=20, alpha=0.8)
plt.show() 

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S21 >>>\n\nOn the X axis >>> I: (Score: 0.871)\nOn the Y axis >>> e3: (Score: 0.874)')          
for k in range(1992):            
    plt.scatter(X[k], Y2[k], s=20, alpha=0.8)
plt.show() 


<div class="alert alert-success">  
</div>

## Example: 4
We want to optimize the previous example (Example # 3), because a notebook with a higher score has not been released yet. that's mean:

To increase the score of the best kernel (Score: 0.871), we get help from the f2 kernel with a score of 0.866.

In [None]:
d1,d2,d3,d4 = generate(ked.iloc[8, 0].sub, f2, 0.45)


In [None]:
# print('Maximum function    | Score: ----')
# d1.describe()

In [None]:
# print('Minimum function    | Score: ----')
# d2.describe()

In [None]:
# print('Mean function    | Score: 0.874')
# d3.describe()

In [None]:
print('Coefficient function (Coeff: 0.45, 0.55)    | Score: 0.876')
d4.describe()

In [None]:
# Version 4
# We have selected the Coefficient function.

# sub = d4

## Version-4 results
### We have now selected the Coefficient function (d4).
### [I: (Score: 0.871), f2: (Score: 0.866)] >>> d4: (Score: 0.876)

## visualization
### We draw the values of a column (for example, column 21).
### For better visualization, you can draw the remaining 23 columns in the same way.

In [None]:
main = ked.iloc[8, 0].sub
X  = main.iloc[:, 21]

support = f2
Y1 = support.iloc[:, 21]

Y2 = d4.iloc[:, 21]

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S21 >>>\n\nOn the X axis >>> I: (Score: 0.871)\nOn the Y axis >>> f2: (Score: 0.866)')
for k in range(1992):            
    plt.scatter(X[k], Y1[k], s=20, alpha=0.8)
plt.show() 

plt.style.use('seaborn-whitegrid')    
plt.figure(figsize=(8, 8), facecolor='lightgray')
plt.title(f'<<< S21 >>>\n\nOn the X axis >>> I: (Score: 0.871)\nOn the Y axis >>> d4: (Score: 0.876)')          
for k in range(1992):            
    plt.scatter(X[k], Y2[k], s=20, alpha=0.8)
plt.show() 


<div class="alert alert-success">  
</div>

## Example: 5
Again, we want to optimize Example 3, because a notebook with a higher score has not been published yet. It means:

To increase the score of the best kernel (Score: 0.871), we get help from the f2 kernel with a score of 0.866.

In [None]:
c1,c2,c3,c4 = generate(ked.iloc[8, 0].sub, f2, 0.475)

In [None]:
# print('Maximum function    | Score: ----')
# c1.describe()

In [None]:
# print('Minimum function    | Score: ----')
# c2.describe()

In [None]:
# print('Mean function    | Score: 0.874')
# c3.describe()

In [None]:
print('Coefficient function (Coeff: 0.475, 0.525)    | Score: 0.876')
c4.describe()

In [None]:
# Version 5
# We have selected the coefficient function.

sub = c4

In [None]:
sub.to_csv("submission.csv", index=False)

c1.to_csv("submission1.csv", index=False)
c2.to_csv("submission2.csv", index=False)
c3.to_csv("submission3.csv", index=False)
c4.to_csv("submission4.csv", index=False)

!ls

<div class="alert alert-success">  
</div>