This is a streamlined solution without any data augmentation. The model has two layers, which build upon each other and one layer uses CPU platform while the other uses GPU platform. We decided that the easiest will be to split this into three kernels, which will be three steps. Please run these three kernels in sequence one after the other as described below.
 

(1) Liverppol_LGBM_oofs

(1.a) This kernal has to be run first.

(1.b) This kernal uses competition data and the public dataset Data Without Drift as input.

(1.c) This kernal uses an LGBM based model to make baseline predictions. 

(1.d) This kernal runs on a CPU platform.

(1.e) The runtime for this kernal is approximately 1.7 hrs (it took us 6133.8 sec).

(2) Liverpool_NoiseRemoval

(2.a) This kernal has to be run second.

(2.b) This kernal takes output from the previous kernal as input.

(2.c) This kernal removes 50Hz noise from the signal and we use that clean signal in the next step.

(2.d) This kernal runs on a CPU platform.

(2.d) This runtime for this kernal is approximately 1.2 min (it took us 72.9 sec).

(3) Liverpool_Wavenet

(3.a) This last kernal outputs two csv files, the one named final_submission_wavenet.csv can be submitted to the competition to get a score close to our final private score.

(3.b) This kernal uses a Wavenet based model, which is trained on clean signal
and makes the final predictions.

(3.c) This kernal runs on a GPU platform.

(3.d) The runtime for this kernal is approximately 2.5 hrs (it took us 8908.9 sec)

Doing this iteratively, i.e generate better predictions and then generate cleaner 
signal and then use that cleaner signal to make even better prediction captures 
the basic strategy of our work.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
train_data = pd.read_csv('../input/data-without-drift/train_clean.csv')

In [None]:
test_data = pd.read_csv('/kaggle/input/data-without-drift/test_clean.csv')

In [None]:
for i in range(0,20):
    for j in range(0,50):
        test1=test_data[100000*i:(i+1)*100000]
        test1.reset_index(inplace=True,drop=True)
        train1=train_data[100000*j:(j+1)*100000]
        train1.reset_index(inplace=True,drop=True)
        corr=train1.signal.corr(test1.signal)
        if corr>abs(0.5):
            test_data[i*100000:(i+1)*100000]['signal']=test_data[i*100000:(i+1)*100000]['signal'].values - train_data[j*100000:(j+1)*100000]['signal'].values

In [None]:
train = pd.DataFrame()
train['signal']=train_data['signal'].values
train['open_channels']=train_data['open_channels'].values

In [None]:
from tqdm import tqdm
from scipy import signal
# signal processing features

def calc_power(s):
    
    power = pd.DataFrame()
    power['s2'] = s**2
    s2m = (s**2).mean()
    power['s2_mean'] = s**2 - s2m
    power = power.fillna(value=0)
    return power
    
def calc_gradients(s, n_grads=4):
    '''
    Calculate gradients for a pandas series. Returns the same number of samples
    '''
    grads = pd.DataFrame()
    
    g = s.values
    for i in range(n_grads):
        g = np.gradient(g)
        grads['grad_' + str(i+1)] = g
        
    return grads

def calc_low_pass(s, n_filts=10):
    '''
    Applies low pass filters to the signal. Left delayed and no delayed
    '''
    wns = np.logspace(-2, -0.3, n_filts)
    
    low_pass = pd.DataFrame()
    x = s.values
    for wn in wns:
        b, a = signal.butter(1, Wn=wn, btype='low')
        zi = signal.lfilter_zi(b, a)
        low_pass['lowpass_lf_' + str('%.4f' %wn)] = signal.lfilter(b, a, x, zi=zi*x[0])[0]
        low_pass['lowpass_ff_' + str('%.4f' %wn)] = signal.filtfilt(b, a, x)
        
    return low_pass

def calc_high_pass(s, n_filts=10):
    '''
    Applies high pass filters to the signal. Left delayed and no delayed
    '''
    wns = np.logspace(-2, -0.1, n_filts)
    
    high_pass = pd.DataFrame()
    x = s.values
    for wn in wns:
        b, a = signal.butter(1, Wn=wn, btype='high')
        zi = signal.lfilter_zi(b, a)
        high_pass['highpass_lf_' + str('%.4f' %wn)] = signal.lfilter(b, a, x, zi=zi*x[0])[0]
        high_pass['highpass_ff_' + str('%.4f' %wn)] = signal.filtfilt(b, a, x)
        
    return high_pass

def calc_roll_stats(s, windows=[10, 50, 100, 500, 1000]):
#def calc_roll_stats(s, windows=[4]):

    '''
    Calculates rolling stats like mean, std, min, max...
    '''
    roll_stats = pd.DataFrame()
    for window in windows:
        roll_stats['roll_mean_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).mean()
#        roll_stats['roll_mean_cen' + str(window)] = s.rolling(window=window, min_periods=1, center=True).mean()        
        roll_stats['roll_std_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).std()
        roll_stats['roll_min_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).min()
        roll_stats['roll_max_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).max()
        roll_stats['roll_range_' + str(window)] = roll_stats['roll_max_' + str(window)] - roll_stats['roll_min_' + str(window)]
#        roll_stats['roll_q10_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).quantile(0.10)
#        roll_stats['roll_q25_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).quantile(0.25)
#        roll_stats['roll_q50_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).quantile(0.50)
#        roll_stats['roll_q75_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).quantile(0.75)
#        roll_stats['roll_q90_' + str(window)] = s.rolling(window=window, min_periods=1, center=True).quantile(0.90)
    
    # add zeros when na values (std)
    roll_stats = roll_stats.fillna(value=0)
             
    return roll_stats

def calc_roll_stats_back(s, windows=[10, 50, 100, 500, 1000]):
#def calc_roll_stats(s, windows=[4]):

    '''
    Calculates rolling stats like mean, std, min, max...
    '''
    roll_stats_b = pd.DataFrame()
    for window in windows:
        roll_stats_b['roll_mean_b' + str(window)] = s.rolling(window=window, min_periods=1, center=False).mean()
#        roll_stats_b['roll_mean_cen' + str(window)] = s.rolling(window=window, min_periods=1, center=True).mean()        
        roll_stats_b['roll_std_b' + str(window)] = s.rolling(window=window, min_periods=1, center=False).std()
        roll_stats_b['roll_min_b' + str(window)] = s.rolling(window=window, min_periods=1, center=False).min()
        roll_stats_b['roll_max_b' + str(window)] = s.rolling(window=window, min_periods=1, center=False).max()
        roll_stats_b['roll_range_b' + str(window)] = roll_stats_b['roll_max_b' + str(window)] - roll_stats_b['roll_min_b' + str(window)]
#        roll_stats_b['roll_q10_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.10)
#        roll_stats_b['roll_q25_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.25)
#        roll_stats_b['roll_q50_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.50)
#        roll_stats_b['roll_q75_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.75)
#        roll_stats_b['roll_q90_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.90)
    
    # add zeros when na values (std)
    roll_stats_b = roll_stats_b.fillna(value=0)
             
    return roll_stats_b

def calc_roll_stats_forward(s, windows=[10, 50, 100, 500, 1000]):
#def calc_roll_stats(s, windows=[4]):

    '''
    Calculates rolling stats like mean, std, min, max...
    '''
    roll_stats_f = pd.DataFrame()
    for window in windows:
        roll_stats_f['roll_mean_f' + str(window)] = s.shift(periods=window, fill_value=0).rolling(window=window, min_periods=1, center=False).mean()
#        roll_stats_f['roll_mean_cen' + str(window)] = s.rolling(window=window, min_periods=1, center=True).mean()        
        roll_stats_f['roll_std_f' + str(window)] = s.shift(periods=window, fill_value=0).rolling(window=window, min_periods=1, center=False).std()
        roll_stats_f['roll_min_f' + str(window)] = s.shift(periods=window, fill_value=0).rolling(window=window, min_periods=1, center=False).min()
        roll_stats_f['roll_max_f' + str(window)] = s.shift(periods=window, fill_value=0).rolling(window=window, min_periods=1, center=False).max()
        roll_stats_f['roll_range_f' + str(window)] = roll_stats_f['roll_max_f' + str(window)] - roll_stats_f['roll_min_f' + str(window)]
#        roll_stats_f['roll_q10_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.10)
#        roll_stats_f['roll_q25_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.25)
#        roll_stats_f['roll_q50_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.50)
#        roll_stats_f['roll_q75_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.75)
#        roll_stats_f['roll_q90_' + str(window)] = s.rolling(window=window, min_periods=1).quantile(0.90)
    
    # add zeros when na values (std)
    roll_stats_f = roll_stats_f.fillna(value=0)
             
    return roll_stats_f

def calc_ewm(s, windows=[10, 50, 100, 500, 1000]):
    '''
    Calculates exponential weighted functions
    '''
    ewm = pd.DataFrame()
    for w in windows:
        ewm['ewm_mean_' + str(w)] = s.ewm(span=w, min_periods=1).mean()
        ewm['ewm_std_' + str(w)] = s.ewm(span=w, min_periods=1).std()
        
    # add zeros when na values (std)
    ewm = ewm.fillna(value=0)
        
    return ewm

def calc_grad_times_means(gradients, roll_stats, ewm, windows=[10, 50, 100, 500, 1000]):
    '''
    Calculates the gradient times means features
    '''
    grad_times = pd.DataFrame()
    for w in windows:
        grad_times['grad_1_times_roll_mean_' + str(w)] = gradients['grad_1']*roll_stats['roll_mean_' + str(w)]
        grad_times['grad_2_times_roll_mean_' + str(w)] = gradients['grad_2']*roll_stats['roll_mean_' + str(w)]
        grad_times['grad_3_times_roll_mean_' + str(w)] = gradients['grad_3']*roll_stats['roll_mean_' + str(w)]
        grad_times['grad_4_times_roll_mean_' + str(w)] = gradients['grad_4']*roll_stats['roll_mean_' + str(w)]
        grad_times['grad_1_times_ewm_mean_' + str(w)] = gradients['grad_1']*ewm['ewm_mean_' + str(w)]
        grad_times['grad_2_times_ewm_mean_' + str(w)] = gradients['grad_2']*ewm['ewm_mean_' + str(w)]
        grad_times['grad_3_times_ewm_mean_' + str(w)] = gradients['grad_3']*ewm['ewm_mean_' + str(w)]
        grad_times['grad_4_times_ewm_mean_' + str(w)] = gradients['grad_4']*ewm['ewm_mean_' + str(w)]
        grad_times['grad_1_times_ewm_std_' + str(w)] = gradients['grad_1']*ewm['ewm_std_' + str(w)]
        grad_times['grad_2_times_ewm_std_' + str(w)] = gradients['grad_2']*ewm['ewm_std_' + str(w)]
        grad_times['grad_3_times_ewm_std_' + str(w)] = gradients['grad_3']*ewm['ewm_std_' + str(w)]
        grad_times['grad_4_times_ewm_std_' + str(w)] = gradients['grad_4']*ewm['ewm_std_' + str(w)]        
        
    grad_times = grad_times.fillna(value=0)
    
    return grad_times

def add_features(s):
    '''
    All calculations together
    '''
    power = calc_power(s)
    gradients = calc_gradients(s)
    low_pass = calc_low_pass(s)
    high_pass = calc_high_pass(s)
    roll_stats = calc_roll_stats(s)
#    roll_stats_b = calc_roll_stats_back(s)
#    roll_stats_f = calc_roll_stats_forward(s)
    ewm = calc_ewm(s)
#    grad_times_means = calc_grad_times_means(gradients, roll_stats, ewm)
    return pd.concat([s, power, gradients, low_pass, high_pass, roll_stats, ewm], axis=1)

def divide_and_add_features(s, signal_size=100000):
    '''
    Divide the signal in bags of "signal_size".
    Normalize the data dividing it by 15.0
    '''
    # normalize
    s = s/15.0
    
    ls = []
    for i in tqdm(range(int(s.shape[0]/signal_size))):
        sig = s[i*signal_size:(i+1)*signal_size].copy().reset_index(drop=True)
        sig_featured = add_features(sig)
        ls.append(sig_featured)
    
    return pd.concat(ls, axis=0)

In [None]:
def reduce_mem_usage(df, verbose=True):
    numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
    start_mem = df.memory_usage().sum() / 1024**2    
    for col in df.columns:
        if col!='open_channels':
            col_type = df[col].dtypes
            if col_type in numerics:
                c_min = df[col].min()
                c_max = df[col].max()
                if str(col_type)[:3] == 'int':
                    if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                        df[col] = df[col].astype(np.int8)
                    elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                        df[col] = df[col].astype(np.int16)
                    elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                        df[col] = df[col].astype(np.int32)
                    elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                        df[col] = df[col].astype(np.int64)  
                else:
                    if c_min > np.finfo(np.float16).min and c_max < np.finfo(np.float16).max:
                        df[col] = df[col].astype(np.float16)
                    elif c_min > np.finfo(np.float32).min and c_max < np.finfo(np.float32).max:
                        df[col] = df[col].astype(np.float32)
                    else:
                        df[col] = df[col].astype(np.float64)    
    end_mem = df.memory_usage().sum() / 1024**2
    if verbose: print('Mem. usage decreased to {:5.2f} Mb ({:.1f}% reduction)'.format(end_mem, 100 * (start_mem - end_mem) / start_mem))
    return df

In [None]:
dfs = []
for i in range(0,50):
    df0 = divide_and_add_features(train[i*100000:(i+1)*100000]['signal'])
    df0 = reduce_mem_usage(df0)
    dfs.extend([df0])
df = pd.concat(dfs)    

In [None]:
dfs = []
for i in range(0,20):
    df0 = divide_and_add_features(test_data[i*100000:(i+1)*100000]['signal'])
    df0 = reduce_mem_usage(df0)
    dfs.extend([df0])
df_test = pd.concat(dfs)
df_test

In [None]:
from sklearn.model_selection import GroupKFold, StratifiedKFold, train_test_split
from sklearn import metrics
import lightgbm as lgb
import time
print(time.ctime())
kf = StratifiedKFold(n_splits = 5, shuffle = True, random_state = 42)
target = 'open_channels'
#features = ['signal', 'grad_1', 'grad_2', 'grad_3', 'grad_4']
features = [col for col in df.columns if col not in ['roll_mean_b10', 'roll_std_b10', 'roll_min_b10', 'roll_max_b10', 'roll_range_b10', 'roll_mean_b50', 'roll_std_b50', 'roll_min_b50', 'roll_max_b50', 'roll_range_b50', 'roll_mean_b100', 'roll_std_b100', 'roll_min_b100', 'roll_max_b100', 'roll_range_b100', 'roll_mean_b500', 'roll_std_b500', 'roll_min_b500', 'roll_max_b500', 'roll_range_b500', 'roll_mean_b1000', 'roll_std_b1000', 'roll_min_b1000', 'roll_max_b1000', 'roll_range_b1000', 'roll_mean_f10', 'roll_std_f10', 'roll_min_f10', 'roll_max_f10', 'roll_range_f10', 'roll_mean_f50', 'roll_std_f50', 'roll_min_f50', 'roll_max_f50', 'roll_range_f50', 'roll_mean_f100', 'roll_std_f100', 'roll_min_f100', 'roll_max_f100', 'roll_range_f100', 'roll_mean_f500', 'roll_std_f500', 'roll_min_f500', 'roll_max_f500', 'roll_range_f500', 'roll_mean_f1000', 'roll_std_f1000', 'roll_min_f1000', 'roll_max_f1000', 'roll_range_f1000']]
#features = [col for col in df.columns]
oof_pred = np.zeros(len(df))
y_pred = np.zeros(len(df_test))
#y_pred_b = np.zeros(len(df_test_b))
#y_pred_c = np.zeros(len(df_test_c))
#y_pred_d = np.zeros(len(df_test_d))
for fold, (tr_ind, val_ind) in enumerate(kf.split(df[features], train[target])):
    x_train, x_val = df[features].iloc[tr_ind], df[features].iloc[val_ind]
    y_train, y_val = train[target][tr_ind], train[target][val_ind]
    train_set = lgb.Dataset(x_train, y_train)
    val_set = lgb.Dataset(x_val, y_val)
    params = {'boosting_type': 'gbdt',
              'metric': 'rmse',
              'objective': 'regression',
              'n_jobs': -1,
              'seed': 42,
              'num_leaves': 280,
              'learning_rate': 0.026623466966581126,
              'max_depth': 73,
              'lambda_l1': 2.959759088169741,
              'lambda_l2': 1.331172832164913,
              'bagging_fraction': 0.9655406551472153,
              'bagging_freq': 9,
              'colsample_bytree': 0.6867118652742716}
        
    model = lgb.train(params, train_set, num_boost_round = 10000, early_stopping_rounds = 50, 
                        valid_sets = [train_set, val_set], verbose_eval = 100)
        
    oof_pred[val_ind] = model.predict(x_val)
        
    y_pred += model.predict(df_test[features]) / kf.n_splits
#    y_pred_b += model.predict(df_test_b[features]) / kf.n_splits
#    y_pred_c += model.predict(df_test_c[features]) / kf.n_splits
#    y_pred_d += model.predict(df_test_d[features]) / kf.n_splits
        
rmse_score = np.sqrt(metrics.mean_squared_error(train[target], oof_pred))
# want to clip and then round predictions (you can get a better performance using optimization to found the best cuts)
oof_pred2 = np.round(np.clip(oof_pred, 0, 10)).astype(int)
round_y_pred = np.round(np.clip(y_pred, 0, 10)).astype(int)
#round_y_pred_b = np.round(np.clip(y_pred_b, 0, 10)).astype(int)
#round_y_pred_c = np.round(np.clip(y_pred_c, 0, 5)).astype(int)
#round_y_pred_d = np.round(np.clip(y_pred_d, 0, 5)).astype(int)
f1 = metrics.f1_score(train[target], oof_pred2, average = 'macro')

pre_train = pd.DataFrame()
pre_train['oofs']=oof_pred
pre_train['oofs2']=oof_pred2
pre_train[['oofs','oofs2']].to_csv('oofs_train.csv', index = False)

print(f'Our oof rmse score is {rmse_score}')
print(f'Our oof  f1 score is {f1}')
#f1_new = (f1[6]+f1[7]+f1[8]+f1[9]+f1[10])/5.0
#print(f'Our oof  f1 score for channels 6-10 is {f1_new}')
print(time.ctime())
#submission = pd.read_csv('/kaggle/input/liverpool-ion-switching/sample_submission.csv', dtype={'time':str})
#submission['open_channels'] = round_y_pred
#submission['oofs'] = y_pred

#submission[['time','open_channels']].to_csv('submission.csv', index = False)
#submission.to_csv('oofs_test.csv', index = False)
sub = pd.DataFrame()
#sub_b = pd.DataFrame()
#sub_c = pd.DataFrame()
#sub_d = pd.DataFrame()
sub['oofs']=y_pred
#sub_b['oofs']=y_pred_b
sub['oofs2']=round_y_pred
#sub_b['oofs2']=round_y_pred_b
#sub_c['open_channels']=round_y_pred_c
#sub_d['open_channels']=round_y_pred_d
sub.to_csv('oofs_test.csv', index = False)
#sub_b.to_csv('oofs_test_b.csv', index = False)
#sub_c.to_csv('sub_i.csv', index = False)
#sub_d.to_csv('sub_k.csv', index = False)