# About
* This notebook achieved top11%, with ensembing efn-b4-5folds and cassava-resnext50-32x4d-weights.
* The resnext50 model weights is public weights, and others are trained on Kaggle by myself.
    
# Source Kernels
* This notebook refered these great kernels below which I upvoted, thanks for your great job and best regards
    
    1. [EfficientNet CV](https://www.kaggle.com/ajaykumar7778/efficientnet-cv#Training)
    2. [Pytorch Efficientnet Baseline ](https://www.kaggle.com/zzy990106/pytorch-efficientnet-baseline/notebook?select=weight.pt)
    3. [EfficientNet and CutMixUp with TPU Predict Phase](https://www.kaggle.com/itsuki9180/efficientnet-and-cutmixup-with-tpu-predict-phase)
    4. [Ensemble: Resnext50_32x4d + Efficientnet = 0.903](https://www.kaggle.com/japandata509/ensemble-resnext50-32x4d-efficientnet-0-903)
    5. [ViT: CUDA as usual [Ensemble Inference]](https://www.kaggle.com/szuzhangzhi/vit-cuda-as-usual-ensemble-inference/data)
    6. [LEAF CLASSIFICATION RESNEXT 50_32*4D](https://www.kaggle.com/manojprabhaakr/leaf-classification-resnext-50-32-4d/data)

# Background

## The Challenge

As the second-largest provider of carbohydrates in Africa, cassava is a key food security crop grown by smallholder farmers because it can withstand harsh conditions. At least 80% of household farms in Sub-Saharan Africa grow this starchy root, but viral diseases are major sources of poor yields. With the help of data science, it may be possible to identify common diseases so they can be treated.

Existing methods of disease detection require farmers to solicit the help of government-funded agricultural experts to visually inspect and diagnose the plants. This suffers from being labor-intensive, low-supply and costly. As an added challenge, effective solutions for farmers must perform well under significant constraints, since African farmers may only have access to mobile-quality cameras with low-bandwidth.

In this competition, we introduce a dataset of 21,367 labeled images collected during a regular survey in Uganda. Most images were crowdsourced from farmers taking photos of their gardens, and annotated by experts at the National Crops Resources Research Institute (NaCRRI) in collaboration with the AI lab at Makerere University, Kampala. This is in a format that most realistically represents what farmers would need to diagnose in real life.

Your task is to classify each cassava image into four disease categories or a fifth category indicating a healthy leaf. With your help, farmers may be able to quickly identify diseased plants, potentially saving their crops before they inflict irreparable damage.

## Files

**[train/test]_images** the image files. The full set of test images will only be available to your notebook when it is submitted for scoring. Expect to see roughly 15,000 images in the test set.

**train.csv**
* image_id the image file name.
* label the ID code for the disease.

**sample_submission.csv** A properly formatted sample submission, given the disclosed test set content.
* image_id the image file name.
* label the predicted ID code for the disease.

**[train/test]_tfrecords** the image files in tfrecord format.

**label_num_to_disease_map.json** The mapping between each disease code and the real disease name.

# load data

In [None]:
import pandas as pd

train_path = '../input/cassava-leaf-disease-classification/train_images/'
test_path = '../input/cassava-leaf-disease-classification/test_images/'

train_csv = pd.read_csv('../input/cassava-leaf-disease-classification/train.csv')
test_csv = pd.read_csv('../input/cassava-leaf-disease-classification/sample_submission.csv')

In [None]:
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [None]:
import cv2
from torch.utils.data import Dataset

class MyDataset(Dataset):
    
    def __init__(self, dataframe, transform=None, test=False):
        self.df = dataframe
        self.transform = transform
        self.test = test
    
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, idx):
        
        label = self.df.label.values[idx]
        p = self.df.image_id.values[idx]
        
        if self.test == False:
            p_path = train_path + p
        else:
            p_path = test_path + p
            
        image = cv2.imread(p_path)
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        
        if self.transform:
            image = self.transform(image=image)['image']
        
        return image, label

# process data

In [None]:
import albumentations
from albumentations.pytorch import ToTensorV2
from torch.utils.data import DataLoader
from sklearn.model_selection import train_test_split, StratifiedKFold

IMG_SIZE = 512
BATCH_SIZE = 12

train_transform = albumentations.Compose([
    albumentations.RandomResizedCrop(IMG_SIZE, IMG_SIZE),
    albumentations.Transpose(p=0.5),
    albumentations.HorizontalFlip(p=0.5),
    albumentations.VerticalFlip(p=0.5),
    albumentations.ShiftScaleRotate(p=0.5),
    albumentations.HueSaturationValue(hue_shift_limit=0.2, sat_shift_limit=0.2, val_shift_limit=0.2, p=0.5),
    albumentations.RandomBrightnessContrast(brightness_limit=(-0.1,0.1), contrast_limit=(-0.1, 0.1), p=0.5),
    albumentations.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], max_pixel_value=255.0, p=1.0),
    albumentations.CoarseDropout(p=0.5),
    albumentations.Cutout(p=0.5),
    ToTensorV2(p=1.0),
], p=1.)

val_transform = albumentations.Compose([
     albumentations.CenterCrop(IMG_SIZE, IMG_SIZE, p=1.),
     albumentations.Resize(IMG_SIZE, IMG_SIZE),
     albumentations.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], max_pixel_value=255.0, p=1.0),
     ToTensorV2(p=1.0),
], p=1.)

kFold = StratifiedKFold(n_splits=10, shuffle=True, random_state=2021).split(train_csv.image_id, train_csv.label)

testset = MyDataset(test_csv, transform=val_transform, test=True)
test_loader = DataLoader(testset, batch_size=BATCH_SIZE, shuffle=False, num_workers=4)

In [None]:
print(testset[0][0].shape)

# create model

In [None]:
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from transformers import get_cosine_schedule_with_warmup

!pip install ../input/timm-install/timm-0.1.30-py3-none-any.whl
import timm

import sys

package_path = '../input/visiontransformer-pakage/VisionTransformer-Pytorch'
sys.path.append(package_path)
from vision_transformer_pytorch import VisionTransformer

LR = 1e-3

# train model

In [None]:
def train_epoch(net, data_loader, device):
    net.train()
    train_batch_num = len(data_loader)
    total_loss = 0
    correct = 0
    sample_num = 0

    for batch_idx, (data, target) in enumerate(data_loader):
        data = data.to(device).float()
        target = target.to(device).long()

        optimizer.zero_grad()
        output = net(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

        total_loss += loss.item()
        prediction = torch.argmax(output, dim=1)
        correct += (prediction == target).sum().item()
        sample_num += len(data)
    
    loss = total_loss / train_batch_num
    acc = correct / sample_num
    return loss, acc

def test_epoch(net, data_loader, device):
    net.eval()
    test_batch_num = len(data_loader)
    total_loss = 0
    correct = 0
    sample_num = 0

    with torch.no_grad():
        for batch_idx, (data, target) in enumerate(data_loader):
            data = data.to(device).float()
            target = target.to(device).long()

            output = net(data)
            loss = criterion(output, torch.squeeze(target).long())
            total_loss += loss.item()
            prediction = torch.argmax(output, dim=1)
            correct += (prediction == target).sum().item()
            sample_num += len(data)
        
    loss = total_loss / test_batch_num
    acc = correct / sample_num
    return loss, acc

In [None]:
# from tqdm import tqdm
# fold_train_loss_list = []
# fold_train_acc_list = []
# fold_val_loss_list = []
# fold_val_acc_list = []

# for n_fold,(train_idx, val_idx) in enumerate(kFold):
#     if n_fold != 9: continue
#     # get i-fold
#     train_df, val_df = train_csv.iloc[train_idx], train_csv.iloc[val_idx]
    
#     trainset = MyDataset(train_df, transform=train_transform)
#     train_loader = DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True, num_workers=4)
    
#     valset = MyDataset(val_df, transform=val_transform)
#     val_loader = DataLoader(valset, batch_size=BATCH_SIZE, shuffle=False, num_workers=4)
    
#     # define a new net
#     net = timm.create_model('tf_efficientnet_b4_ns', pretrained=True, num_classes=5).to(device)
#     #net.load_state_dict(torch.load('../input/efn-b4-ap-fold3-1/efn_b4_ap_fold3_weight_0.8841.pt'))
#     # net = VisionTransformer.from_name('ViT-B_16', num_classes=5).to(device)
#     criterion = nn.CrossEntropyLoss()
#     optimizer = torch.optim.Adam(net.parameters(), lr=LR)
#     scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(
#         optimizer, mode='max', factor=0.5, patience=1, verbose=True, min_lr=1e-9)

#     # train at this fold
#     train_loss_list = []
#     train_acc_list = []
#     val_loss_list = []
#     val_acc_list = []

#     torch.cuda.empty_cache()

#     EPOCHS = 50
#     BEST_ACC = 0
    
#     print('Fold %d:' % n_fold)
#     for epoch in range(EPOCHS):
#         tk = tqdm(train_loader, total=len(train_loader), position=0, leave=True)
#         train_loss, train_acc = train_epoch(net, tk, device)

#         tk = tqdm(val_loader, total=len(val_loader), position=0, leave=True)
#         val_loss, val_acc = test_epoch(net, tk, device)

#         train_loss_list.append(train_loss)
#         train_acc_list.append(train_acc)
#         val_loss_list.append(val_loss)
#         val_acc_list.append(val_acc)

#         if val_acc > BEST_ACC:
#             BEST_ACC = val_acc
#             dirr = 'tf_efficientnet_b4_ns_fold' + str(n_fold) + '_weight_' + str(round(BEST_ACC, 4)) + '.pt'
#             torch.save(net.state_dict(), dirr)

#         print('epoch %d, train loss %.4lf, train acc %.4lf, test loss %.4lf, test acc %.4lf, , best acc %.4lf' %
#         (epoch + 1, train_loss, train_acc, val_loss, val_acc, BEST_ACC))

#         scheduler.step(val_acc)
    
#     fold_train_loss_list.append(train_loss_list)
#     fold_train_acc_list.append(train_acc_list)
#     fold_val_loss_list.append(val_loss_list)
#     fold_val_acc_list.append(val_acc_list)

In [None]:
# import matplotlib.pyplot as plt

# for i in range(5):
#     plt.figure(figsize=[20,9])
#     plt.plot(fold_train_loss_list[i],label="train_loss",color="red")
#     plt.plot(fold_train_acc_list[i],label="train_acc",color="orange")
#     plt.plot(fold_val_loss_list[i],label="val_loss",color="blue")
#     plt.plot(fold_val_acc_list[i],label="val_acc",color="green")
#     plt.legend()

# predict data

In [None]:
# ====================================================
# MODEL
# ====================================================
class CustomResNext(nn.Module):
    def __init__(self, model_name='resnext50_32x4d', pretrained=False):
        super().__init__()
        self.model = timm.create_model(model_name, pretrained=pretrained)
        n_features = self.model.fc.in_features
        self.model.fc = nn.Linear(n_features, 5)

    def forward(self, x):
        x = self.model(x)
        return x

In [None]:
# ====================================================
# Helper functions
# ====================================================
def load_state(model_path):
    model = CustomResNext('resnext50_32x4d', pretrained=False)
    try:  # single GPU model_file
        model.load_state_dict(torch.load(model_path)['model'], strict=True)
        state_dict = torch.load(model_path)['model']
    except:  # multi GPU model_file
        state_dict = torch.load(model_path)['model']
        state_dict = {k[7:] if k.startswith('module.') else k: state_dict[k] for k in state_dict.keys()}

    return state_dict

In [None]:
import numpy as np
from tqdm import tqdm

model_vit = VisionTransformer.from_name('ViT-B_16', num_classes=5).to(device)
model_vit.load_state_dict(torch.load('../input/vit-b-16/ViT-B_16.pt'))

model1 = CustomResNext('resnext50_32x4d', pretrained=False).to(device)
states1 = [load_state(f'../input/cassava-resnext50-32x4d-weights/resnext50_32x4d_fold{fold}.pth') for fold in range(5)]

model2 = timm.create_model('tf_efficientnet_b3_ns', pretrained=False, num_classes=5).to(device)
states2 = [torch.load(f'../input/efn-b3-5folds/efn_b3_fold{fold}_weight.pt') for fold in range(5)]

model3 = timm.create_model('tf_efficientnet_b4_ns', pretrained=False, num_classes=5).to(device)
states3 = [torch.load(f'../input/efn-b4-5folds/efn_b4_fold{fold}_weight.pt') for fold in range(5)]

model4 = timm.create_model('tf_efficientnet_b4_ns', pretrained=False, num_classes=5).to(device)
states4 = [torch.load(f'../input/cassava-efn-b4-ns-10folds-weights/efn_b4_ns_fold{fold}_weight.pt') for fold in range(10)]

model5 = timm.create_model('resnet50', pretrained=False, num_classes=5).to(device)
states5 = [torch.load(f'../input/resnet50-5folds/resNet50_fold{fold}_weight.pt') for fold in range(5)]

test_pred = []
with torch.no_grad():
    for i, data in enumerate(tqdm(test_loader, position=0, leave=True)):
        images, _ = data
        images = images.to(device)
        
        avg_preds = []
            
        for state in states1:
            model1.load_state_dict(state)
            model1.eval()
            pred = model1(images)
            avg_preds.append(pred.to('cpu').numpy())
            
        for state in states2:
            model2.load_state_dict(state)
            model2.eval()
            pred = model2(images)
            avg_preds.append(pred.to('cpu').numpy())
        
        for state in states3:
            model3.load_state_dict(state)
            model3.eval()
            pred = model3(images)
            avg_preds.append(pred.to('cpu').numpy())
        
        for state in states4:
            model4.load_state_dict(state)
            model4.eval()
            pred = model4(images)
            avg_preds.append(pred.to('cpu').numpy())
        
        for state in states5:
            model5.load_state_dict(state)
            model5.eval()
            pred = model5(images)
            avg_preds.append(pred.to('cpu').numpy())

        avg_preds = np.mean(avg_preds, axis=0)
        
        test_pred.append(avg_preds)

test_pred = np.concatenate(test_pred)
test_csv.label = test_pred.argmax(1)
test_csv.to_csv('submission.csv',index=False)