Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the Experiment of Image Multi-label Classification #10

Closed
mymuli opened this issue Aug 4, 2019 · 10 comments
Closed

Question about the Experiment of Image Multi-label Classification #10

mymuli opened this issue Aug 4, 2019 · 10 comments

Comments

@mymuli
Copy link

mymuli commented Aug 4, 2019

In the paper 《DiCENet: Dimension-wise Convolutions for Efficient Networks》, the network width scaling parameter s can be selected, but in the experiment of image multi-label classification, the experimental error of s=0.1 (when I run the corresponding program of s=0.2, the machine can't run, but the network parameter of s=0.1 is less than half) ... can you provide a program for s = 0.1?

@sacmehta
Copy link
Owner

sacmehta commented Aug 4, 2019

Could you please elaborate on the error details?

@mymuli
Copy link
Author

mymuli commented Aug 4, 2019

@sacmehta
python train_classification.py --model dicenet
--scheduler hybrid --clr-max 61 --lr 0.1
--data ./coco-image --dataset coco
--epochs 100 --batch-size 64 --s 0.2

Execute the commands of the above program. When I set s to 0.2, the program would be stuck there because of the small memory of my machine graphics card.

微信图片_20190804152753

×#×---------------------------------------------------------------------------------------------------

2019-08-04 15-39-32屏幕截图

In Table 1, when s = 0.2, FLOPs is 12M, and when s = 0.1, FLOPs is 6.5M.
So I want to set s to 0.1.

python train_classification.py --model dicenet
--scheduler hybrid --clr-max 61 --lr 0.1
--data ./coco-image --dataset coco
--epochs 100 --batch-size 64 --s 0.1

In the experiment of image multi-label classification, execute the command of the above program. When I set s to 0.1, the program will report the error in the figure below.

bug-1

So, Could you tell me how to do it? Thank you very much.

@sacmehta
Copy link
Owner

sacmehta commented Aug 4, 2019

Uncomment this line and it should work. Note that we have not provided pretrained ImageNet weights at this configuration, so you might want to train on theImageNet first.

#0.1 : [8, 8, 16, 32, 64, 512],

@mymuli
Copy link
Author

mymuli commented Aug 6, 2019

@sacmehta
Thank you very much for your reply. And I have another question.

In the paper 《ELASTIC: Improving CNNs with Dynamic Scaling Policies》, the data set used is mscoco 2014, the size of training set/verification set is 82783/40504,
the data set cited in 《DiCENet: Dimension-wise Convolutions for Efficient Networks》 and《ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network》 are mscoco 2014, but in the corresponding experiment of DiCENet, mscoco 2017 is used, and the training set/verification set is 118287/5000.
Because of the difference between training set and test set in DiCENet paper, the comparison between DiCENet and ELASTI will not be a problem?

Code segment corresponding to ELASTIC:
coco2014

Code segment corresponding to DiCENet:
coco2017

@sacmehta
Copy link
Owner

sacmehta commented Aug 6, 2019

In our paper, we used the same split as ELASTIC.

@sacmehta
Copy link
Owner

sacmehta commented Aug 6, 2019

Since COCO2017 is the latest version, we provided the default values to 2017 split. You can change them if you want.

@mymuli
Copy link
Author

mymuli commented Aug 6, 2019

@sacmehta

Code modified to version of data set 2014:

`# -- coding: utf-8 --
import torch
import argparse
import torch.optim
import torch.utils.data
import torch.utils.data.distributed
from data_loader.classification import imagenet as img_loader
import random
import os
from tensorboardX import SummaryWriter
import time
from utilities.utils import model_parameters, compute_flops
from utilities.utils import save_checkpoint
import numpy as np
from utilities.print_utils import *
from torch import nn
from PIL import Image
import torchvision.transforms as transforms
import torchvision.datasets as datasets

"""
author = "Sachin Mehta"
maintainer = "Sachin Mehta"
"""

class CocoDetection(datasets.coco.CocoDetection):
def init(self, root, annFile, transform=None, target_transform=None):
from pycocotools.coco import COCO
self.root = root
self.coco = COCO(annFile)
self.ids = list(self.coco.imgs.keys())
self.transform = transform
self.target_transform = target_transform
self.cat2cat = dict()
for cat in self.coco.cats.keys():
self.cat2cat[cat] = len(self.cat2cat)
# print(self.cat2cat)

def __getitem__(self, index):
    coco = self.coco
    img_id = self.ids[index]
    ann_ids = coco.getAnnIds(imgIds=img_id)
    target = coco.loadAnns(ann_ids)

    output = torch.zeros((3, 80), dtype=torch.long)
    for obj in target:
        if obj['area'] < 32 * 32:
            output[0][self.cat2cat[obj['category_id']]] = 1
        elif obj['area'] < 96 * 96:
            output[1][self.cat2cat[obj['category_id']]] = 1
        else:
            output[2][self.cat2cat[obj['category_id']]] = 1
    target = output

    path = coco.loadImgs(img_id)[0]['file_name']
    img = Image.open(os.path.join(self.root, path)).convert('RGB')
    if self.transform is not None:
        img = self.transform(img)

    if self.target_transform is not None:
        target = self.target_transform(target)
    return img, target

def main(args):
# -----------------------------------------------------------------------------
# Create model
# -----------------------------------------------------------------------------
if args.model == 'dicenet':
from model.classification import dicenet as net
model = net.CNNModel(args)
elif args.model == 'espnetv2':
from model.classification import espnetv2 as net
model = net.EESPNet(args)
elif args.model == 'shufflenetv2':
from model.classification import shufflenetv2 as net
model = net.CNNModel(args)
else:
print_error_message('Model {} not yet implemented'.format(args.model))
exit()

if args.finetune:
    # laod the weights for finetuning
    if os.path.isfile(args.weights_ft):
        pretrained_dict = torch.load(args.weights_ft, map_location=torch.device('cpu'))
        print_info_message('Loading pretrained basenet model weights')
        model_dict = model.state_dict()

        overlap_dict = {k: v for k, v in model_dict.items() if k in pretrained_dict}

        total_size_overlap = 0
        for k, v in enumerate(overlap_dict):
            total_size_overlap += torch.numel(overlap_dict[v])

        total_size_pretrain = 0
        for k, v in enumerate(pretrained_dict):
            total_size_pretrain += torch.numel(pretrained_dict[v])

        if len(overlap_dict) == 0:
            print_error_message('No overlaping weights between model file and pretrained weight file. Please check')

        print_info_message('Overlap ratio of weights: {:.2f} %'.format(
            (total_size_overlap * 100.0) / total_size_pretrain))

        model_dict.update(overlap_dict)
        model.load_state_dict(model_dict, strict=False)
        print_info_message('Pretrained basenet model loaded!!')
    else:
        print_error_message('Unable to find the weights: {}'.format(args.weights_ft))

# -----------------------------------------------------------------------------
# Writer for logging
# -----------------------------------------------------------------------------
if not os.path.isdir(args.savedir):
    os.makedirs(args.savedir)
writer = SummaryWriter(log_dir=args.savedir, comment='Training and Validation logs')
try:
    writer.add_graph(model, input_to_model=torch.randn(1, 3, args.inpSize, args.inpSize))
except:
    print_log_message("Not able to generate the graph. Likely because your model is not supported by ONNX")

# network properties
num_params = model_parameters(model)
flops = compute_flops(model)
print_info_message('FLOPs: {:.2f} million'.format(flops))
print_info_message('Network Parameters: {:.2f} million'.format(num_params))

# -----------------------------------------------------------------------------
# Optimizer
# -----------------------------------------------------------------------------

optimizer = torch.optim.SGD(model.parameters(), args.lr, momentum=args.momentum, weight_decay=args.weight_decay)

# optionally resume from a checkpoint
best_acc = 0.0
num_gpus = torch.cuda.device_count()
print("num_gpus: ", num_gpus)
device = 'cuda' if num_gpus >= 1 else 'cpu'
print("device: ", device)
print("***********************************")
if args.resume:
    if os.path.isfile(args.resume):
        print_info_message("=> loading checkpoint '{}'".format(args.resume))
        checkpoint = torch.load(args.resume)
        args.start_epoch = checkpoint['epoch']
        best_acc = checkpoint['best_prec1']
        model.load_state_dict(checkpoint['state_dict'], map_location=torch.device(device))
        optimizer.load_state_dict(checkpoint['optimizer'])
        print_info_message("=> loaded checkpoint '{}' (epoch {})"
                           .format(args.resume, checkpoint['epoch']))
    else:
        print_warning_message("=> no checkpoint found at '{}'".format(args.resume))

# -----------------------------------------------------------------------------
# Loss Fn
# -----------------------------------------------------------------------------
if args.dataset == 'imagenet':
    criterion = nn.CrossEntropyLoss()
    acc_metric = 'Top-1'
elif args.dataset == 'coco':
    criterion = nn.BCEWithLogitsLoss()
    acc_metric = 'F1'
else:
    print_error_message('{} dataset not yet supported'.format(args.dataset))

if num_gpus >= 1:
    model = torch.nn.DataParallel(model)
    model = model.cuda()
    criterion = criterion.cuda()
    if torch.backends.cudnn.is_available():
        import torch.backends.cudnn as cudnn
        cudnn.benchmark = True
        cudnn.deterministic = True

# -----------------------------------------------------------------------------
# Data Loaders
# -----------------------------------------------------------------------------
# Data loading code
if args.dataset == 'imagenet':
    train_loader, val_loader = img_loader.data_loaders(args)
    # import the loaders too
    from utilities.train_eval_classification import train, validate
elif args.dataset == 'coco':
    # from data_loader.classification.coco import COCOClassification
    # train_dataset = COCOClassification(root=args.data, split='train', year='2014', inp_size=args.inpSize,scale=args.scale, is_training=True) # 2017
    # val_dataset = COCOClassification(root=args.data, split='val', year='2014', inp_size=args.inpSize,is_training=False) # 2017
    # 数据集处理
    # train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True,
                                              # pin_memory=True, num_workers=args.workers)
    # val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=args.batch_size, shuffle=False,
                                             # pin_memory=True, num_workers=args.workers)
    #

    # Data loading code  
    # Elastic论文里面的代码
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                      std=[0.229, 0.224, 0.225])

    train_dataset = CocoDetection(os.path.join(args.data, 'images/train2014'),
                              os.path.join(args.data, 'annotations/instances_train2014.json'),
                              transforms.Compose([
                                  transforms.RandomResizedCrop(224),
                                  transforms.RandomHorizontalFlip(),
                                  transforms.ToTensor(),
                                  normalize,
                              ]))
    val_dataset = CocoDetection(os.path.join(args.data, 'images/val2014'),
                            os.path.join(args.data, 'annotations/instances_val2014.json'),
                            transforms.Compose([
                                transforms.Resize((224, 224)),
                                transforms.ToTensor(),
                                normalize,
                            ]))

    train_sampler = torch.utils.data.sampler.RandomSampler(train_dataset)
    # 训练集
    train_loader = torch.utils.data.DataLoader(
        train_dataset, batch_size=args.batch_size, shuffle=(train_sampler is None),
        num_workers=args.workers, pin_memory=True, sampler=train_sampler, drop_last=True)
    # 验证集
    val_loader = torch.utils.data.DataLoader(
        val_dataset, batch_size=args.batch_size, shuffle=False,
        num_workers=args.workers, pin_memory=True)


    # import the loaders too
    from utilities.train_eval_classification import train_multi as train
    from utilities.train_eval_classification import validate_multi as validate
else:
    print_error_message('{} dataset not yet supported'.format(args.dataset))

# -----------------------------------------------------------------------------
# LR schedulers  学习率策略
# -----------------------------------------------------------------------------
if args.scheduler == 'fixed':
    step_sizes = args.steps
    from utilities.lr_scheduler import FixedMultiStepLR
    lr_scheduler = FixedMultiStepLR(base_lr=args.lr, steps=step_sizes, gamma=args.lr_decay)
elif args.scheduler == 'clr':
    from utilities.lr_scheduler import CyclicLR
    step_sizes = args.steps
    lr_scheduler = CyclicLR(min_lr=args.lr, cycle_len=5, steps=step_sizes, gamma=args.lr_decay)
elif args.scheduler == 'poly':
    from utilities.lr_scheduler import PolyLR
    lr_scheduler = PolyLR(base_lr=args.lr, max_epochs=args.epochs)
elif args.scheduler == 'linear':
    from utilities.lr_scheduler import LinearLR
    lr_scheduler = LinearLR(base_lr=args.lr, max_epochs=args.epochs)
elif args.scheduler == 'hybrid':
    from utilities.lr_scheduler import HybirdLR
    lr_scheduler = HybirdLR(base_lr=args.lr, max_epochs=args.epochs, clr_max=args.clr_max)
else:
    print_error_message('Scheduler ({}) not yet implemented'.format(args.scheduler))
    exit()
print("学习率策略: ", args.scheduler)
print_info_message(lr_scheduler)

# set up the epoch variable in case resuming training
if args.start_epoch != 0:
    for epoch in range(args.start_epoch):
        lr_scheduler.step(epoch)

with open(args.savedir + os.sep + 'arguments.json', 'w') as outfile:
    import json
    arg_dict = vars(args)
    arg_dict['model_params'] = '{} '.format(num_params)
    arg_dict['flops'] = '{} '.format(flops)
    json.dump(arg_dict, outfile)

# -----------------------------------------------------------------------------
# Training and Val Loop
# -----------------------------------------------------------------------------

extra_info_ckpt = args.model + '_' + str(args.s)
for epoch in range(args.start_epoch, args.epochs):
    lr_log = lr_scheduler.step(epoch)
    # set the optimizer with the learning rate
    # This can be done inside the MyLRScheduler
    for param_group in optimizer.param_groups:
        param_group['lr'] = lr_log
    print_info_message("LR for epoch {} = {:.5f}".format(epoch, lr_log))
    train_acc, train_loss = train(data_loader=train_loader, model=model, criteria=criterion, optimizer=optimizer, epoch=epoch, device=device)
    # evaluate on validation set
    val_acc, val_loss = validate(data_loader=val_loader, model=model, criteria=criterion, device=device)

    # remember best prec@1 and save checkpoint
    is_best = val_acc > best_acc
    best_acc = max(val_acc, best_acc)

    weights_dict = model.module.state_dict() if device == 'cuda' else model.state_dict()
    save_checkpoint({
        'epoch': epoch + 1,
        'state_dict': weights_dict,
        'best_prec1': best_acc,
        'optimizer': optimizer.state_dict(),
    }, is_best, args.savedir, extra_info_ckpt)

    writer.add_scalar('Classification/LR/learning_rate', lr_log, epoch)
    writer.add_scalar('Classification/Loss/Train', train_loss, epoch)
    writer.add_scalar('Classification/Loss/Val', val_loss, epoch)
    writer.add_scalar('Classification/{}/Train'.format(acc_metric), train_acc, epoch)
    writer.add_scalar('Classification/{}/Val'.format(acc_metric), val_acc, epoch)
    writer.add_scalar('Classification/Complexity/Top1_vs_flops', best_acc, round(flops, 2))
    writer.add_scalar('Classification/Complexity/Top1_vs_params', best_acc, round(num_params, 2))

writer.close()

if name == 'main':
from commons.general_details import classification_models, classification_datasets, classification_exp_choices,
classification_schedulers

parser = argparse.ArgumentParser(description='Training efficient networks')
# General settings
parser.add_argument('--workers', default=4, type=int, help='number of data loading workers (default: 4)')  # 12
parser.add_argument('--batch-size', default=64, type=int, help='mini-batch size (default: 512)')  # 512

# Dataset related settings
parser.add_argument('--data', default='./coco-image', help='path to dataset')
parser.add_argument('--dataset', default='coco', help='Name of the dataset', choices=classification_datasets)

# LR scheduler settings
parser.add_argument('--epochs', default=10, type=int, help='number of total epochs to run')  # 300
parser.add_argument('--start-epoch', default=0, type=int, help='manual epoch number (useful on restarts)')
parser.add_argument('--clr-max', default=61, type=int, help='Max. epochs for CLR in Hybrid scheduler')
parser.add_argument('--steps', default=[51, 101, 131, 161, 191, 221, 251, 281], type=int, nargs="+",
                    help='steps at which lr should be decreased. Only used for Cyclic and Fixed LR')
parser.add_argument('--scheduler', default='clr', choices=classification_schedulers,  # 循环学习率
                    help='Learning rate scheduler')
parser.add_argument('--lr', default=0.1, type=float, help='initial learning rate')
parser.add_argument('--lr-decay', default=0.5, type=float, help='factor by which lr should be decreased')

# optimizer settings
parser.add_argument('--momentum', default=0.9, type=float, help='momentum')
parser.add_argument('--weight-decay', default=4e-5, type=float, help='weight decay (default: 4e-5)')
parser.add_argument('--resume', default='', type=str, help='path to latest checkpoint (default: none)')
parser.add_argument('--savedir', type=str, default='results_classification', help='Location to save the results')

# Model settings
parser.add_argument('--s', default=1.0, type=float, help='Factor by which output channels should be scaled (s > 1 for increasing the dims while < 1 for decreasing)')
parser.add_argument('--inpSize', default=224, type=int, help='Input image size (default: 224 x 224)')
parser.add_argument('--scale', default=[0.2, 1.0], type=float, nargs="+", help='Scale for data augmentation')
parser.add_argument('--model', default='shuffle_vw', choices=classification_models,
                    help='Which model? basic= basic CNN model, res=resnet style)')
parser.add_argument('--channels', default=3, type=int, help='Input channels')
# DiceNet related settings
parser.add_argument('--model-width', default=224, type=int, help='Model width')
parser.add_argument('--model-height', default=224, type=int, help='Model height')

## Experiment related settings
parser.add_argument('--exp-type', type=str, choices=classification_exp_choices, default='main',
                    help='Experiment type')
parser.add_argument('--finetune', action='store_true', default=False, help='Finetune the model')  # 微调模型

args = parser.parse_args()

assert len(args.scale) == 2
args.scale = tuple(args.scale)

random.seed(1882)
torch.manual_seed(1882)

timestr = time.strftime("%Y%m%d-%H%M%S")
args.savedir = '{}_{}/model_{}_{}/aug_{}_{}/s_{}_inp_{}_sch_{}/{}/'.format(args.savedir, args.exp_type, args.model,
                                                                           args.dataset, args.scale[0],
                                                                           args.scale[1],
                                                                           args.s, args.inpSize, args.scheduler,
                                                                           timestr)

# if you want to finetune ImageNet model on other dataset, say MS-COCO classification
if args.finetune:
    print_info_message('Grabbing location of the ImageNet weights from the weight dictionary')
    from model.weight_locations.classification import model_weight_map

    weight_file_key = '{}_{}'.format(args.model, args.s)
    assert weight_file_key in model_weight_map.keys(), '{} does not exist'.format(weight_file_key)
    args.weights_ft = model_weight_map[weight_file_key]


if args.dataset == 'imagenet':
    args.num_classes = 1000
elif args.dataset == 'coco':
    from data_loader.classification.coco import COCO_CLASS_LIST
    args.num_classes = len(COCO_CLASS_LIST)

main(args)`

Dear Sir,
I changed the program to the data set version of 2014, but after the first epoch of code running, the result was very bad..
At the beginning of the second epoch, the values of Precision and Recall were 0... How can I solve this problem?
If possible, could you provide a program for the data set version of 2014?
Thank you very much.

DiCENets-coco2014

@sacmehta
Copy link
Owner

sacmehta commented Aug 6, 2019

You need to use lesser value of learning rate. Try 0.005

@mymuli
Copy link
Author

mymuli commented Aug 6, 2019

@sacmehta
Dear Sir,

I use dicenet_s_0.2_imagenet_224x224.pth file to fine-tune on mscoco, using the following commands:
python train_classification.py --model dicenet \ --scheduler fixed -- lr 0.005 \ --data 1-mscoco-image --dataset coco \ --epochs 4 --batch_size 64 --s 0.2 --finetune

The experimental process is as follows:
f1

The results of each epoch are as follows:
epoch-1:
P_C 7.34 R_C 1.55 F_C 1.50 P_O 57.24 R_O 18.15 F_O 27.56
epoch-2:
P_C 14.77 R_C 2.39 F_C 3.39 P_O 71.41 R_O 12.82 F_O 21.74
epoch-3:
P_C 21.93 R_C 4.68 F_C 6.16 P_O 66.84 R_O 17.82 F_O 28.14
epoch-4:
P_C 25.78 R_C 4.75 F_C 6.29 P_O 69.06 R_O 18.25 F_O 28.87

Although there are improvements in each round, the effect is not very great.

#------------------------------------------------------------------------------------------

I used clr(scheduler ) to carry out another group of experiments.
python train_classification.py --model dicenet \ --scheduler clr --lr 0.005 \ --data 1-mscoco-image --dataset coco \ --epochs 2 --batch_size 64 --s 0.2 --finetune

The experimental results are as follows:

k1-cylr-2

k2-cylr-2

In your training process, R_C and F_C are also slowly rising from a very small number (for example, in my experiment, 1.21/0.88)?

On my desktop, GTX 970, batch size is 64, each round takes 40 minutes. If it takes 2.7 days to train 100 rounds, I don't know if the F_C value can exceed 71.08 (ELASTIC).

Did my experiment go wrong? Can you give me some advice?

Thank you very much for your patient reply !

@sacmehta
Copy link
Owner

sacmehta commented Aug 6, 2019

With s=0.2, you cannot reach the value close to Elastic paper. You need to use the best Dicenet model.

Also, batch size of 64 is too small. Try using something like 512 for best results.

Since you are able to run the code and it is more of hyper-parameter tuning based on your machine setup which is beyond this repo, I am closing this issue.

@sacmehta sacmehta closed this as completed Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants