Skip to content
/ homura Public
forked from QinZibo/homura

PyTorch utilities including trainer, reporter, etc.

License

Notifications You must be signed in to change notification settings

joniea/homura

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Homura CircleCI

document

homura is a library for prototyping DL research.

🔥🔥🔥🔥 homura (焰) is flame or blaze in Japanese. 🔥🔥🔥🔥

Requirements

minimal requirements

Python>=3.7
PyTorch>=1.1.0
torchvision>=0.3.0
tqdm # automatically installed
tensorboard # automatically installed

optional

miniargs
colorlog

To enable distributed training using auto mixed precision (AMP), install apex.

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cuda_ext --cpp_ext

test

pytest .

install

pip install git+https://github.com/moskomule/homura

or

git clone https://github.com/moskomule/homura
cd homura; pip install -e .

APIs

basics

  • Device Agnostic
  • Useful features
from homura import optim, lr_scheduler
from homura import trainers, callbacks, reporters
from torchvision.models import resnet50
from torch.nn import functional as F

# model will be registered in the trainer
resnet = resnet50()
# optimizer and scheduler will be registered in the trainer, too
optimizer = optim.SGD(lr=0.1, momentum=0.9)
scheduler = lr_scheduler.MultiStepLR(milestones=[30,80], gamma=0.1)

# list of callbacks or reporters can be registered in the trainer
with reporters.TensorboardReporter([callbacks.AccuracyCallback(), 
                                    callbacks.LossCallback()]) as reporter:
    trainer = trainers.SupervisedTrainer(resnet, optimizer, loss_f=F.cross_entropy, 
                                         callbacks=reporter, scheduler=scheduler)

Now iteration of trainer can be updated as follows,

from homura.utils.containers import Map

def iteration(trainer: Trainer, data: Tuple[torch.Tensor]) -> Mapping[torch.Tensor]:
    input, target = data
    output = trainer.model(input)
    loss = trainer.loss_f(output, target)
    results = Map(loss=loss, output=output)
    if trainer.is_train:
        trainer.optimizer.zero_grad()
        loss.backward()
        trainer.optimizer.step()
    # registered values can be called in callbacks
    results.user_value = user_value
    return results

SupervisedTrainer.iteration = iteration
# or   
trainer.update_iteration(iteration) 

Also, dict of models, optimizers, loss functions are supported.

trainer = CustomTrainer({"generator": generator, "discriminator": discriminator},
                        {"generator": gen_opt, "discriminator": dis_opt},
                        {"reconstruction": recon_loss, "generator": gen_loss},
                        **kwargs)

reproductivity

from homura.reproductivity import set_deterministic
with set_deterministic(seed):
    something()

debugger

>>> debug.module_debugger(nn.Sequential(nn.Linear(10, 5), 
                                        nn.Linear(5, 1)), 
                          torch.randn(4, 10))
[homura.debug|2019-02-25 17:57:06|DEBUG] Start forward calculation
[homura.debug|2019-02-25 17:57:06|DEBUG] forward> name=Sequential(1)
[homura.debug|2019-02-25 17:57:06|DEBUG] forward>   name=Linear(2)
[homura.debug|2019-02-25 17:57:06|DEBUG] forward>   name=Linear(3)
[homura.debug|2019-02-25 17:57:06|DEBUG] Start backward calculation
[homura.debug|2019-02-25 17:57:06|DEBUG] backward>   name=Linear(3)
[homura.debug|2019-02-25 17:57:06|DEBUG] backward> name=Sequential(1)
[homura.debug|2019-02-25 17:57:06|DEBUG] backward>   name=Linear(2)
[homura.debug|2019-02-25 17:57:06|INFO] Finish debugging mode

Examples

See examples.

  • cifar10.py: training ResNet-20 or WideResNet-28-10 with random crop on CIFAR10
  • imagenet.py: training a CNN on ImageNet on multi GPUs (single and multi process)
  • gap.py: better implementation of generative adversarial perturbation

For imagenet.py, if you want

  • single node single gpu
  • single node multi gpus

run python imagenet.py /path/to/imagenet/root.

If you want

  • single node multi threads multi gpus

run python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS /path/to/imagenet/root imagenet.py --distributed.

If you want

  • multi nodes multi threads multi gpus,

run

  • python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=0 --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py /path/to/imagenet/root --distributed on the master node
  • python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=$RANK --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py /path/to/imagenet/root --distributed on the other nodes

Here, 0<$RANK<$NUM_NODES.

About

PyTorch utilities including trainer, reporter, etc.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%