Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use pytorch multi gpu in detectron2? not training but inference #2473

Closed
leesangjoon1 opened this issue Jan 11, 2021 · 3 comments
Closed

Comments

@leesangjoon1
Copy link

If you do not know the root cause of the problem, and wish someone to help you, please
post according to this template:

Instructions To Reproduce the Issue:

How to use pytorch in detectron2 for inference?

I am using multi gpu like this python train_net.py --num-gpus 4 --configs~~~ MODEL.WEIGHTS ~~
It works in training

but It doesn't work in inference. so I need so long time for checking inference.

So I want to know about it

Thank you!

  1. Full runnable code or full changes you made:
import torch, torchvision
torch.__version__
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, json, cv2, random

import os
import numpy as np
import json
from detectron2.structures import BoxMode
import itertools
import matplotlib.pyplot as plt

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.evaluation import COCOEvaluator, inference_on_dataset



from detectron2.data.datasets import register_coco_instances
register_coco_instances("data_train", {}, "/home/sangjoon/detectron2/sangjoon/white_train_real.json", "/home/sangjoon/detectron2/sangjoon/white_train")
register_coco_instances("data_val", {}, "/home/sangjoon/detectron2/sangjoon/white_val_real.json", "/home/sangjoon/detectron2/sangjoon/white_val")




#train
from detectron2.engine import DefaultTrainer, default_argument_parser, default_setup, hooks, launch
from detectron2.config import get_cfg
import detectron2.data.transforms as T
from detectron2.data import DatasetMapper   # the default mapper
from detectron2.data import build_detection_train_loader
cfg = get_cfg()
cfg.merge_from_file("/home/sangjoon/detectron2/configs/sangjoon.yaml")
cfg.DATASETS.TRAIN = ("data_train",)
cfg.DATASETS.TEST = ("data_val",)

# # # Size of the smallest side of the image during training
# cfg.INPUT.MIN_SIZE_TRAIN = (3000,)
# # # Sample size of smallest side by choice or random selection from range give by
# # # INPUT.MIN_SIZE_TRAIN
# # # Maximum size of the side of the image during training
# cfg.INPUT.MAX_SIZE_TRAIN = 4000
# # # Size of the smallest side of the image during testing. Set to zero to disable resize in testing.
# cfg.INPUT.MIN_SIZE_TEST = 3000
# # # Maximum size of the side of the image during testing
# cfg.INPUT.MAX_SIZE_TEST = 4000

# cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = ("/home/sangjoon/detectron2/sangjoon/training_mosquito_real_train_r101_DC5/model_0010499.pth")  # initialize from model zoo
# cfg.SOLVER.IMS_PER_BATCH = 1
# cfg.SOLVER.BASE_LR = 0.00025
# cfg.SOLVER.MAX_ITER = 10000    
# cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 4
# cfg.MODEL.ROI_HEADS.NUM_CLASSES = 8
# cfg.OUTPUT_DIR= "/home/ubuntu/detectron2/0908_uv_training"

# cfg.MODEL.WEIGHTS =os.path.join(cfg.OUTPUT_DIR, "model_0001149.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5   # set the testing threshold for this model


trainer = DefaultTrainer(cfg) 
trainer.resume_or_load(resume=False)
#trainer.train()

#inference 

cfg.DATASETS.TEST = ("data_val", )
predictor = DefaultPredictor(cfg)
from detectron2.utils.visualizer import ColorMode

im = plt.imread("/home/sangjoon/detectron2/sangjoon/white_val/white_33.bmp")
outputs = predictor(im)
balloon_metadata = MetadataCatalog.get("data_train")
v = Visualizer(im[:, :, ::-1],
               metadata=balloon_metadata, 
               scale=0.8)

v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
plt.figure(figsize = (20, 16))
plt.imshow(v.get_image()[:,:,::-1])
# cv2.imwrite("/home/ubuntu/detectron2/image.jpg",v.get_image()[:, :, ::-1])

##evaluate

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
evaluator = COCOEvaluator("data_val", cfg, False, output_dir="/home/sangjoon/detectron2/sangjoon/training_mosquito_real_train_r101_DC5")
val_loader = build_detection_test_loader(cfg, "data_val")
inference_on_dataset(trainer.model, val_loader, evaluator)

# # another equivalent way is to use trainer.test
<put code or diff here>
  1. What exact command you run:
  2. Full logs you observed:
<put logs here>

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in detectron2 model zoo.
(2) It indicates a detectron2 bug.

Environment:

Provide your environment information using the following command:

wget -nc -q https://github.com/facebookresearch/detectron2/raw/master/detectron2/utils/collect_env.py && python collect_env.py

If your issue looks like an installation issue / environment issue,
please first try to solve it with the instructions in
https://detectron2.readthedocs.io/tutorials/install.html#common-installation-issues

@github-actions
Copy link

You've chosen to report an unexpected problem or bug. Please include details about it by filling the issue template.
The following information is missing: "Your Environment";

@github-actions github-actions bot added the needs-more-info More info is needed to complete the issue label Jan 11, 2021
@ppwwyyxx
Copy link
Contributor

@leesangjoon1
Copy link
Author

https://detectron2.readthedocs.io/tutorials/getting_started.html#training-evaluation-in-command-line works in inference and supports multi GPU.

but I tried it. like python detectron2.py --nums-gpu 4 --config-file ~~ MODEL.WEIGHTS ~~

but it doesn't work

@github-actions github-actions bot removed the needs-more-info More info is needed to complete the issue label Jan 12, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants