Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于在配置中添加更多的数据增强方式导致训练错误以及训练卡住的问题 #110

Closed
shuxsu opened this issue Dec 12, 2019 · 4 comments

Comments

@shuxsu
Copy link

shuxsu commented Dec 12, 2019

使用mask rcnn 模型的resnet50+fpn 训练卡死不动 控制台输出:

Done (t=0.07s)
creating index...
index created!
{1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6}
{'sky': 1, 'building': 2, 'terrain': 3, 'person': 4, 'vegetation': 5, 'car': 6}
2019-12-12 11:44:38,129-INFO: 139 samples in file /home/aistudio/data/data17467/cococo/annotations/instance_train.json
2019-12-12 11:44:38,131-INFO: places would be ommited when DataLoader is not iterable
I1212 11:44:38.156224 4327 parallel_executor.cc:421] The number of CUDAPlace, which is used in ParallelExecutor, is 1. And the Program will be copied 1 copies
I1212 11:44:38.196641 4327 graph_pattern_detector.cc:96] --- detected 28 subgraphs
I1212 11:44:38.215721 4327 graph_pattern_detector.cc:96] --- detected 25 subgraphs
I1212 11:44:38.257105 4327 build_strategy.cc:363] SeqOnlyAllReduceOps:0, num_trainers:1
I1212 11:44:38.307612 4327 parallel_executor.cc:285] Inplace strategy is enabled, when build_strategy.enable_inplace = True
I1212 11:44:38.331254 4327 parallel_executor.cc:368] Garbage collection strategy is enabled, when FLAGS_eager_delete_tensor_gb = 0

@shuxsu
Copy link
Author

shuxsu commented Dec 12, 2019

数据增强部分配置如下:

`sample_transforms:

  • !DecodeImage
    to_rgb: true
    with_mixup: false
  • !RandomFlipImage
    is_mask_flip: true
    is_normalized: false
    prob: 0.5 # default: 0.5
  • !NormalizeImage
    is_channel_first: false
    is_scale: true
    mean:
    • 0.485
    • 0.456
    • 0.406
      std:
    • 0.229
    • 0.224
    • 0.225
  • !CropImage
    avoid_no_bbox: false
    batch_sampler:
    • [1, 1, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.1, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.3, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.5, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.7, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.9, 0.0]
    • [1, 50, 0.3, 1.0, 0.5, 2.0, 0.0, 1.0]
      satisfy_all: false
  • !ResizeImage
    interp: 1
    max_size: 1333
    target_size: 800
    use_cv2: true
  • !Permute
    channel_first: true
    to_bgr: false # default: true
    batch_transforms:
  • !PadBatch
    pad_to_stride: 32
    num_workers: 2`

@FDInSky
Copy link
Collaborator

FDInSky commented Dec 12, 2019

一个iter都没跑通嘛?你可以在reader里面,用print打一下,看看哪个位置出错了

@shuxsu
Copy link
Author

shuxsu commented Dec 15, 2019

一个iter都没跑通嘛?你可以在reader里面,用print打一下,看看哪个位置出错了

本地可以正常跑通 用ai studio 就不行

@shuxsu
Copy link
Author

shuxsu commented Dec 22, 2019

自查解决

@shuxsu shuxsu closed this as completed Dec 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants