Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error during retraining of the custom dataset(config file) #76

Closed
caixiaoniweimar opened this issue Jan 18, 2022 · 1 comment
Closed

Comments

@caixiaoniweimar
Copy link

caixiaoniweimar commented Jan 18, 2022

Hello, thanks for sharing the code. I'm currently trying to retrain the model using custom dataset. However, I got stuck due to the error shown in the following, could you please help me to figure out what the problem could be? It seems that something goes wrong in the config file.

python trainer/train.py \ --config_file=/home/caixiaoni/Desktop/project/deeplab2/configs/metal_part/panoptic_deeplab/resnet50_os32_semseg.textproto \ --mode=eval \ --model_dir=/home/caixiaoni/Desktop/project/metal_part_retrain_1 \ --num_gpus=0

2022-01-18 23:36:30.240123: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
I0118 23:36:31.325175 139693994784576 train.py:65] Reading the config file.
Traceback (most recent call last):
  File "trainer/train.py", line 76, in <module>
    app.run(main)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "trainer/train.py", line 67, in main
    config = text_format.ParseLines(proto_file, config_pb2.ExperimentOptions())
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 759, in ParseLines
    return parser.ParseLines(lines, message)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 812, in ParseLines
    self._ParseOrMerge(lines, message)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 835, in _ParseOrMerge
    tokenizer = Tokenizer(str_lines)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 1255, in __init__
    self._SkipWhitespace()
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 1283, in _SkipWhitespace
    self._PopLine()
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 1272, in _PopLine
    self._current_line = next(self._lines)
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/google/protobuf/text_format.py", line 832, in <genexpr>
    str_lines = (
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 206, in __next__
    retval = self.readline()
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 170, in readline
    self._preread_check()
  File "/home/caixiaoni/anaconda3/lib/python3.8/site-packages/tensorflow/python/lib/io/file_io.py", line 79, in _preread_check
    self._read_buf = _pywrap_file_io.BufferedInputStream(
TypeError: __init__(): incompatible constructor arguments. The following argument types are supported:
    1. tensorflow.python.lib.io._pywrap_file_io.BufferedInputStream(filename: str, buffer_size: int, token: tensorflow.python.lib.io._pywrap_file_io.TransactionToken = None)

Invoked with: None, 524288

The config file looks like:

experiment_name: "metal_part_retrain_1"
model_options {
  # Update the path to the initial checkpoint (e.g., ImageNet
  # pretrained checkpoint).
  initial_checkpoint: "/home/caixiaoni/Desktop/project/resnet50_imagenet1k/ckpt-100"
  backbone {
    name: "resnet50"
    output_stride: 32
  }
  decoder {
    feature_key: "res5"
    decoder_channels: 256
    aspp_channels: 256
    atrous_rates: 3
    atrous_rates: 6
    atrous_rates: 9
  }
  panoptic_deeplab {
    low_level {
      feature_key: "res3"
      channels_project: 64
    }
    low_level {
      feature_key: "res2"
      channels_project: 32
    }
    instance {
      enable: false
    }
    semantic_head {
      output_channels: 19
      head_channels: 256
    }
  }
}
trainer_options {
  save_checkpoints_steps: 1000
  save_summaries_steps: 100
  steps_per_loop: 100
  loss_options {
    semantic_loss {
      name: "softmax_cross_entropy"
      weight: 1.0
      top_k_percent: 0.2
    }
  }
  solver_options {
    base_learning_rate: 0.0005
    training_number_of_steps: 60000
  }
}
train_dataset_options {
  dataset: "metal_part"
  # Update the path to training set.
  file_pattern: "/home/caixiaoni/Desktop/project/part-TFRecord/train*.tfrecord"
  # Adjust the batch_size accordingly to better fit your GPU/TPU memory.
  # Also see Q1 in g3doc/faq.md.
  batch_size: 8
  crop_size: 513
  crop_size: 513
  # Skip resizing.
  min_resize_value: 0
  max_resize_value: 0
  augmentations {
    min_scale_factor: 0.5
    max_scale_factor: 2.0
    scale_factor_step_size: 0.1
  }
}
eval_dataset_options {
  dataset: "metal_part"
  # Update the path to validation set.
  file_pattern: "/home/caixiaoni/Desktop/project/part-TFRecord/val*.tfrecord"
  batch_size: 1
  crop_size: 513
  crop_size: 513
  # Skip resizing.
  min_resize_value: 0
  max_resize_value: 0
}
evaluator_options {
  continuous_eval_timeout: -1
  save_predictions: true
  save_raw_predictions: false
}

In dataset.py for registering the dataset:

_METAL_PART = 'metal_part'

METAL_PART_INFORMATION = DatasetDescriptor(
    dataset_name = _METAL_PART,
    splits_to_sizes={'train': 1200, 'val':300, 'test': 255},
    num_classes = 2, #including background + metal_part, only 2 classes
    ignore_label=255, 
    panoptic_label_divisor= 20, #values should be larger than the max. num of instances that could appear per image in your dataset
    class_has_instances_list = (0,),  # specifies which class belongs to the thing class (i.e., countable objects such as people, cars).
    colormap=_COCO_COLORMAP,
    is_video_dataset=False, 
    is_depth_dataset=False, 
    ignore_depth=None,     
)

MAP_NAME_TO_DATASET_INFO = {
    _CITYSCAPES_PANOPTIC: CITYSCAPES_PANOPTIC_INFORMATION,
    _KITTI_STEP: KITTI_STEP_INFORMATION,
    _MOTCHALLENGE_STEP: MOTCHALLENGE_STEP_INFORMATION,
    _CITYSCAPES_DVPS: CITYSCAPES_DVPS_INFORMATION,
    _COCO_PANOPTIC: COCO_PANOPTIC_INFORMATION,
    _SEMKITTI_DVPS: SEMKITTI_DVPS_INFORMATION,
    _METAL_PART: METAL_PART_INFORMATION,
}

I highly appreciate for any hints! Thanks in advance!

@caixiaoniweimar caixiaoniweimar changed the title error during the error during retraining of the custom dataset(config file) Jan 19, 2022
@caixiaoniweimar
Copy link
Author

already solve. error was the command line that runs the train.py. (format stuff)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant