Guidelines on how to train the model your own dataset. #14

IemProg · 2020-05-10T05:42:56Z

Could you please, improve the documentation about how can we use the library with pre-trained model ?

I would like to use it on my own dataset if possible.
Thanks

PkuRainBow · 2020-05-10T06:04:04Z

@IemProg Thanks for your advice and we will improve the Doc.

Do you mean the details on how to train the models on your own dataset?

IemProg · 2020-05-10T07:08:27Z

Yeah, please, especially if the dataset is not in "Yaml" extension, I have dataset in JPG format.

Thanks !

PkuRainBow · 2020-05-11T04:18:26Z

@IemProg In fact, the dataset is not required to be "Yaml" extension, and JPG is totally OK.

We illustrate an overall (coarse) guidelines on how to train the model on your own dataset as below and hope it helps.

first of all, you need to create a set of config files under the folder openseg.pytorch/configs/your_dataset_name following the other dataset. For example, we take the coco_stuff dataset as an example (as below),

openseg.pytorch/configs/coco_stuff/R_101_D_8.json

Lines 2 to 49 in db0d389

    
            "dataset": "coco_stuff", 
        
            "method": "fcn_segmentor", 
        
            "data": { 
        
              "image_tool": "cv2", 
        
              "input_mode": "BGR", 
        
              "num_classes": 171, 
        
              "label_list": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 17, 18, 19, 20,  
        
                            21, 22, 23, 24, 25, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 39,  
        
                            40, 41, 42, 43, 44, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58,  
        
                            59, 60, 61, 62, 63, 64, 65, 67, 70, 72, 73, 74, 75, 76, 77,  
        
                            78, 79, 80, 81, 82, 84, 85, 86, 87, 88, 89, 90, 92, 93, 94, 95, 96,  
        
                            97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112,  
        
                            113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,  
        
                            129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144,  
        
                            145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160,  
        
                            161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176,  
        
                            177, 178, 179, 180, 181, 182], 
        
              "reduce_zero_label": true, 
        
              "data_dir": "~/DataSet/pascal_context", 
        
              "workers": 8 
        
            }, 
        
           "train": { 
        
              "batch_size": 16, 
        
              "data_transformer": { 
        
                "size_mode": "fix_size", 
        
                "input_size": [520, 520], 
        
                "align_method": "only_pad", 
        
                "pad_mode": "random" 
        
              } 
        
            }, 
        
            "val": { 
        
              "batch_size": 4, 
        
              "mode": "ss_test", 
        
              "data_transformer": { 
        
                "size_mode": "diverse_size", 
        
                "align_method": "only_pad", 
        
                "pad_mode": "pad_right_down" 
        
              } 
        
            }, 
        
            "test": { 
        
              "mode": "ss_test", 
        
              "batch_size": 4, 
        
              "crop_size": [520, 520], 
        
              "scale_search": [0.5, 0.75, 1, 1.25, 1.5, 1.75, 2], 
        
              "data_transformer": { 
        
                "size_mode": "diverse_size" 
        
              } 
        
            },

You need to change a set of keywords in the json file including the "dataset", "num_classes", "label_list", "reduce_zero_label", "input_size","crop_size", "base_lr" and so on. Of course, you can also reset these parameters in the training script file (listed as below),

openseg.pytorch/scripts/coco_stuff/run_h_48_d_4_ocr_train.sh

Lines 31 to 48 in db0d389

    
           if [ "$1"x == "train"x ]; then 
        
             ${PYTHON} -u main.py --configs ${CONFIGS} \ 
        
                                  --drop_last y \ 
        
                                  --nbb_mult 10 \ 
        
                                  --phase train \ 
        
                                  --gathered n \ 
        
                                  --loss_balance y \ 
        
                                  --log_to_file n \ 
        
                                  --backbone ${BACKBONE} \ 
        
                                  --model_name ${MODEL_NAME} \ 
        
                                  --gpu 0 1 2 3 \ 
        
                                  --data_dir ${DATA_DIR} \ 
        
                                  --loss_type ${LOSS_TYPE} \ 
        
                                  --max_iters ${MAX_ITERS} \ 
        
                                  --checkpoints_name ${CHECKPOINTS_NAME} \ 
        
                                  --pretrained ${PRETRAINED_MODEL} \ 
        
                                  2>&1 | tee ${LOG_FILE}

second, you need to organize your training/validation dataset following the folder structure like below,

├── your_dataset_name
│   ├── train
│   │   ├── image
│   │   └── label
│   ├── val
│   │   ├── image
│   │   └── label

third, you need to prepare the training script following the example below and change the DATA_DIR, SAVE_DIR, CONFIGS, and all of the other settings accordingly.

https://github.com/openseg-group/openseg.pytorch/blob/db0d3894673015e9350881db2d02175b0a263368/scripts/coco_stuff/run_h_48_d_4_ocr_train.sh

jhyin12 · 2022-10-09T02:24:39Z

It seems this is not suitable for training segfix on my own dataset

PkuRainBow added the good first issue Good for newcomers label May 11, 2020

PkuRainBow pinned this issue May 11, 2020

PkuRainBow changed the title ~~Documentation~~ Guidelines on how to train the model your own dataset. May 11, 2020

IemProg closed this as completed May 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidelines on how to train the model your own dataset. #14

Guidelines on how to train the model your own dataset. #14

IemProg commented May 10, 2020 •

edited

PkuRainBow commented May 10, 2020

IemProg commented May 10, 2020

PkuRainBow commented May 11, 2020 •

edited

jhyin12 commented Oct 9, 2022

Guidelines on how to train the model your own dataset. #14

Guidelines on how to train the model your own dataset. #14

Comments

IemProg commented May 10, 2020 • edited

PkuRainBow commented May 10, 2020

IemProg commented May 10, 2020

PkuRainBow commented May 11, 2020 • edited

jhyin12 commented Oct 9, 2022

IemProg commented May 10, 2020 •

edited

PkuRainBow commented May 11, 2020 •

edited