Training YOWO on a customized dataset #98

tanthinhdt · 2024-04-11T14:01:04Z

Hi, I have a dataset for action recognition. I also organized the dataset following the UCF dataset's format and tried training YOWO on it using the UCF settings. However, I keep getting this error. Can anyone help me?

The error

Configurations

TRAIN:
  DATASET: vsl
  BATCH_SIZE: 1
  TOTAL_BATCH_SIZE: 128
  LEARNING_RATE: 1e-4
  EVALUATE: False
  FINE_TUNE: False
  BEGIN_EPOCH: 1
  END_EPOCH: 10
SOLVER:
  MOMENTUM: 0.9
  WEIGHT_DECAY: 5e-4
  STEPS: [2, 3, 4, 5]
  LR_DECAY_RATE: 0.5
  ANCHORS:
    [
      0.70458,
      1.18803,
      1.26654,
      2.55121,
      1.59382,
      4.08321,
      2.30548,
      4.94180,
      3.52332,
      5.91979,
    ]
  NUM_ANCHORS: 5
  OBJECT_SCALE: 5
  NOOBJECT_SCALE: 1
  CLASS_SCALE: 1
  COORD_SCALE: 1
DATA:
  NUM_FRAMES: 16
  SAMPLING_RATE: 1
  TRAIN_JITTER_SCALES: [256, 320]
  TRAIN_CROP_SIZE: 224
  TEST_CROP_SIZE: 224
  MEAN: [0.4345, 0.4051, 0.3775]
  STD: [0.2768, 0.2713, 0.2737]
MODEL:
  NUM_CLASSES: 98
  BACKBONE_3D: resnext101
  BACKBONE_2D: darknet
WEIGHTS:
  BACKBONE_3D: "weights/resnext-101-kinetics.pth"
  BACKBONE_2D: "weights/yolo.weights"
  FREEZE_BACKBONE_3D: False
  FREEZE_BACKBONE_2D: False
BACKUP_DIR: "backup/vsl"
RNG_SEED: 1
LISTDATA:
  BASE_PTH: "data/vsl/yowo_vsl"
  TRAIN_FILE: "data/vsl/yowo_vsl/trainlist.txt"
  TEST_FILE: "data/vsl/yowo_vsl/testlist.txt"
  TEST_VIDEO_FILE: "data/vsl/yowo_vsl/testlist.txt"
  MAX_OBJS: 1
  CLASS_NAMES: [
    "Con chó",
    "Con mèo",
    "Con gà",
    "Con vịt",
    "Con rùa",
    "Con thỏ",
    "Con trâu",
    "Con bò",
    "Con dê",
    "Con heo",
    "Màu đen",
    "Màu trắng",
    "Màu đỏ",
    "Màu cam",
    "Màu vàng",
    "Màu hồng",
    "Màu tím",
    "Màu nâu",
    "Quả dâu",
    "Quả mận",
    "Quả dứa",
    "Quả đào",
    "Quả đu đủ",
    "Quả cam",
    "Quả bơ",
    "Quả chuối",
    "Quả xoài",
    "Quả dừa",
    "Bố",
    "Mẹ",
    "Con trai",
    "Con gái",
    "Vợ",
    "Chồng",
    "Ông nội",
    "Bà nội",
    "Ông ngoại",
    "Bà ngoại",
    "Ăn",
    "Uống",
    "Xem",
    "Thèm",
    "Mách",
    "Khóc",
    "Cười",
    "Học",
    "Dỗi",
    "Chết",
    "Đi",
    "Chạy",
    "Bận",
    "Hát",
    "Múa",
    "Nấu",
    "Nướng",
    "Nhầm lẫn",
    "Quan sát",
    "Cắm trại",
    "Cung cấp",
    "Bắt chước",
    "Bắt buộc",
    "Báo cáo",
    "Mua bán",
    "Không quen",
    "Không nên",
    "Không cần",
    "Không cho",
    "Không nghe lời",
    "Mặn",
    "Đắng",
    "Cay",
    "Ngọt",
    "Đậm",
    "Nhạt",
    "Ngon miệng",
    "Xấu",
    "Đẹp",
    "Chật",
    "Hẹp",
    "Rộng",
    "Dài",
    "Cao",
    "Lùn",
    "Ốm",
    "Mập",
    "Ngoan",
    "Hư",
    "Khỏe",
    "Mệt",
    "Đau",
    "Giỏi",
    "Chăm chỉ",
    "Lười biếng",
    "Tốt bụng",
    "Thú vị",
    "Hài hước",
    "Dũng cảm",
    "Sáng tạo",
  ]

jing-house · 2024-05-05T06:56:55Z

the source code lacks of usage instructions. Not worth putting more effort into it.

lamnguyenvu98 · 2024-07-01T02:54:44Z

You should describe the structure of your dataset. This algorithm is more like spatial temporal localization where it both classifies and localizes action in video. That means you also need bounding box annotation for each video frame. Have you prepared that?

The annotation files are txt files, exactly the same as yolo annotation format.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training YOWO on a customized dataset #98

Training YOWO on a customized dataset #98

tanthinhdt commented Apr 11, 2024

jing-house commented May 5, 2024

lamnguyenvu98 commented Jul 1, 2024

Training YOWO on a customized dataset #98

Training YOWO on a customized dataset #98

Comments

tanthinhdt commented Apr 11, 2024

The error

Configurations

jing-house commented May 5, 2024

lamnguyenvu98 commented Jul 1, 2024