Composite Caption LineList Creation ? #38

nmonet · 2023-08-17T03:01:35Z

Hi ! Thanks for your Disco paper and explanation for the TSV file preparation.

In the composite yaml file, you have a 'caption linelist' file which is used.
caption_linelist: train_TiktokDance-coco-single_person-Lindsey_0411_youtube-SHHQ-1.0-deepfashion2-laion_human-masks-single_cap.caption.linelist.tsv
Could you explain how you make this file ?

uk9921 · 2023-10-19T11:39:55Z

I looked into xxx.caption.linelist.tsv file, which only has two columns.

Please refer to the code here.
https://github.com/Wangt-CN/DisCo/blob/main/dataset/tsv_dataset.py#L490

if self.is_composite:
    assert op.isfile(self.cap_linelist_file)
    self.cap_line_list = [
        int(row[2]) for row in tsv_reader(self.cap_linelist_file)]
    self.img_line_list = [i for i in range(len(self.cap_line_list))]
elif self.cap_linelist_file:
    line_list = load_box_linelist_file(self.cap_linelist_file)
    self.img_line_list = line_list[0]
    self.cap_line_list = line_list[1]
else:
    # one caption per image/video
    self.img_line_list = [i for i in range(self.cap_tsv.num_rows())]
    self.cap_line_list = [0 for i in range(self.cap_tsv.num_rows())]

On the left side, there is a column called img_line_list which seems to range from 0 to the number of data rows. On the right side, there is a column called cap_line_list that remains as 0.

Since there is no official documentation provided regarding xxx.caption.linelist, it seems that we only need to make a simple modification here:

if self.is_composite:  # false
    assert op.isfile(self.cap_linelist_file)
    self.cap_line_list = [
        int(row[2]) for row in tsv_reader(self.cap_linelist_file)]
    self.img_line_list = [i for i in range(len(self.cap_line_list))]
elif self.cap_linelist_file:  # official data training in
    line_list = load_box_linelist_file(self.cap_linelist_file)
    self.img_line_list = line_list[0]
    self.cap_line_list = line_list[1]
elif self.cap_tsv is not None:
    # one caption per image/video
    self.img_line_list = [i for i in range(self.cap_tsv.num_rows())]
    self.cap_line_list = [0 for i in range(self.cap_tsv.num_rows())]
else:  # w/o cap_line and cap_tsv, user data training in,
    self.img_line_list = [i for i in range(self.visual_tsv.num_rows())]
    self.cap_line_list = [0 for i in range(self.visual_tsv.num_rows())]

I will give it a try first, and if there is any progress, I will update it here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite Caption LineList Creation ? #38

Composite Caption LineList Creation ? #38

nmonet commented Aug 17, 2023

uk9921 commented Oct 19, 2023

Composite Caption LineList Creation ? #38

Composite Caption LineList Creation ? #38

Comments

nmonet commented Aug 17, 2023

uk9921 commented Oct 19, 2023