Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ego4d dataset split unavailable #86

Closed
LiJiaqi96 opened this issue Dec 18, 2023 · 31 comments
Closed

Ego4d dataset split unavailable #86

LiJiaqi96 opened this issue Dec 18, 2023 · 31 comments
Labels
question Further information is requested

Comments

@LiJiaqi96
Copy link

Hi, thanks for your great work of VideoChat2!
I tried to organize the Ego4d dataset used in the paper. But I found that there are several splits for each video, and the split information is unavailable neither on Ego4d website nor on this repo.
Is there any information about how the splits were performed? Thanks!

An example (the question is about how to obtain the "split_0.mp4":
d250521e-5197-44aa-8baa-2f42b24444d2/split_0.mp4

@Andy1621
Copy link
Collaborator

Please check the original JSON here. You may need to download the video from Ego4D and split the videos by yourself.

@LiJiaqi96
Copy link
Author

Thanks for your quick reply!!
BTW, the same issue occurs in the YouCook2 dataset. I observed that in YouCook2, the split was done by the "segment" in the original json file. Is it the index of frames? Thanks:)

@Andy1621
Copy link
Collaborator

Yes. The segment means the start second and end second.

@LiJiaqi96
Copy link
Author

Thanks again for your helpful information!

@cathyxl
Copy link

cathyxl commented Jan 4, 2024

Hi @Andy1621 , I found many splits for one video_uid in the ego4f_nlp_qa.json. I'm wondering how you index the splits. Do splits with earlier video start sec have smaller index numbers?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 5, 2024

@cathyxl I just simply split the video according to the annotations. For the same video_uid, different clip_start_sec and clip_end_sec will lead to different splits, thus generating split0, split1 and so on.

@cathyxl
Copy link

cathyxl commented Jan 5, 2024

@cathyxl I just simply split the video according to the annotations. For the same video_uid, different clip_start_sec and clip_end_sec will lead to different splits, thus generating split0, split1 and so on.

Does that mean you decide the index for the clips depending on their appearance order in the annotation file?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 6, 2024

Yes, but actually you can split the clips by yourself and make up the JSON file.

@cathyxl
Copy link

cathyxl commented Jan 7, 2024

Yes, but actually you can split the clips by yourself and make up the JSON file.

Can you kindly provide the script to split the ego4d videos? I found there were some errors when splitting these videos. It will affect the performance a lot if the split videos are not matched with the instruction data samples.

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 8, 2024

I'm sorry that I can not find the full scripts. However, I find some scripts about ffmpeg as follows:

mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 55.8300286 -t 4.4510000000000005 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_0.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 62.7295786 -t 9.501449999999984 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_1.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 150.5177086 -t 3.9923200000000065 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_2.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 7.1810286 -t 1.3579999999999997 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_3.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 214.81002859999998 -t 11.640000000000015 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_4.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 227.0350286 -t 14.85499999999999 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_5.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 254.8062886 -t 8.893740000000008 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_6.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 7.5185686 -t 5.502459999999999 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_7.mp4
mkdir -p your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2
ffmpeg -ss 120.70256859999999 -t 2.3184600000000017 -accurate_seek -i your_path/EgoQA/raw_videos/d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4 -c:v libx264 -c:a aac -strict experimental -an your_path/EgoQA/split_videos/d250521e-5197-44aa-8baa-2f42b24444d2/split_8.mp4

@cathyxl
Copy link

cathyxl commented Jan 8, 2024

In your command, the -ss is the start sec, -t is the duration? There are no durations in the ego4d_nlp_qa.json. Did you use cilp_end_sec-clip_start_sec to get the duration?

My problem is that there are some clips having almost the same clip_start_sec and clip_end_sec in the ego4d_nlp_qa.json, did you include these clips ?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 8, 2024

Yes, I just use the diff as duration. For your second problem, I have not checked the overlap between different clips, but I think it's normal for one clip to match multiple QAs.

@cathyxl
Copy link

cathyxl commented Jan 8, 2024

I find my downloaded ego4d videos are d250521e-5197-44aa-8baa-2f42b24444d2.mp4 instead of d250521e-5197-44aa-8baa-2f42b24444d2/0.mp4. Is there any problem?
image

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 8, 2024

@cathyxl I'm compressing the videos and will upload it later~

@cathyxl
Copy link

cathyxl commented Jan 8, 2024

That will be great! Thanks a lot. Btw, will you also upload videos of other datasets ? I found the downloaded video paths of InterVid are not the same as those in the 1.9M instruction data. Can you also show how to process the InternVid files?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 8, 2024

Yes, I can also upload the part of VideoChat2 conversation~

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 9, 2024

@cathyxl For EgoQA videos, download them from this link. For VideoChat2 conversation videos, download them from this link.

Besides, for splitting YouCook data, please follow the code:

import os
import subprocess
import json


def change_time(segment):
    duration = segment[1] - segment[0]
    hour = segment[0] // 3600
    minute = (segment[0] - 3600 * hour) // 60
    second = segment[0] % 60
    start = f"{hour}:{minute}:{second}"
    return start, duration


def process_video(src_path, des_path, start, duration):
    if not os.path.exists(os.path.join(des_path, start + '.mp4')):
        cmd = f"ffmpeg -ss {start} -t {duration} -accurate_seek -i {src_path} -c:v libx264 -c:a aac -strict experimental -b:a 98k {des_path}"
        subprocess.call(cmd, shell=True)



path = "user/youcook2/raw_videos"
split_lst = ['training', 'validation', 'testing']
total_file = {}
for split in split_lst:
    dir_list = os.listdir(os.path.join(path, split))
    for dir in dir_list:
        file_list = os.listdir(os.path.join(path, split, dir))
        for file in file_list:
            name = file.split('.')[0]
            total_file[name] = os.path.join(path, split, dir, file)

json_data = json.load(open("user/youcook2/youcookii_annotations_trainval.json", "r"))

des = "user/youcook2/split_videos"
caption_dict = {
    "training": [],
    "validation": [],
    "testing": []
}
for name, src_path in total_file.items():
    suffix = '/'.join(src_path.split('/')[-3:]).split('.')[0]
    des_dir = os.path.join(des, suffix)
    print(des_dir)
    if not os.path.exists(des_dir):
        os.makedirs(des_dir)
    for anno in json_data['database'][name]['annotations']:
        split = json_data['database'][name]['subset']
        idx = anno['id']
        caption = anno['sentence']
        segment = anno['segment']
        start, duration = change_time([74, 83])
        des_path = os.path.join(des_dir, f"split_{idx}.mp4")
        process_video(src_path, des_path, start, duration)
        caption_dict[split].append({
            "video": suffix + '/' + f"split_{idx}.mp4",
            "caption": caption
        })

@cathyxl
Copy link

cathyxl commented Jan 9, 2024

Thanks a lot! @Andy1621. Btw, I have the same problem with kinetics710. I found my downloaded video paths of kinetics 400, 600 and 700 cannot match these paths in the 1.9M instruction data. Can you also provide the preprocessing scripts?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 9, 2024

@cathyxl Hi! Please check our raw Kinetics annotation files here. As for the raw videos, I think you may need to find the related link from the official websites, from cvfoundation, or from Open DataLab. It may be illegal for us to share Kinetics Videos directly.

BTW, it's normal that some videos are missed since the YouTube links are not available.

@cathyxl
Copy link

cathyxl commented Jan 10, 2024

@Andy1621 I see~I find 51 videos missing in my downloaded files. I think it might be ok. Besides, I am also looking into the image paths, I found vqav2, vqav2_chinese, st_vqa, okvqa, okvqa_chinese, aokvqa and imagenet have some or all data paths in the pattern of train/xxxx.jpg(x are numbers), which are not coco image paths nor imagenet paths. Can you share how these image paths are organized?

I noticed that m3it has provided the image base64 strs, are these paths related to those base64 strs?

@Andy1621
Copy link
Collaborator

Andy1621 commented Jan 10, 2024

Yes. Most of the image files are from M3IT. And we transform the base64 (image_str) to an image using img_id.

As for some files that do not have img_id, we use the line_id, which is generated by enumerate(line).

@Andy1621
Copy link
Collaborator

And thanks for your notice, I have uploaded vqav2_chinese and okvqa_chinese which were not used. I will remove it later in HF.

@Andy1621
Copy link
Collaborator

@cathyxl I have found some errors in YouCook2 videos. I have split the videos at the same duration start, duration = change_time([74, 83])... I will split the videos again and update the videos~~

@cathyxl
Copy link

cathyxl commented Jan 23, 2024

hi~@Andy1621 have you uploaded the you cooked videos?

@Andy1621
Copy link
Collaborator

@cathyxl I have updated the youccok2 videos at the same link. Besides, the train.json has been updated since some videos are unable to be read.

Furthermore, I have uploaded the random train_80k.json for webvid_caption and train_100k.json for coco_caption, which are smaller and lead to similar results. Check them in hf.

@Andy1621 Andy1621 pinned this issue Jan 23, 2024
@Andy1621 Andy1621 added the question Further information is requested label Jan 23, 2024
@cathyxl
Copy link

cathyxl commented Jan 23, 2024

@Andy1621 Can you pin the link to the youcook2 videos here? I cannot find the link.

@yinanhe
Copy link
Member

yinanhe commented Jan 23, 2024

@cathyxl aliyun

@cathyxl
Copy link

cathyxl commented Jan 23, 2024

@yinanhe this seems to be a link to ego4d, how about the youcook2?

@yinanhe
Copy link
Member

yinanhe commented Jan 24, 2024

@cathyxl If you downloaded the zip file named "egoqa_split_videos.zip" between 11:00 AM on January 23, 2024(UTC+8) and 11:00 AM on January 24, 2024 (UTC+8), there's no need to re-download it. The videos inside it are for YouCook. I'm sorry for this typo, this link is normal now. From now on, the videos in egoqa_split_videos.zip are the ones for ego4d.

@yinanhe
Copy link
Member

yinanhe commented Jan 30, 2024

It seems that the issue has been fixed. If you still have any problems, please feel free to reopen this issue.

@yinanhe yinanhe closed this as completed Jan 30, 2024
@Andy1621
Copy link
Collaborator

Andy1621 commented Apr 6, 2024

For those who are interested in YouCook2, I have updated the JSON files in HF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants