id2nframe.json file not found #2

antoyang · 2021-07-08T11:52:06Z

Hi,

I downloaded the data of a dataset (VLEP) with the corresponding script, but when using train_qa.py with the base config, the dataloading fails as it seems that the file loaded in data.py lines 60-65 {img_dir}/id2nframe_{frame_interval:g}.json or {img_dir}/id2nframe.json is missing (and I couldn't find it in the downloaded folders). Would you mind providing it please?

PS1: it seems that when replacing name2nframe by None, the _compute_nframe function then fails as the downloaded database seems corrupted (fnames = json.loads(self.txn.get(key=b'keys').decode('utf-8')) --> lmdb.CorruptedError: mdb_get: MDB_CORRUPTED: Located page was wrong type).

PS2: it seems also that default configs uses "vfeat_version": "resnet_slowfast", but the downloading script only downloads a "slowfast" folder in video_db, not a "resnet_slowfast" one.

Best,
Antoine Yang

linjieli222 · 2021-07-08T20:57:31Z

@antoyang

I just tried the downloading scripts for VLEP and was able to download the complete version of vlep video db. Here is a screenshot of the folder structure of the decompressed video db for VLEP:

Could you try the following to download video db only and paste the stdout/stderr here?

DOWNLOAD=$1

for FOLDER in 'video_db' 'txt_db' 'pretrained' 'finetune'; do
    if [ ! -d $DOWNLOAD/$FOLDER ] ; then
        mkdir -p $DOWNLOAD/$FOLDER
    fi
done

BLOB='https://datarelease.blob.core.windows.net/value-leaderboard/starter_code_data'

# Use azcopy for video db downloading
if [ -f ~/azcopy/azcopy ]; then
    echo "azcopy exists, skip downloading"
else 
    echo "azcopy does not exist, start downloading"
    wget -P ~/azcopy/ https://convaisharables.blob.core.windows.net/azcopy/azcopy
fi
chmod +x ~/azcopy/azcopy

# video dbs
if [ ! -d $DOWNLOAD/video_db/vlep/ ] ; then
    ~/azcopy/azcopy cp $BLOB/video_db/vlep.tar $DOWNLOAD/video_db/vlep.tar
    tar -xvf $DOWNLOAD/video_db/vlep.tar -C $DOWNLOAD/video_db 
    rm $DOWNLOAD/video_db/vlep.tar
fi

antoyang · 2021-07-09T06:46:43Z

Here are the logs I got (with only the slowfast_1.5 folder downloaded in video_db):
31224336-98f2-0448-63a0-9aed98538eb7.log

linjieli222 · 2021-07-09T22:29:17Z

I see you shared azcopy log file. Can you also share the stdout from running the command? Something like this:

Job 5b27a274-1000-a544-7344-4edc3aedba81 summary
Elapsed Time (Minutes): 1.3001
Number of File Transfers: 1094
Number of Folder Property Transfers: 0
Total Number of Transfers: 1094
Number of Transfers Completed: 1094
Number of Transfers Failed: 0
Number of Transfers Skipped: 0
TotalBytesTransferred: 67136498861
Final Job Status: Completed

I want to check the total bytes transferred to see if it matches what's on the cloud.

antoyang · 2021-07-10T06:08:47Z

Actually ~/azcopy/azcopy cp $BLOB/video_db/vlep.tar $DOWNLOAD/video_db/vlep.tar gets killed:

0 Done, 0 Failed, 1 Pending, 0 Skipped, 1 Total, 2-sec Throughput (Mb/s): 2770.2691scripts/download_vlep_videodb.sh: line 25: 3710 Killed ~/azcopy/azcopy cp $BLOB/video_db/vlep.tar $DOWNLOAD/video_db/vlep.tar
vlep/
vlep/slowfast_1.5/
vlep/slowfast_1.5/data.mdb

linjieli222 · 2021-07-13T06:01:07Z

I suspect the azcopy command is occupying too much CPU resources, hence got killed automatically. Unfortunately, this error is not reproducible at our end, may need more investigation into the official documentation of azcopy.

Another alternative is to use wget instead.

DOWNLOAD=$1

for FOLDER in 'video_db' 'txt_db' 'pretrained' 'finetune'; do
    if [ ! -d $DOWNLOAD/$FOLDER ] ; then
        mkdir -p $DOWNLOAD/$FOLDER
    fi
done

BLOB='https://datarelease.blob.core.windows.net/value-leaderboard/starter_code_data'

# video dbs
if [ ! -d $DOWNLOAD/video_db/vlep/ ] ; then
    wget -P $DOWNLOAD/video_db/  $BLOB/video_db/vlep.tar 
    tar -xvf $DOWNLOAD/video_db/vlep.tar -C $DOWNLOAD/video_db 
    rm $DOWNLOAD/video_db/vlep.tar
fi

But wget may be slower than azcopy.

antoyang · 2021-07-15T09:13:16Z

Actually the downloading script worked fine on another cluster, probably due to different CPU resources. Thanks for helping!
I also have questions regarding VALUE: what do ST and AT->ST stand for in the leaderboard? Also would you mind sharing dev results?

linjieli222 · 2021-07-15T18:14:34Z

Good to know!

All dev/test results (with CLIP-ViT + Slowfast or ResNet + Slowfast) are included in our paper. ST is model trained on single task and AT-> ST is first perform all-task training then finetune on single task.

linjieli222 · 2021-07-21T23:10:44Z

Closed due to inactivity.

linjieli222 closed this as completed Jul 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

id2nframe.json file not found #2

id2nframe.json file not found #2

antoyang commented Jul 8, 2021 •

edited

linjieli222 commented Jul 8, 2021

antoyang commented Jul 9, 2021

linjieli222 commented Jul 9, 2021 •

edited

antoyang commented Jul 10, 2021

linjieli222 commented Jul 13, 2021 •

edited

antoyang commented Jul 15, 2021

linjieli222 commented Jul 15, 2021

linjieli222 commented Jul 21, 2021

id2nframe.json file not found #2

id2nframe.json file not found #2

Comments

antoyang commented Jul 8, 2021 • edited

linjieli222 commented Jul 8, 2021

antoyang commented Jul 9, 2021

linjieli222 commented Jul 9, 2021 • edited

antoyang commented Jul 10, 2021

linjieli222 commented Jul 13, 2021 • edited

antoyang commented Jul 15, 2021

linjieli222 commented Jul 15, 2021

linjieli222 commented Jul 21, 2021

antoyang commented Jul 8, 2021 •

edited

linjieli222 commented Jul 9, 2021 •

edited

linjieli222 commented Jul 13, 2021 •

edited