Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieval on MSVD #11

Closed
sharontaozi opened this issue Nov 9, 2019 · 4 comments
Closed

Retrieval on MSVD #11

sharontaozi opened this issue Nov 9, 2019 · 4 comments

Comments

@sharontaozi
Copy link

I am sorry to bother you again, because I am doing some experiments on the MSVD about retrieval. But I have some troubles and I want to ask you these questions:

  1. The number of descriptions corresponding to each video on MSVD is different. How to deal with this part in the experiment (such as: What is the number of descriptions corresponding to each video on training , test, verification?Or other processing details)
  2. The paper said that Otani's processing method is to randomly select 5 sentences for each test video. But I read Xu's paper 《Joint Modeling Deep Video and Compositional text to bridge vision and language in a unified framework》. It wrote: Firstly, for each testing video we select 5 sentences, so totally we have 3350 sentences and 670 videos. So, what is the difference between these two of processing?
    I really want to know how to deal with MSVD.Thank you very much!
@danieljf24
Copy link
Owner

Sorry for the late reply. We used all the sentences for training, validation, and testing.

@xixiareone
Copy link

I would like to ask you a question: in the MSVD data set, especially in the test phase, do you evaluate all sentences, or just randomly select 5 sentences from the MSVD for evaluation?

@xixiareone
Copy link

很抱歉再次打扰您,因为我正在MSVD上进行一些有关检索的实验。但是我有一些麻烦,我想问你以下问题:

  1. 与MSVD上的每个视频相对应的描述数量是不同的。在实验中如何处理这一部分(例如:与每个视频有关的培训,测试,验证或其他处理细节对应的描述数量是多少?)
  2. 文章说,大谷的处理方法是为每个测试视频随机选择5个句子。但是我读过徐的论文《联合建模深度视频和合成文本以在统一框架中桥接视觉和语言》。它写道:首先,对于每个测试视频,我们选择5个句子,因此总共有3350个句子和670个视频。那么,这两种处理之间有什么区别?
    我真的很想知道如何处理MSVD。非常感谢!

I would like to ask you a question: in the MSVD data set, especially in the test phase, do you evaluate all sentences, or just randomly select 5 sentences from the MSVD for evaluation?

@danieljf24
Copy link
Owner

In the previous version of our w2vv paper, in Table 5, we used all the corresponding sentences instead of randomly sampled 5 sentences for each test video ( Results using data partition from Xu et al. [40]). For the results using data partition from Otani et al. [24], we used 5 sentences for each test video provided by Otani et al, while all the training sentences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants