Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about testing AUC-shuffled #9

Closed
hkkevinhf opened this issue Jul 4, 2019 · 8 comments
Closed

question about testing AUC-shuffled #9

hkkevinhf opened this issue Jul 4, 2019 · 8 comments

Comments

@hkkevinhf
Copy link

When using the evaluation code in this package, AUC-shuffled score is much lower than that reported in the paper on UCF dataset. I was wondering if it is anything wrong with the evaluation code, or if I missed some important details.

@wenguanwang
Copy link
Owner

@hkkevinhf Thanks for your information. Could you please be more specific or let us re-run your experiments? "AUC-shuffled score is much lower than that reported in the paper on UCF dataset." Which paper do you mean? It's hard to figure out the reason. We do not encounter a similar issue before.

@hkkevinhf
Copy link
Author

@wenguanwang hi, the paper concerned is "Revisiting Video Saliency: A Large-scale Benchmark and a New Model".
I used the test code and model weight in 'ACL' to generate the results. ( The code was downloaded from
https://drive.google.com/open?id=1sW0tf9RQMO4RR7SyKhU8Kmbm4jwkFGpQ )
I evaluated the results using evaluation code in 'ACL-evaluation.rar' .

The detailed information is below:
test code: ACL/main.py
model weight for test: ACL/ACL.h5
evaluation code: ACL-evaluation/demo_ours .m

I add a line in demo_ours.m in order to see the overall metrics on a dataset. Other files all remain unchanged. The demo_ours.m I used is shown below:

`**%% Demo.m
% All the codes in "code_forMetrics" are from MIT Saliency Benchmark (https://github.com/cvzoya/saliency). Please refer to their webpage for more details.

% load global parameters, you should set up the "ROOT_DIR" to your own path
% for data.
clear all
METRIC_DIR = 'code_forMetrics';
addpath(genpath(METRIC_DIR));

CACHE = ['./cache/'];
Path = '/data/Paper_code/ACL/';
Datasets = 'UCF';

Metrics{1} = 'AUC_Judd';
Metrics{2} = 'similarity';
Metrics{3} = 'AUC_shuffled';
Metrics{4} = 'CC';
Metrics{5} = 'NSS';

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Results and all_results should have same number of cells 
Results{1} = 'saliency'; % ours method
% Results{2} = 'NIPS08';

results = zeros(300,1);
all_results{1}=zeros(300,5);
all_results{2}=zeros(300,5);

mean_results{1}=zeros(1,5);

for k =1:1 % indexing methods
Results{k}
for j = 3:3 % indexing metrics

    if ~exist([CACHE 'ourdataset_' Results{k} '_' Metrics{j} '.mat'], 'file')
   
        videos = dir([Path Datasets '/test/']);
        
        for i = 1:length(videos)-2    % loop videos
            disp(i);
            options.SALIENCY_DIR = [Path Datasets '/test/' videos(i+2).name '/' Results{k} '/'];
            options.GTSALIENCY_DIR = [Path Datasets '/test/' videos(i+2).name '/maps/'];
            options.GTFIXATION_DIR = [Path Datasets '/test/' videos(i+2).name '/fixation/maps/'];
            [results(i), all]= readAllFrames(options, Metrics{j} );
        end
        
        all_results{k}(:,j)=results;
        
        mean_results{k} = sum(all_results{k}) / (length(videos)-2);
        
        % save([CACHE Datasets '_' Results{k} '_' Metrics{j} '.mat'], 'mean_results');
    else
        
        load([CACHE 'ourdataset_' Results{k} '_' Metrics{j} '.mat']);
    
    end
    
end

end**`

@wenguanwang
Copy link
Owner

@hkkevinhf Many thanks for your detailed information and quick response. I will let my intern carefully check this issue. It will take some time. Thanks for your understanding!

@hkkevinhf
Copy link
Author

@wenguanwang Thanks. Look forward to your reply.

@wenguanwang
Copy link
Owner

wenguanwang commented Jul 8, 2019

@hkkevinhf could you please offer all five scores for the output saliency maps?

@hkkevinhf
Copy link
Author

@wenguanwang yes. As for the UCF test set, the five scores (AUC-J, SIM, S-AUC, CC, NSS) are 0.8977, 0.4058, 0.5619, 0.5070, 2.5413 respectively. The s-AUC scores for Hollywood2 test set and DHF1K val set also seem strange, but I didn't record them. If you need, I will evaluate them once more.

@wenguanwang
Copy link
Owner

@hkkevinhf , we rechecked our evaluation code and found the inconsistency of the S-AUC is caused by the sampling strategy of the reference fixation map (only using the fixations of the same video). This only happens in the released evaluation code. Not to worry, as the evaluation code in the server is still the correct version. We uploaded an updated version in "code_for_Metrics.zip". Note that the S-AUC will have some variations due to the sampling strategy. Many thanks for your reminder.

@hkkevinhf
Copy link
Author

@wenguanwang ,received. Thanks for your effort and kind reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants