-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluate on VOT benchmark #33
Comments
Hi,
Thanks for pointing it out.
On a first look, your script seems fine, but to be honest I have some
troubles remembering the exact setup, as more than one year passed.
Are you sure you are using the batchnorms in eval mode? Behaviour changes
and might affect results significantly.
I have noticed that results may change significantly between two different
commits of the VOT toolkit unfortunately. Also Matconvnet and cuda versions
might be involved.
Did you try with OTB? Thats usually much more stable. Does your code match
the website results?
On 8 Oct 2017 16:31, "Bi Li" <notifications@github.com> wrote:
Hi, luca
Recently, I am trying to reproduce your results in the VOT2016 benchmark.
By using provided result files
<https://www.robots.ox.ac.uk/%7Eluca/stuff/siam-fc_results/vot16.zip>, the
computed EAO is 0.2905.
However, evaluating the pretrained color model
<https://www.robots.ox.ac.uk/%7Eluca/stuff/siam-fc_nets/2016-08-17.net.mat>
with 3 search scales, using VOT benchmark toolkits myself, I only get EAO
0.2247.
Below is the evaluation script. Is there anything I am missing?
function sfc_vot
% *************************************************************
% VOT: Always call exit command at the end to terminate Matlab!
% *************************************************************
cleanup = onCleanup(@() exit() );
% *************************************************************
% VOT: Set random seed to a different value every time.
% *************************************************************
RandStream.setGlobalStream(RandStream('mt19937ar', 'Seed', sum(clock)));
% *************************************************************
% SFC: Set tracking parameters
% *************************************************************
p.numScale = 3;
p.scaleStep = 1.0375;
p.scalePenalty = 0.9745;
p.scaleLR = 0.59; % damping factor for scale update
p.responseUp = 16; % upsampling the small 17x17 response helps
with the accuracy
p.windowing = 'cosine'; % to penalize large displacements
p.wInfluence = 0.176; % windowing influence (in convex sum)
p.net = '2016-08-17.net.mat';
%% execution, visualization, benchmark
p.gpus = 1;
p.fout = -1;
%% Params from the network architecture, have to be consistent
with the training
p.exemplarSize = 127; % input z size
p.instanceSize = 255; % input x size (search region)
p.scoreSize = 17;
p.totalStride = 8;
p.contextAmount = 0.5; % context amount for the exemplar
p.subMean = false;
%% SiamFC prefix and ids
p.prefix_z = 'a_'; % used to identify the layers of the exemplar
p.prefix_x = 'b_'; % used to identify the layers of the instance
p.prefix_join = 'xcorr';
p.prefix_adj = 'adjust';
p.id_feat_z = 'a_feat';
p.id_score = 'score';%
-------------------------------------------------------------------------------------------------
startup;
% Get environment-specific default paths.
p = env_paths_tracking(p);
% Load ImageNet Video statistics
if exist(p.stats_path,'file')
stats = load(p.stats_path);
else
warning('No stats found at %s', p.stats_path);
stats = [];
end
% Load two copies of the pre-trained network
net_z = load_pretrained([p.net_base_path p.net], p.gpus);
net_x = load_pretrained([p.net_base_path p.net], []);
% Divide the net in 2
% exemplar branch (used only once per video) computes features for
the target
remove_layers_from_prefix(net_z, p.prefix_x);
remove_layers_from_prefix(net_z, p.prefix_join);
remove_layers_from_prefix(net_z, p.prefix_adj);
% instance branch computes features for search region x and
cross-correlates with z features
remove_layers_from_prefix(net_x, p.prefix_z);
zFeatId = net_z.getVarIndex(p.id_feat_z);
scoreId = net_x.getVarIndex(p.id_score);
% **********************************
% VOT: Get initialization data
% **********************************
[handle, first_image, region] = vot('rectangle');
% If the provided region is a polygon ...
if numel(region) > 4
x1 = round(min(region(1:2:end)));
x2 = round(max(region(1:2:end)));
y1 = round(min(region(2:2:end)));
y2 = round(max(region(2:2:end)));
region = round([x1, y1, x2 - x1, y2 - y1]);
else
region = round([round(region(1)), round(region(2)), ...
round(region(1) + region(3)) - round(region(1)), ...
round(region(2) + region(4)) - round(region(2))]);
end;
irect = region
targetPosition = [irect(2) + (1 + irect(4)) / 2 irect(1) + (1 +
irect(3)) / 2];
targetSize = [irect(4) irect(3)];
startFrame = 1;
% get the first frame of the video
im = gpuArray(single(imread(first_image)));
% if grayscale repeat one channel to match filters size
if(size(im, 3)==1)
im = repmat(im, [1 1 3]);
end
% get avg for padding
avgChans = gather([mean(mean(im(:,:,1))) mean(mean(im(:,:,2)))
mean(mean(im(:,:,3)))]);
wc_z = targetSize(2) + p.contextAmount*sum(targetSize);
hc_z = targetSize(1) + p.contextAmount*sum(targetSize);
s_z = sqrt(wc_z*hc_z);
scale_z = p.exemplarSize / s_z;
% initialize the exemplar
[z_crop, ~] = get_subwindow_tracking(im, targetPosition,
[p.exemplarSize p.exemplarSize], [round(s_z) round(s_z)], avgChans);
if p.subMean
z_crop = bsxfun(@minus, z_crop, reshape(stats.z.rgbMean, [1 1 3]));
end
d_search = (p.instanceSize - p.exemplarSize)/2;
pad = d_search/scale_z;
s_x = s_z + 2*pad;
% arbitrary scale saturation
min_s_x = 0.2*s_x;
max_s_x = 5*s_x;
switch p.windowing
case 'cosine'
window = single(hann(p.scoreSize*p.responseUp) *
hann(p.scoreSize*p.responseUp)');
case 'uniform'
window = single(ones(p.scoreSize*p.responseUp,
p.scoreSize*p.responseUp));
end
% make the window sum 1
window = window / sum(window(:));
scales = (p.scaleStep .^ ((ceil(p.numScale/2)-p.numScale) :
floor(p.numScale/2)));
% evaluate the offline-trained network for exemplar z features
net_z.eval({'exemplar', z_crop});
z_features = net_z.vars(zFeatId).value;
z_features = repmat(z_features, [1 1 1 p.numScale]);
% start tracking
i = startFrame;
while true
% **********************************
% VOT: Get next frame
% **********************************
[handle, image] = handle.frame(handle);
if isempty(image)
break;
end;
if i>startFrame
% load new frame on GPU
im = gpuArray(single(imread(image)));
% if grayscale repeat one channel to match filters size
if(size(im, 3)==1)
im = repmat(im, [1 1 3]);
end
scaledInstance = s_x .* scales;
scaledTarget = [targetSize(1) .* scales; targetSize(2) .* scales];
% extract scaled crops for search region x at previous
target position
x_crops = make_scale_pyramid(im, targetPosition,
scaledInstance, p.instanceSize, avgChans, stats, p);
% evaluate the offline-trained network for exemplar x features
[newTargetPosition, newScale] = tracker_eval(net_x,
round(s_x), scoreId, z_features, x_crops, targetPosition, window, p);
targetPosition = gather(newTargetPosition);
% scale damping and saturation
s_x = max(min_s_x, min(max_s_x, (1-p.scaleLR)*s_x +
p.scaleLR*scaledInstance(newScale)));
targetSize = (1-p.scaleLR)*targetSize +
p.scaleLR*[scaledTarget(1,newScale) scaledTarget(2,newScale)];
else
% at the first frame output position and size passed as
input (ground truth)
end
i = i + 1;
rectPosition = [targetPosition([2,1]) - targetSize([2,1])/2,
targetSize([2,1])];
% output bbox in the original frame coordinates
oTargetPosition = targetPosition; % .* frameSize ./ newFrameSize;
oTargetSize = targetSize; % .* frameSize ./ newFrameSize;
region = [oTargetPosition([2,1]) - oTargetSize([2,1])/2,
oTargetSize([2,1])];
% **********************************
% VOT: Report position for frame
% **********************************
handle = handle.report(handle, region);
end
% **********************************
% VOT: Output the results
% **********************************
handle.quit(handle);end
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#33>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADLRuaPlGYzLQ-pV_0qIAYqzPZ8V7jaaks5sqOqzgaJpZM4Pxvfb>
.
|
@gongbudaizhe @bertinetto I met the promblem "Tracker has not passed the TraX support test." when evaluating VOT. May I ask for solution ??? |
Initializing workspace ... Error in run_experiments (line 8) |
hi,@KengChiLiu |
Hi, luca
Recently, I am trying to reproduce your results in the VOT2016 benchmark. By using provided result files, the computed EAO is 0.2905.
However, evaluating the pretrained color model with 3 search scales, using VOT benchmark toolkits myself, I only get EAO 0.2247.
Below is the evaluation script. Is there anything I am missing?
The text was updated successfully, but these errors were encountered: