Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Christos Bampis
authored and
Christos Bampis
committed
Sep 3, 2017
1 parent
be4611c
commit 12a0e90
Showing
13 changed files
with
319 additions
and
0 deletions.
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
clear | ||
close all | ||
clc | ||
|
||
%%%% add to path necessary files | ||
addpath(genpath('functions')) | ||
addpath(genpath('Demo_Images')) | ||
|
||
%%%% SpEED parameters | ||
blk_speed = 3; | ||
Nscales = 5; | ||
sigma_nsq = 0.1; | ||
down_size_ss = 2; | ||
window = fspecial('gaussian', 7, 7/6); | ||
window = window/sum(sum(window)); | ||
|
||
%%%% input reference and distorted images | ||
img_ref = double(rgb2gray(imread('bikes.bmp'))); | ||
img_dis = double(rgb2gray(imread('bikes_distorted.bmp'))); | ||
|
||
%%%% Single-Scale SpEED | ||
[SPEED_ss, SPEED_ss_SN] = ... | ||
Single_Scale_SpEED(img_ref, img_dis, ... | ||
down_size_ss, blk_speed, window, sigma_nsq); | ||
|
||
%%%% Multi-Scale SpEED | ||
[SPEED_ms, SPEED_ms_SN] = ... | ||
Multi_Scale_SpEED(img_ref, img_dis, ... | ||
Nscales, blk_speed, window, sigma_nsq); | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
clear | ||
close all | ||
clc | ||
|
||
%%%% add to path necessary files | ||
addpath(genpath('functions')) | ||
|
||
%%%% SpEED parameters | ||
blk_speed = 5; | ||
sigma_nsq = 0.1; | ||
down_size = 4; | ||
window = fspecial('gaussian', 7, 7/6); | ||
window = window/sum(sum(window)); | ||
|
||
%%%% load a reference and a distorted video here | ||
ref_video = 'E:\LiveVQA\pa1_25fps.yuv'; | ||
dis_video = 'E:\LiveVQA\pa2_25fps.yuv'; | ||
Nframes = 250; | ||
frame_height = 432; | ||
frame_width = 768; | ||
|
||
%%%% memory allocation | ||
speed_s = zeros(1, Nframes); | ||
speed_t = zeros(1, Nframes); | ||
speed_s_sn = zeros(1, Nframes); | ||
speed_t_sn = zeros(1, Nframes); | ||
|
||
tic | ||
|
||
for frame_ind = 1 : Nframes | ||
|
||
if frame_ind < Nframes | ||
|
||
%%%% read i and i+1 frames of reference and distorted video | ||
ref_frame = read_single_frame(ref_video, frame_ind, ... | ||
frame_height, frame_width); | ||
ref_frame_next = read_single_frame(ref_video, frame_ind + 1, ... | ||
frame_height, frame_width); | ||
|
||
dis_frame = read_single_frame(dis_video, frame_ind, ... | ||
frame_height, frame_width); | ||
dis_frame_next = read_single_frame(dis_video, frame_ind + 1, ... | ||
frame_height, frame_width); | ||
|
||
%%%% calaculate SpEED | ||
[speed_s(frame_ind), speed_s_sn(frame_ind), ... | ||
speed_t(frame_ind), speed_t_sn(frame_ind)] = ... | ||
Single_Scale_Video_SPEED(ref_frame, ref_frame_next, ... | ||
dis_frame, dis_frame_next, ... | ||
down_size, window, blk_speed, sigma_nsq); | ||
|
||
else | ||
|
||
%%%% cannot read more frame, use previous values | ||
speed_s(frame_ind) = speed_s(frame_ind - 1); | ||
speed_t(frame_ind) = speed_t(frame_ind - 1); | ||
speed_s_sn(frame_ind) = speed_s_sn(frame_ind - 1); | ||
speed_t_sn(frame_ind) = speed_t_sn(frame_ind - 1); | ||
|
||
end; | ||
|
||
end; | ||
|
||
time_took = toc; | ||
|
||
non_nan_inds = intersect(find(~isnan(speed_s)), ... | ||
find(~isnan(speed_t))); | ||
VideoSpEED = mean(speed_s(non_nan_inds)) * mean(speed_t(non_nan_inds)); | ||
|
||
disp(['Video SpEED: ' num2str(VideoSpEED)]) | ||
disp(['Took ' num2str(time_took) ' sec. for ' ... | ||
num2str(Nframes) ' frames of size' ' ' num2str(frame_width) 'x' .... | ||
num2str(frame_height)]) | ||
|
||
|
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
function [speed_ms, speed_sn_ms] = ... | ||
Multi_Scale_SpEED(img1r, img2r, Nscales, ... | ||
blk, window, sigma_nsq) | ||
|
||
weights_msssim = [0.0448 0.2856 0.3001 0.2363 0.1333]; | ||
speed_now = zeros(1, Nscales); | ||
speed_sn_now = zeros(1, Nscales); | ||
|
||
band_ind = 1; | ||
|
||
%%%% calculate local averages | ||
mu1r = imfilter(img1r, window, 'replicate'); | ||
mu2r = imfilter(img2r, window, 'replicate'); | ||
|
||
%%%% estimate local variances and conditional entropies in the spatial | ||
%%%% domain | ||
[ss_ref, q_ref] = est_params(img1r - mu1r,blk,sigma_nsq); | ||
spatial_ref = q_ref.*log2(1+ss_ref); | ||
[ss_dis, q_dis] = est_params(img2r - mu2r,blk,sigma_nsq); | ||
spatial_dis = q_dis.*log2(1+ss_dis); | ||
|
||
%%%% calculate SpEED for the finest scale | ||
speed_now(band_ind) = mean2(abs(spatial_ref-spatial_dis)); | ||
speed_sn_now(band_ind) = abs(mean2(spatial_ref-spatial_dis)); | ||
|
||
for band_ind = 2 : Nscales | ||
|
||
%%%% resize all frames | ||
img1r = imresize(img1r, 0.5); | ||
img2r = imresize(img2r, 0.5); | ||
|
||
%%%% calculate local averages | ||
mu1r = imfilter(img1r, window, 'replicate'); | ||
mu2r = imfilter(img2r, window, 'replicate'); | ||
|
||
%%%% estimate local variances and conditional entropies in the spatial | ||
%%%% domain | ||
[ss_ref, q_ref] = est_params(img1r - mu1r,blk,sigma_nsq); | ||
spatial_ref = q_ref.*log2(1+ss_ref); | ||
[ss_dis, q_dis] = est_params(img2r - mu2r,blk,sigma_nsq); | ||
spatial_dis = q_dis.*log2(1+ss_dis); | ||
|
||
%%%% calculate SpEED for this scale | ||
speed_now(band_ind) = mean2(abs(spatial_ref-spatial_dis)); | ||
speed_sn_now(band_ind) = abs(mean2(spatial_ref-spatial_dis)); | ||
|
||
end; | ||
|
||
%%%% apply MS-SSIM weights | ||
speed_ms = mean(speed_now .* weights_msssim); | ||
speed_sn_ms = mean(speed_sn_now .* weights_msssim); | ||
|
||
end | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
function [speed, speed_sn] = ... | ||
Single_Scale_SpEED(img_ref, img_dis, times_to_down_size, ... | ||
blk, window, sigma_nsq) | ||
|
||
%%%% resize reference and distorted image frames | ||
for band_ind = 1 : times_to_down_size | ||
img_ref = imresize(img_ref, 0.5); | ||
img_dis = imresize(img_dis, 0.5); | ||
end; | ||
|
||
%%%% calculate local averages | ||
mu_ref = imfilter(img_ref, window, 'replicate'); | ||
mu_dis = imfilter(img_dis, window, 'replicate'); | ||
|
||
%%%% estimate local variances and conditional entropies in the spatial | ||
%%%% domain | ||
[ss_ref, q_ref] = est_params(img_ref - mu_ref, blk, sigma_nsq); | ||
spatial_ref = q_ref .* log2(1+ss_ref); | ||
[ss_dis, q_dis] = est_params(img_dis - mu_dis, blk, sigma_nsq); | ||
spatial_dis = q_dis .* log2(1+ss_dis); | ||
|
||
%%%% calculate single-scale SpEED index | ||
speed = mean2(abs(spatial_ref - spatial_dis)); | ||
speed_sn = abs(mean2(spatial_ref - spatial_dis)); | ||
|
||
end | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
function [speed_s, speed_s_sn, speed_t, speed_t_sn] = ... | ||
Single_Scale_Video_SPEED(ref, ref_next, dis, dis_next, times_to_down_size, window, blk, sigma_nsq) | ||
|
||
%%%% resize all frames | ||
for band_ind = 1 : times_to_down_size | ||
ref = imresize(ref, 0.5); | ||
ref_next = imresize(ref_next, 0.5); | ||
dis = imresize(dis, 0.5); | ||
dis_next = imresize(dis_next, 0.5); | ||
end; | ||
|
||
%%%% calculate local averages | ||
mu_ref = imfilter(ref, window, 'replicate'); | ||
mu_dis = imfilter(dis, window, 'replicate'); | ||
|
||
%%%% Spatial SpEED | ||
|
||
%%%% estimate local variances and conditional entropies in the spatial | ||
%%%% domain for ith reference and distorted frames | ||
[ss_ref, q_ref] = est_params(ref - mu_ref, blk, sigma_nsq); | ||
spatial_ref = q_ref.*log2(1+ss_ref); | ||
[ss_dis, q_dis] = est_params(dis - mu_dis, blk, sigma_nsq); | ||
spatial_dis = q_dis.*log2(1+ss_dis); | ||
|
||
speed_s = nanmean(abs(spatial_ref(:) - spatial_dis(:))); | ||
speed_s_sn = abs(nanmean(spatial_ref(:) - spatial_dis(:))); | ||
|
||
%%%% frame differencing | ||
ref_diff = ref_next - ref; | ||
dis_diff = dis_next - dis; | ||
|
||
%%%% calculate local averages of frame differences | ||
mu_ref_diff = imfilter(ref_diff, window, 'replicate'); | ||
mu_dis_diff = imfilter(dis_diff, window, 'replicate'); | ||
|
||
%%%% Temporal SpEED | ||
%%%% estimate local variances and conditional entropies in the spatial | ||
%%%% domain for the reference and distorted frame differences | ||
[ss_ref_diff, q_ref] = est_params(ref_diff - mu_ref_diff, blk, sigma_nsq); | ||
temporal_ref = q_ref.*log2(1+ss_ref).*log2(1+ss_ref_diff); | ||
[ss_dis_diff, q_dis] = est_params(dis_diff - mu_dis_diff, blk, sigma_nsq); | ||
temporal_dis = q_dis.*log2(1+ss_dis).*log2(1+ss_dis_diff); | ||
|
||
speed_t = nanmean(abs(temporal_ref(:) - temporal_dis(:))); | ||
speed_t_sn = abs(nanmean(temporal_ref(:) - temporal_dis(:))); | ||
|
||
end | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
function [ss, ent] = est_params(y, blk, sigma) | ||
% 'ss' and 'ent' refer to the local variance parameter and the | ||
% entropy at different locations of the subband | ||
% y is a subband of the decomposition, 'blk' is the block size, 'sigma' is | ||
% the neural noise variance | ||
|
||
sizeim=floor(size(y)./blk)*blk; % crop to exact multiple size | ||
y=y(1:sizeim(1),1:sizeim(2)); | ||
|
||
temp = im2col(y, [blk blk], 'sliding'); | ||
|
||
mcu=mean(temp,2); | ||
cu=((temp-repmat(mcu,1,size(temp,2)))*(temp-repmat(mcu,1,size(temp,2)))')./size(temp,2); | ||
|
||
[Q,L] = eig(cu); | ||
L = diag(diag(L).*(diag(L)>0))*sum(diag(L))/(sum(diag(L).*(diag(L)>0))+(sum(diag(L).*(diag(L)>0))==0)); | ||
cu = Q*L*Q'; | ||
|
||
temp = im2col(y, [blk blk], 'distinct'); | ||
|
||
%Estimate local variance parameters | ||
if max(eig(cu)) > 0 | ||
ss=(cu\temp); | ||
ss=sum(ss.*temp)./(blk*blk); | ||
ss = reshape(ss,sizeim/blk); | ||
else ss=zeros(sizeim(1)/blk,sizeim(2)/blk); | ||
end | ||
|
||
[V,d]=eig(cu); d = d(d>0); | ||
|
||
%Compute entropy | ||
temp=zeros(size(ss)); | ||
for u=1:length(d) | ||
temp = temp+log2(ss*d(u)+sigma)+log(2*pi*exp(1)); | ||
end | ||
ent = temp; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
function y1 = read_single_frame(vidfilename, framenum, height, width) | ||
|
||
fid = fopen(vidfilename); | ||
|
||
fseek(fid,(framenum-1)*width*height*1.5,'bof'); | ||
|
||
y1 = fread(fid,width*height, 'uchar')'; | ||
y1 = reshape(y1,[width height]); | ||
y1 = y1'; | ||
|
||
fclose(fid); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
Spatial Efficient Entropic Differencing (SpEED-QA) Software release. | ||
================================================================= | ||
-----------COPYRIGHT NOTICE STARTS WITH THIS LINE------------ | ||
Copyright (c) 2017 The University of Texas at Austin | ||
All rights reserved. | ||
Permission is hereby granted, without written agreement and without license or royalty fees, to use, copy, modify, and distribute this code (the source files) and its documentation for any purpose, provided that the copyright notice in its entirety appear in all copies of this code, and the original source of this code, Laboratory for Image and Video Engineering (LIVE, http://live.ece.utexas.edu) and Center for Perceptual Systems (CPS, http://www.cps.utexas.edu) at the University of Texas at Austin (UT Austin, http://www.utexas.edu), is acknowledged in any publication that reports research using this code. The research is to be cited in the bibliography as: | ||
1) C. G. Bampis, Praful Gupta, Rajiv Soundararajan and A. C. Bovik, " SpEED-QA: Spatial Efficient Entropic Differencing for Image and Video Quality," Signal Processing Letters, under review | ||
2) C. G. Bampis, Praful Gupta, Rajiv Soundararajan and A. C. Bovik, "SpEED-QA Software Release" | ||
URL: http://live.ece.utexas.edu/research/quality/ SpEED_QA_release.zip, 2017 | ||
IN NO EVENT SHALL THE UNIVERSITY OF TEXAS AT AUSTIN BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS DATABASE AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF TEXAS AT AUSTIN HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THE UNIVERSITY OF TEXAS AT AUSTIN SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE DATABASE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF TEXAS AT AUSTIN HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. | ||
-----------COPYRIGHT NOTICE ENDS WITH THIS LINE------------% | ||
Authors : Christos Bampis and Praful Gupta | ||
Version : 1.0 | ||
The authors are with the Laboratory for Image and Video Engineering (LIVE), Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX. | ||
Kindly report any suggestions or corrections to cbampis@gmail.com or praful_gupta@utexas.edu | ||
================================================================= | ||
The current release implements SpEED-QA, an efficient image and video quality reduced-reference predictor in the spatial domain. SpEED-QA is based on local operations in the spatial domain and entropic differencing. It calculates the conditional entropies of mean-subtracted coefficients and applies entropic differencing between the reference and distorted image or video. For videos the aforementioned process is repeated on the frame differences as well yielding a temporal quality scores. This score is then combined with the spatial (still picture) quality indicator yielding the video version of SpEED: SpEED-VQA. | ||
The attached code also implements multi-scale SpEED for Image Quality Assessment (IQA), where the still picture SpEED is applied on 5 scales, then combined into a single score using MS-SSIM weights. | ||
The current release contains demo images for testing. For videos, you need to include a reference and a distorted video and call them from the code. | ||
For further questions, feel free to e-mail at cbampis@gmail.com or praful_gupta@utexas.edu | ||
================================================================= | ||
Further details: | ||
SpEED for Image Quality Assessment (IQA): | ||
blk_speed: size of the wavelet block used in the GSM model, set to 3x3 for images | ||
Nscales: number of scales to decompose the image. This parameter is related to the multi-scale version of SpEED. | ||
sigma_nsq: neural noise term, set to 0.1 (default value) | ||
down_size_ss: number of times to downscale the original image. For IQA, SpEED-IQA performs best for down_size_ss = 2 | ||
window: Gaussian window, same as in BRISQUE | ||
|
||
SpEED for Image Quality Assessment (VQA): | ||
blk_speed: size of the wavelet block used in the GSM model, set to 5x5 for videos | ||
sigma_nsq: neural noise term, set to 0.1 (default value) | ||
down_size: number of times to downscale the original image and the frame difference. For VQA, SpEED-VQA performs best for down_size = 4 | ||
window: Gaussian window, same as in BRISQUE | ||
|
||
For videos, you need to supply the full path of the reference and the distorted video. | ||
|
||
|