Skip to content
This repository has been archived by the owner on Jan 7, 2018. It is now read-only.

Commit

Permalink
6.823 lab 2 writeup bits.
Browse files Browse the repository at this point in the history
  • Loading branch information
Victor Costan committed Nov 2, 2009
1 parent 58e9527 commit 5f962df
Show file tree
Hide file tree
Showing 18 changed files with 338 additions and 3 deletions.
122 changes: 122 additions & 0 deletions src/6.823/lab2/all.tex
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,122 @@
\section{Question 3}
The fourth cache model doesn't make any sense. The different cache models in
this lab are motivated by the desire of parallelizing cache access and address
translation.

If the cache is physically index, it means the cache access cannot start unil
after address translation. Since we waited for address translation anyway, we
might as well use the physical address for the tag. There's no reason to deal
with the additional complexities of using virtual tagging.

\section{Question 4}
Having more than one process at a time means that each process will have its
own page tables, and therefore the virtual-to-physical mapping can change
during cache operation.

The physically-indexed physically-tagged cache isn't impacted by this change at
all, since it's oblivious to address translation.

The virtually-indexed virtually-tagged cache

The virtually-indexed physically-tagged cache is equipped to deal with this
change. If the mapping changes, the tag will not match, so the cache will not
give a bad answer by mistake. The cache's design has to deal with aliasing
(different virtual addresses pointing to the same physical address) for the
single-process case, so multiple processes don't introduce a change.

The TAs would have to modify the address translation routine in our source code
to simulate different processes, so I doubt that will be part of the tests.

\section{Question 5}

I stripped the supplied \texttt{caches} pintool into a pintool that counts the
read and write memory accesses, as well as the aligned reads and writes. The
summaries are presented in figure \ref{q5:unaligned_accesses}.

\begin{figure}[htb]
\center
\input{6.823/lab2/figs/unaligned_accesses.tex}
\caption{The percentage of unaligned memory accesses out of the total memory
accesses for the SPEC 2000 benchmark suite. }
\label{q5:unaligned_accesses}
\end{figure}

The bulk of unaligned accesses come from \texttt{bzip2}, which is a
compression tool, and therefore it works with streams of bytes. Other
significant sources of unaligned accesses are also processing streams
of bytes -- \textt{cc} compiles C code, and \textt{parser} presumably builds an
AST out of some language. The one that didn't come to my mind waas
\texttt{mesa}, which is a software OpenGL implementation, and therefore has to
work with a software framebuffer.

The assumption of aligned accesses seems to make sense. The numbers reported
above are an upper bound of unaligned accesses because, from a cache
perspective, byte accesses can be satisfied even if they're unaligned, and
multibyte accesses are only problematic if they span across 2 cache blocks.
Furthermore, the applications which are most impacted by unaligned access don't
really run on consumer computers -- most people don't compile their code, and a
vast majority of desktop and mobile platforms have accelerated graphics
nowadays.


\section{X 1}
Figure \ref{q1:frequencies} shows the dependencies obtained as suggested in the
lab handout.

\begin{figure}[htb]
\includegraphics[width=6.8in]{6.823/lab1/figs/frequencies.png}
\caption{Instruction dependency frequencies for the 11 benchmarks. }
\label{q1:frequencies}.
\end{figure}

The graph is not too helpful, except to tell us that most instructions depend
on registers written in the past 4-5 instructions. Therefore, we use figure
\ref{q1:frequencies_zoom} to take a better look at the dependencies at most 5
instructions apart.

\begin{figure}[htb]
\includegraphics[width=6.8in]{6.823/lab1/figs/frequencies_zoom.png}
\caption{Instruction dependency frequencies for the 11 benchmarks. }
\label{q1:frequencies_zoom}.
\end{figure}

Given the dependency statistics, it seesms like architecture A would be
significantly faster than architecture B. In each benchmark, between 40\% -
80\% of the instructions depend on the previous instruction. In architecture B,
all instructions with dependencies on the previous instruction would have to be
stalled for 1 cycle, while the previous instruction writes its registers. This
means 40\%-80\% pipeline bubbles, which could be avoided by the forwarding
circuitry.

\section{X 2}

Figure \ref{q2:reg_frequencies} shows the per-register breakdown for the
instruction dependency. For each register and instruction distance, we computed
the proportion of dependencies that are owed to that register. We averaged the
values for each benchmark. We only plotted registers which contributed a
dependency of at least 0.005\% on for at least one distance.

\begin{figure}[htb]
\includegraphics[width=6.8in]{6.823/lab1/figs/reg_frequencies.png}
\caption{Instruction dependency frequencies broken down by register. }
\label{q2:reg_frequencies}.
\end{figure}

The graph suggests that a few registers are responsible for most dependencies.
This is not very surprising, given the x86 architecture's predilection to use
the \texttt{eax} register as an accumulator, \texttt{esi} and \texttt{edi}
as pointers, and the presence of a condition flags register.

The best (but most difficult to implement) suggestion would be to give the
instruction set a makeover so it looks more like a RISC instruction set. It's
not cool to have 8 registers, and instructions with limitations on the registers
they can use. However, as Alpha found out, this strategy might not work
commercially.\footnote{One could argue that, even though the strategy failed in
the 1990s, we're living in a different landscape right now, where many servers
are running some flavor of UNIX/Linux. So, maybe the instruction set will be
redesigned one day.}

Assuming we're stuck with the ISA, it seems that (limited) forwarding is very
worth-while. The dependencies seem to taper off after 4 cycles, so it's
probably not worth forwarding for more than 4 cycles. This is important in
super-pipelined architectures, like the latest Pentiums and Core processors.
38 changes: 38 additions & 0 deletions src/6.823/lab2/code/aligned_accesses.rb
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env ruby
#
# Author:: Victor Costan
# Copyright:: none
# License:: Public Domain

require 'rubygems'
require 'gnuplot'
require '6.823/lab2/code/lab2common.rb'

bench_cases_fix_names 'accesses'
cases = bench_cases
stats = bench_values(cases)
stats.keys.each do |name|
total_r, total_w, aligned_r, aligned_w = *stats[name]
stats[name] = [(total_r - aligned_r) / total_r.to_f,
(total_w - aligned_w) / total_w.to_f,
(total_r + total_w - aligned_r - aligned_w) /
(total_r + total_w).to_f]
end
sums = stats.values.inject([0, 0, 0]) do |acc, stat|
acc.zip(stat).map { |a, b| a + b }
end
avgs = sums.map { |s| s / stats.length.to_f }
stats['{}Averages'] = avgs

File.open('6.823/lab2/figs/unaligned_accesses.tex', 'w') do |f|
f.write "\\begin{tabular}{lrrr}\n\\hline\n"
f.write "Test & \\% unaligned reads & \\% unaligned writes & "
f.write "\\% unaligned accesses"
f.write " \\\\\n\\hline\n"
stats.keys.sort.each do |name|
f.write "#{name} "
f.write stats[name].map { |number| "& #{'%.5f' % number}\\% " }.join
f.write " \\\\\n\\hline\n"
end
f.write "\\end{tabular}\n"
end
38 changes: 38 additions & 0 deletions src/6.823/lab2/code/lab2common.rb
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,38 @@
#!/usr/bin/env ruby
#
# Author:: Victor Costan
# Copyright:: none
# License:: Public Domain


def bench_cases_fix_names(dir_name = 'accesses')
files = Dir.glob("6.823/lab2/data/#{dir_name}/*")

files.each_with_index do |file, i|
['.out', '.o'].each do |suffix|
len = suffix.length
next unless file[-len, len] == suffix
File.rename file, file[0...-len]
files[i] = file[0...-len]
end
end
end

def bench_cases
files = Dir.glob('6.823/lab2/data/accesses/*')

names = files.map { |file| File.basename(file) }.
map { |file| file[0...file.index('_base')] }
Hash[*names.zip(files).flatten]
end

def bench_values(cases, base_dir = 'original')
{}.tap do |values|
cases.each do |name, file|
# total_reads, total_writes, aligned_reads, aligned_writes
numbers = File.read(file.gsub('/original/', "/#{base_dir}/")).split(',').
select { |token| !token.empty? }.map { |token| token.to_i }
values[name] = numbers
end
end
end
54 changes: 54 additions & 0 deletions src/6.823/lab2/code/original_plot.rb
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env ruby
#
# Author:: Victor Costan
# Copyright:: none
# License:: Public Domain

# This program needs the gnuplot gem to run. Install with the following command:
# gem install gnuplot

require 'rubygems'
require 'gnuplot'
require '6.823/lab1/code/lab1common.rb'

bench_cases_fix_names 'original'
bench_cases_fix_names 'accurate'
cases = bench_cases
originals = bench_values(bench_cases)


Gnuplot.open do |gp|
Gnuplot::Plot.new gp do |plot|
plot.terminal 'png small size 1024,768'
plot.output '6.823/lab1/figs/frequencies.png'
plot.ylabel '% Instructions'
plot.xlabel 'Distance'

originals.keys.sort.each do |name|
data = originals[name]
plot.data << Gnuplot::DataSet.new([(1..data.length).to_a, data]) do |ds|
ds.title = name
ds.with = 'lines'
ds.linewidth = 1
end
end
end

maxpoints = 4
Gnuplot::Plot.new gp do |plot|
plot.terminal 'png small size 1024,768'
plot.output '6.823/lab1/figs/frequencies_zoom.png'
plot.ylabel '% Instructions'
plot.xlabel 'Distance'

originals.keys.sort.each do |name|
data = originals[name]
plot.data << Gnuplot::DataSet.new([(1..maxpoints).to_a,
data[0, maxpoints]]) do |ds|
ds.title = name
ds.with = 'lines'
ds.linewidth = 1
end
end
end
end
43 changes: 43 additions & 0 deletions src/6.823/lab2/code/per_register_plot.rb
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,43 @@
#!/usr/bin/env ruby
#
# Author:: Victor Costan
# Copyright:: none
# License:: Public Domain

# This program needs the gnuplot gem to run. Install with the following command:
# gem install gnuplot

require 'rubygems'
require 'gnuplot'
require '6.823/lab1/code/lab1common.rb'

bench_cases_fix_names 'detailed'
cases = bench_cases
details = bench_detailed_values bench_cases, 'detailed'

reg_count = details.values.first[:reg_stats].length
stat_count = details.values.first[:numbers].length
register_stats = (0...reg_count).map do |reg|
(0...stat_count).map do |i|
details.map { |name, detail| detail[:reg_stats][reg][i] }.
inject(0) { |acc, n| acc + n} / details.length
end
end

Gnuplot.open do |gp|
Gnuplot::Plot.new gp do |plot|
plot.terminal 'png small size 1024,768'
plot.output '6.823/lab1/figs/reg_frequencies.png'
plot.ylabel '% Instructions'
plot.xlabel 'Distance'

register_stats.each_with_index do |stats, i|
next if stats.max < 0.005
plot.data << Gnuplot::DataSet.new([(1..stat_count).to_a, stats]) do |ds|
ds.title = "Reg #{i}"
ds.with = 'lines'
ds.linewidth = 1
end
end
end
end
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
588075207,105538835,587962226,105489276
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
2540542959,484303033,2540219288,484062351
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
3950342815,2252567492,2982880055,1423015226
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
588959732,275712743,559546386,271817090
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
3064732952,1859867327,2989898421,1840480311
1 change: 1 addition & 0 deletions src/6.823/lab2/data/accesses/equake_base.none_______inp.in
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
939578800,192647960,929113784,189081176
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
345388928,162808145,315185468,155618601
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
862235709,457216222,741334115,399995683
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
1574529275,779095614,1451143066,711838243
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
1276070662,723113942,1186585565,716996086
1 change: 1 addition & 0 deletions src/6.823/lab2/data/accesses/swim_base.none_______swim.in
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1 @@
587422298,55983171,587403253,55979430
29 changes: 29 additions & 0 deletions src/6.823/lab2/figs/unaligned_accesses.tex
Original file line number Original file line Diff line number Diff line change
@@ -0,0 +1,29 @@
\begin{tabular}{lrrr}
\hline
Test & \% unaligned reads & \% unaligned writes & \% unaligned accesses \\
\hline
applu & 0.00019\% & 0.00047\% & 0.00023\% \\
\hline
art & 0.00013\% & 0.00050\% & 0.00019\% \\
\hline
bzip2 & 0.24491\% & 0.36827\% & 0.28971\% \\
\hline
cc1 & 0.04994\% & 0.01413\% & 0.03852\% \\
\hline
crafty & 0.02442\% & 0.01042\% & 0.01913\% \\
\hline
equake & 0.01114\% & 0.01851\% & 0.01239\% \\
\hline
gap & 0.08745\% & 0.04416\% & 0.07358\% \\
\hline
gzip & 0.14022\% & 0.12515\% & 0.13500\% \\
\hline
mesa & 0.07836\% & 0.08633\% & 0.08100\% \\
\hline
parser & 0.07013\% & 0.00846\% & 0.04782\% \\
\hline
swim & 0.00003\% & 0.00007\% & 0.00004\% \\
\hline
{}Averages & 0.06426\% & 0.06150\% & 0.06342\% \\
\hline
\end{tabular}
6 changes: 3 additions & 3 deletions src/master.tex
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
\newcommand{\PsetAuthorEmail}{costan@mit.edu} \newcommand{\PsetAuthorEmail}{costan@mit.edu}


%% Pset class and instance information. %% Pset class and instance information.
\newcommand{\PsetTitle}{Lab 1} \newcommand{\PsetTitle}{Lab 2}
\newcommand{\PsetDueDate}{October 7, 2009} \newcommand{\PsetDueDate}{October 28, 2009}
\newcommand{\PsetMainFile}{6.823/lab1/all.tex} \newcommand{\PsetMainFile}{6.823/lab2/all.tex}
\newcommand{\PsetClassMetadata}{6.823/metadata.tex} \newcommand{\PsetClassMetadata}{6.823/metadata.tex}


%% %%
Expand Down

0 comments on commit 5f962df

Please sign in to comment.