Permalink
Browse files

6.823 lab 2 writeup bits.

  • Loading branch information...
1 parent 58e9527 commit 5f962dfeba514371b7f04ba72d7b16050e795c89 Victor Costan committed Nov 2, 2009
View
@@ -0,0 +1,122 @@
+\section{Question 3}
+The fourth cache model doesn't make any sense. The different cache models in
+this lab are motivated by the desire of parallelizing cache access and address
+translation.
+
+If the cache is physically index, it means the cache access cannot start unil
+after address translation. Since we waited for address translation anyway, we
+might as well use the physical address for the tag. There's no reason to deal
+with the additional complexities of using virtual tagging.
+
+\section{Question 4}
+Having more than one process at a time means that each process will have its
+own page tables, and therefore the virtual-to-physical mapping can change
+during cache operation.
+
+The physically-indexed physically-tagged cache isn't impacted by this change at
+all, since it's oblivious to address translation.
+
+The virtually-indexed virtually-tagged cache
+
+The virtually-indexed physically-tagged cache is equipped to deal with this
+change. If the mapping changes, the tag will not match, so the cache will not
+give a bad answer by mistake. The cache's design has to deal with aliasing
+(different virtual addresses pointing to the same physical address) for the
+single-process case, so multiple processes don't introduce a change.
+
+The TAs would have to modify the address translation routine in our source code
+to simulate different processes, so I doubt that will be part of the tests.
+
+\section{Question 5}
+
+I stripped the supplied \texttt{caches} pintool into a pintool that counts the
+read and write memory accesses, as well as the aligned reads and writes. The
+summaries are presented in figure \ref{q5:unaligned_accesses}.
+
+\begin{figure}[htb]
+\center
+\input{6.823/lab2/figs/unaligned_accesses.tex}
+\caption{The percentage of unaligned memory accesses out of the total memory
+accesses for the SPEC 2000 benchmark suite. }
+\label{q5:unaligned_accesses}
+\end{figure}
+
+The bulk of unaligned accesses come from \texttt{bzip2}, which is a
+compression tool, and therefore it works with streams of bytes. Other
+significant sources of unaligned accesses are also processing streams
+of bytes -- \textt{cc} compiles C code, and \textt{parser} presumably builds an
+AST out of some language. The one that didn't come to my mind waas
+\texttt{mesa}, which is a software OpenGL implementation, and therefore has to
+work with a software framebuffer.
+
+The assumption of aligned accesses seems to make sense. The numbers reported
+above are an upper bound of unaligned accesses because, from a cache
+perspective, byte accesses can be satisfied even if they're unaligned, and
+multibyte accesses are only problematic if they span across 2 cache blocks.
+Furthermore, the applications which are most impacted by unaligned access don't
+really run on consumer computers -- most people don't compile their code, and a
+vast majority of desktop and mobile platforms have accelerated graphics
+nowadays.
+
+
+\section{X 1}
+Figure \ref{q1:frequencies} shows the dependencies obtained as suggested in the
+lab handout.
+
+\begin{figure}[htb]
+ \includegraphics[width=6.8in]{6.823/lab1/figs/frequencies.png}
+ \caption{Instruction dependency frequencies for the 11 benchmarks. }
+ \label{q1:frequencies}.
+\end{figure}
+
+The graph is not too helpful, except to tell us that most instructions depend
+on registers written in the past 4-5 instructions. Therefore, we use figure
+\ref{q1:frequencies_zoom} to take a better look at the dependencies at most 5
+instructions apart.
+
+\begin{figure}[htb]
+ \includegraphics[width=6.8in]{6.823/lab1/figs/frequencies_zoom.png}
+ \caption{Instruction dependency frequencies for the 11 benchmarks. }
+ \label{q1:frequencies_zoom}.
+\end{figure}
+
+Given the dependency statistics, it seesms like architecture A would be
+significantly faster than architecture B. In each benchmark, between 40\% -
+80\% of the instructions depend on the previous instruction. In architecture B,
+all instructions with dependencies on the previous instruction would have to be
+stalled for 1 cycle, while the previous instruction writes its registers. This
+means 40\%-80\% pipeline bubbles, which could be avoided by the forwarding
+circuitry.
+
+\section{X 2}
+
+Figure \ref{q2:reg_frequencies} shows the per-register breakdown for the
+instruction dependency. For each register and instruction distance, we computed
+the proportion of dependencies that are owed to that register. We averaged the
+values for each benchmark. We only plotted registers which contributed a
+dependency of at least 0.005\% on for at least one distance.
+
+\begin{figure}[htb]
+ \includegraphics[width=6.8in]{6.823/lab1/figs/reg_frequencies.png}
+ \caption{Instruction dependency frequencies broken down by register. }
+ \label{q2:reg_frequencies}.
+\end{figure}
+
+The graph suggests that a few registers are responsible for most dependencies.
+This is not very surprising, given the x86 architecture's predilection to use
+the \texttt{eax} register as an accumulator, \texttt{esi} and \texttt{edi}
+as pointers, and the presence of a condition flags register.
+
+The best (but most difficult to implement) suggestion would be to give the
+instruction set a makeover so it looks more like a RISC instruction set. It's
+not cool to have 8 registers, and instructions with limitations on the registers
+they can use. However, as Alpha found out, this strategy might not work
+commercially.\footnote{One could argue that, even though the strategy failed in
+the 1990s, we're living in a different landscape right now, where many servers
+are running some flavor of UNIX/Linux. So, maybe the instruction set will be
+redesigned one day.}
+
+Assuming we're stuck with the ISA, it seems that (limited) forwarding is very
+worth-while. The dependencies seem to taper off after 4 cycles, so it's
+probably not worth forwarding for more than 4 cycles. This is important in
+super-pipelined architectures, like the latest Pentiums and Core processors.
@@ -0,0 +1,38 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab2/code/lab2common.rb'
+
+bench_cases_fix_names 'accesses'
+cases = bench_cases
+stats = bench_values(cases)
+stats.keys.each do |name|
+ total_r, total_w, aligned_r, aligned_w = *stats[name]
+ stats[name] = [(total_r - aligned_r) / total_r.to_f,
+ (total_w - aligned_w) / total_w.to_f,
+ (total_r + total_w - aligned_r - aligned_w) /
+ (total_r + total_w).to_f]
+end
+sums = stats.values.inject([0, 0, 0]) do |acc, stat|
+ acc.zip(stat).map { |a, b| a + b }
+end
+avgs = sums.map { |s| s / stats.length.to_f }
+stats['{}Averages'] = avgs
+
+File.open('6.823/lab2/figs/unaligned_accesses.tex', 'w') do |f|
+ f.write "\\begin{tabular}{lrrr}\n\\hline\n"
+ f.write "Test & \\% unaligned reads & \\% unaligned writes & "
+ f.write "\\% unaligned accesses"
+ f.write " \\\\\n\\hline\n"
+ stats.keys.sort.each do |name|
+ f.write "#{name} "
+ f.write stats[name].map { |number| "& #{'%.5f' % number}\\% " }.join
+ f.write " \\\\\n\\hline\n"
+ end
+ f.write "\\end{tabular}\n"
+end
@@ -0,0 +1,38 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+
+def bench_cases_fix_names(dir_name = 'accesses')
+ files = Dir.glob("6.823/lab2/data/#{dir_name}/*")
+
+ files.each_with_index do |file, i|
+ ['.out', '.o'].each do |suffix|
+ len = suffix.length
+ next unless file[-len, len] == suffix
+ File.rename file, file[0...-len]
+ files[i] = file[0...-len]
+ end
+ end
+end
+
+def bench_cases
+ files = Dir.glob('6.823/lab2/data/accesses/*')
+
+ names = files.map { |file| File.basename(file) }.
+ map { |file| file[0...file.index('_base')] }
+ Hash[*names.zip(files).flatten]
+end
+
+def bench_values(cases, base_dir = 'original')
+ {}.tap do |values|
+ cases.each do |name, file|
+ # total_reads, total_writes, aligned_reads, aligned_writes
+ numbers = File.read(file.gsub('/original/', "/#{base_dir}/")).split(',').
+ select { |token| !token.empty? }.map { |token| token.to_i }
+ values[name] = numbers
+ end
+ end
+end
@@ -0,0 +1,54 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+# This program needs the gnuplot gem to run. Install with the following command:
+# gem install gnuplot
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab1/code/lab1common.rb'
+
+bench_cases_fix_names 'original'
+bench_cases_fix_names 'accurate'
+cases = bench_cases
+originals = bench_values(bench_cases)
+
+
+Gnuplot.open do |gp|
+ Gnuplot::Plot.new gp do |plot|
+ plot.terminal 'png small size 1024,768'
+ plot.output '6.823/lab1/figs/frequencies.png'
+ plot.ylabel '% Instructions'
+ plot.xlabel 'Distance'
+
+ originals.keys.sort.each do |name|
+ data = originals[name]
+ plot.data << Gnuplot::DataSet.new([(1..data.length).to_a, data]) do |ds|
+ ds.title = name
+ ds.with = 'lines'
+ ds.linewidth = 1
+ end
+ end
+ end
+
+ maxpoints = 4
+ Gnuplot::Plot.new gp do |plot|
+ plot.terminal 'png small size 1024,768'
+ plot.output '6.823/lab1/figs/frequencies_zoom.png'
+ plot.ylabel '% Instructions'
+ plot.xlabel 'Distance'
+
+ originals.keys.sort.each do |name|
+ data = originals[name]
+ plot.data << Gnuplot::DataSet.new([(1..maxpoints).to_a,
+ data[0, maxpoints]]) do |ds|
+ ds.title = name
+ ds.with = 'lines'
+ ds.linewidth = 1
+ end
+ end
+ end
+end
@@ -0,0 +1,43 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+# This program needs the gnuplot gem to run. Install with the following command:
+# gem install gnuplot
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab1/code/lab1common.rb'
+
+bench_cases_fix_names 'detailed'
+cases = bench_cases
+details = bench_detailed_values bench_cases, 'detailed'
+
+reg_count = details.values.first[:reg_stats].length
+stat_count = details.values.first[:numbers].length
+register_stats = (0...reg_count).map do |reg|
+ (0...stat_count).map do |i|
+ details.map { |name, detail| detail[:reg_stats][reg][i] }.
+ inject(0) { |acc, n| acc + n} / details.length
+ end
+end
+
+Gnuplot.open do |gp|
+ Gnuplot::Plot.new gp do |plot|
+ plot.terminal 'png small size 1024,768'
+ plot.output '6.823/lab1/figs/reg_frequencies.png'
+ plot.ylabel '% Instructions'
+ plot.xlabel 'Distance'
+
+ register_stats.each_with_index do |stats, i|
+ next if stats.max < 0.005
+ plot.data << Gnuplot::DataSet.new([(1..stat_count).to_a, stats]) do |ds|
+ ds.title = "Reg #{i}"
+ ds.with = 'lines'
+ ds.linewidth = 1
+ end
+ end
+ end
+end
@@ -0,0 +1 @@
+588075207,105538835,587962226,105489276
@@ -0,0 +1 @@
+2540542959,484303033,2540219288,484062351
@@ -0,0 +1 @@
+3950342815,2252567492,2982880055,1423015226
@@ -0,0 +1 @@
+588959732,275712743,559546386,271817090
@@ -0,0 +1 @@
+3064732952,1859867327,2989898421,1840480311
@@ -0,0 +1 @@
+939578800,192647960,929113784,189081176
@@ -0,0 +1 @@
+345388928,162808145,315185468,155618601
@@ -0,0 +1 @@
+862235709,457216222,741334115,399995683
@@ -0,0 +1 @@
+1574529275,779095614,1451143066,711838243
@@ -0,0 +1 @@
+1276070662,723113942,1186585565,716996086
@@ -0,0 +1 @@
+587422298,55983171,587403253,55979430
@@ -0,0 +1,29 @@
+\begin{tabular}{lrrr}
+\hline
+Test & \% unaligned reads & \% unaligned writes & \% unaligned accesses \\
+\hline
+applu & 0.00019\% & 0.00047\% & 0.00023\% \\
+\hline
+art & 0.00013\% & 0.00050\% & 0.00019\% \\
+\hline
+bzip2 & 0.24491\% & 0.36827\% & 0.28971\% \\
+\hline
+cc1 & 0.04994\% & 0.01413\% & 0.03852\% \\
+\hline
+crafty & 0.02442\% & 0.01042\% & 0.01913\% \\
+\hline
+equake & 0.01114\% & 0.01851\% & 0.01239\% \\
+\hline
+gap & 0.08745\% & 0.04416\% & 0.07358\% \\
+\hline
+gzip & 0.14022\% & 0.12515\% & 0.13500\% \\
+\hline
+mesa & 0.07836\% & 0.08633\% & 0.08100\% \\
+\hline
+parser & 0.07013\% & 0.00846\% & 0.04782\% \\
+\hline
+swim & 0.00003\% & 0.00007\% & 0.00004\% \\
+\hline
+{}Averages & 0.06426\% & 0.06150\% & 0.06342\% \\
+\hline
+\end{tabular}
View
@@ -5,9 +5,9 @@
\newcommand{\PsetAuthorEmail}{costan@mit.edu}
%% Pset class and instance information.
-\newcommand{\PsetTitle}{Lab 1}
-\newcommand{\PsetDueDate}{October 7, 2009}
-\newcommand{\PsetMainFile}{6.823/lab1/all.tex}
+\newcommand{\PsetTitle}{Lab 2}
+\newcommand{\PsetDueDate}{October 28, 2009}
+\newcommand{\PsetMainFile}{6.823/lab2/all.tex}
\newcommand{\PsetClassMetadata}{6.823/metadata.tex}
%%

0 comments on commit 5f962df

Please sign in to comment.