6.823 lab 2 writeup bits.

pwnall · Nov 2, 2009 · 5f962df · 5f962df
1 parent 58e9527
commit 5f962df
Show file tree

Hide file tree

Showing 18 changed files with 338 additions and 3 deletions.
diff --git a/src/6.823/lab2/all.tex b/src/6.823/lab2/all.tex
@@ -0,0 +1,122 @@
+\section{Question 3}
+The fourth cache model doesn't make any sense. The different cache models in
+this lab are motivated by the desire of parallelizing cache access and address
+translation.
+
+If the cache is physically index, it means the cache access cannot start unil
+after address translation. Since we waited for address translation anyway, we
+might as well use the physical address for the tag. There's no reason to deal
+with the additional complexities of using virtual tagging.
+
+\section{Question 4}
+Having more than one process at a time means that each process will have its
+own page tables, and therefore the virtual-to-physical mapping can change
+during cache operation.
+
+The physically-indexed physically-tagged cache isn't impacted by this change at
+all, since it's oblivious to address translation.
+
+The virtually-indexed virtually-tagged cache 
+
+The virtually-indexed physically-tagged cache is equipped to deal with this
+change. If the mapping changes, the tag will not match, so the cache will not
+give a bad answer by mistake. The cache's design has to deal with aliasing
+(different virtual addresses pointing to the same physical address) for the
+single-process case, so multiple processes don't introduce a change.
+
+The TAs would have to modify the address translation routine in our source code
+to simulate different processes, so I doubt that will be part of the tests. 
+
+\section{Question 5}
+
+I stripped the supplied \texttt{caches} pintool into a pintool that counts the
+read and write memory accesses, as well as the aligned reads and writes. The
+summaries are presented in figure \ref{q5:unaligned_accesses}.
+
+\begin{figure}[htb]
+\center
+\input{6.823/lab2/figs/unaligned_accesses.tex}
+\caption{The percentage of unaligned memory accesses out of the total memory
+accesses for the SPEC 2000 benchmark suite. }
+\label{q5:unaligned_accesses}
+\end{figure}
+
+The bulk of unaligned accesses come from \texttt{bzip2}, which is a
+compression tool, and therefore it works with streams of bytes. Other
+significant sources of unaligned accesses are also processing streams
+of bytes -- \textt{cc} compiles C code, and \textt{parser} presumably builds an
+AST out of some language. The one that didn't come to my mind waas
+\texttt{mesa}, which is a software OpenGL implementation, and therefore has to
+work with a software framebuffer.
+
+The assumption of aligned accesses seems to make sense. The numbers reported
+above are an upper bound of unaligned accesses because, from a cache
+perspective, byte accesses can be satisfied even if they're unaligned, and
+multibyte accesses are only problematic if they span across 2 cache blocks.
+Furthermore, the applications which are most impacted by unaligned access don't
+really run on consumer computers -- most people don't compile their code, and a
+vast majority of desktop and mobile platforms have accelerated graphics
+nowadays.
+
+
+\section{X 1}
+Figure \ref{q1:frequencies} shows the dependencies obtained as suggested in the
+lab handout.
+
+\begin{figure}[htb]
+  \includegraphics[width=6.8in]{6.823/lab1/figs/frequencies.png}
+  \caption{Instruction dependency frequencies for the 11 benchmarks. }
+  \label{q1:frequencies}.
+\end{figure}
+
+The graph is not too helpful, except to tell us that most instructions depend
+on registers written in the past 4-5 instructions. Therefore, we use figure
+\ref{q1:frequencies_zoom} to take a better look at the dependencies at most 5
+instructions apart.
+
+\begin{figure}[htb]
+  \includegraphics[width=6.8in]{6.823/lab1/figs/frequencies_zoom.png}
+  \caption{Instruction dependency frequencies for the 11 benchmarks. }
+  \label{q1:frequencies_zoom}.
+\end{figure}
+
+Given the dependency statistics, it seesms like architecture A would be
+significantly faster than architecture B. In each benchmark, between 40\% -
+80\% of the instructions depend on the previous instruction. In architecture B,
+all instructions with dependencies on the previous instruction would have to be
+stalled for 1 cycle, while the previous instruction writes its registers. This
+means 40\%-80\% pipeline bubbles, which could be avoided by the forwarding
+circuitry.
+
+\section{X 2}
+
+Figure \ref{q2:reg_frequencies} shows the per-register breakdown for the
+instruction dependency. For each register and instruction distance, we computed
+the proportion of dependencies that are owed to that register. We averaged the
+values for each benchmark. We only plotted registers which contributed a
+dependency of at least 0.005\% on for at least one distance.
+
+\begin{figure}[htb]
+  \includegraphics[width=6.8in]{6.823/lab1/figs/reg_frequencies.png}
+  \caption{Instruction dependency frequencies broken down by register. }
+  \label{q2:reg_frequencies}.
+\end{figure}
+
+The graph suggests that a few registers are responsible for most dependencies.
+This is not very surprising, given the x86 architecture's predilection to use
+the \texttt{eax} register as an accumulator, \texttt{esi} and \texttt{edi}
+as pointers, and the presence of a condition flags register.
+
+The best (but most difficult to implement) suggestion would be to give the
+instruction set a makeover so it looks more like a RISC instruction set. It's
+not cool to have 8 registers, and instructions with limitations on the registers
+they can use. However, as Alpha found out, this strategy might not work
+commercially.\footnote{One could argue that, even though the strategy failed in
+the 1990s, we're living in a different landscape right now, where many servers
+are running some flavor of UNIX/Linux. So, maybe the instruction set will be
+redesigned one day.}
+
+Assuming we're stuck with the ISA, it seems that (limited) forwarding is very
+worth-while. The dependencies seem to taper off after 4 cycles, so it's
+probably not worth forwarding for more than 4 cycles. This is important in
+super-pipelined architectures, like the latest Pentiums and Core processors.
diff --git a/src/6.823/lab2/code/aligned_accesses.rb b/src/6.823/lab2/code/aligned_accesses.rb
@@ -0,0 +1,38 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab2/code/lab2common.rb'
+
+bench_cases_fix_names 'accesses'
+cases = bench_cases
+stats = bench_values(cases)
+stats.keys.each do |name|
+  total_r, total_w, aligned_r, aligned_w = *stats[name]
+  stats[name] =  [(total_r - aligned_r) / total_r.to_f,
+                  (total_w - aligned_w) / total_w.to_f,
+                  (total_r + total_w - aligned_r - aligned_w) /
+                  (total_r + total_w).to_f]
+end
+sums = stats.values.inject([0, 0, 0]) do |acc, stat|
+  acc.zip(stat).map { |a, b| a + b }
+end
+avgs = sums.map { |s| s / stats.length.to_f }
+stats['{}Averages'] = avgs
+
+File.open('6.823/lab2/figs/unaligned_accesses.tex', 'w') do |f|
+  f.write "\\begin{tabular}{lrrr}\n\\hline\n"
+  f.write "Test & \\% unaligned reads & \\% unaligned writes & "
+  f.write "\\% unaligned accesses"
+  f.write " \\\\\n\\hline\n"
+  stats.keys.sort.each do |name|
+    f.write "#{name} "
+    f.write stats[name].map { |number| "& #{'%.5f' % number}\\% " }.join
+    f.write " \\\\\n\\hline\n"
+  end
+  f.write "\\end{tabular}\n"
+end
diff --git a/src/6.823/lab2/code/lab2common.rb b/src/6.823/lab2/code/lab2common.rb
@@ -0,0 +1,38 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+
+def bench_cases_fix_names(dir_name = 'accesses')
+  files = Dir.glob("6.823/lab2/data/#{dir_name}/*")
+
+  files.each_with_index do |file, i|
+    ['.out', '.o'].each do |suffix|
+      len = suffix.length
+      next unless file[-len, len] == suffix
+      File.rename file, file[0...-len]
+      files[i] = file[0...-len]
+    end
+  end  
+end
+
+def bench_cases
+  files = Dir.glob('6.823/lab2/data/accesses/*')  
+
+  names = files.map { |file| File.basename(file) }.
+                map { |file| file[0...file.index('_base')] }
+  Hash[*names.zip(files).flatten]
+end
+
+def bench_values(cases, base_dir = 'original')  
+  {}.tap do |values|
+    cases.each do |name, file|
+      # total_reads, total_writes, aligned_reads, aligned_writes
+      numbers = File.read(file.gsub('/original/', "/#{base_dir}/")).split(',').
+                     select { |token| !token.empty? }.map { |token| token.to_i }
+      values[name] = numbers
+    end
+  end
+end
diff --git a/src/6.823/lab2/code/original_plot.rb b/src/6.823/lab2/code/original_plot.rb
@@ -0,0 +1,54 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+# This program needs the gnuplot gem to run. Install with the following command:
+# gem install gnuplot
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab1/code/lab1common.rb'
+
+bench_cases_fix_names 'original'
+bench_cases_fix_names 'accurate'
+cases = bench_cases
+originals = bench_values(bench_cases)
+
+
+Gnuplot.open do |gp|
+  Gnuplot::Plot.new gp do |plot|
+    plot.terminal 'png small size 1024,768'
+    plot.output '6.823/lab1/figs/frequencies.png'
+    plot.ylabel '% Instructions'
+    plot.xlabel 'Distance'
+
+    originals.keys.sort.each do |name|
+      data = originals[name]
+      plot.data << Gnuplot::DataSet.new([(1..data.length).to_a, data]) do |ds|
+        ds.title = name
+        ds.with = 'lines'
+        ds.linewidth = 1
+      end
+    end
+  end
+
+  maxpoints = 4
+  Gnuplot::Plot.new gp do |plot|
+    plot.terminal 'png small size 1024,768'
+    plot.output '6.823/lab1/figs/frequencies_zoom.png'
+    plot.ylabel '% Instructions'
+    plot.xlabel 'Distance'
+
+    originals.keys.sort.each do |name|
+      data = originals[name]
+      plot.data << Gnuplot::DataSet.new([(1..maxpoints).to_a,
+                                        data[0, maxpoints]]) do |ds|
+        ds.title = name
+        ds.with = 'lines'
+        ds.linewidth = 1
+      end
+    end
+  end
+end
diff --git a/src/6.823/lab2/code/per_register_plot.rb b/src/6.823/lab2/code/per_register_plot.rb
@@ -0,0 +1,43 @@
+#!/usr/bin/env ruby
+#
+# Author:: Victor Costan
+# Copyright:: none
+# License:: Public Domain
+
+# This program needs the gnuplot gem to run. Install with the following command:
+# gem install gnuplot
+
+require 'rubygems'
+require 'gnuplot'
+require '6.823/lab1/code/lab1common.rb'
+
+bench_cases_fix_names 'detailed'
+cases = bench_cases
+details = bench_detailed_values bench_cases, 'detailed'
+
+reg_count = details.values.first[:reg_stats].length
+stat_count = details.values.first[:numbers].length
+register_stats = (0...reg_count).map do |reg|
+  (0...stat_count).map do |i|    
+    details.map { |name, detail| detail[:reg_stats][reg][i] }.
+            inject(0) { |acc, n| acc + n} / details.length
+  end
+end
+
+Gnuplot.open do |gp|
+  Gnuplot::Plot.new gp do |plot|
+    plot.terminal 'png small size 1024,768'
+    plot.output '6.823/lab1/figs/reg_frequencies.png'
+    plot.ylabel '% Instructions'
+    plot.xlabel 'Distance'
+
+    register_stats.each_with_index do |stats, i|
+      next if stats.max < 0.005
+      plot.data << Gnuplot::DataSet.new([(1..stat_count).to_a, stats]) do |ds|
+        ds.title = "Reg #{i}"
+        ds.with = 'lines'
+        ds.linewidth = 1
+      end
+    end
+  end
+end
diff --git a/src/6.823/lab2/data/accesses/applu_base.none_______applu.in b/src/6.823/lab2/data/accesses/applu_base.none_______applu.in
@@ -0,0 +1 @@
+588075207,105538835,587962226,105489276
diff --git a/...___-stride___2___-startx___134___-starty___220___-endx___139___-endy___225___-objects___1 b/...___-stride___2___-startx___134___-starty___220___-endx___139___-endy___225___-objects___1
@@ -0,0 +1 @@
+2540542959,484303033,2540219288,484062351
diff --git a/src/6.823/lab2/data/accesses/bzip2_base.x86_linux___input.random___2 b/src/6.823/lab2/data/accesses/bzip2_base.x86_linux___input.random___2
@@ -0,0 +1 @@
+3950342815,2252567492,2982880055,1423015226
diff --git a/src/6.823/lab2/data/accesses/cc1_base.x86_linux___cccp.i___-o___foo b/src/6.823/lab2/data/accesses/cc1_base.x86_linux___cccp.i___-o___foo
@@ -0,0 +1 @@
+588959732,275712743,559546386,271817090
diff --git a/src/6.823/lab2/data/accesses/crafty_base.x86_linux_______crafty.in b/src/6.823/lab2/data/accesses/crafty_base.x86_linux_______crafty.in
@@ -0,0 +1 @@
+3064732952,1859867327,2989898421,1840480311
diff --git a/src/6.823/lab2/data/accesses/equake_base.none_______inp.in b/src/6.823/lab2/data/accesses/equake_base.none_______inp.in
@@ -0,0 +1 @@
+939578800,192647960,929113784,189081176
diff --git a/src/6.823/lab2/data/accesses/gap_base.x86_linux___-l___input___-q___-m___64M_______test.in b/src/6.823/lab2/data/accesses/gap_base.x86_linux___-l___input___-q___-m___64M_______test.in
@@ -0,0 +1 @@
+345388928,162808145,315185468,155618601
diff --git a/src/6.823/lab2/data/accesses/gzip_base.x86_linux___input.compressed___2 b/src/6.823/lab2/data/accesses/gzip_base.x86_linux___input.compressed___2
@@ -0,0 +1 @@
+862235709,457216222,741334115,399995683
diff --git a/src/6.823/lab2/data/accesses/mesa_base.none___-frames___10___-meshfile___mesa.in b/src/6.823/lab2/data/accesses/mesa_base.none___-frames___10___-meshfile___mesa.in
@@ -0,0 +1 @@
+1574529275,779095614,1451143066,711838243
diff --git a/src/6.823/lab2/data/accesses/parser_base.x86_linux___2.1.dict___-batch_______test.in b/src/6.823/lab2/data/accesses/parser_base.x86_linux___2.1.dict___-batch_______test.in
@@ -0,0 +1 @@
+1276070662,723113942,1186585565,716996086
diff --git a/src/6.823/lab2/data/accesses/swim_base.none_______swim.in b/src/6.823/lab2/data/accesses/swim_base.none_______swim.in
@@ -0,0 +1 @@
+587422298,55983171,587403253,55979430
diff --git a/src/6.823/lab2/figs/unaligned_accesses.tex b/src/6.823/lab2/figs/unaligned_accesses.tex
@@ -0,0 +1,29 @@
+\begin{tabular}{lrrr}
+\hline
+Test & \% unaligned reads & \% unaligned writes & \% unaligned accesses \\
+\hline
+applu & 0.00019\% & 0.00047\% & 0.00023\%  \\
+\hline
+art & 0.00013\% & 0.00050\% & 0.00019\%  \\
+\hline
+bzip2 & 0.24491\% & 0.36827\% & 0.28971\%  \\
+\hline
+cc1 & 0.04994\% & 0.01413\% & 0.03852\%  \\
+\hline
+crafty & 0.02442\% & 0.01042\% & 0.01913\%  \\
+\hline
+equake & 0.01114\% & 0.01851\% & 0.01239\%  \\
+\hline
+gap & 0.08745\% & 0.04416\% & 0.07358\%  \\
+\hline
+gzip & 0.14022\% & 0.12515\% & 0.13500\%  \\
+\hline
+mesa & 0.07836\% & 0.08633\% & 0.08100\%  \\
+\hline
+parser & 0.07013\% & 0.00846\% & 0.04782\%  \\
+\hline
+swim & 0.00003\% & 0.00007\% & 0.00004\%  \\
+\hline
+{}Averages & 0.06426\% & 0.06150\% & 0.06342\%  \\
+\hline
+\end{tabular}
diff --git a/src/master.tex b/src/master.tex
@@ -5,9 +5,9 @@
 \newcommand{\PsetAuthorEmail}{costan@mit.edu}
 
 %% Pset class and instance information.
-\newcommand{\PsetTitle}{Lab 1}
+\newcommand{\PsetTitle}{Lab 2}
-\newcommand{\PsetDueDate}{October 7, 2009}
+\newcommand{\PsetDueDate}{October 28, 2009}
-\newcommand{\PsetMainFile}{6.823/lab1/all.tex}
+\newcommand{\PsetMainFile}{6.823/lab2/all.tex}
 \newcommand{\PsetClassMetadata}{6.823/metadata.tex}
 
 %%