Idiot

mejibyte · Oct 16, 2011 · 0d00389 · 0d00389
1 parent c1f6308
commit 0d00389
Show file tree

Hide file tree

Showing 3 changed files with 43 additions and 1 deletion.
diff --git a/public/colombian-national-programming-contests/2011/index.html b/public/colombian-national-programming-contests/2011/index.html
@@ -28,7 +28,7 @@ <h1>Solutions to problems from Colombian National Programming Contest 2011</h1>
 
 <div class="author">
   <p class="name">by Andrés Mejía</p>
-  <p class="date">October 15, 2011</p>
+  <p class="date">October 16, 2011</p>
 </div>
 
 <p><a name="contents"></a></p>
@@ -60,6 +60,48 @@ <h2>Solution to problem B - Sewing Buttons with Grandma</h2>
 
 <h2>Solution to problem C - Document Compression</h2>
 
+<p>The fact that&#39;s key to solve this problem is noticing that there are at most 16 different terms among all possible documents. This is good news because there are only <code>2^16 = 65536</code> different subsets of terms. Some of these subsets must be the documents we want to codify. Since this is a small number, we can simply precompute how many basis documents we need to form each of the 65536 possible subsets and then read the answer for each document we want to codify in <code>O(1)</code>.</p>
+
+<p>When working with subsets, it&#39;s usually very helpful to use bitwise operations. We will represent a subset of terms <code>S</code> with a single integer <code>x</code>, where the <code>i</code>-th bit of <code>x</code> (in binary representation) is 1 if <code>i+1</code> is present in <code>S</code> and 0 otherwise. For example, if <code>S = {2, 3, 7, 15, 16}</code> then <code>x = 1100000001000110</code>:</p>
+
+<p><img src="../images/binary_representation_of_sets.png" alt="Binary representation of sets"></p>
+
+<p>This is useful because we can find the union of two subsets with a single bitwise <code>or</code>, which is blazingly fast and simple (we can also find the intersection with a bitwise <code>and</code>, but we don&#39;t need that in this problem).</p>
+
+<p>Now, how do we find the minimum number of basis documents needed to form each possible subset of terms? Let&#39;s consider a graph where each node is a subset of terms and there&#39;s an edge from <code>u</code> to node <code>v</code> if we can mix <code>u</code> with a basis document and get <code>v</code>. It&#39;s easier to explain with an example, so let&#39;s imagine we have the following basis documents:</p>
+
+<pre>
+  b[0] = {1}
+  b[1] = {1, 3}
+  b[2] = {2, 4}
+  b[3] = {1, 2, 3}
+</pre>
+
+<p>The graph we&#39;re talking about would look something like this (the index of the actual basis document that was used on each edge is shown in red):</p>
+
+<p><img src="../images/document_graph.png" alt="Document graph"></p>
+
+<p>Every path in this graph from <code>0000</code> to any node <code>v</code> represents a subset of basis documents that were chosen and mixed together to encode document <code>v</code>. Since we want to use the least possible number of basis documents, the answer is simply the shortest path in this graph (starting from <code>0000</code>).</p>
+
+<p>For example, in the graph above we can see that the shortest path from <code>0000</code> to <code>0001</code> is 1. This means that we can form the document <code>{1}</code> using a single basis document (indeed, we just need <code>b[0]</code>). There are three different paths to <code>0111</code>; the shortest one has length 1. This means that we can form the document <code>{1, 2, 3}</code> with a single basis document (indeed, we just need <code>b[4]</code>). There are several paths to <code>1111</code>; the shortest one has length 2 (indeed, we can form <code>{1, 2, 3, 4}</code> mixing two basis documents, <code>b[2]</code> and <code>b[3]</code>, or <code>b[1]</code> and <code>b[2]</code>). There is no path to <code>0110</code>. This means we cannot form <code>{2, 3}</code> no matter how hard we try.</p>
+
+<p>Since we have a directed graph where all edges have the same length, we can use a classical algorithm known as Breadth First Search (BFS) to find the shortest path from the first node to all others.</p>
+
+<p>It&#39;s worth noting that we don&#39;t really need to explicitly build the graph above. We can just build it on the fly as we traverse it.</p>
+
+<p>Here&#39;s a sample implementation in C++:</p>
+
+<pre class="brush: cpp">
+  Coming soon
+</pre>
+
+<h3>Exercises</h3>
+
+<ul>
+<li>Modify the algorithm above to not only tell what&#39;s the minimum number of basis documents needed, but actually tell which are the basis documents used.</li>
+<li>Modify the algorithm above to calculate in how many different ways you can form some given document.</li>
+</ul>
+
 <p><a name="solution-d"></a></p>
 
 <h2>Solution to problem D - Digital Roulette</h2>

diff --git a/...olombian-national-programming-contests/images/binary_representation_of_sets.png b/...olombian-national-programming-contests/images/binary_representation_of_sets.png
diff --git a/public/colombian-national-programming-contests/images/document_graph.png b/public/colombian-national-programming-contests/images/document_graph.png