Skip to content
Find file
Fetching contributors…
Cannot retrieve contributors at this time
932 lines (637 sloc) 105 KB
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head profile="http://gmpg.org/xfn/11">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<title>Martin Kleppmann&rsquo;s blog</title>
<link rel="stylesheet" type="text/css" media="screen, print, handheld" href="/css/typography.css" />
<link rel="stylesheet" type="text/css" media="screen" href="/css/style.css" />
<link rel="stylesheet" type="text/css" media="all" href="/css/pygments-default.css" />
<link rel="stylesheet" type="text/css" media="all" href="/css/ansi2html.css" />
<link rel="stylesheet" type="text/css" media="all" href="/css/customizations.css?2" />
<!--[if lt IE 8]>
<link rel="stylesheet" href="/css/ie.css" type="text/css" media="screen" charset="utf-8" />
<![endif]-->
<link rel="alternate" type="application/rss+xml" title="RSS" href="http://feeds.feedburner.com/martinkl?format=xml" title="Martin Kleppmann's blog" />
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.4/jquery.min.js" type="text/javascript"></script>
<script src="http://downloads.mailchimp.com/js/jquery.form-n-validate.js" type="text/javascript"></script>
<script src="/form.js" type="text/javascript"></script>
</head>
<body class="wordpress">
<div id="page">
<p id="top">
<a id="to-content" href="#content" title="Skip to content">Skip to content</a>
</p>
<div id="header">
<div class="wrapper">
<strong id="blog-title">
<a href="/" title="Home" rel="home">Martin Kleppmann</a>
</strong>
<p id="blog-description">Entrepreneurship, web technology and the user experience</p>
</div>
</div>
<div id="sub-header">
<div class="wrapper">
<div id="navigation">
<ul>
<li class="page_item"><a href="/contact.html" title="About/Contact">About/Contact</a></li>
</ul>
</div>
</div>
</div>
<hr class="divider">
<div class="wrapper">
<div id="content">
<div class="hentry full p1 post publish">
<h1 class="entry-title full-title">
<a href="/2012/10/08/complexity-of-user-experience.html" rel="bookmark">The complexity of user experience</a>
</h1>
<div class="entry-content full-content">
<p>The problem of overly complex software is nothing new; it is almost as old as software itself. Over and over again, software systems become so complex that they become very difficult to maintain and very time-consuming and expensive to modify. Most developers hate working on such systems, yet nevertheless we keep creating new, overly complex systems all the time.</p>
<p>Much has been written about this, including classic papers by Fred Brooks (<a href='http://people.eecs.ku.edu/~saiedian/Teaching/Sp08/816/Papers/Background-Papers/no-silver-bullet.pdf'>No Silver Bullet</a>), and Ben Moseley and Peter Marks (<a href='http://shaffner.us/cs/papers/tarpit.pdf'>Out of the Tar Pit</a>). They are much more worth reading than this post, and it is presumptuous of me to think I could add anything significant to this debate. But I will try nevertheless.</p>
<p>Pretty much everyone agrees that if you have a choice between a simpler software design and a more complex design, all else being equal, that simpler is better. It is also widely thought to be worthwhile to deliberately invest in simplicity &#8212; for example, to spend effort refactoring existing code into a cleaner design &#8212; because the one-off cost of refactoring today is easily offset by the benefits of easier maintenance tomorrow. Also, much thought by many smart people has gone into finding ways of breaking down complex systems into manageable parts with manageable dependencies. I don&#8217;t wish to dispute any of that.</p>
<p>But there is a subtlety that I have been missing in discussions about software complexity, that I feel somewhat ambivalent about, and that I think is worth discussing. It concerns the points where external humans (people outside of the team maintaining the system) touch the system &#8212; as developers using an API exposed by the system, or as end users interacting with a user interface. I will concentrate mostly on user interfaces, but much of this discussion applies to APIs too.</p>
<h2 id='examples'>Examples</h2>
<p>Let me first give a few examples, and then try to extract a pattern from them. They are examples of situations where, if you want, you can go to substantial engineering effort in order to make a user interface a little bit nicer. (Each example based on a true story!)</p>
<ul>
<li>You have an e-commerce site, and need to send out order confirmation emails that explain next steps to the customer. Those next steps differ depending on availability, the tax status of the product, the location of the customer, the type of account they have, and a myriad other parameters. You want the emails to only include the information that is applicable to this particular customer&#8217;s situation, and not burden them with edge cases that don&#8217;t apply to them. You also want the emails to read as coherent prose, not as a bunch of fragmented bullet points generated by <code>if</code> statements based on the order parameters. So you go and build a natural language grammar model for constructing emails based on sentence snippets (providing pluralisation, agreement, declension in languages that have it, etc), in such a way that for any one out of 100 million possible parameter combinations, the resulting email is grammatically correct and easy to understand.</li>
<li>You have a multi-step user flow that is used in various different contexts, but ultimatively achieves the same thing in each context. (For example, <a href='http://rapportive.com/'>Rapportive</a> has several OAuth flows for connecting your account with various social networks, and there are several different buttons in different places that all lead into the same user flow.) The simple solution is to make the flow generic, and not care how the user got there. But if you want to make the user feel good, you need to imagine what state their mind was in when they entered the flow, and customise the images, text and structure of the flow in order to match their goal. This means you have to keep track of where the user came from, what they were trying to do, and thread that context through every step of the flow. This is not fundamentally hard, but it is fiddly, time-consuming and error-prone.</li>
<li>You have an application that requires some arcane configuration. You could take the stance that you will give the user a help page and they will have to figure it out from there. Or you could write a sophisticated auto-configuration tool that inspects the user&#8217;s environment, analyses thousands of possible software combinations and configurations (and updates this database as new versions of other products in the environment are released), and automatically chooses the correct settings &#8212; hopefully without having to ask the user for help. With auto-configuration, the users never even know that they were spared a confusing configuration dialog. But somehow, word gets around that the product &#8220;just works&#8221;.</li>
</ul>
<h2 id='whats_a_user_requirement'>What&#8217;s a user requirement?</h2>
<p>We said above that simplicity is good. However, taking simplicity to an exaggerated extreme, you end up with software that does nothing. This implies that there are aspects of software complexity that are <strong>essential</strong> to the user&#8217;s problem that is being solved. (Note that I don&#8217;t mean complexity of the user interface, but complexity of the actual code that implements the solution to the user&#8217;s problem.)</p>
<p>Unfortunately, there is a lot of additional complexity introduced by stuff that is not directly visible or useful to users: stuff that is only required to &#8220;grease the wheels&#8221;, for example to make legacy components work or to improve performance. Moseley and Marks call this latter type <strong>accidental</strong> complexity, and argue that it should be removed or abstracted away as much as possible. (Other authors define essential and accidental complexity slightly differently, but the exact definition is not important for the purpose of this post.)</p>
<p>This suggests that it is important to understand what <strong>user problem</strong> is being solved, and that&#8217;s where things start getting tricky. When you say that something is essential because it fulfils a <strong>user requirement</strong> (as opposed to an implementation constraint or a performance optimisation), that presupposes a very utilitarian view of software. It assumes that the user is trying to get a job done, and that they are a rational actor. But what if, say, you are taking an emotional approach and optimising for <strong>user delight</strong>?</p>
<p>What if the user didn&#8217;t know they had a problem, but you solve it anyway? If you introduce complexity in the system for the sake of making things a little nicer for the user (but without providing new core functionality), is that complexity really essential? What if you add a little detail that is surprising but delightful?</p>
<p>You can try to reduce an emotional decision down to a rational one &#8212; for example, you can say that when a user plays a game, it is solving the user&#8217;s problem of boredom by providing distraction. Thus any feature which substantially contributes towards alleviating boredom may be considered essential. Such reductionism can sometimes provide useful angles of insight, but I think a lot would be lost by ignoring the emotional angle.</p>
<p>You can state categorically that &#8220;great user experience is an essential feature&#8221;. But what does that mean? By itself, that statement is so general that could be used to argue for anything or nothing. User experience is subjective. What&#8217;s preferable for one user may be an annoyance for another user, even if both users are in the application&#8217;s target segment. Sometimes it just comes down to taste or fashion. User experience tends to have an emotional angle that makes it hard to fit into a rational reasoning framework.</p>
<p>What I am trying to get at: there are things in software that introduce a lot of complexity (and that we should consequently be wary of), and that can&#8217;t be directly mapped to a bullet point on a list of user requirements, but that are nevertheless important and valuable. These things do not necessarily provide important functionality, but they contribute to how the user <strong>feels</strong> about the application. Their effect may be invisible or subconscious, but that doesn&#8217;t make them any less essential.</p>
<h2 id='datadriven_vs_emotional_design'>Data-driven vs. emotional design</h2>
<p>Returning to the examples above: as an application developer, you can choose whether to take on substantial additional complexity in the software in order to simplify or improve the experience for the user. The increased software complexity actually <strong>reduces</strong> the complexity from the user&#8217;s point of view. These examples also illustrate how user experience concerns are not just a matter of graphic design, but can also have a big impact on how things are engineered.</p>
<p>The features described above arguably do not contribute to the utility of the software &#8212; in the e-commerce example, orders will be fulfilled whether or not the confirmation emails are grammatical. In that sense, the complexity is unnecessary. But I would argue that these kind of user experience improvements are just as important as the utility of the product, because they determine how users <strong>feel</strong> about it. And how they feel ultimately determines whether they come back, and thus the success or failure of the product.</p>
<p>One could even argue that the utility of a product is a subset of its user experience: if the software doesn&#8217;t do the job that it&#8217;s supposed to, then that&#8217;s one way of creating a pretty bad experience; however, there are also many other ways of creating a bad experience, while remaining fully functional from a utilitarian point of view.</p>
<p>The emotional side of user experience can be a difficult thing for organisations to grapple with, because it doesn&#8217;t easily map to metrics. You can measure things like how long a user stayed on your site, how many things they clicked on, conversion rates, funnels, repeat purchase rates, lifetime values&#8230; but those numbers tell you very little about how happy you made a user. So you can take a &#8220;data-driven&#8221; approach to design decisions and say that a feature is worthwhile if and only if it makes the metrics go up &#8212; but I fear that an important side of the story is missed if you go solely by the numbers.</p>
<h2 id='questions'>Questions</h2>
<p>This is as far as my thinking has got: believing that a great user experience is essential for many products; and recognising that building a great UX is hard, can require substantial additional complexity in engineering, and can be hard to justify in terms of logical arguments and metrics. Which leaves me with some unanswered questions:</p>
<ul>
<li>Every budget is finite, so you have to prioritise things, and not everything will get done. When you consider building something that improves user experience without strictly adding utility, it has to be traded off against features that do add utility (is it better to shave a day off the delivery time than to have a nice confirmation email?), and the cost of the increased complexity (will that clever email generator be a nightmare to localise when we translate the site into other languages?). How do you decide about that kind of trade-offs?</li>
<li>User experience choices are often emotional and <a href='http://martin.kleppmann.com/2010/10/30/intuition-has-no-transfer-encoding.html'>intuitive</a> (no number of focus groups and usability tests can replace good taste). That doesn&#8217;t make them any more or less important than rational arguments, but combining emotional and rational arguments can be tricky. Emotionally-driven people tend to let emotional choices overrule rational arguments, and rationally-driven people vice versa. How do you find the healthy middle ground?</li>
<li>If you&#8217;re aiming for a minimum viable product in order to test out a market (as opposed to improving a mature product), does that change how you prioritise core utility relative to &#8220;icing on the cake&#8221;?</li>
</ul>
<p>I suspect that the answers to the questions above are <em>&#8220;it depends&#8221;</em>. More precisely, <em>&#8220;how one thing is valued relative to another is an aspect of your particular organisation&#8217;s culture, and there&#8217;s no one right answer&#8221;</em>. That would imply that each of us should think about it; you should have your own personal answers for how you decide these things in your own projects, and be able to articulate them. But it&#8217;s difficult &#8212; I don&#8217;t think hard-and-fast rules have a chance of working here.</p>
<p>I&#8217;d love to hear your thoughts in the comments below. If you liked this post, you can <a href='http://eepurl.com/csJmf'>subscribe to email notifications</a> when I write something new :)</p>
</div>
<div class="by-line">
<address class="author vcard full-author">
<span class="by">By</span> <a class="url fn" href="http://martin.kleppmann.com/">Martin Kleppmann</a>
</address>
<span class="date full-date">
<abbr class="published" title="2012-10-08T00:00:00-07:00">08 Oct 2012</abbr>
</span>
</div>
<p class="comments-link">
<a href="/2012/10/08/complexity-of-user-experience.html#disqus_thread">Comments</a>
</p>
<div class="clear"></div>
</div>
<div class="hentry full p1 post publish">
<h1 class="entry-title full-title">
<a href="/2012/10/01/rethinking-caching-in-web-apps.html" rel="bookmark">Rethinking caching in web apps</a>
</h1>
<div class="entry-content full-content">
<p>Having spent a lot of the last few years worrying about the scalability of data-heavy applications like <a href='http://rapportive.com/'>Rapportive</a>, I have started to get the feeling that maybe we have all been &#8220;doing it wrong&#8221;. Maybe what we consider to be &#8220;state of the art&#8221; application architecture is actually holding us back.</p>
<p>I don&#8217;t have a definitive answer for how we should be architecting things differently, but in this post I&#8217;d like to outline a few ideas that I have been fascinated by recently. My hope is that we can develop ways of better managing scale (in terms of complexity, volume of data and volume of traffic) while keeping our applications nimble, easy and safe to modify, test and iterate.</p>
<p>My biggest problem with web application architecture is how <strong>network communication concerns</strong> are often intermingled with <strong>business logic concerns</strong>. This makes it hard to rearrange the logic into new architectures, such as the precomputed cache architecture described below. In this post I explore why it important to be able to try new architectures for things like caching, and what it would take to achieve that flexibility.</p>
<h2 id='an_example'>An example</h2>
<p>To illustrate, consider the clichéd Rails blogging engine example:</p>
<div class='highlight'><pre><code class='ruby'><span class='k'>class</span> <span class='nc'>Post</span> <span class='o'>&lt;</span> <span class='no'>ActiveRecord</span><span class='o'>::</span><span class='no'>Base</span>
<span class='n'>attr_accessible</span> <span class='ss'>:title</span><span class='p'>,</span> <span class='ss'>:content</span><span class='p'>,</span> <span class='ss'>:author</span>
<span class='n'>has_many</span> <span class='ss'>:comments</span>
<span class='k'>end</span>
<span class='k'>class</span> <span class='nc'>Comment</span> <span class='o'>&lt;</span> <span class='no'>ActiveRecord</span><span class='o'>::</span><span class='no'>Base</span>
<span class='n'>attr_accessible</span> <span class='ss'>:content</span><span class='p'>,</span> <span class='ss'>:author</span>
<span class='n'>belongs_to</span> <span class='ss'>:post</span>
<span class='k'>end</span>
<span class='k'>class</span> <span class='nc'>PostsController</span> <span class='o'>&lt;</span> <span class='no'>ApplicationController</span>
<span class='k'>def</span> <span class='nf'>show</span>
<span class='vi'>@post</span> <span class='o'>=</span> <span class='no'>Post</span><span class='o'>.</span><span class='n'>find</span><span class='p'>(</span><span class='n'>params</span><span class='o'>[</span><span class='ss'>:id</span><span class='o'>]</span><span class='p'>)</span>
<span class='n'>respond_to</span> <span class='k'>do</span> <span class='o'>|</span><span class='nb'>format</span><span class='o'>|</span>
<span class='nb'>format</span><span class='o'>.</span><span class='n'>html</span> <span class='c1'># show.html.erb</span>
<span class='nb'>format</span><span class='o'>.</span><span class='n'>json</span> <span class='p'>{</span> <span class='n'>render</span> <span class='ss'>:json</span> <span class='o'>=&gt;</span> <span class='vi'>@post</span> <span class='p'>}</span>
<span class='k'>end</span>
<span class='k'>end</span>
<span class='k'>end</span>
<span class='c1'># posts/show.html.erb:</span>
</code></pre>
</div>
<p><div class='highlight'><pre><code class='rhtml'><span class='nt'>&lt;h1&gt;</span><span class='cp'>&lt;%=</span> <span class='vi'>@post</span><span class='o'>.</span><span class='n'>title</span> <span class='cp'>%&gt;</span><span class='nt'>&lt;/h1&gt;</span>
<span class='nt'>&lt;p</span> <span class='na'>class=</span><span class='s'>&quot;author&quot;</span><span class='nt'>&gt;</span>By <span class='cp'>&lt;%=</span> <span class='vi'>@post</span><span class='o'>.</span><span class='n'>author</span> <span class='cp'>%&gt;</span><span class='nt'>&lt;/p&gt;</span>
<span class='nt'>&lt;div</span> <span class='na'>class=</span><span class='s'>&quot;content&quot;</span><span class='nt'>&gt;</span>
<span class='cp'>&lt;%=</span> <span class='n'>simple_format</span><span class='p'>(</span><span class='vi'>@post</span><span class='o'>.</span><span class='n'>content</span><span class='p'>)</span> <span class='cp'>%&gt;</span>
<span class='nt'>&lt;/div&gt;</span>
<span class='nt'>&lt;h2&gt;</span>Comments<span class='nt'>&lt;/h2&gt;</span>
<span class='nt'>&lt;ul</span> <span class='na'>class=</span><span class='s'>&quot;comments&quot;</span><span class='nt'>&gt;</span>
<span class='cp'>&lt;%</span> <span class='vi'>@post</span><span class='o'>.</span><span class='n'>comments</span><span class='o'>.</span><span class='n'>each</span> <span class='k'>do</span> <span class='o'>|</span><span class='n'>comment</span><span class='o'>|</span> <span class='cp'>%&gt;</span>
<span class='nt'>&lt;li&gt;</span>
<span class='nt'>&lt;blockquote&gt;</span><span class='cp'>&lt;%=</span> <span class='n'>simple_format</span><span class='p'>(</span><span class='n'>comment</span><span class='o'>.</span><span class='n'>content</span><span class='p'>)</span> <span class='cp'>%&gt;</span><span class='nt'>&lt;/blockquote&gt;</span>
<span class='nt'>&lt;p</span> <span class='na'>class=</span><span class='s'>&quot;author&quot;</span><span class='nt'>&gt;</span><span class='cp'>&lt;%=</span> <span class='n'>comment</span><span class='o'>.</span><span class='n'>author</span> <span class='cp'>%&gt;</span><span class='nt'>&lt;/p&gt;</span>
<span class='nt'>&lt;/li&gt;</span>
<span class='cp'>&lt;%</span> <span class='k'>end</span> <span class='cp'>%&gt;</span>
<span class='nt'>&lt;/ul&gt;</span>
</code></pre>
</div></p>
<p>Pretty good code by various standards, but it has always irked me a bit that I can&#8217;t see where the network communication (i.e. making database queries) is happening. When I look at that <code>Post.find</code> in the controller, I can guess that probabably translates into a <code>SELECT * FROM posts WHERE id = ?</code> internally &#8211; unless the same query was already made recently, and ActiveRecord cached the result. And another database query of the form <code>SELECT * FROM comments WHERE post_id = ?</code> might be made as a result of the <code>@post.comments</code> call in the template. Or maybe the comments were already previously loaded by some model logic, and then cached? Or someone decided to eagerly load comments with the original post? Who knows.</p>
<p>The execution flow for a MVC framework request like <code>PostsController#show</code> probably looks something like this:</p>
<p><a href='/2012/10/architecture-high-01.png'><img alt='Typical MVC request flow' height='119' src='/2012/10/architecture-01.png' width='550' /></a></p>
<p>Of course it is deliberately designed that way. Your template and your controller shouldn&#8217;t have to worry about database queries &#8212; those are encapsulated by the model for many good reasons. I am violating abstraction by even thinking about the database whilst I&#8217;m in the template code! I should just think of my models as pure, beautiful pieces of application state. How that state gets loaded from a database is a matter that only the models need to worry about.</p>
<h2 id='adding_complexity'>Adding complexity</h2>
<p>In the example above, the amount of logic in the model is minimal, but it typically doesn&#8217;t stay that way for long. As the application becomes popular (say, the blogging engine morphs to become Twitter, Tumblr, Reddit or Pinterest), all sorts of stuff gets added: memcache to stop the database from falling over, spam filtering, analytics features, email sending, notifications, A/B testing, more memcache, premium features, ads, upsells for viral loops, more analytics, even more memcache. As the application inevitably grows in complexity, the big monolithic beast is split into several smaller services, and different services end up being maintained by different teams.</p>
<p>As all of this is happening, the programming model typically stays the same: each service in the architecture (which may be a user-facing web server, or an internal service e.g. for user authentication) communicates over the network with a bunch of other nodes (memcached instances, database servers, other application services), processes and combines the data in some way, and then serves it out to a client.</p>
<p>That processing and combining of data we can abstractly call &#8220;business logic&#8221;. It might be trivially simple, or it might involve half a million lines of parsing, rendering or machine learning code. It might behave differently depending on which A/B test bucket the user is in. It might deal with hundreds of hairy edge cases. Whatever.</p>
<p>At the root of the matter, business logic should be a <a href='http://en.wikipedia.org/wiki/Pure_function'>pure function</a>. It takes a bunch of inputs (request parameters from the client, data stored in various databases and caches, responses from various other services) and produces a bunch of outputs (data to return to the client, data to write back to various databases and caches). It is usually deterministic: given the same inputs, the business logic should produce exactly the same output again. It is also stateless: any data that is required to produce the output or to make a decision has to be provided as an input.</p>
<p>By contrast, the network communication logic is all about &#8216;wiring&#8217;. It may end up having a lot of complexity in its own right: sending requests to the right node of a sharded database, retrying failed requests with exponential back-off, making requests to different services in parallel, cross-datacenter failover, service authentication, etc. But the network communication logic ought to be general-purpose and completely independent of your application&#8217;s business logic.</p>
<p>Both business logic and network communication logic are needed to build a service. But how do you combine the two into a single process? Most commonly, we build abstractions for each type of logic, hiding the gory implementation details. Much like in the blog example above, you end up calling a method somewhere inside the business logic, not really knowing or caring whether it will immediately return a value that the object has already computed, or whether it will talk to another process on the same machine, or load the value from some remote cache, or make a query on a database cluster somewhere.</p>
<p>It&#8217;s good that the business logic doesn&#8217;t need to worry about how and when the communication happens. And it&#8217;s good that the communication logic is general-purpose and not polluted with application-specific concerns. But I think it&#8217;s problematic that network communication may happen somewhere deeply inside a business logic call stack. Let me try to explain why.</p>
<h2 id='precomputed_caches'>Precomputed caches</h2>
<p>As your volume of data and your number of users grow, database access often becomes a bottleneck (there are more queries competing for I/O, and each query takes longer when there&#8217;s more data). The standard answer to the problem is of course caching. You can cache at many different levels: an individual database row, or a model object generated by combining several sources, or even an entire HTML page ready to serve to a client. I will focus on the mid-to-high-level caches, where the raw data has gone through some sort of business logic before it ends up in the cache.</p>
<p>Most commonly, caches are set up in read-through style: on every query, you first check the cache, and return the value from the cache if it&#8217;s a hit; otherwise it&#8217;s a miss, so you do whatever is required to generate the value (query databases, apply business logic, perform voodoo), and return it to the client whilst also storing it in the cache for next time. As long as you can generate the value on the fly in a reasonable time, this works pretty well.</p>
<p>I will gloss over cache invalidation and expiry for now, and return to it below.</p>
<p>The most apparent problem with a read-through cache is that the first time a value is requested, it&#8217;s always slow. (And if your cache is too small to hold the entire dataset, rarely accessed values will get evicted and thus be slow every time.) That may or may not be a problem for you. One reason why it may be a problem is that on many sites, the first client to request a given page is typically the Googlebot, and Google <a href='http://www.mattcutts.com/blog/site-speed/'>penalises</a> slow sites in rankings. So if you have the kind of site where Google juice is lifeblood, then your SEO guys may tell you that a read-through cache is not good enough.</p>
<p>So, can you make sure that the data is in the cache even before it is requested for the first time? Well, if your dataset isn&#8217;t too huge, you can actually <strong>precompute every possible cache entry</strong>, put them in a big distributed key-value store and serve them with minimal latency. That has a great advantage: cache misses no longer exist. If you&#8217;ve precomputed every possible cache entry, and a key isn&#8217;t in the cache, you can be sure that there&#8217;s no data for that key.</p>
<p>If that sounds crazy to you, consider these points:</p>
<ul>
<li>A database index is a special case of a precomputed cache. For every value you might want to search for, the index tells you where to find occurrences of that value. If it&#8217;s not in the index, it&#8217;s not in the database. The initial index creation is a one-off batch job, and thereafter the database automatically keeps it in sync with the raw data. Yes, databases have been doing this for a long time.</li>
<li>With Hadoop you can process terabytes of data without breaking a sweat. That is truly awesome power.</li>
<li>There are several datastores that allow you to precompute their files in Hadoop, which makes them very well suited for serving the cache that you precomputed. We are currently using <a href='http://www.project-voldemort.com/voldemort/'>Voldemort</a> in read-only mode (<a href='http://static.usenix.org/events/fast12/tech/full_papers/Sumbaly.pdf'>research paper</a>), but <a href='http://hbase.apache.org/book/arch.bulk.load.html'>HBase</a> and <a href='https://github.com/nathanmarz/elephantdb'>ElephantDB</a> can do this too.</li>
<li>If you&#8217;re currently storing data in denormalized form (to avoid joins on read queries), you can stop doing that. You can keep your primary database in a very clean, normalized schema, and any caches you derive from it can denormalize the data to your heart&#8217;s content. This gives you the best of both worlds.</li>
</ul>
<h2 id='separating_communication_from_business_logic'>Separating communication from business logic</h2>
<p>Ok, say you&#8217;ve decided that you want to precompute a cache in Hadoop. As we&#8217;ve not yet addressed cache invalidation (see below), let&#8217;s just say you&#8217;re going to rebuild the entire cache once a day. That means the data you serve out of the cache will be stale, out of date by up to a day, but that&#8217;s still acceptable for some applications.</p>
<p>The first step is to get your raw data into HDFS. That&#8217;s not hard, assuming you have daily database backups: you can take your existing backup, transform it into a more MapReduce-friendly format such as <a href='http://avro.apache.org/'>Avro</a>, and write it straight to HDFS. Do that with all your production databases and you&#8217;ve got a fantastic resource to work with in Hadoop.</p>
<p>Now, to build your precomputed cache, you need to apply the same business logic to the same data as you would in an uncached service that does it on the fly. As described above, your business logic takes as input the request parameters from the user and any data that is loaded from databases or services in order to serve that request. If you have all that data in HDFS, and you can work out all possible request parameters, then in theory, you should be able to take your existing business logic implementation and run it in Hadoop.</p>
<p>Business logic can be very complex, so you should probably aim to reuse the existing implementation rather than rewriting it. But doing so requires untangling the real business logic from all the network communication logic.</p>
<p>When your business logic is running as a service processing individual requests, you&#8217;re used to making several small requests to databases, caches or other services as part of generating a response (see the blog example above). Those small requests constitute gathering all the inputs needed by the business logic in order to produce its output (e.g. a rendered HTML page).</p>
<p>But when you&#8217;re running in Hadoop, this is all turned on its head. You don&#8217;t want to be making individual random-access requests to data, because that would be an order of magnitude too slow. Instead you need to use MapReduce to gather all the inputs for one particular evaluation of the business logic into one place, and then run the business logic given those inputs without any network communication. Rather than the business logic <em>pulling</em> together all the bits of data it needs in order to produce a response, the MapReduce job has already gathered all the data it knows the business logic is going to need, and <em>pushes</em> it into the business logic function.</p>
<p>Let&#8217;s use the blog example to make this more concrete. The data dependency is fairly simple: when the blog post <code>params[:id]</code> is requested, we require the row in the <code>posts</code> table whose <code>id</code> column matches the requested post, and we require all the rows in the <code>comments</code> table whose <code>post_id</code> column matches the requested post. If the <code>posts</code> and <code>comments</code> tables are in HDFS, it&#8217;s a very simple MapReduce job to group together the post with <code>id = x</code> and all the comments with <code>post_id = x</code>.</p>
<p>We can then use a stub database implementation to feed those database rows into the existing <code>Post</code> and <code>Comment</code> model objects. That way we can make the models think that they loaded the data from a database, even though actually we had already gathered all the data we knew it was going to need. The model objects can keep doing their job as normally, and the output they produce can be written straight to the cache.</p>
<p>By this point, two problems should be painfully clear:</p>
<ul>
<li>How does the MapReduce job know what inputs the business logic is going to need in order to work?</li>
<li>OMG, implementing stub database drivers, isn&#8217;t that a bit too much pain for limited gain? (Note that in testing frameworks it&#8217;s not unusual to stub out your database, so that you can run your unit tests without a real database. Still, it&#8217;s non-trivial and annoying.)</li>
</ul>
<p>Both problems have the same cause, namely that the network communication logic is triggered from deep inside the business logic.</p>
<h2 id='data_dependencies'>Data dependencies</h2>
<p>When you look at the business logic in the light of precomputing a cache, it seems like the following pattern would make more sense:</p>
<ol>
<li>Declare your data dependencies: &#8220;if you want me to render the blog post with ID <code>x</code>, I&#8217;m going to need the row in the <code>posts</code> table with <code>id = x</code>, and also all the rows in the <code>comments</code> table with <code>post_id = x</code>&#8221;.</li>
<li>Let the communication logic deal with resolving those dependencies. If you&#8217;re running as a normal web app, that means making database (or memcache) queries to one or more databases, and maybe talking to other services. If you&#8217;re running in Hadoop, it means configuring the MapReduce job to group together all the pieces of data on which the business logic depends.</li>
<li>Once all the dependencies have been loaded, the business logic is now a pure function, deterministic and side-effect-free, that produces our desired output. It can perform whatever complicated computation it needs to, but it&#8217;s not allowed access to the network or data stores that weren&#8217;t declared as dependencies up front.</li>
</ol>
<p>This separation would make application architecture very different from the way it is commonly done today. I think this new style would have several big advantages:</p>
<ul>
<li>By removing the assumption that the business logic is handling one request at a time, it becomes much easier to run the business logic in completely different contexts, such as in a batch job to precompute a cache. (No more stubbing out database drivers.)</li>
<li>Testing becomes much easier. All the tricky business logic for which you want to write unit tests is now just a function with a bunch of inputs and a bunch of outputs. You can easily vary what you put in, and easily check that the right thing comes out. Again, no more stubbing out the database.</li>
<li>The network communication logic can become a lot more helpful. For example, it can make several queries in parallel without burdening the business logic with a lot of complicated concurrency stuff, and it can deduplicate similar requests.</li>
<li>Because the data dependencies are very clearly and explicitly modelled, the system becomes easier to understand, and it becomes easier to move modules around, split a big monolithic beast into smaller services, or combine smaller services into bigger, logical units.</li>
</ul>
<p>I hope you agree that this is a very exciting prospect. But is it practical?</p>
<p>In most cases, I think it would not be very hard to make business logic pure (i.e. stop making database queries from deep within) &#8212; it&#8217;s mostly a matter of refactoring. I have done it to substantial chunks of the Rapportive code base, and it was a bit tedious but perfectly doable. And the network communication logic wouldn&#8217;t have to change much at all.</p>
<p>The problem of making this architecture practical hinges on having a good mechanism for declaring data dependencies. The idea is not new &#8212; for instance, LinkedIn have an internal framework for resolving data dependencies that queries several services in parallel &#8212; but I&#8217;ve not yet seen a language or framework that really gets to the heart of the problem.</p>
<p>Adapting the blog example above, this is what I imagine such an architecture would look like:</p>
<p><a href='/2012/10/architecture-high-02.png'><img alt='Concept for using a dependency resolver' height='119' src='/2012/10/architecture-02.png' width='550' /></a></p>
<p>We still have models, and they are still used as encapsulations of state, but they are no longer wrappers around a database connection. Instead, the dependency resolver can take care of the messy business of talking to the database; the models are pure and can focus on the business logic. The models don&#8217;t care whether they are instantiated in a web app or in a Hadoop cluster, and they don&#8217;t care whether the data was loaded from a SQL database or from HDFS. That&#8217;s the way it should be.</p>
<p>In my spare time I have started working on a language called <strong>Flowquery</strong> (don&#8217;t bother searching, there&#8217;s nothing online yet) to solve the problem of declaring data dependencies. If I can figure it out, it should make precomputed caches and all the good things above very easy. But it&#8217;s not there yet, so I don&#8217;t want to oversell it.</p>
<p>But wait, there is one more thing&#8230;</p>
<h2 id='cache_invalidation'>Cache invalidation</h2>
<blockquote>
<p>There are only two hard things in Computer Science: cache invalidation and naming things. &#8212; <a href='http://martinfowler.com/bliki/TwoHardThings.html'>Phil Karlton</a></p>
</blockquote>
<p>How important is it that the data in your cache is up-to-date and consistent with your &#8220;source of truth&#8221; database? The answer depends on the application and the circumstances. For example, if the user edits their own data, you almost certainly want to show them an up-to-date version of their own data post-editing, otherwise they will assume that your app is broken. But you might be able to get away with showing stale data to other users for a while. For data that is not directly edited by users, stale data may always be ok.</p>
<p>If staleness is acceptable, caching is fairly simple: on a read-through cache you set an expiry time on a cache key, and when that time is reached, the entry falls out of the cache. On a precomputed cache you do nothing, and just wait until the next time you recompute the entire thing.</p>
<p>In cases where greater consistency is required, you have to explicitly invalidate cache entries when the original data changes. If just one cache key is affected by a change, you can write-through to that cache key when the &#8220;source of truth&#8221; database is updated. If many keys may be affected, you can use <a href='http://37signals.com/svn/posts/3113-how-key-based-cache-expiration-works'>generational caching</a> and <a href='https://groups.google.com/forum/#!msg/memcached/OiScvRbGaU8/C1vny7DiGakJ'>clever generalisations thereof</a>. Whatever technique you use, it usually ends up being a lot of manually written, fiddly and error-prone code. Not a great joy to work with, hence the terribly clichéd quote above.</p>
<p>But&#8230; observe the following: in our efforts to separate pure business logic from network communication logic, we decided that we needed to explicitly model the data dependencies, and only data sources declared there are permitted as inputs to the business logic. In other words, the data dependency framework knows exactly which pieces of data are required in order to generate a particular piece of output &#8212; and conversely, when a piece of (input) data changes, it can know exactly which outputs (cache entries) may be affected by the change!</p>
<p>This means that if we have a real-time feed of changes to the underlying databases, we can feed it into a stream processing framework like <a href='http://storm-project.net/'>Storm</a>, run the data dependency analysis in reverse on every change, recompute the business logic for each output affected by the change in input, and write the results to another datastore. This store sits alongside the precomputed cache we generated in a batch process in Hadoop. When you want to query the cache, check both the output of the batch process and the output of the stream process. If the stream process has generated more recent data, use that, otherwise use the batch process output.</p>
<p>If you&#8217;ve been following recent news in Big Data, you may recognise this as an application of Nathan Marz&#8217; <a href='http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html'>lambda architecture</a> (described in detail in his <a href='http://www.manning.com/marz/'>upcoming book</a>). I cannot thank Nathan enough for his amazing work in this area.</p>
<p>In this architecture, you get the benefits of a precomputed cache (every request is fast, including the first one), it keeps itself up-to-date with the underlying data, and because you have already declared your data dependencies, you don&#8217;t need to manually write cache invalidation code! The same dependency declaration can be used in three different ways:</p>
<ol>
<li>In &#8216;online&#8217; mode in a service or web app, for driving the network communication logic in order to make all the required queries and requests in order to serve an incoming request, and to help with read-through caching.</li>
<li>In &#8216;offline&#8217; mode in Hadoop, to configure a MapReduce pipeline that brings together all the required data in order to run it through the business logic and generate a precomputed cache of all possible queries.</li>
<li>In &#8216;nearline&#8217; mode in Storm, to configure a stream processing topology that tracks changes to the underlying data, determines which cache keys need to be invalidated, and recomputes the cache values for those keys using the business logic.</li>
</ol>
<p>I am designing Flowquery so that it can be used in all three modes &#8212; you should be able to write your data dependencies just once, and let the framework take care of bringing all the necessary data together so that the business logic can act on it.</p>
<p>My hope is to make caching and cache invalidation as simple as database indexes. You declare an index once, the database runs a one-off batch job to build the index, and thereafter automatically keeps it up-to-date as the table contents change. It&#8217;s so simple to use that we don&#8217;t even think about it, and that&#8217;s what we should be aiming for in the realm of caching.</p>
<p>The project is still at a very early stage, but hopefully I&#8217;ll be posting more about it as it progresses. If you&#8217;d like to hear more, please <a href='http://eepurl.com/csJmf'>leave your email address</a> and I&#8217;ll send you a brief note when I post more. Or you can follow me on <a href='https://twitter.com/martinkl'>Twitter</a> or <a href='https://alpha.app.net/martinkl'>App.net</a>.</p>
<p><em>Thanks to Nathan Marz, Pete Warden, Conrad Irwin, Rahul Vohra and Sam Stokes for feedback on drafts of this post.</em></p>
</div>
<div class="by-line">
<address class="author vcard full-author">
<span class="by">By</span> <a class="url fn" href="http://martin.kleppmann.com/">Martin Kleppmann</a>
</address>
<span class="date full-date">
<abbr class="published" title="2012-10-01T00:00:00-07:00">01 Oct 2012</abbr>
</span>
</div>
<p class="comments-link">
<a href="/2012/10/01/rethinking-caching-in-web-apps.html#disqus_thread">Comments</a>
</p>
<div class="clear"></div>
</div>
<div class="hentry full p1 post publish">
<h1 class="entry-title full-title">
<a href="/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html" rel="bookmark">Java's hashCode is not safe for distributed systems</a>
</h1>
<div class="entry-content full-content">
<p>As you probably know, hash functions serve many different purposes:</p>
<ol>
<li>Network and storage systems use them (in the guise of checksums) to detect accidental corruption of data.</li>
<li>Crypographic systems use them to detect malicious corruption of data and to implement signatures.</li>
<li>Password authentication systems use them to make it harder to extract plaintext passwords from a database.</li>
<li>Programming languages use them for hash maps, to determine in which hash bucket a key is placed.</li>
<li>Distributed systems use them to determine which worker in a cluster should handle a part of a large job.</li>
</ol>
<p>All those purposes have different requirements, and different hash functions exist for the various purposes. For example, <a href='http://en.wikipedia.org/wiki/Cyclic_redundancy_check'>CRC32</a> is fine for detecting bit corruption in Ethernet, as it&#8217;s really fast and easy to implement in hardware, but it&#8217;s useless for cryptographic purposes. <a href='http://tools.ietf.org/html/rfc3174'>SHA-1</a> is fine for protecting the integrity of a message against attackers, as it&#8217;s cryptographically secure and also reasonably fast to compute; but if you&#8217;re storing passwords, you&#8217;re probably better off with something like <a href='http://codahale.com/how-to-safely-store-a-password/'>bcrypt</a>, which is <em>deliberately</em> slow in order to make brute-force attacks harder.</p>
<p>Anyway, that&#8217;s all old news. Today I want to talk about points 4 and 5, and why they are also very different from each other.</p>
<p><strong>Hashes for hash tables</strong></p>
<p>We use hash tables (dictionaries) in programming languages all the time without thinking twice. When you insert an item into a hash table, the language computes a hash code (an integer) for the key, uses that number to choose a bucket in the hash table (typically <code>mod n</code> for a table of size <code>n</code>), and then puts the key and value in that bucket in the table. If there&#8217;s already a value there (a hash collision), a linked list typically takes care of storing the keys and values within the same hash bucket. In Ruby, for example:</p>
<pre><span class='ansi1 ansi31'>$</span> ruby --version
ruby 1.8.7 (2011-06-30 patchlevel 352) [i686-darwin11.0.0]
<span class='ansi1 ansi31'>$</span> pry
[1] pry(main)&gt; hash_table = {<span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span> =&gt; <span class='ansi1 ansi34'>42</span>}
=&gt; {<span class='ansi1 ansi32'>&quot;</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>&quot;</span>=&gt;<span class='ansi1 ansi34'>42</span>}
[2] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>-1246806696</span>
[3] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>-1246806696</span>
[4] pry(main)&gt; ^D
<span class='ansi1 ansi31'>$</span> pry
[1] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>-1246806696</span>
[2] pry(main)&gt; <span class='ansi1 ansi32'>&quot;</span><span class='ansi32'>don't panic</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>-464783873</span>
[3] pry(main)&gt; ^D
</pre>
<p>When you add the key <code>&#39;answer&#39;</code> to the hash table, Ruby internally calls the <code>#hash</code> method on that string object. The method returns an arbitrary number, and as you see above, the number is always the same for the same string. A different string usually has a different hash code. Occasionally you might get two keys with the same hash code, but it&#8217;s extremely unlikely that you get a large number of collisions in normal operation.</p>
<p>The problem with the example above: when I quit Ruby (<code>^D</code>) and start it again, and compute the hash for the same string, I still get the same result. <em>But why is that a problem,</em> you say, <em>isn&#8217;t that what a hash function is supposed to do?</em> &#8211; Well, the problem is that I can now put on my evil genius hat, and generate a list of strings that all have the same hash code:</p>
<pre><span class='ansi1 ansi31'>$</span> pry
[1] pry(main)&gt; <span class='ansi1 ansi32'>&quot;</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
[2] pry(main)&gt; <span class='ansi1 ansi32'>&quot;\0</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
[3] pry(main)&gt; <span class='ansi1 ansi32'>&quot;\0\0</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
[4] pry(main)&gt; <span class='ansi1 ansi32'>&quot;\0\0\0</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
[5] pry(main)&gt; <span class='ansi1 ansi32'>&quot;\0\0\0\0</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
[6] pry(main)&gt; <span class='ansi1 ansi32'>&quot;\0\0\0\0\0</span><span class='ansi32'>a</span><span class='ansi1 ansi32'>&quot;</span>.hash
=&gt; <span class='ansi1 ansi34'>100</span>
</pre>
<p>Any server in the world running the same version of Ruby will get the same hash values. This means that I can send a specially crafted web request to your server, in which the request parameters contain lots of those strings with the same hash code. Your web framework will probably parse the parameters into a hash table, and they will all end up in the same hash bucket, no matter how big you make the hash table. Whenever you want to access the parameters, you now have to iterate over a long list of hash collisions, and your swift O(1) hash table lookup is suddenly a crawling slow O(n).</p>
<p>I just need to make a small number of these evil requests to your server and I&#8217;ve brought it to its knees. This type of denial of service attack was already <a href='http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003.pdf'>described</a> back in 2003, but it only became widely known last year, when Java, Ruby, Python, PHP and Node.js all suddenly <a href='http://www.ocert.org/advisories/ocert-2011-003.html'>scrambled</a> to fix the issue.</p>
<p>The solution is for the hash code to be consistent within one process, but to be different for different processes. For example, here is a more recent version in Ruby, in which the flaw is fixed:</p>
<pre><span class='ansi1 ansi31'>$</span> ruby --version
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]
<span class='ansi1 ansi31'>$</span> pry
[1] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>968518855724416885</span>
[2] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>968518855724416885</span>
[3] pry(main)&gt; ^D
<span class='ansi1 ansi31'>$</span> pry
[1] pry(main)&gt; <span class='ansi1 ansi32'>'</span><span class='ansi32'>answer</span><span class='ansi1 ansi32'>'</span>.hash
=&gt; <span class='ansi1 ansi34'>-150894376904371785</span>
[2] pry(main)&gt; ^D
</pre>
<p>When I quit Ruby and start it again, and ask for the hash code of the same string, I get a completely different answer. This is obviously not what you want for cryptographic hashes or checksums, since it would render them useless &#8212; but for hash tables, it&#8217;s exactly right.</p>
<p><strong>Hashes for distributed systems</strong></p>
<p>Now let&#8217;s talk about distributed systems &#8212; systems in which you have more than process, probably on more than one machine, and they are talking to each other. If you have something that&#8217;s too big to fit on one machine (too much data to fit on one machine&#8217;s disks, too many requests to be handled by one machine&#8217;s CPUs, etc), you need to spread it across multiple machines.</p>
<p>How do you know which machine to use for a given request? Unless you have some application-specific partitioning that makes more sense, a hash function is a simple and effective solution: hash the name of the thing you&#8217;re requesting, mod number of servers, and that&#8217;s your server number. (Though if you ever want to change the number of machines, <a href='http://michaelnielsen.org/blog/consistent-hashing/'>consistent hashing</a> is probably a better bet.)</p>
<p>For this setup you obviously don&#8217;t want a hash function in which different processes may compute different hash codes for the same value, because you&#8217;d end up routing requests to the wrong server. You can&#8217;t use the same hash function as the programming language uses for hash tables.</p>
<p>Unfortunately, this is <a href='http://squarecog.wordpress.com/2011/02/20/hadoop-requires-stable-hashcode-implementations/'>exactly</a> what Hadoop does. <a href='http://storm-project.net/'>Storm</a>, a stream processing framework, <a href='https://github.com/nathanmarz/storm/blob/33a2ea5/src/clj/backtype/storm/tuple.clj#L7-8'>does too</a>. Both use the Java Virtual Machine&#8217;s <a href='http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()'>Object.hashCode()</a> method.</p>
<p>I understand the use of <code>hashCode()</code> &#8212; it&#8217;s very tempting. On strings, numbers and collection classes, <code>hashCode()</code> always returns a consistent value, apparently even across different JVM vendors. It&#8217;s like that despite the <a href='http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()'>documentation</a> for <code>hashCode()</code> explicitly <em>not</em> guaranteeing consistency across different processes:</p>
<blockquote>
<p>Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. <em>This integer need not remain consistent from one execution of an application to another execution of the same application.</em></p>
</blockquote>
<p>And once in a while, a bold library comes along that actually returns different <code>hashCode()</code> values in different processes &#8211; <a href='http://code.google.com/p/protobuf/'>Protocol Buffers</a>, for example &#8211; and <a href='https://groups.google.com/forum/?fromgroups#!topic/protobuf/MCk1moyWgIk'>people get quite confused</a>.</p>
<p>The problem is that although the documentation says <code>hashCode()</code> doesn&#8217;t provide a consistency guarantee, the Java standard library behaves as if it <em>did</em> provide the guarantee. People start relying on it, and since backwards-compatibility is rated so highly in the Java community, it will probably never ever be changed, even though the documentation would allow it to be changed. So the JVM gets the worst of both worlds: a hash table implementation that is open to DoS attacks, but also a hash function that can&#8217;t always safely be used for communication between processes. :(</p>
<p><strong>Therefore&#8230;</strong></p>
<p>So what I&#8217;d like to ask for is this: if you&#8217;re building a distributed framework based on the JVM, <strong>please don&#8217;t</strong> use Java&#8217;s <code>hashCode()</code> for anything that needs to work across different processes. Because it&#8217;ll look like it works fine when you use it with strings and numbers, and then someday a brave soul will use (e.g.) a protocol buffers object, and then spend days banging their head against a wall trying to figure out why messages are getting sent to the wrong servers.</p>
<p>What should you use instead? First, you probably need to serialize the object to a byte stream (which you need to do anyway if you&#8217;re going to send it over the network). If you&#8217;re using a serialization that always maps the same values to the same sequence of bytes, you can just hash that byte stream. A cryptographic hash such as MD5 or SHA-1 would be ok for many cases, but might be a bit heavyweight if you&#8217;re dealing with a really high-throughput service. I&#8217;ve heard good things about <a href='http://code.google.com/p/smhasher/'>MurmurHash</a>, which is non-cryptographic but lightweight and claims to be well-behaved.</p>
<p>If your serialization doesn&#8217;t always produce the same sequence of bytes for a given value, then you can still define a hash function on the objects themselves. Just please don&#8217;t use <code>hashCode()</code>. It&#8217;s ok for in-process hash tables, but distributed systems are a different matter.</p>
<p>(Oh, and in case you were wondering: it looks like the web servers affected by Java&#8217;s hashCode collisions fixed the problem not by changing to a different hash function, but simply by limiting the number of parameters: <a href='http://svn.apache.org/viewvc/tomcat/tc7.0.x/trunk/java/org/apache/tomcat/util/http/Parameters.java?r1=1195977&amp;r2=1195976&amp;pathrev=1195977'>Tomcat</a>, <a href='https://github.com/eclipse/jetty.project/commit/085c79d7d6cfbccc02821ffdb64968593df3e0bf'>Jetty</a>.)</p>
</div>
<div class="by-line">
<address class="author vcard full-author">
<span class="by">By</span> <a class="url fn" href="http://martin.kleppmann.com/">Martin Kleppmann</a>
</address>
<span class="date full-date">
<abbr class="published" title="2012-06-18T00:00:00-07:00">18 Jun 2012</abbr>
</span>
</div>
<p class="comments-link">
<a href="/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html#disqus_thread">Comments</a>
</p>
<div class="clear"></div>
</div>
<div class="hentry full p1 post publish">
<h1 class="entry-title full-title">
<a href="/2011/08/16/founderly-interview.html" rel="bookmark">My FounderLY interview</a>
</h1>
<div class="entry-content full-content">
<p>Matthew from <a href='http://www.founderly.com/'>FounderLY</a> wondered what it would have been like to watch raw video footage of Steve Jobs, Bill Gates, and other tech founders during their formative years. So he&#8217;s been going around interviewing young startup founders, for posterity and for other founders&#8217; inspiration. A pretty interesting effort.</p>
<p>A few weeks ago he asked whether he could <a href='http://www.founderly.com/2011/07/martin-kleppmann-rapportive-1-of-2/'>interview me</a> for the site. Although it would be rather presumptious to put myself in the category of potential future Steve Jobses, I agreed.</p>
<p>So here you go &#8211; a tidily scripted set of questions from Matthew, and some chaotically unscripted stream-of-consciousness replies from me. The video comes in two parts, about 22 minutes in total, and a transcript is below.</p>
<iframe frameborder='0' height='309' src='http://player.vimeo.com/video/25790273?title=0&amp;byline=0&amp;portrait=0' width='549'>
<a href='http://vimeo.com/25790273'>View on Vimeo</a>
</iframe>
<p><a href='http://vimeo.com/25790273'>Martin Kleppmann interview, part 1</a> from <a href='http://vimeo.com/founderly'>FounderLY</a> on <a href='http://vimeo.com'>Vimeo</a></p>
<iframe frameborder='0' height='309' src='http://player.vimeo.com/video/25790604?title=0&amp;byline=0&amp;portrait=0' width='549'>
<a href='http://vimeo.com/25790604'>View on Vimeo</a>
</iframe>
<p><a href='http://vimeo.com/25790604'>Martin Kleppmann interview, part 2</a> from <a href='http://vimeo.com/founderly'>FounderLY</a> on <a href='http://vimeo.com'>Vimeo</a></p>
<p><strong>Transcript</strong></p>
<p><strong>Matthew Wise:</strong> Hi this is Matthew Wise with FounderLY.com. We empower entrepreneurs to have a voice and share their story with the world, enabling others to learn about building products and starting companies.</p>
<p>I&#8217;m really excited today because I&#8217;m here with Martin Kleppmann, founder of Rapportive. Rapportive shows you everything about your contacts inside your email box, enabling you to see who people are and where they are based, so that you can connect and collaborate over shared interests. So, Martin, we&#8217;d love you to give our audience a brief bio.</p>
<p><strong>Martin:</strong> Sure. I&#8217;m originally from Germany, which explains my weird accent, and then I went to the UK for several years to study computer science. That was in Cambridge. After that, I started a startup; it was called Go Test It, we made a tool for automated cross-browser testing of websites. That was pretty cool, and it was acquired a few years ago. After that, I was looking around for something new to do, and together with two friends we started Rapportive.</p>
<p>What we do now is to pull photos, job details from LinkedIn, recent tweets and all of this stuff into Gmail, and show it right there.</p>
<p><strong>Matthew:</strong> What makes Rapportive unique, who is it for and why are you so passionate about it?</p>
<p><strong>Martin:</strong> It&#8217;s really for people who do a lot of email, particularly emailing with people who you don&#8217;t really know well. If you only ever email with ten different people, then you wouldn&#8217;t need it — but most of us, particularly startup founders, are constantly dealing with investors, outside advisors, users emailing us, potential customers, potential partners, people on emailing lists… all of these people, we vaguely know who they are, but not really. And actually, it is really important that you build this personal contact with them, and get to know them personally.</p>
<p>Previously, when people got an email from someone, they would go and search Google, try to find their Twitter account, try to find them on LinkedIn, and this just takes a lot of time. And we&#8217;ve just automated all of that. The idea is that now, you can actually respond to people personally and build up that personal connection. It&#8217;s little things: even just being able to see the photo of someone in your email… firstly, that&#8217;s a deep visceral connection: you connect much more with them than if you&#8217;re just looking at a wall of text; and also, if you meet them in real life, well, you&#8217;re much more likely to be able to recognise them. I think that makes your email a better place; it&#8217;s really excellent.</p>
<p><strong>Matthew:</strong> What are some of the technology and market trends that currently exist, and how do you see things developing in the future in your space?</p>
<p><strong>Martin:</strong> I&#8217;m not sure about the big trends. There are a lot of things, but they are all very subtle things. For example, people caring a lot about user experience, and we take that really seriously. We put ridiculous amounts of effort into making sure that stuff works really nicely.</p>
<p>Other things that are happening: we are having to deal with more and more people, and people expect that you don&#8217;t just get an automated stock reply, but that people actually engage with you personally. That&#8217;s the future, I think. We&#8217;ve already got that in one-to-one communication between individuals, but the big trend is that companies as a whole are starting to be more personal with the outside. They are no longer this corporate brand, this cold, anonymous thing, but you actually expect to be able to see the people behind that brand, and be able to engage with them directly and build a relationship. And those relationships are what matter, because… if you&#8217;re just competing on price, your customers can just go somewhere else, but if you can build up a relationship with your customers, that&#8217;s really really powerful.</p>
<p>We think that&#8217;s what we are enabling, by giving you this social substrate for your communications.</p>
<p><strong>Matthew:</strong> Can you tell us what inspired you to start Rapportive? Was there an &#8220;aha&#8221; moment, or did market research lead you to the opportunity? What’s the story behind it?</p>
<p><strong>Martin:</strong> It really came from something we wanted ourselves. I think everyone says this! In my previous startup (and my cofounders also had a previous startup), we were all trying to do a lot of engaging with people personally, getting out there, learning a lot from people, really understanding where they were coming from. And that was so much effort! I’d keep lists of people in a custom database or in spreadsheets or in CRM systems like Highrise, and I&#8217;d have to keep them up to date by hand. I’d make a lot of notes about people, even just for myself, just so that I could remember when I came back to them six months later: what interactions I’d had with them, what we’d talked about.</p>
<p>But I then found that all of this information would go stale: for example, I had entered someone&#8217;s job details and then they’d change job… and I’m not going to go and re-enter all of this stuff! It’s already out there on the web — really, software should just do this stuff automatically; there’s no reason why I should have to type this in again.</p>
<p>And then, also, why should I have to always change over to another browser tab in order to search for something, and have five tabs open with different searches for stuff? It’s just ridiculous, this stuff should be in the tool which I use all the time anyway, which is email.</p>
<p>And so, those are the two premises we started with. We wanted something which keeps itself up-to-date automatically from all the data which is already out there; you shouldn’t have to re-enter anything. And secondly, it should be in the workflow of the tool you already use, which, for most of us is primarily email. And on that premise we said: what can we build? Oh, well, let’s just stick something on the side of Gmail, see how it works. And people loved it.</p>
<p><strong>Matthew:</strong> Excellent! Who is your cofounder, how did you meet, what qualities were you looking for in a cofounder, and how did you know they’d be a good fit?</p>
<p><strong>Martin:</strong> I have two cofounders, Rahul and Sam; there are three of us. They are both really excellent people. I had known them for a while before starting: we were together in an office space, a kind of co-working space in Cambridge, UK. They were working on their previous startup, and I was working on my previous startup; we worked together a bit, we had lunch together every day, and just ended up talking about a lot of things.</p>
<p>We found that partly we thought the same in a lot of ways, and partly we also had different but nicely complementary ways of thinking. We had a shared culture but often different perspectives, which helped us to together find the best way of doing stuff. And that’s really the basis on which we work. I think we have a very strong sense of a culture and making sure we work together very well, so we are constantly getting better at what we do.</p>
<p><strong>Matthew:</strong> From idea to product launch, how long did it take, and when did you actually launch?</p>
<p><strong>Martin:</strong> It was pretty quick actually: from first UI mockups to launch it was less than two months. We weren’t actually intending to launch: we had just put up this little website. We were applying for Y Combinator at the time and we also had some other people who were interested, so we wanted to show some potential investors what we were doing. Put up a little website; it wasn’t protected, but just at unknown URL.</p>
<p>And then somehow the press got hold of this, and within a day we found ourselves with 10,000 users on our hands, because it just went wild through all of the blogs. That was a totally crazy experience: we had thought, &#8220;well, we’ve built this little thing, let’s give it to 10 people and see how it works&#8221;, and suddenly we have this massive load of people coming in. And we were working, working very hard, firstly trying to keep the servers up, but fortunately they held up quite nicely. Then also responding to all of the tweets, responding to all of the emails that were coming in. There was lots and lots of stuff happening very quickly; at that point we knew that we were on to something pretty exciting.</p>
<p><strong>Matthew:</strong> And then you formally launched when?</p>
<p><strong>Martin:</strong> We considered that our launch after the fact; we then said, &#8220;Well, OK, I guess we’ve launched now. Oh well, we’ve launched.&#8221; And then since then we’ve, at times, launched new features but that original bit of press we regard as our real launch.</p>
<p><strong>Matthew:</strong> Are there any unique metrics or social proof about Rapportive that you’d like to share with our audience?</p>
<p><strong>Martin:</strong> I think the thing I find most exciting: we always have a Twitter search going on — we have a big screen in the office, showing what people are saying about Rapportive on Twitter — and there’s just this constant stream of people loving it. I’m really humbled all of the time I see this. Every hour there’s stuff coming in from people saying things like, &#8220;This product has changed our life.&#8221;</p>
<p>And that’s just amazing: when people will actually go out of their way to say something like that, and we’re not even particularly prompting them. So yeah, we have hundreds of thousands of users at the moment, but the important thing is really how much people care about it.</p>
<p><strong>Matthew:</strong> We know founders face unique challenges when they start a company. What was the hardest part about launching or starting Rapportive, and how did you overcome this obstacle?</p>
<p><strong>Martin:</strong> So we had a bit of a frustrating phase over the last summer. We were working very, very hard and there was lots going on, but our product was making very little visible progress, because we were spending all of our time firefighting, scaling our database because we had so much stuff coming in that we had to do a lot of work to re-architect it. We were doing a lot of groundwork for features which are just coming out now, but in technical groundwork there are months of work which is just invisible. We were moving country because we were all coming from the UK, moving to San Francisco, and we were fighting with US immigration. We were also spending a lot of time on support — which is good, it’s really valuable, because we learn a lot about the problems that people have, but again it’s very time consuming.</p>
<p>So, with all of those things, it’s all useful stuff; there’s nothing really wasteful there. But on the other hand, our product wasn’t making progress, and people were starting to ask, &#8220;Well, you’ve been around for six months now, nine months now, and you’ve not really released any exciting new features. What’s going on?&#8221; And we were just saying, &#8220;Yeah, we’re trying to get to it, we’re doing what we can!&#8221;</p>
<p>And then I was so happy when, towards the end of 2010, we got over this big hump of stuff, and now we’re putting out features again and there is much more visible progress. So that was a fairly hard phase to go through, but I’m really glad we got over it. In the end you just have to work through it. You just have to not give up, just keep on going, keep on going, even if it’s getting tough.</p>
<p><strong>Matthew:</strong> Since you’ve been in operation, what have you learned about your business and your users that you didn’t realize before you launched?</p>
<p><strong>Martin:</strong> When we first launched I was a bit cautious. I was wondering: &#8220;are people going to be really freaked out by seeing how much information is actually publicly available about them on the web?&#8221; You know, when you think about it rationally, it’s obvious: you can just search for someone on Google, and for most people you’ll actually get a pretty good idea of who this person is just by looking at the search results. And we’ve just taken away a step by automating a lot of that search, making it more convenient by putting it in email.</p>
<p>And so I was expecting that there’d be a lot of people who would go, &#8220;Oh my God, no, privacy is dead!&#8221; But we tried to manage that very carefully: whenever anyone was concerned, we listen to them and respond to any concerns very quickly, and explain what we’re doing, why we’re doing it and why we think it’s absolutely fine. We are all very privacy conscious and we make that very clear as well. We don’t mess around with people’s private data; we only show information which people actually want to be public.</p>
<p>And that is something I found surprising: just how quickly we can defuse any situation. If anyone was upset we’d just talk to them quietly, patiently, and explain what’s going on. If there was any problem, fix it quickly — and all the problems suddenly go away. And that’s really encouraging, because it means that we seem to actually be doing the right thing: pushing the envelope a bit. But yes, it works.</p>
<p><strong>Matthew:</strong> What is it that you make look easy? What skill or talent comes easy or intuitively to you, and what has been difficult and how do you manage that?</p>
<p><strong>Martin:</strong> I’d say: what we, as a team, are particularly good at is product design. Making something which is very neat, stays out of your way, but is still powerful; which does exactly the stuff you need, not more, not less; and just behaves the way people expect it to behave, without running into a weird corner where you don’t know what to do.</p>
<p>And that is actually really hard to achieve. The amount of time we spend on optimizing the workflows for different users, depending on which starting state they’re coming from, which screens they have to go through and exactly what button we can show in which place, exactly what copy we use, what words we use to describe things, then taking them through the flow… and then, to the user, all it looks like is: &#8220;oh, I clicked a button, a pop-up appeared, I clicked another button and it worked.&#8221;</p>
<p>That’s something we really enjoy: making that look easy, but a lot of work goes into it. In the end people just appreciate it as a product which is really nicely designed, which just works and which gives them a kind of warm, fuzzy feeling.</p>
<p><strong>Matthew:</strong> What’s the most important lesson you’ve learned since launching Rapportive?</p>
<p><strong>Martin:</strong> The most important lesson? I’ve not really graded them in a particular priority.</p>
<p>I’d say, off the top of my head: caring about user experience and caring about users was something we thought from the start was really important — and that has really been validated. People appreciate us for having a product which just works nicely, and which has the little details thought out.</p>
<p>People appreciate that we get back to them quickly, that we’re always very friendly when responding to them, that we’re trying to be personal where we can.</p>
<p><strong>Matthew:</strong> Martin, what bit of advice do you wish you would have known before starting Rapportive?</p>
<p><strong>Martin:</strong> I think what’s really interesting is that in a startup everything is magnified. If you have any issue early on, that will just continue, continue, get bigger and bigger, so if you have any issue early on then make sure you fix it early on. I think we’ve generally done a pretty good job of that. But it’s worth doing that really consciously.</p>
<p>Certain things are really hard, but you need to get good at them. For example, communicating and sharing intuitions, that’s a topic that I’ve been thinking about a lot. We find that, since we’re three cofounders, we often have similar ideas about things, but and then often find that they differ in subtle ways. Really what we want to do is to combine our three intuitions into one, so that together, we have a really good broad and also deep insight into what people want. That requires that you find ways of explaining to the others not just <em>what</em> you think, but <em>why</em> you think it.</p>
<p>And that’s really hard to learn, and we’ve gradually been getting better at that. As you go about things, just be conscious of the fact that it’s going to take a lot of effort and time, even just to learn to speak the same language. You think you all speak English, but then you find, of course, that you make up your own words to describe the domain you’re working in. A lot of things are just completely non-obvious.</p>
<p>You get a lot of conflicting advice from outside mentors. We have a lot of really good investors, advisors, mentors, and often they say completely contradictory things — and that’s fine. You just need to learn to absorb those things into your own intuition, and within the team work out how you can share those intuitions. Then you can have a coherent vision, all together, for what you’re going to build, why it’s important, how you’re going to go forward.</p>
<p><strong>Matthew:</strong> What bit of advice would you like to share with our audience about launching a startup? If you have to distill it, what are the key elements?</p>
<p><strong>Martin:</strong> One thing, which worked in our favor but is not necessarily particularly replicable: if your product works well for journalists, then journalists will write about it quite a lot. We didn’t realize this initially, but it happened to be the fact that, Rapportive works really well for people who deal with a lot of incoming weird stuff from lots of people they don’t know, and need to assess very quickly whether the sources are reliable. And, well, that’s pretty much what journalists do.</p>
<p>It was also the case that when we started Rapportive, a lot of the data we had about people was not particularly great, but bloggers tend to be the kind of people who are very present on social media, so we had really great data for them! And that worked in our favor. Since then we’ve got a lot better at data for everyone else, and now we’ve got a pretty high coverage rate for everyone. But for that initial launch, just working well for reporters and bloggers was pretty good.</p>
<p>But of course, you can’t choose your startup based on the fact that it’s going to be useful for bloggers, so that’s not very useful advice.</p>
<p>There are lots of different schools of thought for launching and they all kind of make sense. There’s the &#8220;launch small and make sure that you’re continuously learning&#8221; school, and that makes a lot of sense. And then there’s also the school which observes that, if you can get a lot of very quick press that generates a lot of excitement and a lot of buzz, that’s also valuable. In the end, with these things there’s never a right answer; you just have to take in all of the bits of advice you hear and create your personal conglomerate of what makes sense.</p>
<p><strong>Matthew:</strong> Before we close, I would love for you to give our audience your vision of Rapportive and how you hope it will change the world.</p>
<p><strong>Martin:</strong> We’ve got a lot of really exciting things coming. I don’t want to talk about them in too much detail, but to give a rough outline:</p>
<p>I think, firstly, the inbox is a really, really interesting place, because that’s where all of your communications come together. Email is the primary one we use at the moment; I don’t know, maybe it’ll be Facebook mail within two years’ time, but that doesn’t really matter, that’s beside the point.</p>
<p>The point is that people are really, really opinionated about which tool they want to use, and getting people to change tool is really, really hard. So we’re building Rapportive in the philosophy that we don’t people to change behavior; we just want people to continue doing what they’re doing already, and just make it better.</p>
<p>Just add those little magic touches, add little things which either save you time, or which take something which was previously laborious (and required switching to other browser tabs and required re-entering of data), and make all of that go away. Just make it be there, and make common tasks feel natural.</p>
<p>That’s the philosophy with which we’re going about things, and that seems to be working pretty well.</p>
<p><strong>Matthew:</strong> Excellent. Well, Martin, it’s been a pleasure having you as a guest on FounderLY. We’re rooting for your success at Rapportive. For those in our audience who’d like to learn more you can visit their website at www.rapportive.com and register to become a user and join their community. This is Matthew Wise with FounderLY. Thanks so much, Martin.</p>
<p><strong>Martin:</strong> Thank you, Matthew.</p>
</div>
<div class="by-line">
<address class="author vcard full-author">
<span class="by">By</span> <a class="url fn" href="http://martin.kleppmann.com/">Martin Kleppmann</a>
</address>
<span class="date full-date">
<abbr class="published" title="2011-08-16T00:00:00-07:00">16 Aug 2011</abbr>
</span>
</div>
<p class="comments-link">
<a href="/2011/08/16/founderly-interview.html#disqus_thread">Comments</a>
</p>
<div class="clear"></div>
</div>
<div class="hentry full p1 post publish">
<h1 class="entry-title full-title">
<a href="/2011/05/24/evolution-of-rapportive-new-design.html" rel="bookmark">Evolution of Rapportive's new design</a>
</h1>
<div class="entry-content full-content">
<p><em>This is a re-post from the <a href='http://blog.rapportive.com/53827077'>Rapportive blog</a>.</em></p>
<p>Today we are launching a new design for Rapportive. We put a huge amount of effort into this design, because we all believe deeply great user experience, and we know that our users really appreciate it too.</p>
<p>Here it is (best viewed fullscreen):</p>
<iframe allowfullscreen='allowfullscreen' frameborder='0' height='309' src='http://www.youtube.com/embed/DPaSxa2vopU' width='549'>
<a href='http://www.youtube.com/watch?v=DPaSxa2vopU'>View on YouTube</a>
</iframe>
<p>In this post I&#8217;d like to explain some of our process and thinking in the creation of the new design.</p>
<p>Our old design, which we had <a href='http://blog.rapportive.com/the-accidental-launch'>since launch</a>, has served us very well. It was subtle, simple, effective, and blended in well with Gmail. Unfortunately, it was beginning to show some limitations:</p>
<ol>
<li>If you&#8217;re using Gmail on a small screen, on a laptop or even a netbook, the Rapportive sidebar would often be too tall to fit on screen. Of course you can scroll down, but the main Gmail scrollbar is already used to scroll down in the conversation. We hooked into the page scrolling, but if you were on a long conversation, you had to scroll all the way to the bottom of the conversation if you wanted to see the rest of the Rapportive sidebar. That was really annoying.</li>
<li>Over the months we&#8217;ve been adding more and more useful stuff to the sidebar: <a href='http://blog.rapportive.com/address-book-inbox-together-at-last'>phone numbers</a>, <a href='http://blog.rapportive.com/40551428'>Facebook status updates</a>, <a href='http://blog.rapportive.com/grow-your-network-with-rapportive'>LinkedIn invitations</a>, and a choice of <a href='http://raplets.com/'>Raplets</a> for a <a href='http://thenextweb.com/apps/2010/04/29/rapportive/'>variety of</a> <a href='http://blog.rapportive.com/get-out-of-your-inbox-and-meet-people-in-pers'>different</a> <a href='http://blog.rapportive.com/rapportive-for-developers-bitbucket-github-st'>purposes</a>. This has pushed the previous design to its limits: some profiles would become unmanageably tall.</li>
<li>Different people find different parts of the sidebar useful. Some find <a href='http://twitter.com/GraemeF/status/25286282993213440'>recent tweets</a> most useful, others swear by our <a href='http://twitter.com/smsbnyc/status/40436337681104896'>CRM raplets</a>, others again leave <a href='http://twitter.com/nickcernis/status/15001395635691520'>lots of notes</a> about their contacts. But it sucks if the thing you&#8217;re most interested is often scrolled off the bottom of the screen because the things higher up in the sidebar are taking all the space.</li>
</ol>
<p>These three points have a common theme: we do not handle long sidebars well.</p>
<p>How long can a sidebar get? Well, here&#8217;s my sidebar, with the CrunchBase and the MailChimp Raplets:</p>
<p style='text-align: center'>
<img alt='Example of very tall Rapportive profile' height='1361' src='/2011/05/tall_profile.png' width='230' />
</p>
<p>As we add more features to Rapportive, these problems would only get worse, so we decided that it was time to rethink our design.</p>
<h2 id='our_design_principles'>Our design principles</h2>
<p>We had several guiding principles for the redesign. Rapportive should:</p>
<ul>
<li>Remain very subtle and unobtrusive: Rapportive should be there for you when you want it, but should not try to grab your attention or use more space than necessary. Your <em>email</em> is what&#8217;s important, not the sidebar!</li>
<li>Allow you to serendipitously discover things about your contacts: the information should simply be there when you glance at the sidebar, and shouldn&#8217;t require a lot of clicking or scrolling.</li>
<li>Look good on both large and small screens, both with lots of data and little data. That means it has to make efficient use of screen space.</li>
<li>Avoid configuration dialogs. The interface should just do the right thing.</li>
<li>Be clear, obvious to use, beautiful and <em>enjoyable</em> to interact with :)</li>
</ul>
<h2 id='buttons'>Buttons</h2>
<p>First of all, I started with some graphical tweaking. Here are some nice buttons:</p>
<p style='text-align:center'>
<a href='/2011/05/connect_buttons.png'><img alt='Different sylings for &apos;add friend&apos;/&apos;connect&apos; buttons' src='/2011/05/connect_buttons.png' /></a>
</p>
<h2 id='how_do_we_handle_long_sidebars'>How do we handle long sidebars?</h2>
<p>Our first thought: we could simply give the Rapportive sidebar its own scrollbar. That would avoid having to scroll down to the bottom of a long conversation, because you could scroll the sidebar separately. But that approach is pretty bad. A large part of the sidebar may still end up being hidden off-screen, which means you have to go out of your way to scroll down, making it unlikely that you&#8217;ll discover things serendipitously.</p>
<p>Another idea that we ruled out quickly was a &#8216;tabbed&#8217; interface. Tabs work well in a browser, where you have exactly one web page per tab, and each tab is independent. Rapportive isn&#8217;t like that: we might have several pages of information for one person (i.e. the tabs aren&#8217;t independent), and while we have several full tabs for one person, the information for another person might be a lot more sparse. That means that either you have to either have to waste a lot of space (e.g. always have a tab for tweets, even if the person doesn&#8217;t have a Twitter account), or you have things appearing in different places for different people (which is confusing). Finally, tabs require a lot of laborious clicking: you can&#8217;t see what&#8217;s in a tab without clicking it, and you can&#8217;t see the contents of two tabs at the same time. Tabs would have been an unpleasant, clunky interface.</p>
<p>An &#8216;accordion&#8217; interface, like you find in Outlook for example, seemed like a step in a more promising direction:</p>
<p style='text-align:center'>
<a href='/2011/05/accordion.png'><img alt='Accordion interface example' src='/2011/05/accordion.png' /></a>
</p>
<p>You can have several sections, and each section can expand to show additional information when you click it. When you expand one section, another one collapses to make room. (In the screenshot above, I could click &#8216;Contacts&#8217; to see my list of contacts; the list of mailboxes under the &#8216;Mail&#8217; heading would be hidden to make room for the list of contacts.)</p>
<p>Accordions are fairly efficient when space is very limited, but they suck if you have a large screen. If you have enough space that all your sections could be comfortably expanded side-by-side, why limit it to only one expanded section at a time? I don&#8217;t want to have to click a section to see what&#8217;s inside. It&#8217;s the same problem as with tabs.</p>
<p>So I started experimenting with a kind of adaptive accordion which could have several sections expanded at the same time, if there was enough screen space available. Here are some early design ideas:</p>
<p style='text-align:center'>
<a href='/2011/05/section_headings.png'><img alt='Three designs for separators/headings between sidebar sections' src='/2011/05/section_headings.png' width='550' /></a>
</p>
<p>The designs were fairly ugly, but I could see an algorithm emerging here. I figured that we needed an accordion with the following improvements:</p>
<ul>
<li>It should be possible for several sections to be expanded at the same time, up to a maximum of what will fit on screen without scrolling.</li>
<li>When you expand a section, other sections may need to collapse in order to make space on screen. The application should be intelligent about which sections to collapse &#8211; for example, if you haven&#8217;t clicked a section in a long time, you probably find it less useful than a section which you clicked just now. So we should keep the recently-used sections expanded, if possible.</li>
<li>The collapsed version of a section should be useful too; for example, the collapsed Twitter section could show just the username, whereas the expanded version could show the username and three most recent tweets. If the contact doesn&#8217;t have a Twitter account, we shouldn&#8217;t show the section at all, since it would just be a waste of space.</li>
<li>Sometimes a section gets very tall, for example a Facebook status with lots of comments. In that case, we need to limit the section&#8217;s height and give it a scrollbar, to avoid it dominating the entire sidebar. But if we can avoid scrollbars, we should do without.</li>
</ul>
<p>Even scrollbars, despite being such a standard part of user interface design, have their problems:</p>
<p style='text-align:center'>
<a href='/2011/05/scrolling.png'><img alt='Misalignment of text due to scrollbar spacing' src='/2011/05/scrolling.png' /></a>
</p>
<p>Fortunately such spacing issues are easy to iron out. A harder question is: how do we communicate to the user that they can expand a section?</p>
<h2 id='expandable_sections'>Expandable sections</h2>
<p>In the old design, if you clicked someone&#8217;s Twitter username, we would open up a new browser tab to show their tweet stream. In an accordion design, however, you&#8217;d expect that clicking the collapsed section will cause it to expand (i.e. show recent tweets, not open a new browser tab). Do we break the old interaction and force users to learn a new behaviour, or do we add an extra button for expanding a section?</p>
<p style='text-align:center'>
<a href='/2011/05/expand1.png'><img alt='Up/down arrow button to trigger expansion of a section' src='/2011/05/expand1.png' /></a>
</p>
<p>That was my first attempt. The arrows serve two purposes: to indicate that the section can be expanded, and to act as a button to trigger the actual expanding. But I didn&#8217;t like it. Some sections would have arrows and others wouldn&#8217;t (because we don&#8217;t always have additional information), so the &#8220;connect&#8221;/&#8221;add friend&#8221; buttons could become strangely misaligned. It also made the interface look more cluttered and complicated.</p>
<p>Surely we could do better? For example, could we show the button for expanding a section only when you hover over it? That&#8217;s an interesting idea. What&#8217;s more, we could then calculate how tall the section would be if it was expanded, and indicate it with the height of the arrow:</p>
<p style='text-align:center'>
<a href='/2011/05/expand2.png'><img alt='Large arrow when hovering mouse over expanded section; click the arrow to expand' src='/2011/05/expand2.png' width='550' /></a>
</p>
<p>At this point we&#8217;re really moving away from established user interface patterns. Will users notice the arrow, and understand what it means? Will they figure out that you can click it? It contains a lot of information: the fact that the section can be expanded, how big it will be and where it will be placed when expanded. It&#8217;s also a much bigger click target than the previous arrow button, which is good. But it still has the problem of not looking particularly like a button. (It goes grey when you hover over it, to indicate that you can interact with it, but still it&#8217;s not exactly obvious.)</p>
<p>I was starting to get sick of this redesign, but fortunately, inspiration struck again. If we&#8217;re already using an arrow on hover to indicate how tall the expanded section will be, well&#8230; why don&#8217;t we use <em>the expanded section itself</em> to indicate how tall it will be? Yes, we can just show the expanded section itself on hover!</p>
<p style='text-align:center'>
<a href='/2011/05/expand3.png'><img alt='Showing the expanded version when hovering over a collapsed section' src='/2011/05/expand3.png' width='550' /></a>
</p>
<p>It&#8217;s obvious in retrospect, but it took a surprisingly long time to come up with this design. I call it the <a href='http://en.wikipedia.org/wiki/Jinn_in_popular_culture'>&#8220;genie&#8221;</a> because it looks like a ghost that has come out of an oil lamp. Although the version above still isn&#8217;t pretty, it really got to the bottom of many of the challenges I discussed at the beginning:</p>
<ul>
<li>When you&#8217;re not hovering your mouse over the sidebar, Rapportive remains minimalistic and uncluttered. No more buttons than necessary.</li>
<li>If you&#8217;re on a large screen, the sections of the sidebar expand to fill the screen space, so you can see everything at a glance. No need to click anything (unless you want to add them as a friend, of course).</li>
<li>If you&#8217;re on a smaller screen, we collapse your lesser-used sections to make them more compact; in most cases, this allows us to fit everything on screen without need for scrolling.</li>
<li>When you&#8217;re interacting with the sidebar, we allow Rapportive to pop up additional information outside of the sidebar. But we don&#8217;t encroach on your email space when you&#8217;re not interacting with the sidebar.</li>
<li>The visual cues makes it pretty obvious that the expanded section is a bigger, more verbose version of the collapsed section. (I think it&#8217;s a bit like a magnifying glass.)</li>
<li>If there are several collapsed sections next to each other in the sidebar, you can skim the content of all of them by slowly moving the mouse across each of the collapsed sections. As the mouse moves from one section to another, the genie for the previous section disappears and the one for the new section appears. Wonderful for getting a quick overview of someone&#8217;s online activity before you reply to their email.</li>
</ul>
<p>Not all is great though. If there is space, we&#8217;d still like to show you the expanded section in the sidebar right away, without you having to hover the mouse. But how do we make clear to users that sections can be expanded?</p>
<p>In the screenshot above, I tried using a button with an arrow pointing right. If you clicked it, the expanded section would slide into the sidebar, and the other sections would move out of the way to make room. I liked the animation, but the button was ugly. Fortunately, Rahul had the idea that we could use a right-pointing mouse cursor to indicate the same thing. So now you can click anywhere on the genie and it will slide into the sidebar.</p>
<p>After some graphical tweaking, this is what the final design looks like:</p>
<p style='text-align:center'>
<a href='/2011/05/expand4.png'><img alt='End result: Rapportive&apos;s new sidebar design with collapsible section' src='/2011/05/expand4.png' width='550' /></a>
</p>
<p>I hope you&#8217;ll agree that it is <em>gorgeous</em>. We have added very little new user interface (and we&#8217;ve taken a lot away, compared to earlier designs), but the result is very effective. It fits neatly on both big and small screens, it is easy to use, and it is actually lots of fun to play with. Give it a try for yourself, and let us know what you think!</p>
<p>Usability without design is dreary. Design without usability is pretentious. Design and usability together are delightful.</p>
</div>
<div class="by-line">
<address class="author vcard full-author">
<span class="by">By</span> <a class="url fn" href="http://martin.kleppmann.com/">Martin Kleppmann</a>
</address>
<span class="date full-date">
<abbr class="published" title="2011-05-24T00:00:00-07:00">24 May 2011</abbr>
</span>
</div>
<p class="comments-link">
<a href="/2011/05/24/evolution-of-rapportive-new-design.html#disqus_thread">Comments</a>
</p>
<div class="clear"></div>
</div>
</div>
<div id="sidebar">
<div id="carrington-subscribe" class="widget">
<h2 class="widget-title">Subscribe</h2>
<a class="feed" title="RSS 2.0 feed for posts" rel="alternate" href="http://feeds.feedburner.com/martinkl">
Site <acronym title="Really Simple Syndication">RSS</acronym> feed
</a>
<div id="mc_embed_signup">
<p>
Enjoyed this? To get notified when I write something new,
<a href="http://twitter.com/martinkl">follow me</a> on Twitter,
<a href="http://feeds.feedburner.com/martinkl">subscribe</a> to the RSS feed,
or type in your email address:
</p>
<form action="http://rapportive.us2.list-manage.com/subscribe/post?u=9a1adaf549282981a96e171d1&amp;id=4543b695f6"
method="post" id="mc-embedded-subscribe-form" name="mc-embedded-subscribe-form" class="validate" target="_blank">
<fieldset>
<div class="mc-field-group">
<label for="mce-EMAIL">Email:</label>
<input type="text" value="" name="EMAIL" class="required email" id="mce-EMAIL">
<input type="submit" value="Subscribe" name="subscribe" id="mc-embedded-subscribe" class="btn">
</div>
<div id="mce-responses">
<div class="response" id="mce-error-response" style="display:none"></div>
<div class="response" id="mce-success-response" style="display:none"></div>
</div>
</fieldset>
</form>
<p class="disclaimer">
I won't give your address to anyone else, won't send you any spam, and you can unsubscribe at any time.
</p>
</div>
</div>
<div id="carrington-about" class="widget">
<div class="about">
<h2 class="title">About</h2>
<p>Hello! I'm Martin Kleppmann, entrepreneur and software craftsman.
I co-founded <a href="http://rapportive.com/">Rapportive</a>
(<a href="http://blog.rapportive.com/rapportive-acquired-by-linkedin">acquired</a>
by <a href="http://www.linkedin.com/">LinkedIn</a> in 2012) and Go Test It (acquired by
<a href="http://www.red-gate.com/">Red Gate Software</a> in 2009).</p>
<p>I care about making stuff that people want, great people and culture, the web and
its future, marvellous user experiences, maintainable code and scalable architectures.</p>
<p>I'd love to hear from you, so please leave comments, or feel free to
<a rel="author" href="/contact.html">contact me directly</a>.</p>
</div>
</div>
<div id="primary-sidebar">
</div>
<div id="secondary-sidebar">
<div id="carrington-archives" class="widget">
<h2 class="title">Recent posts</h2>
<ul>
<li>08 Oct 2012: <a href="/2012/10/08/complexity-of-user-experience.html">The complexity of user experience</a></li>
<li>01 Oct 2012: <a href="/2012/10/01/rethinking-caching-in-web-apps.html">Rethinking caching in web apps</a></li>
<li>18 Jun 2012: <a href="/2012/06/18/java-hashcode-unsafe-for-distributed-systems.html">Java's hashCode is not safe for distributed systems</a></li>
<li>16 Aug 2011: <a href="/2011/08/16/founderly-interview.html">My FounderLY interview</a></li>
<li>24 May 2011: <a href="/2011/05/24/evolution-of-rapportive-new-design.html">Evolution of Rapportive's new design</a></li>
<li><a href="/archive.html">Full archive</a></li>
</ul>
</div>
</div>
</div>
</div> <!-- div.wrapper, started in 'before.html' -->
<hr class="divider" />
<div id="footer">
<div class="wrapper">
<p id="generator-link">
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/"
style="float: left; padding: 0.3em 1em 0 0;"><img alt="Creative Commons License"
src="http://i.creativecommons.org/l/by/3.0/88x31.png" /></a>
Unless otherwise specified, all content on this site is licensed under a
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/">Creative Commons
Attribution 3.0 Unported License</a>.
Theme borrowed from
<span id="theme-link"><a href="http://carringtontheme.com" title="Carrington theme for WordPress">Carrington</a></span>,
ported to <a href="https://github.com/mojombo/jekyll">Jekyll</a> by Martin Kleppmann.
</p>
</div>
</div>
</div>
<script type="text/javascript">
var disqus_shortname = 'martinkl';
(function () {
var s = document.createElement('script'); s.async = true;
s.type = 'text/javascript';
s.src = 'http://disqus.com/forums/' + disqus_shortname + '/count.js';
(document.getElementsByTagName('HEAD')[0] || document.getElementsByTagName('BODY')[0]).appendChild(s);
}());
</script>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-7958895-1");
pageTracker._trackPageview();
} catch (err) {}
</script>
</body>
</html>
Jump to Line
Something went wrong with that request. Please try again.