-
Notifications
You must be signed in to change notification settings - Fork 0
/
atom.xml
186 lines (125 loc) · 20.1 KB
/
atom.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title><![CDATA[Category: Business | Bear Metal]]></title>
<link href="http://bearmetal.eu/theden/categories/business/atom.xml" rel="self"/>
<link href="http://bearmetal.eu/"/>
<updated>2014-01-15T11:43:00+02:00</updated>
<id>http://bearmetal.eu/</id>
<author>
<name><![CDATA[Bear Metal OÜ]]></name>
</author>
<generator uri="http://octopress.org/">Octopress</generator>
<entry>
<title type="html"><![CDATA[Do You Know the Biggest Reason for Why Enterprise Software Sucks?]]></title>
<link href="http://bearmetal.eu/theden/do-you-know-the-biggest-reason-why-enterprise-software-sucks/"/>
<updated>2014-01-15T11:43:00+02:00</updated>
<id>http://bearmetal.eu/theden/do-you-know-the-biggest-reason-why-enterprise-software-sucks</id>
<content type="html"><![CDATA[<p><em>We all know the story. Your company was going to get this big new shiny ERP software. It was going to replace a third of the workforce in the company, cut the costs in half and make everyone happy. In reality the project went two years over schedule, cost three times as much as envisioned, and the end result was a steaming pile of shit.</em></p>
<p><a href="http://www.flickr.com/photos/53326337@N00/8043877054/"><img src="https://farm9.staticflickr.com/8453/8043877054_883963cf80_c.jpg" alt="" /></a></p>
<p><small>Photo by <a href="http://www.flickr.com/photos/53326337@N00/8043877054/">Quinn Dombrowski</a>, used under the Creative Commons license.</small></p>
<p>At this point started the blame-throwing. The provider duped the client with waterfall and exorbitant change fees. The buyer didn’t know how to act as a client in an information system project. The specs weren’t good/detailed/strict/loose enough. The consultants just weren’t that good in the first place. On and on and on.</p>
<p>While one or more of the above invariably are true in failed software projects, there’s one issue that almost each and every failed enterprise software project has in common: <em>the buyers were not (going to be) the users of the software</em>.</p>
<p>This simple fact has huge implications. Ever heard that “the client didn’t really know what they wanted”? Well, that’s because they didn’t. Thus, most such software projects are built with something completely different than the end user in mind. Be it the ego of the CTO, his debt to his mason brothers who happen to be in the software business<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>, or just the cheapest initial bid<sup id="fnref:2"><a href="#fn:2" rel="footnote">2</a></sup>. In any case, it’s in the software provider’s best interest to appeal to the decisionmaker, not the people actually using the system.</p>
<p>Of course, not every software buyer is as bad as described above. Many truly care about the success of the system and even its users. If for no other reason, at least because it has a direct effect on the company’s bottom line. But even then, they just don’t have the first-hand experience of working in the daily churn. They simply can’t know what’s best for the users. Of course, this gets even worse in the design-by-committee, big-spec-upfront projects.</p>
<p>Since it’s not very likely that we could change the process of making large software project purchases any time soon, what can we as software vendors do? One word: <em>empathy</em>. If you just take a spec and implement it with no questions asked, shame on you. You deserve all the blame. Your job is not to implement what the spec says. Heck, your job isn’t even to create what the client wants. Your job is to build what the client – no, the end users – need. For this – no matter how blasphemous it might sound to an engineer – you have to actually <em>talk</em> to the people that will be using your software.</p>
<p>This is why it’s so important to put the software developers to actually do what the end-users would. <strong>If you’re building call-center software, make the developers work in the call center a day or a week. If you’re building web apps, make the developers and designers work the support queue, don’t just outsource it to India.</strong></p>
<p>There is no better way to understand the needs for software you’re building than to talk directly to its users or use it yourself for real, in a real-life situation. While there aren’t that many opportunities to dog-fooding when building (perhaps internal) enterprise software for a client, there’s nothing preventing you from sending your people to the actual cost center. Nothing will give as much insight to the needs and pains of the actual users. No spec will ever give you as broad a picture. No technical brilliance will ever make up for lacking domain knowledge. And no client will ever love you as much as the one in the project where you threw yourself (even without being asked) on the line of fire. That’s what we here at Bear Metal insist on doing at the start of every project. I think you should, too.</p>
<hr />
<p><em>We at Bear Metal have some availability open for short and mid-term projects. If you’re looking for help building, running, scaling or marketing your web app, <a href="mailto:info@bearmetal.eu">get in touch</a>.</em></p>
<div class="footnotes">
<hr/>
<ol>
<li id="fn:1">
<p>It’s surprising how often the same people actually represent both the buyer and the seller. This happens all the time e.g. in the patient care systems projects.<a href="#fnref:1" rev="footnote">↩</a></p></li>
<li id="fn:2">
<p>Nevermind that the cheapest initial bid almost always balloons to something completely different in the end.<a href="#fnref:2" rev="footnote">↩</a></p></li>
</ol>
</div>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[Are You Flying Blind – How to Regain Control of Production Systems With the Help of Situation Awareness?]]></title>
<link href="http://bearmetal.eu/theden/situation-awareness/"/>
<updated>2013-09-03T16:57:00+03:00</updated>
<id>http://bearmetal.eu/theden/situation-awareness</id>
<content type="html"><![CDATA[<p><figure markdown="1">
<a href="http://www.flickr.com/photos/robbn1/3391187126/">
<img src="https://farm4.staticflickr.com/3454/3391187126_4e62f6a374_b.jpg">
</a></p>
<p> <figcaption>
<p>
Photo by <a href="http://www.flickr.com/photos/robbn1/3391187126/">Robb North</a>
</p>
</figcaption>
</figure></p>
<p>A few months of work during a sabbatical yielded a product that nailed a problem in the preventative healthcare space. After a freemium window, the product gained good market traction and you spawn a new company with three coworkers. Customers are raving, sales trends are on the up, the engineering team is growing and there are conceptual products in the pipeline.</p>
<p>Three months down the line, there are 4 production applications, hordes of paying customers, a few big contracts with strict SLAs (service-level agreements) and enough resources to spin off a presence in Europe. A new feature that combines these products into a suite is slated for release. Engineering hauled ass for 2 months and sales is super stoked to be able to pitch it to customers.</p>
<h2>Shipping</h2>
<p>A few days before the feature release a set of new servers is provisioned to buffer against the upcoming marketing push. Due diligence on various fronts was completed, mostly through static analysis of the current production stack by various individuals. Saturday morning at 1am PST they deploy during a window with a historically low transaction volume. Representatives of a few departments sign off on the release, although admittedly there are still dark corners and the OK is mostly based off a few QA passes. Champagne pops, drinks are being had and everyone calls it a day. But then…</p>
<h2>When things go south</h2>
<p>At 9am PST various alerts flood the European operations team - only 25% of the platform’s available, support is overwhelmed and stress levels go up across the board. Some public facing pages load intermittently, MySQL read load is sky high and application log streams are blank. This deployment, as with most naive releases, was flying blind. A snapshot of a working system prior to release isn’t of much value if it can’t be easily reproduced after rollout for comparison.</p>
<p>Based on assumptions about time, space and other variables there was a total lack of <strong>situation awareness</strong> and thus no visibility into expected impact of these changes. Running software that pays the bills is today more important than a flashy new feature. However, one must move forward and there are processes and tools available for mitigating risk.</p>
<h2>What is situation awareness?</h2>
<p>Situation awareness can be defined as an engineering team’s knowledge of both the internal and external states of their production systems, as well as the environment in which it is operating. Internal states refer to health checks, statistics and other monitoring info. The external environment refers to things we generally can’t directly control: Humans and their reactions; hosting providers and their networks; acts of god and other environmental issues.</p>
<p>It’s thus <em>a snapshot in time of system status that provides the primary basis for decision making and operation of complex systems</em>. Experience with a given system gives team members the ability to remain aware of everything that is happening concurrently and to integrate that sense of awareness into what they’re doing at any moment.</p>
<h2>How situation awareness could have helped?</h2>
<p>The new feature created a dependency tree between 4 existing applications, a lightweight data synchronization service (Redis) and the new nodes that were spun up. Initial investigation and root cause analysis revealed that the following went wrong:</p>
<ul>
<li>The Redis server was configured for only 1024 connections and it tanked over when backends warmed up as the client connection was lazily initialized.</li>
<li>Initial data synchronization (cache warmup) put excessive load on MySQL and other data stores also used for customer facing reporting.</li>
<li>The data payloads used for synchronization were often very large for outlier customers, effectively blocking the Redis server’s event loop, also causing memory pressure.</li>
<li>The new nodes were spun up with a wrong Ruby major version and also missed critical packages required for normal operations.</li>
<li>A new feature that rolls the “logger” utility into some core init scripts piggybacked on this release. A syntax error fubar’ed output redirection and thus there weren’t any log streams.</li>
</ul>
<p>Without much runtime introspection in place, it was very difficult to predict what the release impact would be. Although not everything could be covered ahead of time for this release, even with basic runtime analysis, monitoring and good logging it would have been possible to spot trends and avoid issues bubbling up systematically many hours later.</p>
<p>Another core issue here is the “low traffic” release window. It’s often considered good practice to release during such times to minimize fallout for the worst case, however it’s sort of akin to commercial Boeing pilots only training on Cessnas. Any residual and overlooked issues tend to also only surface hours later when traffic ramps up again. This divide between cause and effect complicates root cause analysis immensely. You’d want to be able to infer errors from the system state, worst case QA or an employee and most definitely not customers interacting with your product at 9am.</p>
<p>One also cannot overlook the fact that suddenly each team now had a direct link with at least 3 other applications, new (misconfigured) backends and Redis at this point in time. Each team however only still mostly had a mental model of a single isolated application.</p>
<h2>Why situation awareness is so important?</h2>
<p>We at Bear Metal have been through a few technology stacks in thriving businesses and noticed a recurring theme and problem. Three boxes become fifty, ad-hoc nodes are spun up for testing, special slaves are provisioned for data analysis, applications are careless with resources and a new service quickly becomes a platform-wide single point of failure. Moving parts increase exponentially and so do potential points of failure.</p>
<p>Engineering, operations and support teams often have no clue what runs where, or what the dependencies are between them. This is especially true for fast growing businesses that reach a critical mass - teams tend to become more specialized, information silos are common and thus total system visibility is also quite narrow. Having good knowledge of your runtime (or even just a perception) is instrumental in making informed decisions for releases, maintenance, capacity planning and discovering potential problems ahead of time. Prediction only makes sense once there’s a good perception of “current state” in place to minimize the rendering of fail whales.</p>
<h2>Web operations and awareness</h2>
<p>Operations isn’t about individuals, but teams. The goal is to have information exchange between team members and other teams being as passive as possible. Monitoring, alerting and other push based systems help a lot with passive learning about deployments. It’s mostly effortless and easy for individuals to build up knowledge and trends over time.</p>
<p>However, when we actively need to search for information, we can only search for what we already know exists. It’s impossible to find anything we’re not aware of. Given the primary goal of an operations team is platform stability in the face of changes, time to resolution (TTR) is always critical and actively seeking out information when under pressure is a luxury.</p>
<p>Historically a systemwide view has always been the territory of the CTO, operations team and perhaps a handful of platform or integration engineers. Inline with devops culture, we need to acknowledge this disconnect and explore solutions for raising situation awareness of critical systems for all concerned.</p>
<h2>And now</h2>
<p>Take a minute and ponder the following :</p>
<ul>
<li>How well do you think you know your systems?</li>
<li>Are developers able to infer potential release risks themselves?</li>
<li>When things go south, how well informed is your support team and what information can they give customers?</li>
<li>Are you comfortable releasing at any time?</li>
</ul>
<p>In our next post, we’ll explore some common components, variables and events required for being “on top” of your stack. In the meantime, what causes you the most pain when trying to keep up with your production systems? What would you write a blank cheque for? :-)</p>
<p><a href="https://news.ycombinator.com/item?id=6332734">Discuss on Hacker News</a>.</p>
]]></content>
</entry>
<entry>
<title type="html"><![CDATA[It’s Not About Us, It’s About You (and Not Really About You, Either)]]></title>
<link href="http://bearmetal.eu/theden/its-not-about-us-its-about-you/"/>
<updated>2013-08-14T15:51:00+03:00</updated>
<id>http://bearmetal.eu/theden/its-not-about-us-its-about-you</id>
<content type="html"><![CDATA[<p><figure markdown="1">
<a href="http://www.flickr.com/photos/22711505@N05/8987537991/in/photolist-eGctVz-Rw9Az-RVB46-d1oXub-3od3r2-5maP4G-d1pEjS-2z4E45-7zg9zD-dk9TPx-dk9ViL-62gdUK-cbhisA-dk9VoE-6sjCDX-6sjCQ4-8EcAxc-d1pPAm-aSuUEX-7WR5Va-9SMEDS-mzFjV-9SrEU9-5bvnMg-bbvqpn-6S8uV4-6ScRvQ-8ypVzY-bbvqoz-7YzNHn-eecQ8W-d1pDv9-6QYGRg-62QwZR-fbZJU2-bXZ3iw-hZ9cW-hZ9cX-hZa5p-hZa5q-5iHQyg-4SnuGn-g5GYQ-9dbkFa-b5vRQc-d1pF7C-5oG5br-bcHysn-c3QYZW-c3QYiN-749XSZ/"><img src="https://farm3.staticflickr.com/2859/8987537991_795f7568ca_h.jpg" alt="" /></a></p>
<p> <figcaption markdown="1">
<p>Photo by <a href="http://www.flickr.com/photos/22711505@N05/8987537991/in/photolist-eGctVz-Rw9Az-RVB46-d1oXub-3od3r2-5maP4G-d1pEjS-2z4E45-7zg9zD-dk9TPx-dk9ViL-62gdUK-cbhisA-dk9VoE-6sjCDX-6sjCQ4-8EcAxc-d1pPAm-aSuUEX-7WR5Va-9SMEDS-mzFjV-9SrEU9-5bvnMg-bbvqpn-6S8uV4-6ScRvQ-8ypVzY-bbvqoz-7YzNHn-eecQ8W-d1pDv9-6QYGRg-62QwZR-fbZJU2-bXZ3iw-hZ9cW-hZ9cX-hZa5p-hZa5q-5iHQyg-4SnuGn-g5GYQ-9dbkFa-b5vRQc-d1pF7C-5oG5br-bcHysn-c3QYZW-c3QYiN-749XSZ/">Ron Cogswell</a></p>
</figcaption>
</figure></p>
<p>One of the saddest things to happen online was in 2007 when my all-time favorite author and presenter, Kathy Sierra, received death threaths and thus retreated from the public web. It also meant that she stopped writing her <a href="http://headrush.typepad.com">Creating Passionate Users</a> weblog, which had been a great inspiration for me for quite some time. Thank god she didn’t <a href="http://ejohn.org/blog/eulogy-to-_why/">pull a _why</a> on it.</p>
<p>While it’s more than six years since Kathy’s last blog post (is it really that long?), there is no reason we shouldn’t apply her lessons even in today’s online world.</p>
<p>Maybe the most famous mantra of Sierra was that in order to create passionate users you should make <em>them</em> kick ass. Sure, it’s nice if your UI boasts übercool 3D CSS transformations but if it doesn’t help your users shine, no one (well, except for some web geeks) will give a flying fuck.</p>
<p>She demonstrated this with the fact that very often companies spend a huge amount of effort and money to hone the living daylights off their marketing materials but don’t really put that much time into what actually helps their users: tutorials and user manuals. Of course this had helped her immensively by creating a market for the visual <a href="http://www.headfirstlabs.com">Head First</a> book series on O’Reilly that she curated.</p>
<p>Apple has for a long time been a good example of helping its users kick ass. The user manual of the old Final Cut Pro 7 was also a great introduction to the art of video editing<sup id="fnref:1"><a href="#fn:1" rel="footnote">1</a></sup>. Likewise, most of Apple ads show things you can do and <em>create</em> with their products, not just random people dancing around the pool.</p>
<p>People care about how they can kick ass <em>themselves</em> and they need to be able to learn it to capitalize on it. Nowadays it seems that companies are much more interested in giving people free apps and then using psychological tricks to milk money out of them than helping them shine. Which, coincidentally, brings us back to Kathy Sierra.</p>
<p>To my pleasant surprise, I last week learned that Kathy is back with the pseudonym <a href="https://twitter.com/seriouspony">Serious Pony</a>, and <a href="http://seriouspony.com/blog/">a new blog</a> of the same name. The first article, <a href="http://seriouspony.com/blog/2013/7/24/your-app-makes-me-fat">Your app makes me fat</a>, is of the same awesome quality as her old pieces. In it, she tackles head-on the aforementioned gamification trend and the <a href="http://youarenotsosmart.com/2012/04/17/ego-depletion/">ego depletion</a> tax it puts on us as app users.</p>
<p>To honor Kathy, we wanted to start this blog off by not talking about us ourselves, because <a href="http://youarenotsosmart.com/2012/04/17/ego-depletion/">Bear Metal</a> isn’t really about us, but you. And – assuming you are a developer, entrepreneur or content provider – not really about you either. It’s about who we (you and us) serve. Because without them there is no market, no audience, no need, no problems to solve, no pains to relieve. Your customers should be the ones that matter to you. And they don’t care about you or us. They care about whether your product can make them shine.</p>
<p>Can your product help <em>them kick ass</em>? <em>Does it</em>? Are you <em>communicating that effectively</em> to your current and potential customers? That is all that should matter.</p>
<div class="footnotes">
<hr/>
<ol>
<li id="fn:1">
<p>Unfortunately this can’t be said about the manual of the new version, Final Cut Pro X.<a href="#fnref:1" rev="footnote">↩</a></p></li>
</ol>
</div>
]]></content>
</entry>
</feed>