http://codekata.com/kata/kata08-conflicting-objectives/

For this kata, we’re going to write a program to solve a simple problem, and we’re going to write it with three different sub-objectives. Our program is going do process the dictionary we used in previous kata, this time looking for all six letter words which are composed of two concatenated smaller words. For example:


~~~
  al + bums => albums
  bar + ely => barely
  be + foul => befoul
  con + vex => convex
  here + by => hereby
  jig + saw => jigsaw
  tail + or => tailor
  we + aver => weaver
~~~

Write the program three times.

    The first time, make program as readable as you can make it.
    The second time, optimize the program to run fast fast as you can make it.
    The third time, write as extendible a program as you can.

Now look back at the three programs and think about how each of the three subobjectives interacts with the others. For example, does making the program as fast as possible make it more or less readable? Does it make easier to extend? Does making the program readable make it slower or faster, flexible or rigid? And does making it extendible make it more or less readable, slower or faster? Are any of these correlations stronger than others? What does this mean in terms of optimizations you may perform on the code you write?

In [45]:
!! head -13 ../data/wordlist.txt

['A',
 "A'asia",
 "A's",
 'AA',
 "AA's",
 'AAA',
 'AAM',
 'AB',
 "AB's",
 'ABA',
 'ABC',
 "ABC's",
 'ABCs']

In [44]:
!! egrep '^bar.?.?.?$' ../data/wordlist.txt | sort

['bar',
 "bar's",
 'baraza',
 'barb',
 "barb's",
 'barbal',
 'barbe',
 'barbed',
 'barbel',
 'barber',
 'barbes',
 'barbet',
 'barbie',
 'barbs',
 'barbut',
 'barca',
 'barcas',
 'bard',
 "bard's",
 'barde',
 'barded',
 'bardes',
 'bardic',
 'bardo',
 'bards',
 'bardy',
 'bare',
 'bared',
 'barege',
 'barely',
 'barer',
 'bares',
 'barest',
 'barf',
 "barf's",
 'barfed',
 'barfly',
 'barfs',
 'barful',
 'barfy',
 'barge',
 'barged',
 'bargee',
 'barges',
 'barhop',
 'baric',
 'baring',
 'barish',
 'barit',
 'barite',
 'barium',
 'bark',
 "bark's",
 'barkan',
 'barked',
 'barken',
 'barker',
 'barks',
 'barky',
 'barley',
 'barlow',
 'barm',
 "barm's",
 'barman',
 'barmen',
 'barmie',
 'barms',
 'barmy',
 'barn',
 "barn's",
 'barned',
 'barney',
 'barns',
 'barny',
 'barock',
 'baron',
 'barong',
 'barons',
 'barony',
 'barque',
 'barra',
 'barras',
 'barrat',
 'barre',
 'barred',
 'barrel',
 'barren',
 'barres',
 'barret',
 'barrio',
 'barrow',
 'bars',
 'barter',
 'barton',
 'barye',


<h1>Solution 1</h1>

In [31]:
import codecs

class JoinedWords():
    """
    Reads a dictionary file and finds the six letter words made of two shorter words.
    e.g. 
        jig + saw => jigsaw
    """
    def __init__(self, dictionary_file, encoding='utf-8'):
        self.shortWords = set()   # a set of words
        self.longWordsDict = {}  # a dict of words list of the pairs of shortWords that make up the dict key word. 
        
        with codecs.open(dictionary_file, 'r', encoding) as f:
            for word in f.read().split():
                if len(word) < 6:
                    self.shortWords.add(word)
                elif len(word) == 6:
                    self.longWordsDict.setdefault(word,list())
                    
        for keyWord in self.longWordsDict.keys():
            for i in range(1,6,1):
                leftWord = keyWord[:i] # leftmost i chars
                rightWord = keyWord[-6+i:]
                if leftWord in self.shortWords and rightWord in self.shortWords:
                    self.longWordsDict[keyWord].append([leftWord, rightWord])
                    
    def getJoinedWords(self, longWord):
        """
        Returns the list of shortWords pairs that combine to make longWord
        """
        return self.longWordsDict.get(longWord, [])

                

<h1>Unit Tests</h1>

In [39]:
from unittest import *

class JoinedWordsTests(TestCase):
    
    @classmethod
    def setUpClass(self):
        self.jw = JoinedWords('../data/wordlist.txt', 'iso-8859-1')
        
    def setUp(self):
        pass
        
        
    def test_joinedWords_bulk1(self):
        # Check expected results
        self.testWords = {
              'albums' :[['al', 'bums'], ['alb', 'ums'], ['album', 's']], # Example incomplete
              'barely' :[['ba','rely'],],                                 # Example wrong
              'befoul' :[['be','foul'],],
              'convex' :[['con','vex'],],
              'hereby' :[['here','by'],],
              'jigsaw' :[['jig', 'saw'], ['jigs', 'aw']],                 # Example incomplete
              'tailor' :[['tai', 'lor'], ['tail', 'or']],                 # Example incomplete
              'weaver' :[['we', 'aver'], ['weave', 'r']],                 # Example incomplete
        }
        for tk in self.testWords.keys():
            joinedWords = self.jw.getJoinedWords(tk)
            self.assertEqual(self.testWords[tk], joinedWords)
            

jwt = JoinedWordsTests()

suite = TestLoader().loadTestsFromModule(jwt)
TextTestRunner().run(suite)

.
----------------------------------------------------------------------
Ran 1 test in 0.537s

OK


<unittest.runner.TextTestResult run=1 errors=0 failures=0>

In [33]:
jw = JoinedWords('../data/wordlist.txt', 'iso-8859-1')

In [34]:
jw.longWordsDict

{'hondle': [],
 'matzas': [['mat', 'zas'], ['matza', 's']],
 'Hydras': [['Hydra', 's']],
 'gurjun': [['gur', 'jun']],
 'balsam': [['bal', 'sam'], ['bals', 'am'], ['balsa', 'm']],
 'pollen': [['poll', 'en']],
 'wanier': [],
 'Harley': [],
 'ambery': [['amber', 'y']],
 'Seaton': [],
 'thorny': [['t', 'horny'], ['thorn', 'y']],
 'Canute': [['Can', 'ute']],
 'lacets': [['lace', 'ts'], ['lacet', 's']],
 'twenty': [],
 'modify': [['modi', 'fy']],
 'admits': [['admit', 's']],
 'begums': [['be', 'gums'], ['beg', 'ums'], ['begum', 's']],
 'emmews': [['em', 'mews'], ['emmew', 's']],
 'précis': [],
 'Nimitz': [],
 'Burley': [['Bur', 'ley']],
 'mentos': [['ment', 'os'], ['mento', 's']],
 'dismay': [['dis', 'may']],
 'layman': [['lay', 'man']],
 'snitch': [['snit', 'ch']],
 'Maglev': [],
 "week's": [],
 'uraris': [['ur', 'aris'], ['urari', 's']],
 "dude's": [],
 'caking': [['c', 'aking'], ['ca', 'king']],
 'Megrez': [['Meg', 'rez']],
 'clarty': [['cl', 'arty'], ['clart', 'y']],
 'poshed': [['po', '