Skip to content

HSA Lawnmower Drones

Paco Nathan edited this page Jan 7, 2014 · 30 revisions

Matt Damon directs this low-budget, direct-to-BitTorrent movie (scheduled for an early 2025 release) as a sequel of one of the most quintessentially B-rate movies of all times, Lawnmower Man. Stephen King had to file lawsuits three times to get his name removed from the 1992 original.

The sequel, however, concentrates on real-life drama circa 2023. After significant political jousting, the Homeland Security Agency has finally seized the day, wresting control of the United States from any prior, deprecated electoral system or other illegal processes. Freedom reigns. Drones fly a myriad of domestic missions daily without encumbrance. The public is safe from terror.

Step 1: Problem statement… In some of the fleeting dramatic moments of the sequel, the HSA mandates that each municipality must structure its central commons area into a park-like formation, arranged as a 10 x 10 square grid. Through a series of under-the-table deals shrouded by public SBIRs, the HSA procures a new line of aerial drones to help coordinate public safety in these central squares. These drones nominally get used as lawnmowers for the park-like areas. However, in times of crises they can be "repurposed" to disperse rioters and perform terrorist renditions. Each drone is equipped with a transporter beam, which allegedly relocates the perpetrator being subdued by "instant arrest" to an offshore detention facility in a totally humane way -- all based on quantum physics. Or something.

Also, the HSA funds and expertly trains the emergency first responders in each municipality to equip their central squares with perimeter transporter beams, especially for assisting the lawnmower drones. If a drone passes the border of the square, it instantly becomes transported to the corresponding position on the opposite side of the square. Again, via some kind of Hollywood quantum physics stuff.

Step 2: Representation, without taxation… Our assignment is to employ a machine learning practice called genetic programming to help create software for the lawnmower drones, so that they can detain perpetrators more efficiently. To help conserve taxpayer dollars. In other words, instead of using a GA to generate feature sets to solve some kind of problem, in a subtle twist we make them generate code which we will evaluate in a simulation. As a result, we get GPs. If you'd like to learn loads more about GPs in theory and practice, check out the excellent and comprehensive Genetic Programming in Theory and Practice series, edited by Rick Riolo, Bill Worzel, et al., and for that matter check out the GPTP conference from which these delightful encodings evolved.

As luck would have it, during times of crises the terrorists accommodate the authorities by arranging themselves one per tile in that 10 x 10 grid. This is, after all, the future that we're making a documentary about, circa 2023 thank you very much. Stranger things have happened. So the code required by our example GP is to fly a lawnmower drone through the grid, rendering one terrorist per tile via transporter. Humanely. If a drone flies off the edge of the square, the perimeter transporter beams send it to the opposite side of the square. Again, this is the future.

Take a look at sample_lmd.LMDFactory, which subclasses UnitOfWorkFactory. We'll use this new UnitOfWork definition to evolve software to fly the drones. And catch bad guys, or at least some illegal aliens.

class LMDFactory (UnitOfWorkFactory):
    """UnitOfWork definition for Lawnmower Drone GP"""

    def __init__ (self):
        self.n_pop = 300
        self.n_gen = 200
        self.max_indiv = 20000
        self.selection_rate = 0.3
        self.mutation_rate = 0.3
        self.term_limit = 5.0e-02
        self.hist_granularity = 3

        self.grid = [
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
            [ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ],
        ]

        # sampling parameters
        self.length = len(self.grid) ** 2
        self.min = 0
        self.max = len(OPS) - 1

The self.grid member provides a matrix that represents the municipal square grid, and the self.length member represents the total number of tiles.

Note that the population size increased to 300 and the selection rate and mutation rate are both boosted to 30% now. Yeah, that's how we roll. To evolve GPs we need a larger, more robust Population with less elitism and a higher rate of mutation -- sort of like New Jersey. In any case, much more so than in the relatively simpler GA examples shown earlier.

The code for each software "program" is a Python list of integers, with the integers representing operations and their parameters, based on the instruction set for the drone's onboard computer. These are effectively the "chromosomes" in our GP example. We also introduce the notion of a stack pointer, representing an index into a program.

Since these drones were procured by agents of the federal government, their onboard computers are only capable of executing an extremely limited instruction set:

  • rend -- fly the drone to the next tile ahead, and render a terrorist there (if any)
  • turn -- turn the drone 90 degrees to the left... that's your military left
  • sup(x, y) -- perform a quantum superposition to move the drone automagically into another grid tile, based on x, y offsets, and render a terrorist there (if any)
  • loop(n) -- loop the stack pointer back n operation steps

Operations in the special HSA instruction set are encoded using a Python list, which the software indexes into:

OPS = ( "rend", "turn", "sup", "loop" )

Orientation on the map is based on a cartesian grid, with the origin (0, 0) located in the right, upper corner. Tile coordinates are represented in Python using a namedtuple collection:

Point = namedtuple('Point', 'x y')

DIR_W = Point(1, 0)
DIR_S = Point(0, 1)
DIR_E = Point(-1, 0)
DIR_N = Point(0, -1)

An additional class Drone represents a simulation of the HSA's new lawnmower drone and its orientation on the grid.

class Drone (object):
    def __init__ (self, x, y):
        self.pos = Point(x, y)
        self.dir = Point(1, 0)

The Drone class also defines methods to perform the simulation steps required for each of the operations in the instruction set:

def exec_op_sup (self, mod, sup):
    x = self._mod_math(self.pos.x, sup.x, mod)
    y = self._mod_math(self.pos.y, sup.y, mod)
    self.pos = Point(x, y)
    return x, y

def exec_op_move (self, mod):
    x = self._mod_math(self.pos.x, self.dir.x, mod)
    y = self._mod_math(self.pos.y, self.dir.y, mod)
    self.pos = Point(x, y)
    return x, y

def exec_op_turn (self):
    if self.dir.x == DIR_W.x and self.dir.y == DIR_W.y:
        self.dir = DIR_N
    elif self.dir.x == DIR_S.x and self.dir.y == DIR_W.y:
        self.dir = DIR_W
    elif self.dir.x == DIR_E.x and self.dir.y == DIR_E.y:
        self.dir = DIR_S
    elif self.dir.x == DIR_N.x and self.dir.y == DIR_N.y:
        self.dir = DIR_E

Also, due to the ineffable nuances of quantum physics -- and, to be fair, probably also due to the inherent chaos of a riot circa 2023, given all their cool DIY MakerBot stuff by then -- a lawnmower drone must always initiate its terrorist rendition operations at a randomly selected tile in the grid. Plus, the executive producers at the studio thought it would be really cool for special effects. That's easy to do:

drone = Drone(randint(0, len(self.grid)), randint(0, len(self.grid)))

Step 3: To generate a feature set for a lawnmower drone Individual, first we generate a non-zero random number rand_len which is less than the total number of tiles in the grid. Then we generate rand_len number of operations from the OPS list, sampled at random with replacement. Note that we take care to generate parameters which make sense in their context. That gives our in vitro simulated process of evolution here a gentle push. We've thought about applying for a software patent, actually, and considered naming this approach Intelligent Design -- if we can secure the trademark. Some of the top brass at the HSA seemed fond of that name. Weird.

def generate_features (self):
    """generate a new feature set for a lawnmower drone"""
    rand_len = randint(1, self.length)
    feature_set = []

    while len(feature_set) < rand_len:
        op = randint(self.min, self.max)

        if op == OPS.index("sup"):
            feature_set.append(op)
            feature_set.append(randint(0, len(self.grid) - 1))
            feature_set.append(randint(0, len(self.grid) - 1))
            
        elif op == OPS.index("loop"):
            if len(feature_set) > 2:
                offset = randint(1, len(feature_set) - 1)
                feature_set.append(op)
                feature_set.append(offset)
        
        else:
            feature_set.append(op)

    return feature_set

We handle the LMPFactory.mutate_features() and LMPFactory.breed_features() methods with a slight twist... Mutation of any particular "gene" is limited to values within the range of op codes in the instruction set -- yeah, so that happened. Many GP implementations use trees to represent code. Our representation could be handled in a better way, but it works for now.

Meanwhile, effective drone software breeding requires that we pick a point at random at which slice the mommy/daddy DNA, as shown in the split variable:

def mutate_features (self, feature_set):
    """mutate a copy of the given GP program"""
    pos_to_mutate = randint(0, len(feature_set) - 1)
    mutated_feature_set = list(feature_set)
    mutated_feature_set[pos_to_mutate] = randint(self.min, self.max)
    return mutated_feature_set

def breed_features (self, f_feature_set, m_feature_set):
    """breed two GP programs to produce a little toddler GP program"""
    split = randint(1, min(len(f_feature_set), len(m_feature_set)))
    return f_feature_set[split:] + m_feature_set[:split]

Great. We can evolve programs with that logic. But how do we evaluate them?

Step 4: Regulations which are strictly enforced by the HSA insist that all lawnmower drone software be evaluated by running a simulation of 100 instruction calls. No more, no less. Well, unless a great program can be evolved in less than 100 instructions. Or unless the software performs an #epicfail before its 100th step. Close enough for government work.

This next part is the truly fun part of GPs, where one gets to simulate software running. Which is why, circa 2023, that we all enlisted in the Computer Science branch of the military to begin with. It's vaguely like simulating the terrorists running away from the lawnmower drones, except that they don't run, they just stand there, one per tile, then get beamed away.

The LMDFactory._simulate() method takes a copy of the grid as a parameter, along with the code to be evaluated, and the drone being simulated, as shown below. First the method sets up a simulation environment, with the sp stack pointer variable set at the start of the program:

def _simulate (self, grid, code, drone):
    """simulate the lawnmower grid"""
    sp = 0
    mod = len(self.grid)
    num_ops = 0
    max_ops = self.length
    result = None

Then the simulation iterates until reaching the end of the program, or executing the max_ops = 100 maximum number of instructions allowed by HSA regulations, or until the simulation hits some kind of programming fault:

    try:
        while sp < len(code) and num_ops < max_ops:
            num_ops += 1
            op = code[sp]

            if op == OPS.index("rend"):
                x, y = drone.exec_op_move(mod)
                grid[y][x] = 0

            elif op == OPS.index("turn"):
                drone.exec_op_turn()

            elif op == OPS.index("sup"):
                sup = Point(code[sp + 1], code[sp + 2])
                sp += 2

                if sup.x == 0 and sup.y == 0:
                    return None

                x, y = drone.exec_op_sup(mod, sup)
                grid[y][x] = 0

            elif op == OPS.index("loop"):
                offset = code[sp + 1]

                if offset == 0 or offset > sp:
                    return None
                sp -= offset

            else:
                return None

            sp += 1
        result = grid

    finally:
        return result

If all goes well, the method returns a simulated grid to use for tallying the count of terrorists which have been rendered via "instant arrest" and transporter beam. However, if the simulation reaches some fault condition in the code, then the method returns None so that we can evaluate another approach. Granted, we stated previously that there's no such thing as a bad candidate solution… For the record, we're holding with that statement. Off the record, some candidate solutions just totally suck. And we can neither confirm nor deny that any of these fault conditions may or may not have been introduced by a foreign power. So we dispense with them.

Assuming that there are 100 terrorists instigating a riot, the fitness function for our GP becomes a simple matter: execute the generated code per HSA regulations, then divide the number of rendered terrorists by 100. Done and done.

def get_fitness (self, feature_set):
    """determine the fitness ranging [0.0, 1.0]; higher is better"""
    drone = Drone(randint(0, len(self.grid)), randint(0, len(self.grid)))
    grid = self._simulate(deepcopy(self.grid), feature_set, drone)
    fitness = 0.0

    if grid:
        terrorists = 0

        for row in grid:
            terrorists += sum(row)

        fitness = (self.length - terrorists) / float(self.length)

        if len(feature_set) > 5:
            penalty = len(feature_set) / 10.0
            fitness /= penalty

    return fitness

Again, if the grid returned from simulation is set to None, then the fitness score is forced to zero. Otherwise, we test the length of the evolved program… If it's longer than some threshold length -- which in this case is set to 5 -- then we penalize the fitness score. That helps avoid lots of DNA junk. Cut to the chase, in other words. And think of the taxpayers.

Step 5: It's time to kick the tires and light the fires… Let's run this GP for lawnmower drones in Exelixi and see what evolves?

ubuntu@ec2-23-69-11-93:~/exelixi-master$ ./src/exelixi.py -m localhost:5050 -w 2 --uow sample_lmd.LMDFactory
gen	0	size	299	total	299	mse	9.66e-01	max	1.83e-01	med	5.40e-03	avg	1.06e-02
gen	1	size	300	total	496	mse	9.64e-01	max	1.83e-01	med	2.60e-03	avg	1.12e-02
gen	2	size	300	total	694	mse	9.61e-01	max	1.83e-01	med	4.35e-05	avg	1.24e-02
gen	3	size	300	total	887	mse	9.54e-01	max	1.83e-01	med	1.89e-05	avg	1.38e-02
# ... intervening log was redacted for matters of national security ...
gen	106	size	300	total	19780	mse	7.43e-01	max	5.09e-01	med	1.20e-02	avg	2.04e-02
gen	107	size	300	total	19948	mse	7.42e-01	max	5.09e-01	med	8.00e-03	avg	2.05e-02
gen	108	size	299	total	20118	mse	7.39e-01	max	5.09e-01	med	1.48e-02	avg	1.89e-02
indiv	0.5091	42	[0, 0, 0, 1, 2, 3, 4, 0, 0, 3, 9]
indiv	0.4143	51	[0, 0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11]
indiv	0.4071	96	[0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 3, 9]
indiv	0.4071	93	[0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 3, 9, 2]
indiv	0.4071	92	[0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 3, 9, 0]
indiv	0.4071	72	[0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11, 4]
indiv	0.4071	70	[0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11, 2]
indiv	0.4071	40	[0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11, 3]
indiv	0.4071	37	[0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11, 0]
indiv	0.4071	32	[0, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11, 1]
indiv	0.4071	103	[2, 1, 0, 0, 2, 1, 1, 2, 3, 5, 0, 1, 3, 11]

Not bad. After 42 generations the answer from our best GP solution for the lawnmower drone software prevailed with a fitness score of 0.5091 due to its astoundingly patriotic [0, 0, 0, 1, 2, 3, 4, 0, 0, 3, 9] program. Rougthly speaking, 51% effectiveness is a helluva lot better odds for terrorist rendition than the HSA gets from those idiot contractors out of Halliburton. Return on investment, we'll drink to that.

The prize-winning ubercode executes a brilliant software strategy -- all things considered, given the limited instruction set:

rend
rend
rend
turn
sup(3, 4)
rend
rend
loop(9)

In other words, it renders a few terrorists in a row, wiggles a bit to the left, performs a quantum superposition to automagically move 3 tiles over and 4 down… Then renders a couple more terrorists for good measure, and finally loops back to the beginning. Other leading candidate solutions follow variations on this theme for their freedom strategies. All as American as Ma and apple pie.

Movie Spoiler Alert!! The studio execs in Hollywood would not let us flee the scene, so to speak, without leaving the door slightly ajar for yet another sequel… When one considers the path of a lawnmower drone collecting terrorists from the city grid -- courtesy of our GP-evolved software -- it resembles a bee waggling directions to the other bees in the hive, about where to score the best pollen schwag. This reveals precisely the ultrasecret and supersensitive function of the HSA's lawnmower drone program: a Bee Killer, to alleviate an escalating glut of immigrant farm workers ending-around the GMO patents from Monsanto. Why else would the HSA give a rat's ass about mowing grass in a park as a matter of federal policy? Get rid of the bees, and these criminal agrarians will have no choice but to capitulate. To paraphrase William S. Burroughs, "We have to stay ahead of ourselves and the Ivans, lest some joker endanger national security by braying out, 'You still have bees!'"

Ruins of Hiroshima on screen. Pull back to show the Technician at a switchboard. Behind him, Robert Oppenheimer flanked by three middle-aged men in dark suits, with the cold dead look of heavy power. The Technician twiddles his knobs. He gives the O.K. sign. "All clear." "Are you sure?" The Technician shrugs. "The instruments say so." Oppy says: "Thank God it wasn't a dud." "Oh, uh, hurry with those printouts, Joe." "Yes, sir." He looked after them sourly, thinking: Thank Joe it wasn't a dud. God doesn't know what buttons to push. -- Seven Souls by WSB.

What would really be awesome would be more productive kinds of applications for GP technology. No, wait, what would really be awesome would be a future history in which the HSA never existed… Aside from that, imagine if we could apply GP technology to fix software? That'd be great. Imagine data mining at scale using the public archives of GitHub and other public code repositories. Imagine applying a myriad of machine learning techniques to generalize about bug fixes by examining the commit histories for thousands of popular open source software projects. Turn the deltas into "operations" and evolve ways to apply common patches to other OSS projects. Then perhaps, just maybe, we could apply GPs to evolve bug fixes. Oh snap! A researcher named Claire Le Goues at CMU has been busy as a bee already doing that. She's amazingly brilliant.

Enjoy.