/
dataflow.txt
executable file
·398 lines (300 loc) · 14.9 KB
/
dataflow.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
========
Dataflow
========
-----------
The Factory
-----------
Let's pretend we want to write a simulation of a doll factory that has the
following requirements:
+ One storeroom that contains plastic pellets that will be molded.
+ One storeroom that contains rivets used to attach parts.
+ An injection molder that takes 0.2 pounds of plastic pellets and creates
a pair of arms in 6 seconds.
+ An injection molder that takes 0.2 pounds of plastic pellets and creates
a pair of legs in 5 seconds.
+ An injection molder that takes 0.1 pounds of plastic pellets and creates
a head in 4 seconds.
+ An injection molder that takes 0.5 pounds of plastic pellets and creates
a torso in 10 seconds.
+ An assembly station that takes a completed torso, a completed pair of
legs, a rivet, and puts them together in 2 seconds.
+ An assembly station that takes the combined part from above, a pair of
arms, and a rivet and puts them together in 2 seconds.
+ An assembly station that takes the completed part from above, a head,
and a rivet and puts them together in 3 seconds.
+ Each station keeps on doing this forever.
--------------------
The 'normal' version
--------------------
I'll take a stab at writing this 'normally' without using stackless. After we
go through the 'normal' example, we'll build one with stackless and compare the
resulting code. If you think the example is contrived, and you have the time,
you may want to take a break, write a factory with the above requirements
yourself, and compare your resulting code to the stackless version.
Here is the code:
.. include:: code/assemblyline.py
:literal:
Analysis
========
We start off with a class that represents the storerooms. It is initialized
with a product name, a unit measurement (such as pounds or number of parts),
and an initial quantity. There is also a run method that does nothing; its
usage will become clear a little later. We create two storeroom instances
based on this class.
Next up is an injectionMolder class. It is initialized with the name of a
finished part, a storeroom that acts a plastic source, the quantity required to
build one part, and the time required to create the part. There is a get()
method that can be used to get retrieve a finished part and adjust inventory if
one exists. For this class the run() method actually does something:
+ While the timer is above 0, it continues molding and decrements the counter.
+ When the time-to-mold reaches 0, a finished part is created and the time
counter is set to -1.
+ When the time counter is -1, the molder checks to see if it has enough
plastic to mold another part, retrieves it if necessary, and begins
molding.
We create four injection molder instances with this class.
Next up is an assembler class. It is initialized with the name of the finished
part, the source for part 1, the source for part 2, a storeroom that contains
rivets, and the time required to assemble the parts in question. There is a get()
method that can be used to get retrieve a finished part and adjust inventory if
one exists. For this class the run() method:
+ If the timer is greater than one, the assembler has the required parts and
continues to assemble.
+ If the timer equals 0, a part is completed and inventory is adjusted.
+ If the timer is less than one, the assembler tries to grab one of each part
required to start building again. It may have to wait here if a part hasn't
finished molding yet.
Assembler instances are created to attach the leg, arm, and head.
.. note::
You'll notice a lot of similarities between the storeroom, injectionMolder,
and assember classes. If I was writing a production system, I would
probably create a base class and use inheritance. In this case, I thought
setting up a class heirarchy would confuse the code, so I kept it
intentionally simple.
All instances from all three classes are loaded into an array called
compenents. We then create an event loop that repeatedly calls the run()
method for each component.
--------------
Enter Dataflow
--------------
If you're familiar with unix systems, you've probably used dataflow techniques
whether you knew it or not. Take the following shell command::
cat README | more
In fairness, the Windows equivilent is::
type readme.txt | more
Although dataflow techniques aren't as pervasive in the Windows world as they
are in the Unix world.
For those who are unfamilar with the *more* program, it recieves input from an
external source, displays a page's worth of output, pauses until the user hits
a key, and shows another page. The '|' operator takes the output from one
program, and pipes it into the input of another command. In this case, either
*cat* or *type* sends the text of a document to standard output, and *more*
recieves it.
The *more* program just sits there until data flows into it from another
program. Once enough data flows in, it displays a page on the screen and
pauses. The user hits a key, and more lets more data flow in. Once again,
more waits until enough data flows in, displays it, and pauses. Hence the term
*dataflow*.
Using the stackless round-robin scheduler in addition to channels, we can use
dataflow techniques to write the factory simulation.
---------------------------------
The Stackless Version of the Code
---------------------------------
.. include:: code/assemblyline-stackless.py
:literal:
Analysis
========
Sleeper utilities
-----------------
First, we create some helper functions that allow our classes to 'sleep'. When
a tasklet calls Sleep(), it creates a channel, calculates a time to wake up,
and attaches this information to the global sleepingTasklets array. After
that, channel.receive() is called. This causes the calling tasklet to pause
until reawakened.
Then we create a function to manage the sleeping tasklets. It checks the
global sleepingTasklets array, finds any items that are ready to wake up, and
reactivates them via the channel. This function is added to the tasklet
scheduler.
The classes
-----------
These classes are similar to the original classes with a few notable
exceptions. The first is that they spawn their *run()* methods at
instantiation. We no longer need to manually create a components array and an
external *run()* function to process the event loop; stackless handles this
implicitly. The second difference is that tasklets will sleep while waiting
for a part to be built instead of explictly maintaining a counter to track
time. The third difference is that calls to *get()* are more natural. If a
material or part is not available, the tasklet will simply reschedule itself
until it is available.
-----------------------
So what have we gained?
-----------------------
Okay, so what's the big deal? Both programs run and basically generate the same
results. Lets examine the run method from the original version of the
factory::
def run(self):
if self.time == 0:
self.items += 1
print "%s finished assembling part" % self.name
self.time -= 1
elif self.time < 0:
print "%s starts assembling new part" % self.name
if self.itemA < 1:
print "%s Getting item A" % self.name
self.itemA += self.partAsource.get(1)
if self.itemA < 1:
print "%s waiting for item A" % self.name
elif self.itemB < 1:
print "%s Getting item B" % self.name
self.itemB += self.partBsource.get(1)
if self.itemB < 1:
print "%s waiting for item B" % self.name
print "%s starting to assemble" % self.name
self.time = self.timeToAssemble
else:
print "%s assembling for %s more seconds" % (self.name, self.time)
self.time -= 1
And then the stackless version::
def run(self):
while 1:
print "%s starts assembling new part" % self.name
self.itemA += self.partAsource.get(1)
self.itemB += self.partBsource.get(1)
print "%s starting to assemble" % self.name
Sleep(self.timeToAssemble)
print "%s done assembling after %s" % (self.name, self.timeToAssemble)
self.items += 1
print "%s finished assembling part" % self.name
stackless.schedule()
The stackless version is much simpler, clearer, and intuitive than the original
version. It doesn't wrap the event loop infrastructure into the run method.
This has been decoupled from the run() method. The run() method only expresses
what it is doing, and doesn't worry about the details of how it gets done. It
lets the software developer concentrate on how the factory works, and not how
the event loop and program works.
------------
Pushing Data
------------
.. note::
The completed program from this section is listed as digitalCircuit.py in
both the code .zip archive and at the end of this document.
In the case of the factory, we were 'pulling' data. Each component asked for
the parts it needed and waited until they arrived. We could also 'push'
information so that one component in a system propigates its changes forward to
another component. The 'pull' approach is called **lazy data flow** and the 'push' approach is called **eager data flow**.
To demonstrate the push approach, we'll build a digital circuit simulator. The simulator will consist of various parts that that have a state of 0 or 1, and can be interconnected in various ways. In this case, we will take an object oriented approach and define an EventHandler base class that implements most of the functionality:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 6
:end-line: 33
The EventHandler class core functionality performs three things:
+ It continously listens for messages on a channel on the listen method.
+ Then it processes any message it recieves via the processMessage method.
+ Then it notifies any registered outputs of the results via the nofify method.
In addition there are two helper methods:
+ registerOutput allows you to register additional outputs after instantiation.
+ __call__ is overridden as a convienence so that we can send messages in
the form of::
event(1)
instead of::
event.channel.send(1)
Using the EventHandler class as a building block, we can start to implement our
digital circuit simulator, starting with a Switch. This is a representation of
a user-togglible switch that can be send to either a 0 or 1 value:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 35
:end-line: 45
When initialized, the switch stores its original state. processMessage is
overridden to store the received message as the current state. The notify
method is overridden to send a tuple containing a reference to the instance
itself and the state. As we will see a little further on, we need to send the
instance so that components with multiple inputs can tell what the message
source was.
.. note::
If you're typing in the code as you go along, please note that we're using the
debugPrint() function originally defined in the **Lightweight Threads** section
to provide diagnostics.
The next class we'll create is the Reporter() class. Instances of this class simply display their current state. I suppose we could imagine that these were LED's in a real digital circuit:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 47
:end-line: 54
The initializer accepts an optional format string for subsequent output.
Everything else should be self explanitory.
Now we have a good enough framework to test the initial functionality::
C:\Documents and Settings\grant\Desktop\why_stackless\code>c:\Python24\python.ex
e
Python 2.4.3 Stackless 3.1b3 060516 (#69, May 3 2006, 11:46:11) [MSC v.1310 32
bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import stackless
>>> from digitalCircuit import *
>>>
>>> reporter = Reporter()
>>> switch = Switch(0,reporter) #create switch and attach reporter as output.
>>>
>>> switch(1)
<digitalCircuit.Switch instance at 0x00A46828> send message 1
>>>
>>> switch(0)
<digitalCircuit.Switch instance at 0x00A46828> send message 0
>>>
>>> switch(1)
<digitalCircuit.Switch instance at 0x00A46828> send message 1
>>>
Unlike the factory we created earlier, toggling a switch instantly pushes the
results to it's output and the results are displayed.
Now lets create some digital logic components. The first is an inverter. It
takes an input and pushes the logical inverse. Inputs of 0 or 1 will push 1 or
0 respectively:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 56
:end-line: 69
The initializer for the inverter accepts an input, which is some sort of
EventHandler, stores it, and registers itself as an output. processMessage()
sets the state to the logical opposite of the message recieved. Like the
Switch class, the notify event sends a tuple containing itself and its state.
We could chain this between the Switch and Reporter from the example above.
Feel free to try it if you like, but I don't think it's necessary to show the
interactive session here.
Next up is an AndGate. This is the first class we have that accepts multiple
inputs. It has two inputs. If both are set to 1, it sends the message 1,
otherwise it sends the message 0:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 74
:end-line: 106
In the AndGate's process message, we need to determine what input sent the
message and assign the state accordingly. This is why we needed to send the
self object from the other components.
Lastly, we have the OrGate. It is identical to the AndGate, with the execption
that it sends the message 1 if ether input is 1, and 0 only if both inputs are
0:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 108
:end-line: 141
Half Adder
==========
To finish things up, we'll use the components we have created to build a half-adder. A half-adder performs addition on two bits.
::
Simple half-adder circuit
_____ ____
/ / <--- inputA ---> | \
+< ( ( | ) >-+-> carry
| \____\ <--- inputB ---> |____/ |
| |
| +-----------------------------+
| | _
| | | \ ____
| +--> | )O >------> | \
| |_/ | ) >---> result
| +---> |____/
+---------------------+
We chain together several of our components and flip the switches. Flipping the switches changes their state, and propigates these changes through the system via dataflow:
.. include:: code/digitalCircuit.py
:literal:
:start-line: 143
:end-line: 158