Skip to content
Fetching contributors…
Cannot retrieve contributors at this time
1338 lines (982 sloc) 46.1 KB
Abstract
A list of POE ideas and projects to be considered for future development.
Some of these ideas need a lot more thinking before implementing.
Potential Google Summer of Code (GSoC) projects.
[_] 33% Overall Progress
[X] 100% Procedural Interface
my $input = non_blocking_readline();
Completed in Reflex, to the extent that a Perl program is able without ugly, deep hacks inside Perl's interpreter.
[_] 0% Make POE::Loop and subclasses more generic.
Rationale
There are 16 different POE::Loop classes on CPAN.
They are only available for direct use by POE.
There are use cases where generic, non-sessioned events are required.
In many cases, such as Reflex, it's very useful to also allow POE modules to run alongside.
Make POE::Loop more appropriate for direct use by other clients, such as Reflex.
Tasks
[_] 0% Revise the POE::Loop API for generic callbacks.
[_] 0% Document the new API.
[_] 0% Figure out how to version POE::Loop APIs.
Updating all loops at once will be hard work.
Coordination must happen with multiple module authors.
Versioned APIs would allow incremental migration.
[_] 0% Update POE::Test::Loops to use the new API.
[_] % Native OS event support.
Rationale
[_] % Why is this particularly good?
Tasks
[_] % How will it be implemented.
[_] 0% Enable CPAN testers support for development releases of POE.
Rationale
CPAN Testers are a wonderful QA resource.
Unfortunately it's impossible to depend upon development versions of dependencies.
So it's impossible to release development POE and POE::Test::Loops for testing.
The wrong tests---the previous production version---are loaded.
Tasks
[_] % Research how this might work.
[_] % Add new tasks to implement a solution, once one is found.
[_] % Migrate POE traces to a proper log dispatcher.
Rationale
POE has a lot of logging information that only a developer can love.
The tracing code is custom made from a time before decent logging modules existed.
Replacing tracing with off-the-shelf code would make it more accessible to more people.
Tasks
[_] % Decide upon a log dispatching module.
I'm partial to Log::Log4perl because Rocco uses it at work.
Nick Perez suggested Log::Dispatch but didn't offer rationale for its use.
[_] % Determine and document new logging classes.
[_] % Implement environment variable overrides.
[_] % Avoid making the log dispatcher module a hard dependency.
[_] % Wrap all logging in a single constant.
[_] % Only enable logging when the constant is set, either programmatically or through the environment.
[_] % Migrate POE::Resource mixin classes to proper objects.
Rationale
POE::Resource modules mainly implement private interfaces for private data.
The OO equivalent would be roles, but they're really discrete objects in their own rights.
Defining the interaction between POE::Kernel and POE::Resource::* may expose more interfaces.
Parts of POE::API::Peek may become obsolete.
Tasks
[_] % Define new encapsulation of resource data.
As mixed-in roles, resources are given access to things they ought not be.
This isn't good form; it leads to actions at a distance.
[_] % Define new APIs.
As mixed-in roles, resource classes tend to call each other in a tightly coupled way.
While the resource classes are separate, they behave monolithically.
Make them more of a toolkit for POE::Kernel to coordinate.
Merge POE::API::Peek into POE.
Rationale
It's useful.
It's tightly coupled to POE's internals, so it must be maintained in lockstep with POE.
Drawbacks
It publishes interfaces to private implementation details.
Inclusion in POE would make these details official.
Rocco would need to maintain it.
He's morally opposed to the idea.
He's not going to maintain it.
If it ever occurs and gets in his way, he's liable to delete it.
Tasks
Find a rationale that makes the design, implementation and maintenance costs worth the grief.
[_] % Replace POE explicit reference counting with Perl's references.
Rationale
POE can use Scalar::Util::weaken() now.
Perl reference counting is more efficient than rolling it ourselves.
Drawbacks
SIGIDLE becomes difficult or impossible.
Tasks
[_] % Determien whether SIGIDLE can be saved.
Pushing references down into Perl makes them opaque.
Removing reference introspection eliminates the ability to detect deadlocks.
Eliminating deadlock detection means we can't fire SIGIDLE at appropriate times.
[_] % Determine an implementation strategy that doesn't suck.
Document it here, and make it part of the tasks for this item.
[_] % Convert the build process to Dist::Zilla
See notes at the start of Makefile.PL.
poe-docs
See http://github.com/rcaputo/poe-docs
-><- Old old old stuff after here.
** Tutorials
POE can be difficult for people to pick up, especially if they're not
familiar with Perl references and objects. A "gradual" introduction
to state machines and POE is half written. It discusses things
through Kernel and Session, but it doesn't go into Wheels or the
Object Layer yet.
Freeside suggests:
"here's an event-driven application domain."
"here's how people usually do it with select(,,,)."
"here's how you might want to do it in POE."
"here are some other cool things POE makes easy."
** Papers
*** Coroutines
It is possible to model coroutines in POE. Coroutines are a hot
topic, and this would make a nice paper topic.
*** Call With Current Continuation (callcc)
Likewise, it was once lamented that Perl has no callcc (call with
current continuation) function. POE can simulate callcc, which in
turn can simulate coroutines, threads, and all sorts of things.
** Other Documentation
Crimson suggested that I write a book about POE, using the PCB Q&A
format. Maybe when POE's design settles down?
Golly, I *hope*, POE's design settles down.
* SUPPORT OTHER SYSTEMS
POE uses a lot of POSIX to be portable, but some systems don't seem to
be POSIX enough.
** Win32 support
Native support for Windows events would make POE work much better on that platform.
** OS X support
Native support for OSX events might make POE work better there, but OSX has good multiplexer support at the moment.
What would native OSX events give us that we can't already get?
* PROGRAMS USING POE
** tail + grep
Someone (I've forgotten who) wants a program that combines the
features of /usr/bin/grep and /usr/bin/tail.
** POE::Simple
Philip Gwyn would like to see a POE::Simple::(Client|Server|Proxy)
that combines a lot of default behavior into an extremely-high-level
module for writing quick network programs.
He says, "Contrast LWP::Simple with the other modules." Fair enough.
* DEVELOPMENT METHODOLOGY CHANGES
** Event
Event.pm is destined to become Perl's standard event queue. POE
should probably cooperate with or use it at some point.
While I've heard that Event cooperates with Tk, there doesn't seem to
be anything in the source or documentation about this. Perhaps Tk
will use it if it's present?
Event is supposed to make perl's signals safe. If true, that's a big
win.
Event has multiple queues and priorities. That could be useful.
Event has a C API. While it's subject to incompatible changes every
time Event or perl is upgraded, it's supposed to be faster than the
Perl API. No wonder, there's one less level of indirection.
Migration to Event. It may be better to wait for Event to reach 1.00
before going ahead with this, but it's good to keep in mind.
*** Wrap Event's C API
Some of the Event things don't work the way I'd like. Function calls
are slow, so fetching event attributes with them is extra overhead.
Passing them to states as another parameter (array reference) and
accessing them with offset constants might be faster.
*** Replace POE::Kernel Event Guts
This will lighten POE a lot, at the expense of requiring a bundle of
modules including some C. Plus POE will have XS to interface Event's,
and then POE may be incompatibly outdated whenever perl or Event
changes. Blargh.
* DESIGN CONSIDERATIONS
** Minor Interface Nits
Some functions only check for undef. These should really check for
undef or @_==1, since the second parameter may be missing.
If a wheel has no ErrorState, the default behavior should be to issue
a warning with $!.
Consolidate FailureState and ErrorState, so people don't have to
remember when to use one or the other.
** Major Nits
Wheels need to be able to stack things.
Swapping wheels requires that old ones be deleted before new ones are
created, otherwise selects are cleared improperly. This is bad, I
don't like it, and I don't have time to fix it right now. Grrrr!
** Cascaded States
Cherem joeo@suninternet.com wants this for NIRC.
I think building a cascading mechanism can be useful for non-POE
programs, so building it into POE would limit its use. Since POE
states are just Perl subs, the mechanism should focus on cascading
subs instead of states.
As I interpret cascading, it is the ability to call a sequence of subs
to handle a single call. Each sub in the sequence can decide whether
or not co continue the cascade to the next sub. Here are a couple
cascading sub examples to start things:
sub user_kick {
&do_cascade(@_) if ($needs_to_cascade_before_work);
# ... do some work ...
&do_cascade(@_) if ($needs_to_cascade_amidst_work);
# ... do more work ...
&do_cascade(@_) if ($needs_to_cascade_after_work);
}
sub user_join {
# ... do some work ...
&stop_cascade() if ($this_should_be_the_last_state_called);
# ... do some work ...
}
The cascader continues the sequence of subs by default. However,
&do_cascade() and &stop_cascade() will set a flag indicating that the
current sub has already cascaded. This will prevent the cascader from
calling the next sub once the current one returns.
Now for cascader instantiation:
my $cascader = new Cascader(
kick => [ \&default_kick ],
join => [ \&default_join ],
);
The cascader uses AUTOLOAD to resolve cascades from their names. For
example, $cascader->kick() is mapped to the 'kick' sequence through
AUTOLOAD. Manipulating the Cascader package's symbol table is not
recommended, because the symbol table is shared among all instances.
Cascader lists can be manipulated with push and pop methods. To add a
new sub to the 'kick' cascade:
$cascader->push( 'kick', \&user_kick );
To remove the last pushed sub:
$cascader->pop( 'kick' );
Push would allow multiple subs. If one of the subs is \&stop_cascade,
then the cascade is stopped unconditionally at that point:
$cascader->push( 'join', \&user_join, \&stop_cascade );
And finally, integration with POE via object states:
new POE::Session( _start => \&state_start,
$cascader => [ 'kick', 'join' ]
);
The 'kick' event will go to $cascader->kick(), and 'join' is sent to
$cascader->join(). AUTOLOAD fires off the proper cascade for each
meta-method.
This has evolved out of the scope of POE proper, and so I've delegated
any further work to Cherem.
** Reorganize Wheels
I think the Wheels abstraction should be reorganized into a new
taxonomy:
*** Stream
Streams are read and written in whatever chunks they receive.
POE::Filter::Lecks would probably exist here.
*** Block
Blocks would be subdivided:
**** Separated
Blocks that are identified by separators. POE::Filter::Line goes
here.
**** Enumerated
Blocks that are identified by length. This includes fixed-width
records and variable-width records that begin with a length marker.
POE::Filter::Reference belongs here.
*** Form
Forms are read/written as anonymous hashrefs:
{ Headers => { $header_1 => $value_1,
$header_2 => $value_2,
...
},
Fields => { $field_1 => $value_1,
$field_2 => $value_2,
...
},
}
It would be up to the specific Form filters to figure out a way to
render forms. Perhaps one of the Headers could contain meta-layout
information.
POE::Filter::HTTPD goes here. As would a new filter for screen
widgets. The nice thing about this scheme is that sessions wouldn't
need to know what sort of form interface they're using. It could be
Tk widgets, Face widgets or CGI.
** Self-Knowledgeable Components
Fairytale stuff. It's here so I don't forget it.
What if the Filter::Reference is chained off Filter::Line instead of
Filter::Block?
The "chaining" could be made internal to Filter::Reference, so it
automagically brings in Filter::Block. That would prevent people from
chaining incompatible filters together, but it would limit their
flexibility.
Or components (wheels, filters and drivers) could be made aware of the
other components they're working with. That would give them an
opportunity to verify that they're working with compatible components.
It might also allow components to alter their default behaviors. For
example, Filter::Reference might make sure to escape/unescape newline
characters if it's working with a Filter::Line. It might even go so
far as to query the Filter::Line about the newline characters it needs
to escape.
** Break Session Encapsulation
There is a growing need to manipulate remote sessions with
Kernel::signal(), ::alarm(), and :state(). This seems to break the
barriers between sessions in uncomfortable ways, but it appears to
have its uses.
While I can't justify remote signal() and state(), remote alarm() is
harmless enough. But who gets the $alarm_id when the alarm semantics
change?
** Daemonify a Process
Add a Kernel function to make the whole process go daemon. From
_Advanced Programming in the UNIX Environment_ by W. Richard Stevens:
fork() && exit;
setsid();
chdir('/');
umask(0);
close(STDIN); close(STDOUT); close(STDERR); # if necessary
** Kernel fork()
POE::Kernel has some standard requirements for forking.
Implement them as POE::Kernel::fork()?
** "Morphing" SocketFactory
SocketFactory is dead weight after it's made a connection, and its
socket is useless while it's trying to make the connection. Randal
Schwartz suggests a SocketFactory::put() method that queues data and
delivers it to the socket when it finally connects.
A (UNIX) streams-based approach would handle this nicely. In the
meantime, two other ideas have been considered:
*** Morphing SocketFactory: First Sortie
After some debate, the idea evolved into a SocketFactory that can
"morph" into another wheel after it successfully connects.
Furthermore, it was suggested that an option to automatically retry
connections might be useful.
Since most of the work is done in the SocketFactory constructor, here
is a sample of what such a beast might look like:
new POE::Wheel::SocketFactory
# regular SocketFactory things
( SocketDomain => AF_INET,
SocketType => SOCK_STREAM,
SocketProtocol => 'tcp',
RemoteAddress => $remote_address,
RemotePort => $remote_port,
FailureState => 'io_error',
# here's where it gets strange
MorphInto => 'Wheel::ReadWrite',
MorphParameters => { Driver => new POE::Driver::SysRW,
Filter => new POE::Filter::Line,
InputState => 'got_a_line',
ErrorState => 'io_error',
},
);
You sort of need to know what you want to do ahead of time, since
almost all the work's during initialization. Some folks will find
this concept very strange.
*** Morphing SocketFactory: Second Sortie
I just like saying "sortie". It comes from long afternoons playing
Choplifter on the Apple ][. Anyway:
POE::Wheel::SocketFactory would have a put() method, and it would
enqueue output just like the original idea. However, instead of
playing with the SocketFactory's blessing with the OO equivalent of
goto, provide another method (maybe dequeue_to()) that returns the
queued data as a list. The list would be suitable for feeding back
into another wheel's put() method.
dequeue_to() might look like this inside:
sub SocketFactory::dequeue_to {
my ($self, $new_wheel) = @_;
if (@{$self->{queue}}) {
$new_wheel->put(@{$self->{queue}});
$self->{queue} = [];
}
$new_wheel;
}
You would create a new wheel and pass its reference through the
dequeue_to() function. This would let you create the new wheel,
dequeue to it, and store it in one statement:
sub SomeSession::connect_handler {
my $heap = $_[HEAP];
$heap->{wheel} =
$heap->{wheel}->dequeue_to( new Wheel::ReadWrite( ... ) );
}
This seems cleaner than playing with the SocketFactory's blessing.
** Kernel Statistics
A function to list active sessions. A function to acquire details
about a session. A function to acquire the kernel's and OS's load
averages, uptime, etc.
** Port Face from Serv to POE
Face is a text widgets library using Curses. It's heirarchical, and
it uses a funky slow tied-hash message passing thing. It would be
enormously cool to revise this and make it work with POE.
* RESEARCH AND DEVELOPMENT
** KST
Fmh suggests looking at the Knowledge Server Toolkit at http://lambda.gsfc.nasa.gov/kst/kst.html
Apparently it is a socket-based application for monitor and control.
** Java Application Servers
Fmh says POE needs Java Application Server functionality.
There's one, with explanation and overview and stuff, at http://www.enhydra.org/
** Object System Architecture
Artur Bergman emphatically recommended the book _Object Database Development, Concepts and Principles_, by David W. Embley.
More information about it may be found at http://cseng.awl.com/bookdetail.qry?ISBN=0-201-25829-3&ptype=0
http://osm7.cs.byu.edu/~eric/allegro.html> talks about some of the topics in the book. Namely "OSA":
OSA has concepts to formalize just about everything one needs to
model a real world situation. Although OSA is an "integrated"
modeling scheme, that is all the parts work together, it can
conveniently be seen as consisting of three parts: ORM, OBM, and
OIM.
ORM = Object-Relationship Model
OBM = Object-Behavior Model;
OIM = Object-Interaction Model
** Serial and Terminal Device
This amounts to creating pre-conditioned filehandles that the regular
select logic can deal with. The code is mostly written already; it
just needs to be cleaned up and made available.
Stacked filters would encompass this quite easily.
*** General STREAMS Notes
Streams may be symmetric. That is, put in both directions and no
service routines. This is because the stream head is event driven,
instead of blocking and synchronous.
POE::Wheel::ReadWrite is a stream head.
POE::Filter is a stream module.
POE::Driver is a stream end. In some stream implementations, more
modules can be stacked onto the stream after the end. That's how they
support multiplexing. I'm not sure this is good.
User calls into the Wheel (stream head) start a chain of calls down
the stream towards the driver.
Select events start a chain of calls up the stream towards the user.
General usage:
SocketFactory to create the socket.
Or open() or whatever to create the file.
Create a Stream, and give it the file handle.
*** Notes From DIGITAL's Network Programmer's Guide
http://www.partner.digital.com/www-swdev/pages/Home/TECH/documents/Digital_UNIX/V4.0/AA-PS2WD-TET1_html/netprog6.html
**** Module Data Structures
queue init structure
put routine (takes outgoing data)
service routine (handles incoming data)
open routine (called on each open)
close routine (called on last close)
information structure
statistics structure
module info structure
module id number
modlue name
minimum packet size (developer use)
maximum packet size (developer use)
high-water mark (flow control)
low-water mark (flow control)
streamtab structure
read queue init structure
write queue init structure
mux read queue init structure (for mux drivers)
mux write queue init structure (for mux drivers)
**** Message Data Structures
Data Buffer, containing the message's binary data.
message block:
data block
message priority (band)
message flags
**** Processing Routines for Drivers and modules
Open - Similar to Filter::new
Close - Similar to Filter::DESTROY
Configuration
Read side put
Stream ends don't have this, since they receive data from the
kernel.
Downstream modules call putnext() to send to this.
Takes pointer to read queue and message pointer.
Write side put
Called when the upstream module calls putnext().
Takes pointer to write queue and message pointer.
Read side service
Write side service
Service routines pull messages off the read/write queues and
try to process them. I'm still unclear on this concept.
**** Queue Synchronization
Queue level. One thread can access any instance of the module or
driver's write queue at the same time another thread is accessing a
module's read queue. Read and write queues don't share common data
and don't need to be synchronized.
Queue pair. Read and write queues share data. Only one thread at a
time can access them together.
Module level synchronization. All code within the module or driver is
single-threaded.
Elsewhere. Something else is synchronizing things.
Global. All drivers and modules in this synchronization level run in
the same therad. No concurrency at all.
*** DMR's Streams Paper
A Stream Input-Output System, by Dennis M. Ritchie: http://cm.bell-labs.com/cm/cs/who/dmr/st.html
**** Overview
Streams overview.
***** Streams
Streams are full-duplex connections between a user's process and a
device or pseudo device. It consists of a strand of modules, similar
to a shell pipeline, but with data going in both directions. Flow
control is something exceptional.
On the process end there is a stream head. It provides a programming
interface to the stream. On the device end is a device driver
module-- it's not a driver, it's a "driver module". Okay! On the
driver module end, data is sent to the device. In the receiving
direction, data and state transitions are composed into messages and
set towards the user's program.
On open, you get two modules: a head module and a driver module.
Intermediate modules are attached and dropped dynamically.
***** Queues
Each stream module has a pair of queues, read and write. A stream
queue is a queue, plus two routines and some status information. One
routine is the "put procedure" which is called by the neighboring
module to send data along. The other is the "service procedure" which
is scheduled to execute whenever there is work to do.
So, put() doesn't immediately cause anything to happen. It merely
signals the next module in the line to process something when it
can.
The status information includes a pointer to the next module, some
flags, and some ntate information. Both queues know about each-other,
so they can implement "echo" and stuff.
***** Message Blocks
Message blocks are passed between queues. They are obtained from an
allocator. Each contains a read pointer, a write pointer, and a limit
pointer. Respectively, they are the beginning, end and growth limit
of the block's data.
The block header specifies its type. The most common type is data.
There are also control blocks of various kinds: data delimiters, I/O
control requests, and special conditions (line break, or carrier
loss).
Data blocks arrive one at a time. Boundaries between them are
insignificant. Data blocks may be coalesced; control blocks always
remain separate.
***** Scheduling
Each queue module behaves like a separate process, but they're not
real processes. The system saves no state information for a
non-running queue module. Queue modules don't block when they can't
continue; they must return control. When queues are enabled, the
system will (as soon as convenient) call its service procedure. The
service procedure removes successive blocks from the data queue,
processes them, and places them on the next queue by calling its put
procedure. Wher there are no more blocks to process, or when the next
queue becomes full ,the service procedure returns to the system.
Queue enabling is mostly automatic. Like, when a block is put on the
queue, it enables like magic.
***** Flow Control
Each queue has a pair of numbers for flow control. A high-water mark
limits the amount of data that may be outstanding in the queue. By
convertions, modules don't place data on a queue above the high-water
limit. A low-water mark is used for scheduling: When a queue has
exceeded its high-water mark, a flag is set. Then, when the routine
that takes blocks from a data queue notices this flag is set and that
the queue has dropped below the low-water mark, the upstream queue is
enabled.
***** Examples
A newly-opened stream device:
user write ---> | stream device | ---> device out
user read <--- | | <--- device in
The top-level routines are invoked by users' read and write calls.
The writer routine sends messages to the device driver on th eright.
Data arriving from the device is composed of messages sent to the
top-level reader, which returns the data to the user process when it
executes read.
Configuration after the device is open:
user wr ---> tty out ---> device out
user rd <--- tty in <--- device in
An ordinary terminal connected by RS-232. Here a processing module
(in the middle) is interposed. It performs the services necessary to
make terminals usable; for example echoing, character-erase and
line-kill, tab expansion as required, translation between CR and \n.
It's possible to use one of several terminal handling modules. The
standard one provides services like those of the Seventh Edition
system. Another resembles the Berkeley "new tty" driver.
The processing modules in a stream are thought of as a stack whose top
(shown on the left) is next to the user program. To install the
terminal processing module after opening a terminal device, the
program that makes such connections executes a "push" I/O control call
naming the stream and the desired module. Other primitives pop a
module and determine the topmost module's name.
Here is a case where terminal processing is staked on a network
protocol (a network terminal):
user wr --> tty out --> proto out --> device out
user rd <-- tty in <-- proto in <-- device in
Then there is a common configuration (not illustrated) that's used
when the network is used for file transfers or other purposes. It
simply omits the "tty" module and uses only the protocol module.
**** Messages
Most of the messages between modules contain data. The allocator that
dispenses message blocks takes an argument specifying the smallest
block its caller is willing to accept. The current allocator
maintains an inventroy of blocks that are 4, 16, 64 and 1024
characters long. Modules allocate blocks using a best guess of the
needed size-- for example, the top level write routine requests 64- or
1024-character blocks. The network input routine allocates 16-byte
blocks because that's the data packet size. The smallest blocks are
used only to carry arguments to control messages.
Besides data blocks, there are also several kinds of control messages.
***** Synchronous Messages
Synchronous messages are queued in the stream, so they occur at the
appropriate time in the stream.
****** BREAK
Break is generated by a terminal vice on detection of a line break
signal. The standard terminal input processor turns this message into
an interrupt request. It may also be sent to a terminal device driver
to cause it to generate a break on the output line.
****** HANGUP
Generated by a device when its remote connection drops. It also marks
the stream so further use causes an error.
****** DELIM
This is a data delimiter. Most of the stream I/O is prepared to
provide true streams, in which record boundaries are insignificant,
but there are various situations in which it is desirable to delimit
the data. For example, terminal input is read a line at a time; is
generated by the terminal input processor to demarcate lines.
****** DELAY
Tells terminal drivers to generate a real-time delay on output. It
allows time for slow terminals to react to characters previously sent.
****** IOCTL
These are generated by users' ioctl() calls. The parameters are
gathered at the top level, and if the request is not understood there,
it and the parameters are passed down the stream as a message. The
first module that understands a particular request acts on it and
returns a positive ACK. Intermediate modules that don't recognize a
particular request pass it on; stream-end modules return a negative
NAK. The top-level routine blocks until acknowledgement, and passes
the info to the user.
***** Asynchronous Messages
These are expedited up or down the stream, bypassing the queue.
****** IOCACK and IOCNAK
Acknowledgement messages for IOCTL. The stream head may time out if
one of these isn't returned quickly enough.
****** SIGNAL
Signals are generated by the terminal processing module and cause the
top level to generate process signals such as SIGQUIT and SIGINT.
****** FLUSH
Flush messages throw away data from input and output queues after a
signal or on user request.
****** STOP and START
These messages are used by the terminal processor to halt and restart
output by a device. For example, to implement the traditional
XON/XOFF flow control mechanism.
**** Queue Mechanisms and Interfaces
queue structure
flags
put procedure
service procedure
link to next downstream module
pointer to the first block on the queue
pointer to the last block on the queue
hi-water value
lo-water value
count of characters now on the queue
pointer to private storage
The flags contain several bits used by low-level routines to control
scheduling. They show whether the downstream module wishes to read
data, or the upstream module wishes to write, or the queue is already
enabled. One bit is examined by the upstream module; it tells whether
this queue is full.
The first and last block pointers point to a singly-linked list of
data that implements something of a ring buffer. Hi-water and
lo-water are initialized when the queue is created, and they are
compared against the character count to decide how to control flow.
The private storage is used to keep characteristics governed by the
queue module. For example, all the stty(1) things for a terminal.
Stream processing modules are written in one of two flavors. In the
simpler flavor, the queue module acts almost like a classical
coroutine. When it's instantiated, it sets its put procedure to a
system-supplied default routine, and supplies a service procedure.
Its upstream module disposes of blocks by calling this module's put
procedure, which moves a block downstream. The standard put procedure
also enables the current module; a short time later, the current
module's service procedure is called by the scheduler. In
pseudo-code, a typical service routine is:
while (my queue is not empty and the next queue is not full) {
get a message block from my queue
process the message block
call the next queue's put() procedure to move the block along
}
That's appropriate in cases where messages can be processed
independently of each-other. For example, it's used by the terminal
output module. All the scheduling details are taken care of by
standard routines.
More complicated modules need finer control over scheduling. A good
example is terminal input. Here the device module upstream produces
characters, usually one at a time, that must be gathered into a line
to allow for character erase and kill processing. Therefore the
stream input module provides a put procedure to be called by the
device driver or other module downstream from it; here is an outline
of this routine and its accompanying service procedure:
put procedure(queue, block):
put block on my queue
if (block contains new-line or carriage return) {
enable my queue
}
service procedure(queue):
take data from queue until new-line or carriage return, processing
erase and kill characters
call the next queue's put() procedure to move the line along
call the next queue's put() procedure with DELIM to signify a line
The put procedure generates the echo characters as promptly as
possible; when the terminal module is attached to a device handler,
they are created during the input interrupt from the device because
the put procedure is called as a subroutine of the handler. On the
other hand, line-gathering, erase and kill processing, which can be
lengthy, are done during the service procedure at a lower priority.
**** Other Stuff
Pseudo-terminals act like pipe drivers. Messages that they receive
from a master or slave PTY are passed unchanged to the other end.
This means that streams on both sides are kept in synch.
device process <--> message module <--> pty slave <--> \\
user process <----> tty module <------> pty master <--> //
The "message" module translates between messages and data, so the pty
drivers can pass them around. In one direction, the message processor
takes control and data messages and transforms them into data blocks.
Data blocks start with a header giving the message type and contain
data with the message (or data) innit. In the other direction, it
parses the structured data messages and creates the corresponding
control blocks. The "device process" simulates a device driver.
**** Evaluation
Design decisions:
1. Messages hold references to data blocks, to minimize data copying.
Modules must be prepared to coalesce and break up data.
2. Modules are not proper processes (sessions) because there can be
thousands of them. Coroutines or threads would rock.
3. Put and service were necessary because of asynchrony, flow control,
etc.
Other notes:
No multiplexing, fan-in or fan-out, are provided by the original
streams interface.
Streams are good for controlling opened channels, but it lacks a
general way to establish channels between processes.
*** Unread Sources
SUN's STREAMS Programming Guide: http://docs.sun.com/ab2/coll.156.1/STREAMS/@Ab2TocView?
Mentat Performance Networking: http://www.mentat.com/
** New Modules
*** New Filters
Stacked filters would let you do neat things like IMAP through SSH.
The mind boggles. Here are some useful filter ideas:
POE::Filter::SSH
For example:
Create a raw socket.
Push "tcp socket" on it.
Push "SSH stream protocol" on it.
Push "line oriented protocol" on it.
Push "imap client protocol" on it.
And go to town!
*** New Components
Some stand-alone components that might be useful:
**** POE::Component::Authenticate
Artur suggests a generic API for authentication servers. I'm
interpreting this as a new component. This component would implement
a standard public interface. Plug-ins for various authentication
servers would do most of the hard work.
Security issues: Kernel hooks can accidentally or maliciously peek at
authentication requests, possibly logging them or otherwise displaying
cleartext passwords.
**** Network Servers and Clients
POE::Component::IRCD, SMTP, SMTPD, NNTP, NNTPD, IMAP, IMAPD, etc.
Sean Puckett has a working prototype of a multi-protocol chat server.
It accepts IRC and MUD style connections, and he spoke of adding web
support. Everyone shares the same chat stream.
Fimmtiu is the author of Net::IRC. He has been working on
POE::Component::IRC, and the new module has entered limited beta
testing (as of 6/4/1999).
Filter::HTTP (user agent). This is the client side. Randal thinks
there's a non-blocking way to use LWP, possibly by suppling the select
loop logic for it, which would work nicely.
**** POE::Component::VNC
VNC's home is http://www.uk.research.att.com/vnc/>
Something on the VNC site caught my eye, but I don't remember exactly
where it is. The site mentions that the VNC protocol doesn't really
need a desktop behind it. Instead, stand-alone VNC servers can
provide graphical interfaces to one or more remote clients.
It sure would be interesting if POE could serve virtual desktops or
dialogs with VNC. Interesting and scary, like a huge, mutant,
radioactive lizard staring in your car window, wondering if maybe it
wants a crunchy snack with a soft, gooey center. Or something.
** Concurrent POE (Distributed Queue)
How does POE transparently thread or fork if these concurrency methods
are available? How does it make concurrent and single-threaded modes
compatible with each-other?
Enqueue events in sessions instead of the Kernel. This change should
also allow sessions to be frozen and thawed, since freezing them will
also freeze their event queues. Other resources, such as files, may
be lost, however. A work-around might be to have special _freeze and
_thaw states that are called at appropriate times. They can clean up
and reallocate resources.
Function name conventions used in the following code snippets:
sub name is public
sub _name is friend
sub __name is private
*** Kernel Changes
sub _enqueue_event {
my ($self, $session_id, $source_id, $state, $priority, $time, $etc) = @_;
}
#----------------------------------------------------------------------------
# Dispatch an event to the next session in the round-robin queue.
# This also has the side-effect of testing sessions for activity; they
# can be checked for resource starvation only when they've run out of
# events. This should eliminate a lot of checks.
sub _dispatch_next_event {
my ($self) = @_;
my $next_session = shift @{$self->[KR_SQUEUE]};
if ($next_session->_dispatch_event()) {
push @{$self->[KR_SQUEUE]}, $next_session;
}
}
# Theory of operation:
#
# Kernel keeps a master queue. This holds active sessions in a
# time-based "priority" queue.
#
# The queue's "key" is the time that a session will next need
# attention. This is the time of the next event in the session's
# queue.
#
# The queue's "value" is a reference to the session (or perhaps
# session ID) that points to the session for dispatching.
#
# So:
my $kernel = bless {}, 'POE::Kernel';
$kernel->[KR_MASTER_QUEUE] =
[ [ $time, $session ],
[ $time, $session ],
...
];
# Sorted in $time order.
#
# Inser
*** Session Changes
#----------------------------------------------------------------------------
# Enqueue an event.
sub _enqueue_event {
my ($self, $sender, $state, $priority, $time, $etc) = @_;
# Place the event is the session's queue.
#
# If "concurrent" POE:
# Start or unblock the session's dispatch thread.
# End
#
# Return the number of events in the session's queues.
}
sub _dispatch_event {
my ($self) = @_;
# If "concurrent" POE:
# Return 1 if there are no events but the session has resources
# to keep it active, or return 0 if there are no events and the
# session is "stalled".
# Otherwise, "regular" POE:
# Dispatch an event, and return the number of events left in the
# queue.
# End
}
* WILD IDEAS
** Scheme
Write a virtual scheme machine that compiles into POE sessions.
Sessions can emulate callcc (call-with-current-continuation), which in
turn emulates everything, so it's theoretically possible.
** "Organic" FSA
POE::Session instances are finite state automata. They are permitted
to modify themselves at runtime. Is there an elegant way to encode
decisions in POE::Session instances, perhaps in a way that can modify
itself over time?
For example, a neural network could be encoded in a POE::Session, one
neuron per state. Neurons would fire events at each-other, in massive
simulated parallelism.
Neural networks implemented this way are also dynamic, because
sessions may add, remove and redefine their states. There just isn't
a handy way to define this particular POE use yet.
** "AutoPOE"
AutoPOE would implement Philip Gwyn's idea to auto-split and translate
Perl code into POE event handlers as part of an installation process,
rather than at runtime. Sort of an autoloader/selfloader that also
transforms code into POE sessions.
** POE::Perl
POE::Perl would take a different approach. It would try to manipulate
Perl into cooperating with POE's event model. So far, the idea
includes:
*** Overriding Perl's built-in functions
CORE::GLOBAL::sleep, for example, would register an alarm handler, and
set an alarm for it.
*** Manipulating Perl's call stack
CORE::GLOBAL::sleep (again, for example) would save the call stack
going into the sleep() function, and simulate a return back to the
kernel. When the alarm handler is called, the stack is reconstituted,
and CORE::GLOBAL::sleep returns back to the session.
That's the idea, anyway. Perl may not like it.
** Automate State Creation
*** Translate plain procedural perl into POE event handlers
There is an easy way to translate blocking code into non-blocking
code. You just need identify all the jump/return destinations, and
start new states at them.
For example, consider this perl code:
my $count = 1_000_000_000;
while ($count--) {
print "Hello, world! Enter some text: ";
my $line = <STDIN>;
last if ($line =~ /^quit$/i);
}
print "Goodbye, world!\n";
This could be represented by the following states:
sub _start {
$_[HEAP]->{count} = 1_000_000_000;
$_[KERNEL]->yield('while_test_1');
}
sub while_test_1 {
if ($_[HEAP]->{count}--) {
$_[KERNEL]->yield('end_of_while_1');
}
print "Hello, world! Enter some text: ";
$_[KERNEL]->block('line input', 'resume_state');
}
sub resume_state {
if ($_[ARG0] =~ /^quit$/i) {
$_[KERNEL]->yield('end_of_while_1');
}
else {
$_[KERNEL]->yield('while_test_1');
}
}
sub end_of_while_1 {
print "Goodbye, world!\n";
}
B::Deparse returns "normalized" Perl based on the bytecodes in perl's
compiled parse trees. It may be easier to convert normalized Perl
into states than to try and parse everyone's different coding styles.
Here's the original Perl code, run through `perl -MO=Deparse,-p`:
(my($count) = 1000000000);
while (($count--)) {
print('Hello, world! Enter some text: ');
(my($line) = <STDIN>);
(($line =~ /^quit$/i) and last);
}
print("Goodbye, world!\n");
More research is needed here. Must test more code constructs; maybe
run a large program through B::Deparse and see how it treats things.
** Load Balancing Among Distributed POE Kernels
Some ueber-kernel directory service would keep track of everything,
maybe, and, uh, make sure stuff works. :)
** Input Parsing (Factoid Style)
*** The Problem
Input parsing is hard. I have some prototype code for Infocom-style
command parsing, but it doesn't scale well when you consider that that
existing infobots (see http://www.cs.cmu.edu/~infobot ) already
manage around 150,000 factiods.
So consider the need to look up responses in a 150,000 factoid
database. This isn't so far-fetched; there's an IRC 'bot that does
this now with a tied hash (I think; haven't looked). Its limitation,
however is that it can't perform "fuzzy" matches on user input.
Consider the example:
can someone help me?
There are a number of different ways to represent it, and they can be
summarized succintly as a regexp:
/^can some(one|body) help( me)?$/
So the question is: how do I key a hash by regexp? The answer is: I
can't. But there seems to be a limited regexp style that might work.
Instead of using the full regexp syntax, I limit it to just (|).
Begin and end anchors are implied, and I leave out some bits.
(can|would) (someone|somebody|anyone|anybody) (help|assist) (|me)
Pretty rough. How to make that pattern a factoid key, so that someone
asking the original question triggers the response.
*** Factoid Storage
First we treat the factoid key as a series of words instead of a
string of letters. This removes the whitespace dependency:
(can|would), (someone|somebody|anyone|anybody), (help|assist), (|me)
Second, we sort options lexically, so that (can|would) and (would|can)
are treated the same:
(can|would), (anybody|anyone|somebody|someone), (assist|help), (|me)
Third, we build a hash of discrete words to the option groups they
occur in. I have some doubts about this, but it's my best idea so
far:
$alias{'can'} = [ 'can', 'can|would' ];
$alias{'would'} = [ 'would', 'can|would' ];
$alias{'someone'} => [ 'someone', 'anybody|anyone|somebody|someone' ];
$alias{'anyone'} => [ 'anyone', 'anybody|anyone|somebody|someone' ];
$alias{'somebody'} => [ 'somebody', 'anybody|anyone|somebody|someone' ];
$alias{'anybody'} => [ 'anybody', 'anybody|anyone|somebody|someone' ];
$alias{'help'} => [ 'help', 'assist|help' ];
$alias{'assist'} => [ 'assist', 'assist|help' ];
$alias{'me'} => [ 'me', '|me' ];
Fourth, the factoid is stored under list of words and sorted options
(the results from the second step):
$factoid{ 'can|would', 'anybody|anyone|somebody|someone',
'assist|help', '|me'
} = 'No! Go away!';
*** Information Retrieval
Retrieving factoids is a little harder. Let's review the target
input, preprocessed as a list of words (punctuation handily
discarded):
'can', 'someone', 'help', 'me'
Here's the query (pseudo-SQL):
select factoid from factoids
where key.word_1 is in
( select alias from aliases where key=='can' )
and key.word_2 is in
( select alias from aliases where key=='someone' )
and key.word_3 is in
( select alias from aliases where key=='help' )
and key.word_4 is in
( select alias from aliases where key=='me' )
And then pick the factoid with the shortest/longest key?
Actually, this fails for (|me) things, because the select will look
for four words. So I suppose what's really needed is an index that
works in the same vein as regexps do for matching. Somehow.
So, the query:
qw(can someone help me)
somehow matches a factoid stored by:
qw(can|would anybody|anyone|somebody|someone assist|help |me)
Hrm. I've decided it's going to take some experimentation, which
means having a framework for tinkering. And the best framework for
tinkering with NLP/factoid stuff is a 'bot! And the best way to have
a 'bot that you don't have to reset all the time is (IMO) the fabled
Object Layer.
So, this to-do is on hold 'til I can build an IRC 'bot.
* IN CLOSING
I welcome all ideas, suggestions and research pointers.
Jump to Line
Something went wrong with that request. Please try again.