Tasklist: IO Subsystem

Whiteknight edited this page May 22, 2012 · 2 revisions

This page details all the necessary improvements to Parrot's I/O subsystem that need to be made. If you work on any of these tasks, make sure you remove it from the list when you're done.

Major Refactor

Here are some details of a major refactor proposed by Whiteknight which will probably be pursued soon:

  • Refactor IO-related codepaths to use a new io_vtable for dispatching operations to the subsystem API
  • Refactor out a buffering API, which can take vtable function pointers for type-agnostic buffering on separate input/output streams.
  • Start separating out Pipe and FileHandle logic, though do not modify the user-facing interfaces at all.
  • Possible create a new Pipe PMC, to expose the new pipe interface without removing those features from FileHandle (yet).

Related TO-DO items

These are some older TODO items which will be covered by this refactor:

  • Make it possible to use separate input and output buffering. Input buffering is needed for readline() and peek(), but sockets or pipes usually don't use output buffering.
  • Separate pipe-related logic out of FileHandle. Create a Pipe PMC type. Alternatively, we could deprecate opening pipes to external commands via FileHandle PMC's.
  • Move buffering logic from FileHandle and StringHandle to Handle. Be able to inherit it from all other PMC types. Include buffering-related ATTR's, METHOD_buffer_type(), and METHOD_buffer_size(). We should keep the buffering logic at a C-level. All that is needed for now is to pass the buffering functions a pair of read/write callbacks for files and sockets (e.g. recv() and send()). (Also, maybe add a flag to the buffer_type and buffer_size methods to specify whether this is for input-only or output-only or both, (default)).
  • Modify I/O API so that it doesn't call PCCINVOKE(). Method calls on I/O PMC's should call the API functions, not the other way around.
  • Unify logic wherever possible so multiple I/O types can share.
  • Unify the codepaths for Socket and Pipe into the I/O API. Refactor the I/O API to be more unified.
  • Add externally-visible socket functions src/io/socket_*.c to src/io/api.c.
  • Extract a sane buffering API.
  • Cleanup Parrot_io_write(), Parrot_io_puts(), Parrot_io_putps(). Unify where possible and remove duplicate code.
  • Add src/io/pipe.c and src/io/stringhandle.c files for type-specific logic similar to src/io/filehandle.c.
  • Make sure the functions in src/io/buffer.c will work with all buffer-enabled PMC's.
  • Move src/io/filehandle.c:Parrot_io_clear_buffer() to src/io/buffer.c.

Unnecessary TODO Items

Here are a list of previously-mentioned TODO items which become unnecessary in this refactor:

  • Add Pipe and Socket logic to Parrot_io_open. CallParrot_io_open()from theopen()` method of each
  • Add Pipe and Socket logic to Parrot_io_close().
  • Add Pipe and Socket logic to Parrot_io_is_closed().
  • Add Pipe and Socket logic to Parrot_io_flush().
  • Add Pipe and Socket logic to Parrot_io_reads().
  • Add Pipe and Socket logic to Parrot_io_readline().
  • Error handling: figure out what happens if an unsuitable PMC is passed to Parrot_io_seek(), Parrot_io_tell(), Parrot_io_eof(), and Parrot_io_is_tty().
  • Use GET_ATTR macros in src/io/filehandle.c so all functions will be able to handle subclasses.


  • Move encoding logic (METHOD_encoding(), ATTR_encoding(), etc.) from StringHandle and FileHandle to Handle. Be able to inherit it from all other PMC types. The encoding logic is encapsulated in the string code quite well now, so we should have the StringHandle PMC handle encodings on its own. Only the encoding attribute should be shared.


  • Rename Parrot_io_new_pmc to Parrot_io_new_filehandle_pmc().

Misc Cleanup

  • Evaluate Parrot_io_make_offset(), Parrot_io_make_offset32(), and Parrot_io_make_offset_pmc(). If they do not need to be in src/io/api.c, move them elsewhere, possibly src/io/filehandle.c.
  • Make print() and say() stringify the same way (see http://rt.perl.org/rt3/Ticket/Display.html?id=55196).
  • Change fprintf() to Parrot_io_fprintf() where relevant.
  • The %s conversion specification in the printf()-like functions do not handle null C strings well.
  • Move src/io/filehandle.c:Parrot_io_is_encoding() to src/io/core.c.

Asynchronous I/O

Following are various I/O related RT tickets:


  • Create a StreamBuffer PMC to abstract away buffering details. This would allow FileHandle and Socket PMC's to be subclassed more easily and give all I/O types easy access to buffering.
  • Parrot_io_printf() should output to Parrot's standard output PMC, not stdout.
  • Parrot_io_eprintf() should output to Parrot's standard error PMC, not stderr.
  • Deprecate the current pipe API and create an OSProcess PMC that works like this (see IPC::Open3).
proc = new 'OSProcess'

# The 'flags' argument is something like EXEC_STDIN | EXEC_STDOUT | EXEC_STDERR
proc.'exec'(command, args, flags)

# Get stdin of exec()'ed process for writing
w_handle = proc.'stdin'()

# Get stdout of exec()'ed process for reading
r_handle = proc.'stdout'()


  • Created an abstract IOHandle class.
  • Abstracted away relevant API code from FileHandle and Socket into IOHandle.
  • Fixed StringHandle to be a proper subclass of IOHandle.
  • Removed the layers structures and macros after the migration is complete.
  • Removed src/io/io_mmap.c as it's unused and not useful.
  • Converted I/O layers to I/O objects.
  • src/io/io_unix.c is the guts of most I/O on most platforms. isrc/io/io_win32.c is Windows. src/io/io_stdio.c is STDIN, STDOUT, and STDERR. These three need to be ported to the new system.
  • src/io/io_utf8.c is really the wrong way to go about implementing UTF-8 encoding. Filehandle PMC's should be marked with their character set and encoding, similar to strings.
  • Created a FileHandle PMC as a core filehandle object which can be subclassed by various HLL's.
  • Continued to support different I/O operations on different platforms, using the #ifdef directive on platform-specific sections.
  • Renamed all PIO_* functions to 'Parrot_io_*'. Since the implementation is completely changing, it's better to create new functions with the new names than to change the names of existing functions.
  • Removed the deprecated pioctl opcode and fixed related documentation (see http://rt.perl.org/rt3/Ticket/Display.html?id=48589).
  • Decided if we plan to use AIO before the 1.0 release (see http://rt.perl.org/rt3/Ticket/Display.html?id=57920).
  • Removed src/io/io_passdown.c and src/io/io_layers.c since they're purely implementation artifacts of the I/O layers implementation.
  • Changed src/io/io_string.c to a subclass of FileHandle that provides the same interface but to a string instead of a filehandle.
  • Changed all I/O PMC's to inherit from Handle.
  • Added a close() anddisconnect() method to the Socket PMC.
  • Added an improved file-like API to the Socket PMC (open(), close(), readline(), etc). Internally, these methods should direct to similar functions in src/io/api.c.
  • Moved METHOD_get_fd() from FileHandle to Handle so that it can be inherited by Socket, Pipe, etc.
  • Create an OSHandle PMC that FileHandle and Socket can inherit from (but not StringHandle).
  • Create a Select PMC, maybe have a look at libevent. (It's a dynpmc)
  • Create StreamDescriptor PMC to abstract away system-dependent I/O descriptors. This would allow FileHandle and Socket PMC's to be subclassed more easily. (Mostly done as 'Handle' PMC)