Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Shared memory substrings of large Perl buffers
Perl Perl6
Branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
doc/String
lib/String
test
.travis.yml
About
Changes
Meta
ReadMe.pod

ReadMe.pod

Name

String::Slice - Shared Memory Slices of Bigger Strings

string-slice-pm

Version

This document describes String::Slice version 0.08. ";

Synopsis

    # Exports 'slice':
    use String::Slice;

    # String::Slice is for parsing across huge strings:
    my $buffer = get_large_buffer;

    # Make an SV to turn into a slice:
    my $slice = '';

    # Move (parse) forward across the $buffer in small increments:
    for (my $pos = 0; $pos <= length $buffer;) {

      # $pos is the current postion in buffer.
      # $slice retains info from previous call:
      slice($slice, $buffer, $pos);

      # front-anchored means /\A…/
      my $regex = get_small_front_anchored_regex;

      # Match may fail. Stay at same $pos if fail:
      $slice =~ $regex or next;

      # Add length of match:
      $pos += $+[0];
    }

Description

Processing large strings in Perl is inefficient because to access any smaller portion of a buffer you need to make a copy of that portion. Also finding substr offsets in large utf8 strings requires looping, since each character can have a varying length.

This module lets you make a string scalar (a "slice") point to a portion of the content of another string scalar. Finding the next slice is based on the position of the previous slice, so hopping over utf8 is much faster.

The primary goal of this module is to make the parsing large data much faster in Perl.

API

String::Slice exports one function: slice. It can be called in a few different ways:

    slice($slice_variable, $big_buffer_variable, $char_offset, $char_length)

This effectively makes $slice_variable a substr of the buffer, providing a faster, and more memory efficient way of doing:

    $slice_variable = substr($big_buffer_variable, $char_offset, $char_length);

The offset defaults to 0, and if no length is given, the slice goes to the end of the buffer.

If $slice is already a slice of $buffer then this call:

    slice($slice, $buffer, $offset, $length)

will subtract the previous offset (stashed in the slice internally) from the current offset and hop the difference. (This may be forwards or backwards).

If length is too long, the slice will go to the end of the buffer.

One side effect of this function is that both strings will become readonly, and the memory will not be freed until they both go out of scope.

The slice function returns 1 on success and 0 on failure. Failure occurs if the requested offset is invalid (less than the start or greater than or equal to the end of the buffer).

Credit

These people provided invaluable help:

  • Jan Dubois
  • Florian Ragwitz

Author

Ingy döt Net <ingy@cpan.org>

Copyright and License

Copyright 2012-2015. Ingy döt Net.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

See http://www.perl.com/perl/misc/Artistic.html

Something went wrong with that request. Please try again.