Skip to content

mschilli/uri-find

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NAME
    URI::Find - Find URIs in arbitrary text

SYNOPSIS
      require URI::Find;

      my $finder = URI::Find->new(\&callback);

      $how_many_found = $finder->find(\$text);

DESCRIPTION
    This module does one thing: Finds URIs and URLs in plain text. It finds
    them quickly and it finds them all (or what URI::URL considers a URI to
    be.) It only finds URIs which include a scheme (http:// or the like),
    for something a bit less strict have a look at URI::Find::Schemeless.

    For a command-line interface, see Darren Chamberlain's "urifind" script.
    It's available from his CPAN directory,
    <http://www.cpan.org/authors/id/D/DA/DARREN/>.

EXAMPLES
    Store a list of all URIs (normalized) in the document.

      my @uris;
      my $finder = URI::Find->new(sub {
          my($uri) = shift;
          push @uris, $uri;
      });
      $finder->find(\$text);

    Print the original URI text found and the normalized representation.

      my $finder = URI::Find->new(sub {
          my($uri, $orig_uri) = @_;
          print "The text '$orig_uri' represents '$uri'\n";
          return $orig_uri;
      });
      $finder->find(\$text);

    Check each URI in document to see if it exists.

      use LWP::Simple;

      my $finder = URI::Find->new(sub {
          my($uri, $orig_uri) = @_;
          if( head $uri ) {
              print "$orig_uri is okay\n";
          }
          else {
              print "$orig_uri cannot be found\n";
          }
          return $orig_uri;
      });
      $finder->find(\$text);

    Turn plain text into HTML, with each URI found wrapped in an HTML
    anchor.

      use CGI qw(escapeHTML);
      use URI::Find;

      my $finder = URI::Find->new(sub {
          my($uri, $orig_uri) = @_;
          return qq|<a href="$uri">$orig_uri</a>|;
      });
      $finder->find(\$text, \&escapeHTML);
      print "<pre>$text</pre>";

AUTHOR
    Michael G Schwern <schwern@pobox.com> with insight from Uri Gutman, Greg
    Bacon, Jeff Pinyan, Roderick Schertler and others.

    Roderick Schertler <roderick@argon.org> maintained versions 0.11 to
    0.16.

LICENSE
    Copyright 2000, 2009 by Michael G Schwern <schwern@pobox.com>.

    This program is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

    See http://www.perlfoundation.org/artistic_license_1_0

SEE ALSO
    URI::Find::Schemeless, URI::URL, URI, RFC 3986 Appendix C

About

Perl module to find URIs in arbitrary text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Perl 100.0%