Unofficial Github repo for Snoopy. Snoopy is a PHP class that simulates a web browser. It automates the task of retrieving web page content and posting forms, for example.
PHP
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
AUTHORS
COPYING.lib
ChangeLog
FAQ
INSTALL
LICENSE
NEWS
README.md
Snoopy.class.php
TODO

README.md

NAME:

Snoopy - the PHP net client v1.2.4

SYNOPSIS:

include "Snoopy.class.php";
$snoopy = new Snoopy;

$snoopy->fetchtext("http://www.php.net/");
print $snoopy->results;

$snoopy->fetchlinks("http://www.phpbuilder.com/");
print $snoopy->results;

$submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";

$submit_vars["q"] = "amiga";
$submit_vars["submit"] = "Search!";
$submit_vars["searchhost"] = "Altavista";

$snoopy->submit($submit_url,$submit_vars);
print $snoopy->results;

$snoopy->maxframes=5;
$snoopy->fetch("http://www.ispi.net/");
echo "<PRE>\n";
echo htmlentities($snoopy->results[0]); 
echo htmlentities($snoopy->results[1]); 
echo htmlentities($snoopy->results[2]); 
echo "</PRE>\n";
$snoopy->fetchform("http://www.altavista.com");
print $snoopy->results;

DESCRIPTION:

What is Snoopy?

Snoopy is a PHP class that simulates a web browser. It automates the task of retrieving web page content and posting forms, for example.

Some of Snoopy's features:

  • easily fetch the contents of a web page
  • easily fetch the text from a web page (strip html tags)
  • easily fetch the the links from a web page
  • supports proxy hosts
  • supports basic user/pass authentication
  • supports setting user_agent, referer, cookies and header content
  • supports browser redirects, and controlled depth of redirects
  • expands fetched links to fully qualified URLs (default)
  • easily submit form data and retrieve the results
  • supports following html frames (added v0.92)
  • supports passing cookies on redirects (added v0.92)

REQUIREMENTS:

Snoopy requires PHP with PCRE (Perl Compatible Regular Expressions), which should be PHP 3.0.9 and up. For read timeout support, it requires PHP 4 Beta 4 or later. Snoopy was developed and tested with PHP 3.0.12.

CLASS METHODS:

fetch($URI)

This is the method used for fetching the contents of a web page. $URI is the fully qualified URL of the page to fetch. The results of the fetch are stored in $this->results. If you are fetching frames, then $this->results contains each frame fetched in an array.

fetchtext($URI)

This behaves exactly like fetch() except that it only returns the text from the page, stripping out html tags and other irrelevant data.

fetchform($URI)

This behaves exactly like fetch() except that it only returns the form elements from the page, stripping out html tags and other irrelevant data.

fetchlinks($URI)

This behaves exactly like fetch() except that it only returns the links from the page. By default, relative links are converted to their fully qualified URL form.

submit($URI,$formvars)

This submits a form to the specified $URI. $formvars is an array of the form variables to pass.

submittext($URI,$formvars)

This behaves exactly like submit() except that it only returns the text from the page, stripping out html tags and other irrelevant data.

submitlinks($URI)

This behaves exactly like submit() except that it only returns the links from the page. By default, relative links are converted to their fully qualified URL form.

CLASS VARIABLES: (default value in parenthesis)

    $host           the host to connect to
    $port           the port to connect to
    $proxy_host     the proxy host to use, if any
    $proxy_port     the proxy port to use, if any
    $agent          the user agent to masqerade as (Snoopy v0.1)
    $referer        referer information to pass, if any
    $cookies        cookies to pass if any
    $rawheaders     other header info to pass, if any
    $maxredirs      maximum redirects to allow. 0=none allowed. (5)
    $offsiteok      whether or not to allow redirects off-site. (true)
    $expandlinks    whether or not to expand links to fully qualified URLs (true)
    $user           authentication username, if any
    $pass           authentication password, if any
    $accept         http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */*)
    $error          where errors are sent, if any
    $response_code  responde code returned from server
    $headers        headers returned from server
    $maxlength      max return data length
    $read_timeout   timeout on read operations (requires PHP 4 Beta 4+)
                    set to 0 to disallow timeouts
    $timed_out      true if a read operation timed out (requires PHP 4 Beta 4+)
    $maxframes      number of frames we will follow
    $status         http status of fetch
    $temp_dir       temp directory that the webserver can write to. (/tmp)
    $curl_path      system path to cURL binary, set to false if none

EXAMPLES:

Fetch a web page and display the return headers and the contents of the page (html-escaped):

    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    $snoopy->user = "joe";
    $snoopy->pass = "bloe";

    if($snoopy->fetch("http://www.slashdot.org/"))
    {
        echo "response code: ".$snoopy->response_code."<br>\n";
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";

        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";

submit a form and print out the result headers and html-escaped page:

    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";

    $submit_vars["q"] = "amiga";
    $submit_vars["submit"] = "Search!";
    $submit_vars["searchhost"] = "Altavista";


    if($snoopy->submit($submit_url,$submit_vars))
    {
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";

        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";

Showing functionality of all the variables:

    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    $snoopy->proxy_host = "my.proxy.host";
    $snoopy->proxy_port = "8080";

    $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
    $snoopy->referer = "http://www.microsnot.com/";

    $snoopy->cookies["SessionID"] = 238472834723489l;
    $snoopy->cookies["favoriteColor"] = "RED";

    $snoopy->rawheaders["Pragma"] = "no-cache";

    $snoopy->maxredirs = 2;
    $snoopy->offsiteok = false;
    $snoopy->expandlinks = false;

    $snoopy->user = "joe";
    $snoopy->pass = "bloe";

    if($snoopy->fetchtext("http://www.phpbuilder.com"))
    {
        while(list($key,$val) = each($snoopy->headers))
            echo $key.": ".$val."<br>\n";
        echo "<p>\n";

        echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";

Fetched framed content and display the results

    include "Snoopy.class.php";
    $snoopy = new Snoopy;

    $snoopy->maxframes = 5;

    if($snoopy->fetch("http://www.ispi.net/"))
    {
        echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>\n";
        echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>\n";
        echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>\n";
    }
    else
        echo "error fetching document: ".$snoopy->error."\n";

COPYRIGHT:

Copyright(c) 1999,2000 ispi. All rights reserved. This software is released under the GNU General Public License. Please read the disclaimer at the top of the Snoopy.class.php file.

THANKS:

Special Thanks to: