Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Merge CVS history #1
I have done a new cvs export job using https://github.com/rcls/crap, got all commit history from sourceforge.net/snoopy, and then merged with your modified README.md.
During merge, sourceforge version seems newer in version(1.2.5-dev) and some author mail address.
Although this class has not much usage because there are curl already, I think keep a copy here is good for some old project.
added some commits
Feb 3, 2000
added carriage return along with newline for headers. Some servers ha…
…d problems with this, such as slashdot.org
added some commits
Jul 25, 2004
updated to reflect new naming of snoopy class file from Snoopy.class.…
…inc to Snoopy.class.php
Fixing BUG # 912060 Undefined variable: postdata
The postdata variable isn't initialized in the _prepare_post_body() function causing a "Notice" error.
Fix for BUG 626849 submitted by Tajh Leitso Corrects the redirect fun…
…ction under the submit functions. When submitting (POST) if a redirect is encountered, it is assumed to also be a POST. This fix allows for a redirect from a POST to lead to a page that should be pulled with a GET
Fix for BUG 626849 submitted by Tajh Leitso Removes quotes from an ht…
…tps url to prevent escaping from the curl command line and executing commands. Also added a temp_dir variable instead of hardcoding the temporary file with the curl command in it to /tmp
Fixes BUG 642958 and 912060 Checks to see if $URI_PARTS["query"] is s…
…et and if it isn't, set's it to ''
Fixed bug 999079 .This is caused by the preg_replace functions in the…
… _exapandlinks function. I tried to decode which regex was causing this but was unable to, so I just added an additional regex to remove the trailing slash from the URI before the page is concatenated onto the end of it. My new regex is : $match = preg_replace("|/$|","",$match); This isn't elegant but it works. If anyone wants to determine which of the 6 other regexes being run is causing the double slashes, I'd be happy to fix it and take out my additional line of code.
Fixed BUG 1014823
Meta redirect regex inaccurate The original regex was expecting 1 or more whitespaces between the semicolon and the URL in the http refresh. This is not always that case. ;[\s]+ The new line expects 0 or more whitespaces between the semicolon and the URL ;[\s]*
Fixed BUG # 1097134 $URI_PARTS["path"] can be undefined, generating a…
… "notice" level error.
Bug fix of Bug # 864047 pertaining to Root relative links and the _ex…
…pandlinks function. Root relative links are treated as relative Snoopy is treating root relative links as relative. When a page at domain.com/foo/bar/page1.htm has a link like /foo/bar/page2.htm then Snoopy returns the link to page 2 as: domain.com/foo/bar/foo/bar/page2.htm instead of domain.com/foo/bar/page2.htm
Fixed Bug # 1077870
Snoopy now allows a meta refresh tag to have any number of spaces between the semicolon following the refresh delay and the URL= value.
Bug Fix # 1086830
Added : if($this->lastredirectaddr) $URI = $this->lastredirectaddr; into the fetchlinks, submitlinks and submitext functions to properly expandlinks after a redirect. Also modified the documentation at the beginning of the file indicating which functions use expandlinks
Security fix for potential arbitrary command exploit. The http header…
…s in the https curl request weren't being checked for double quotes (the URI was, but not the headers). Here's the description of the exploit from SEC. SEC-CONSULT Security Advisory < 2005xxxx-0 > ====================================================================== title: Snoopy Remote Code Execution Vulnerability program: Snoopy PHP Webclient vulnerable version: 1.2 and earlier homepage: http://snoopy.sourceforge.net found: 2005-10-10 by: D. Fabian / SEC-CONSULT / www.sec-consult.com ====================================================================== vendor description: --------------- Snoopy is a PHP class that simulates a web browser. It automates the task of retrieving web page content and posting forms, for example. Snoopy is used by various RSS parser, which are in turn used in a whole bunch of applications like weblogs, content management systems, and many more. vulnerabilty overview: --------------- Whenever an SSL protected webpage is requested with one of the many Snoopy API calls, it calls the function _httpsrequest which takes the URL as argument. This function in turn calls the PHP-function exec with unchecked user-input. Using a specially crafted URL, an attacker can supply arbitrary commands that are executed on the web server with priviledges of the web user. While the vulnerability can not be exploited using the Snoopy class file itself, there may exist implementations which hand unchecked URLs from users to snoopy. proof of concept: --------------- Consider the following code on a webserver:
The escaping done on the https headers had a bug (missing curly braces).
Thanks zaruba and Kellan
patch # 985470 : http 1.1 Host header not passing port information fo…
…r http or https see : http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23
fixed bug # 1328793 : fetch is case sensetive when it comes to the sc…
…heme (http / https) fixed a typo that I introduced in 1.2.2 (the first character of the file is a "z" updated the version variable in the code to reflect the new version
Merge remote-tracking branch 'upstream/master' into hurrycaner
Conflicts: AUTHORS Snoopy.class.php