Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A mechanize-inspired screenscraping utility for Google App Engine.
Python
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
.gitignore
README
SoupToNuts.py

README

A mechanize-inspired screenscraping utility for Google App Engine.

Public Attributes:
url -- the current page url
response -- the raw response from google's urlfetch
page -- a BeautifulSoup object for the current page
cookies -- a string representing the current cookies

Example:
nuts = SoupToNuts('http://www.google.com')
puts nuts.page.prettify()


class SoupToNuts(__builtin__.object)
  __init__(self, url=None)
    Initializer

    Arguments:
    url -- an initial url to fetch

  fetch(self, url, data=None, raw=False)
    Fetch a url using Google's urlfetch library.  Does a GET by default
    and a POST if data is supplied.

    Arguments:
    url -- the url to visit, can be a url fragment
    data -- the data to send (via POST)
    raw -- whether to parse the incoming response with BeautifulSoup

  follow(self, link)
    Follow a link on the page.

    Arguments:
    link -- a BeautifulSoup link element from the page

    Example:
    nuts.follow(nuts.page.a)

  submit(self, form, values)
    Submit a form on the page with the passed in values.

    Arguments:
    form -- a BeautifulSoup form element on the page
    values -- a dict of key-value pairs for the form items

    Example:
    nuts.submit(nuts.page.find('form', id='login_form'),
      {'email': 'steve@apple.com', 'password': 'i!<3flash'})
Something went wrong with that request. Please try again.