Skip to content

A Python library for automating interaction with websites.

License

Notifications You must be signed in to change notification settings

crowd42/MechanicalSoup

 
 

Repository files navigation

MechanicalSoup

Home page

https://mechanicalsoup.readthedocs.io/en/latest/

Overview

A Python library for automating interaction with websites. MechanicalSoup automatically stores and sends cookies, follows redirects, and can follow links and submit forms. It doesn't do JavaScript.

MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize is incompatible with Python 3 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and BeautifulSoup (for document navigation). Since 2017 it is a project actively maintained by a small team including @hemberger and @moy.

Gitter Chat

Installation

Latest Version Supported Versions

PyPy and PyPy3 are also supported (and tested against).

Download and install the latest released version from PyPI:

pip install MechanicalSoup

Download and install the development version from GitHub:

pip install git+https://github.com/MechanicalSoup/MechanicalSoup

Installing from source (installs the version in the current working directory):

python setup.py install

(In all cases, add --user to the install command to install in the current user's home directory.)

Documentation

The full documentation is available on https://mechanicalsoup.readthedocs.io/. You may want to jump directly to the automatically generated API documentation.

Example

From examples/expl_duck_duck_go.py, code to get the results from a DuckDuckGo search:

"""Example usage of MechanicalSoup to get the results from
DuckDuckGo."""

import mechanicalsoup

# Connect to duckduckgo
browser = mechanicalsoup.StatefulBrowser()
browser.open("https://duckduckgo.com/")

# Fill-in the search form
browser.select_form('#search_form_homepage')
browser["q"] = "MechanicalSoup"
browser.submit_selected()

# Display the results
for link in browser.get_current_page().select('a.result__a'):
    print(link.text, '->', link.attrs['href'])

More examples are available in examples/.

For an example with a more complex form (checkboxes, radio buttons and textareas), read tests/test_browser.py and tests/test_form.py.

Development

Build Status Coverage Status Requirements Status Documentation Status CII Best Practices

Instructions for building, testing and contributing to MechanicalSoup: see CONTRIBUTING.rst.

Common problems

Read the FAQ.

See also

About

A Python library for automating interaction with websites.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.4%
  • Shell 0.6%