Skip to content
This repository has been archived by the owner on Dec 28, 2020. It is now read-only.

Commit

Permalink
Updated documentation for new tricks
Browse files Browse the repository at this point in the history
  • Loading branch information
palewire committed Jul 14, 2014
1 parent 0e6f269 commit baba335
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 6 deletions.
16 changes: 16 additions & 0 deletions docs/analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,14 @@ An URL's archived HTML with tools for analysis.
The HTML archived

.. py:attribute:: gzip
Returns the archived HTML as a stream of gzipped data

.. py:attribute:: archive_filename
Returns a file name for this archive using the conventions of :py:func:`storytracker.create_archive_filename`.

.. py:attribute:: soup
The archived HTML passed into a `BeautifulSoup <http://www.crummy.com/software/BeautifulSoup/bs4/doc/#>`_ parser
Expand All @@ -29,6 +37,14 @@ An URL's archived HTML with tools for analysis.
A list of all the hyperlinks extracted from the HTML

.. py:method:: write_gzip_to_directory(path)
Writes gzipped HTML data to a file in the provided directory path

.. py:method:: write_html_to_directory(path)
Writes HTML data to a file in the provided directory path

Example usage:

.. code-block:: python
Expand Down
12 changes: 6 additions & 6 deletions docs/archiving.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,11 @@ Archive the HTML from the provided URLs
:param bool verify: Verify that HTML is in the response's content-type header
:param bool minify: Minify the HTML response to reduce its size
:param bool extend_urls: Extend relative URLs discovered in the HTML response to be absolute
:param bool compress: Compress the HTML response using gzip
:param bool compress: Compress the HTML response using gzip if an ``output_dir`` is provided
:param output_dir: Provide a directory for the archived data to be stored
:type output_dir: str or None
:return: The content of the HTML response, unless an output directory is provided when it will return the path to the created file
:rtype: ``str``
:return: An :py:class:`ArchivedURL` object
:rtype: :py:class:`ArchivedURL`
:raises ValueError: If the response is not verified as HTML

Example usage:
Expand All @@ -28,13 +28,13 @@ Example usage:
>>> import storytracker
>>> # This will return gzipped content of the page to the variable
>>> data = storytracker.archive("http://www.latimes.com")
>>> obj = storytracker.archive("http://www.latimes.com")
>>> # You can save it to an automatically named file a directory you provide
>>> path = storytracker.archive(http://www.latimes.com, output_dir="./")
>>> obj = storytracker.archive(http://www.latimes.com, output_dir="./")
>>> # If you'd prefer to have the HTML without compression
>>> data = storytracker.archive("http://www.latimes.com", compress=False)
>>> obj = storytracker.archive("http://www.latimes.com", compress=False)
Command-line interface
~~~~~~~~~~~~~~~~~~~~~~
Expand Down
1 change: 1 addition & 0 deletions docs/filehandling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ Accepts a file path and returns an ``ArchivedURL`` object
:param str path: The path to the archived file. Its file name must conform to the conventions of :py:func:`storytracker.create_archive_filename`.
:return: An :py:class:`ArchivedURL` object
:rtype: :py:class:`ArchivedURL`
:raises ArchiveFileNameError: If the file's name cannot not be parsed using the conventions of :py:func:`storytracker.create_archive_filename`.

Example usage:

Expand Down

0 comments on commit baba335

Please sign in to comment.