Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Adding support for rebuilding the planet on a new system.

  • Loading branch information...
commit ac80f2c2095d69bf48e25920a901cc89a108c0f0 1 parent 1512eae
@pedersen pedersen authored
Showing with 11,149 additions and 0 deletions.
  1. +2 −0  planet/AUTHORS
  2. +151 −0 planet/INSTALL
  3. +84 −0 planet/LICENCE
  4. +4 −0 planet/NEWS
  5. +10 −0 planet/PKG-INFO
  6. +12 −0 planet/README
  7. +18 −0 planet/THANKS
  8. +22 −0 planet/TODO
  9. +61 −0 planet/examples/atom.xml.tmpl
  10. +88 −0 planet/examples/basic/config.ini
  11. +88 −0 planet/examples/basic/index.html.tmpl
  12. 0  planet/examples/cache/.empty
  13. +106 −0 planet/examples/fancy/config.ini
  14. +125 −0 planet/examples/fancy/index.html.tmpl
  15. +31 −0 planet/examples/foafroll.xml.tmpl
  16. +15 −0 planet/examples/opml.xml.tmpl
  17. BIN  planet/examples/output/images/edd.png
  18. BIN  planet/examples/output/images/evolution.png
  19. BIN  planet/examples/output/images/feed-icon-10x10.png
  20. BIN  planet/examples/output/images/jdub.png
  21. BIN  planet/examples/output/images/keybuk.png
  22. BIN  planet/examples/output/images/logo.png
  23. BIN  planet/examples/output/images/opml.png
  24. BIN  planet/examples/output/images/planet.png
  25. BIN  planet/examples/output/images/thom.png
  26. +146 −0 planet/examples/output/planet.css
  27. +37 −0 planet/examples/rss10.xml.tmpl
  28. +30 −0 planet/examples/rss20.xml.tmpl
  29. +194 −0 planet/planet-cache.py
  30. +168 −0 planet/planet.py
  31. +953 −0 planet/planet/__init__.py
  32. +124 −0 planet/planet/atomstyler.py
  33. +306 −0 planet/planet/cache.py
  34. +1,196 −0 planet/planet/compat_logging/__init__.py
  35. +299 −0 planet/planet/compat_logging/config.py
  36. +728 −0 planet/planet/compat_logging/handlers.py
  37. +2,931 −0 planet/planet/feedparser.py
  38. +1,480 −0 planet/planet/htmltmpl.py
  39. +354 −0 planet/planet/sanitize.py
  40. 0  planet/planet/tests/__init__.py
  41. +4 −0 planet/planet/tests/data/simple.tmpl
  42. +4 −0 planet/planet/tests/data/simple2.tmpl
  43. +38 −0 planet/planet/tests/test_channel.py
  44. +71 −0 planet/planet/tests/test_main.py
  45. +125 −0 planet/planet/tests/test_sanitize.py
  46. +79 −0 planet/planet/tests/test_sub.py
  47. +424 −0 planet/planet/timeoutsocket.py
  48. +11 −0 planet/runtests.py
  49. +22 −0 planet/setup.py
  50. 0  planet/static/.hidden
  51. +61 −0 planet/tgplanet/atom.xml.tmpl
  52. 0  planet/tgplanet/cache/.hidden
  53. +149 −0 planet/tgplanet/config.ini
  54. +31 −0 planet/tgplanet/foafroll.xml.tmpl
  55. BIN  planet/tgplanet/images/edd.png
  56. BIN  planet/tgplanet/images/evolution.png
  57. BIN  planet/tgplanet/images/feed-icon-10x10.png
  58. BIN  planet/tgplanet/images/jdub.png
  59. BIN  planet/tgplanet/images/keybuk.png
  60. BIN  planet/tgplanet/images/logo.png
  61. BIN  planet/tgplanet/images/opml.png
  62. BIN  planet/tgplanet/images/planet.png
  63. BIN  planet/tgplanet/images/thom.png
  64. +125 −0 planet/tgplanet/index.html.tmpl
  65. +15 −0 planet/tgplanet/opml.xml.tmpl
  66. +146 −0 planet/tgplanet/planet.css
  67. +37 −0 planet/tgplanet/rss10.xml.tmpl
  68. +30 −0 planet/tgplanet/rss20.xml.tmpl
  69. +14 −0 planet/updplanet
View
2  planet/AUTHORS
@@ -0,0 +1,2 @@
+Scott James Remnant <scott@netsplit.com>
+Jeff Waugh <jdub@perkypants.org>
View
151 planet/INSTALL
@@ -0,0 +1,151 @@
+Installing Planet
+-----------------
+
+You'll need at least Python 2.1 installed on your system, we recommend
+Python 2.3 though as there may be bugs with the earlier libraries.
+
+Everything Pythonesque Planet needs should be included in the
+distribution.
+
+ i.
+ First you'll need to extract the files into a folder somewhere.
+ I expect you've already done this, after all, you're reading this
+ file. You can place this wherever you like, ~/planet is a good
+ choice, but so's anywhere else you prefer.
+
+ ii.
+ Make a copy of the files in the 'examples' subdirectory, and either
+ the 'basic' or 'fancy' subdirectory of it and put them wherever
+ you like; I like to use the Planet's name (so ~/planet/debian), but
+ it's really up to you.
+
+ The 'basic' index.html and associated config.ini are pretty plain
+ and boring, if you're after less documentation and more instant
+ gratification you may wish to use the 'fancy' ones instead. You'll
+ want the stylesheet and images from the 'output' directory if you
+ use it.
+
+ iii.
+ Edit the config.ini file in this directory to taste, it's pretty
+ well documented so you shouldn't have any problems here. Pay
+ particular attention to the 'output_dir' option, which should be
+ readable by your web server and especially the 'template_files'
+ option where you'll want to change "examples" to wherever you just
+ placed your copies.
+
+ iv.
+ Edit the various template (*.tmpl) files to taste, a complete list
+ of available variables is at the bottom of this file.
+
+ v.
+ Run it: planet.py pathto/config.ini
+
+ You'll want to add this to cron, make sure you run it from the
+ right directory.
+
+ vi.
+ Tell us about it! We'd love to link to you on planetplanet.org :-)
+
+
+Template files
+--------------
+
+The template files used are given as a space separated list in the
+'template_files' option in config.ini. They are named ending in '.tmpl'
+which is removed to form the name of the file placed in the output
+directory.
+
+Reading through the example templates is recommended, they're designed to
+pretty much drop straight into your site with little modification
+anyway.
+
+Inside these template files, <TMPL_VAR xxx> is replaced with the content
+of the 'xxx' variable. The variables available are:
+
+ name .... } the value of the equivalent options
+ link .... } from the [Planet] section of your
+ owner_name . } Planet's config.ini file
+ owner_email }
+
+ url .... link with the output filename appended
+ generator .. version of planet being used
+
+ date .... { your date format
+ date_iso ... current date and time in { ISO date format
+ date_822 ... { RFC822 date format
+
+
+There are also two loops, 'Items' and 'Channels'. All of the lines of
+the template and variable substitutions are available for each item or
+channel. Loops are created using <TMPL_LOOP LoopName>...</TMPL_LOOP>
+and may be used as many times as you wish.
+
+The 'Channels' loop iterates all of the channels (feeds) defined in the
+configuration file, within it the following variables are available:
+
+ name .... value of the 'name' option in config.ini, or title
+ title .... title retreived from the channel's feed
+ tagline .... description retreived from the channel's feed
+ link .... link for the human-readable content (from the feed)
+ url .... url of the channel's feed itself
+
+ Additionally the value of any other option specified in config.ini
+ for the feed, or in the [DEFAULT] section, is available as a
+ variable of the same name.
+
+ Depending on the feed, there may be a huge variety of other
+ variables may be available; the best way to find out what you
+ have is using the 'planet-cache' tool to examine your cache files.
+
+The 'Items' loop iterates all of the blog entries from all of the channels,
+you do not place it inside a 'Channels' loop. Within it, the following
+variables are available:
+
+ id .... unique id for this entry (sometimes just the link)
+ link .... link to a human-readable version at the origin site
+
+ title .... title of the entry
+ summary .... a short "first page" summary
+ content .... the full content of the entry
+
+ date .... { your date format
+ date_iso ... date and time of the entry in { ISO date format
+ date_822 ... { RFC822 date format
+
+ If the entry takes place on a date that has no prior entry has
+ taken place on, the 'new_date' variable is set to that date.
+ This allows you to break up the page by day.
+
+ If the entry is from a different channel to the previous entry,
+ or is the first entry from this channel on this day
+ the 'new_channel' variable is set to the same value as the
+ 'channel_url' variable. This allows you to collate multiple
+ entries from the same person under the same banner.
+
+ Additionally the value of any variable that would be defined
+ for the channel is available, with 'channel_' prepended to the
+ name (e.g. 'channel_name' and 'channel_link').
+
+ Depending on the feed, there may be a huge variety of other
+ variables may be available; the best way to find out what you
+ have is using the 'planet-cache' tool to examine your cache files.
+
+
+There are also a couple of other special things you can do in a template.
+
+ - If you want HTML escaping applied to the value of a variable, use the
+ <TMPL_VAR xxx ESCAPE="HTML"> form.
+
+ - If you want URI escaping applied to the value of a variable, use the
+ <TMPL_VAR xxx ESCAPE="URI"> form.
+
+ - To only include a section of the template if the variable has a
+ non-empty value, you can use <TMPL_IF xxx>....</TMPL_IF>. e.g.
+
+ <TMPL_IF new_date>
+ <h1><TMPL_VAR new_date></h1>
+ </TMPL_IF>
+
+ You may place a <TMPL_ELSE> within this block to specify an
+ alternative, or may use <TMPL_UNLESS xxx>...</TMPL_UNLESS> to
+ perform the opposite.
View
84 planet/LICENCE
@@ -0,0 +1,84 @@
+Planet is released under the same licence as Python, here it is:
+
+
+A. HISTORY OF THE SOFTWARE
+==========================
+
+Python was created in the early 1990s by Guido van Rossum at Stichting Mathematisch Centrum (CWI) in the Netherlands as a successor of a language called ABC. Guido is Python's principal author, although it includes many contributions from others. The last version released from CWI was Python 1.2. In 1995, Guido continued his work on Python at the Corporation for National Research Initiatives (CNRI) in Reston, Virginia where he released several versions of the software. Python 1.6 was the last of the versions released by CNRI. In 2000, Guido and the Python core development team moved to BeOpen.com to form the BeOpen PythonLabs team. Python 2.0 was the first and only release from BeOpen.com.
+
+Following the release of Python 1.6, and after Guido van Rossum left CNRI to work with commercial software developers, it became clear that the ability to use Python with software available under the GNU Public License (GPL) was very desirable. CNRI and the Free Software Foundation (FSF) interacted to develop enabling wording changes to the Python license. Python 1.6.1 is essentially the same as Python 1.6, with a few minor bug fixes, and with a different license that enables later versions to be GPL-compatible. Python 2.1 is a derivative work of Python 1.6.1, as well as of Python 2.0.
+
+After Python 2.0 was released by BeOpen.com, Guido van Rossum and the other PythonLabs developers joined Digital Creations. All intellectual property added from this point on, starting with Python 2.1 and its alpha and beta releases, is owned by the Python Software Foundation (PSF), a non-profit modeled after the Apache Software Foundation. See http://www.python.org/psf/ for more information about the PSF.
+
+Thanks to the many outside volunteers who have worked under Guido's direction to make these releases possible.
+
+B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
+===============================================================
+
+PSF LICENSE AGREEMENT
+---------------------
+
+1. This LICENSE AGREEMENT is between the Python Software Foundation ("PSF"), and the Individual or Organization ("Licensee") accessing and otherwise using Python 2.1.1 software in source or binary form and its associated documentation.
+
+2. Subject to the terms and conditions of this License Agreement, PSF hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 2.1.1 alone or in any derivative version, provided, however, that PSF's License Agreement and PSF's notice of copyright, i.e., "Copyright (c) 2001 Python Software Foundation; All Rights Reserved" are retained in Python 2.1.1 alone or in any derivative version prepared by Licensee.
+
+3. In the event Licensee prepares a derivative work that is based on or incorporates Python 2.1.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 2.1.1.
+
+4. PSF is making Python 2.1.1 available to Licensee on an "AS IS" basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 2.1.1 WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
+
+5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 2.1.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 2.1.1, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
+
+6. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
+
+7. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between PSF and Licensee. This License Agreement does not grant permission to use PSF trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party.
+
+8. By copying, installing or otherwise using Python 2.1.1, Licensee agrees to be bound by the terms and conditions of this License Agreement.
+
+BEOPEN.COM TERMS AND CONDITIONS FOR PYTHON 2.0
+----------------------------------------------
+
+BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
+
+1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the Individual or Organization ("Licensee") accessing and otherwise using this software in source or binary form and its associated documentation ("the Software").
+
+2. Subject to the terms and conditions of this BeOpen Python License Agreement, BeOpen hereby grants Licensee a non-exclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use the Software alone or in any derivative version, provided, however, that the BeOpen Python License is retained in the Software, alone or in any derivative version prepared by Licensee.
+
+3. BeOpen is making the Software available to Licensee on an "AS IS" basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
+
+4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
+
+5. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
+
+6. This License Agreement shall be governed by and interpreted in all respects by the law of the State of California, excluding conflict of law provisions. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between BeOpen and Licensee. This License Agreement does not grant permission to use BeOpen trademarks or trade names in a trademark sense to endorse or promote products or services of Licensee, or any third party. As an exception, the "BeOpen Python" logos available at http://www.pythonlabs.com/logos.html may be used according to the permissions granted on that web page.
+
+7. By copying, installing or otherwise using the software, Licensee agrees to be bound by the terms and conditions of this License Agreement.
+
+CNRI OPEN SOURCE GPL-COMPATIBLE LICENSE AGREEMENT
+-------------------------------------------------
+
+1. This LICENSE AGREEMENT is between the Corporation for National Research Initiatives, having an office at 1895 Preston White Drive, Reston, VA 20191 ("CNRI"), and the Individual or Organization ("Licensee") accessing and otherwise using Python 1.6.1 software in source or binary form and its associated documentation.
+
+2. Subject to the terms and conditions of this License Agreement, CNRI hereby grants Licensee a nonexclusive, royalty-free, world-wide license to reproduce, analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python 1.6.1 alone or in any derivative version, provided, however, that CNRI's License Agreement and CNRI's notice of copyright, i.e., "Copyright (c) 1995-2001 Corporation for National Research Initiatives; All Rights Reserved" are retained in Python 1.6.1 alone or in any derivative version prepared by Licensee. Alternately, in lieu of CNRI's License Agreement, Licensee may substitute the following text (omitting the quotes): "Python 1.6.1 is made available subject to the terms and conditions in CNRI's License Agreement. This Agreement together with Python 1.6.1 may be located on the Internet using the following unique, persistent identifier (known as a handle): 1895.22/1013. This Agreement may also be obtained from a proxy server on the Internet using the following URL: http://hdl.handle.net/1895.22/1013".
+
+3. In the event Licensee prepares a derivative work that is based on or incorporates Python 1.6.1 or any part thereof, and wants to make the derivative work available to others as provided herein, then Licensee hereby agrees to include in any such work a brief summary of the changes made to Python 1.6.1.
+
+4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS" basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT INFRINGE ANY THIRD PARTY RIGHTS.
+
+5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON 1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1, OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
+
+6. This License Agreement will automatically terminate upon a material breach of its terms and conditions.
+
+7. This License Agreement shall be governed by the federal intellectual property law of the United States, including without limitation the federal copyright law, and, to the extent such U.S. federal law does not apply, by the law of the Commonwealth of Virginia, excluding Virginia's conflict of law provisions. Notwithstanding the foregoing, with regard to derivative works based on Python 1.6.1 that incorporate non-separable material that was previously distributed under the GNU General Public License (GPL), the law of the Commonwealth of Virginia shall govern this License Agreement only as to issues arising under or with respect to Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this License Agreement shall be deemed to create any relationship of agency, partnership, or joint venture between CNRI and Licensee. This License Agreement does not grant permission to use CNRI trademarks or trade name in a trademark sense to endorse or promote products or services of Licensee, or any third party.
+
+8. By clicking on the "ACCEPT" button where indicated, or by copying, installing or otherwise using Python 1.6.1, Licensee agrees to be bound by the terms and conditions of this License Agreement.
+
+ ACCEPT
+
+CWI PERMISSIONS STATEMENT AND DISCLAIMER
+----------------------------------------
+
+Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam, The Netherlands. All rights reserved.
+
+Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both that copyright notice and this permission notice appear in supporting documentation, and that the name of Stichting Mathematisch Centrum or CWI not be used in advertising or publicity pertaining to distribution of the software without specific, written prior permission.
+
+STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
View
4 planet/NEWS
@@ -0,0 +1,4 @@
+Planet 1.0
+----------
+
+ * First release!
View
10 planet/PKG-INFO
@@ -0,0 +1,10 @@
+Metadata-Version: 1.0
+Name: planet
+Version: 2.0
+Summary: The Planet Feed Aggregator
+Home-page: http://www.planetplanet.org/
+Author: Planet Developers
+Author-email: devel@lists.planetplanet.org
+License: Python
+Description: UNKNOWN
+Platform: UNKNOWN
View
12 planet/README
@@ -0,0 +1,12 @@
+Planet
+------
+
+Planet is a flexible feed aggregator. It downloads news feeds published by
+web sites and aggregates their content together into a single combined feed,
+latest news first.
+
+It uses Mark Pilgrim's Universal Feed Parser to read from RDF, RSS and Atom
+feeds; and Tomas Styblo's templating engine to output static files in any
+format you can dream up.
+
+Keywords: feed, blog, aggregator, RSS, RDF, Atom, OPML, Python
View
18 planet/THANKS
@@ -0,0 +1,18 @@
+Patches and Bug Fixes
+---------------------
+
+Chris Dolan - fixes, exclude filtering, duplicate culling
+David Edmondson - filtering
+Lucas Nussbaum - locale configuration
+David Pashley - cache code profiling and recursion fixing
+Gediminas Paulauskas - days per page
+
+
+Spycyroll Maintainers
+---------------------
+
+Vattekkat Satheesh Babu
+Richard Jones
+Garth Kidd
+Eliot Landrum
+Bryan Richard
View
22 planet/TODO
@@ -0,0 +1,22 @@
+TODO
+====
+
+ * Expire feed history
+
+ The feed cache doesn't currently expire old entries, so could get
+ large quite rapidly. We should probably have a config setting for
+ the cache expiry, the trouble is some channels might need a longer
+ or shorter one than others.
+
+ * Allow display normalisation to specified timezone
+
+ Some Planet admins would like their feed to be displayed in the local
+ timezone, instead of UTC.
+
+ * Support OPML and foaf subscriptions
+
+ This might be a bit invasive, but I want to be able to subscribe to OPML
+ and FOAF files, and see each feed as if it were subscribed individually.
+ Perhaps we can do this with a two-pass configuration scheme, first to pull
+ the static configs, second to go fetch and generate the dynamic configs.
+ The more I think about it, the less invasive it sounds. Hmm.
View
61 planet/examples/atom.xml.tmpl
@@ -0,0 +1,61 @@
+<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
+<feed xmlns="http://www.w3.org/2005/Atom">
+
+ <title><TMPL_VAR name></title>
+ <link rel="self" href="<TMPL_VAR feed ESCAPE="HTML">"/>
+ <link href="<TMPL_VAR link ESCAPE="HTML">"/>
+ <id><TMPL_VAR feed ESCAPE="HTML"></id>
+ <updated><TMPL_VAR date_iso></updated>
+ <generator uri="http://www.planetplanet.org/"><TMPL_VAR generator ESCAPE="HTML"></generator>
+
+<TMPL_LOOP Items>
+ <entry<TMPL_IF channel_language> xml:lang="<TMPL_VAR channel_language>"</TMPL_IF>>
+ <title type="html"<TMPL_IF title_language> xml:lang="<TMPL_VAR title_language>"</TMPL_IF>><TMPL_VAR title ESCAPE="HTML"></title>
+ <link href="<TMPL_VAR link ESCAPE="HTML">"/>
+ <id><TMPL_VAR id ESCAPE="HTML"></id>
+ <updated><TMPL_VAR date_iso></updated>
+ <content type="html"<TMPL_IF content_language> xml:lang="<TMPL_VAR content_language>"</TMPL_IF>><TMPL_VAR content ESCAPE="HTML"></content>
+ <author>
+<TMPL_IF author_name>
+ <name><TMPL_VAR author_name ESCAPE="HTML"></name>
+<TMPL_IF author_email>
+ <email><TMPL_VAR author_email ESCAPE="HTML"></email>
+</TMPL_IF author_email>
+<TMPL_ELSE>
+<TMPL_IF channel_author_name>
+ <name><TMPL_VAR channel_author_name ESCAPE="HTML"></name>
+<TMPL_IF channel_author_email>
+ <email><TMPL_VAR channel_author_email ESCAPE="HTML"></email>
+</TMPL_IF channel_author_email>
+<TMPL_ELSE>
+ <name><TMPL_VAR channel_name ESCAPE="HTML"></name>
+</TMPL_IF>
+</TMPL_IF>
+ <uri><TMPL_VAR channel_link ESCAPE="HTML"></uri>
+ </author>
+ <source>
+<TMPL_IF channel_title>
+ <title type="html"><TMPL_VAR channel_title ESCAPE="HTML"></title>
+<TMPL_ELSE>
+ <title type="html"><TMPL_VAR channel_name ESCAPE="HTML"></title>
+</TMPL_IF>
+<TMPL_IF channel_subtitle>
+ <subtitle type="html"><TMPL_VAR channel_subtitle ESCAPE="HTML"></subtitle>
+</TMPL_IF>
+ <link rel="self" href="<TMPL_VAR channel_url ESCAPE="HTML">"/>
+<TMPL_IF channel_id>
+ <id><TMPL_VAR channel_id ESCAPE="HTML"></id>
+<TMPL_ELSE>
+ <id><TMPL_VAR channel_url ESCAPE="HTML"></id>
+</TMPL_IF>
+<TMPL_IF channel_updated_iso>
+ <updated><TMPL_VAR channel_updated_iso></updated>
+</TMPL_IF>
+<TMPL_IF channel_rights>
+ <rights type="html"><TMPL_VAR channel_rights ESCAPE="HTML"></rights>
+</TMPL_IF>
+ </source>
+ </entry>
+
+</TMPL_LOOP>
+</feed>
View
88 planet/examples/basic/config.ini
@@ -0,0 +1,88 @@
+# Planet configuration file
+
+# Every planet needs a [Planet] section
+[Planet]
+# name: Your planet's name
+# link: Link to the main page
+# owner_name: Your name
+# owner_email: Your e-mail address
+name = Planet Zog
+link = http://www.planet.zog/
+owner_name = Zig The Alien
+owner_email = zig@planet.zog
+
+# cache_directory: Where cached feeds are stored
+# new_feed_items: Number of items to take from new feeds
+# log_level: One of DEBUG, INFO, WARNING, ERROR or CRITICAL
+cache_directory = examples/cache
+new_feed_items = 2
+log_level = DEBUG
+
+# template_files: Space-separated list of output template files
+template_files = examples/basic/index.html.tmpl examples/atom.xml.tmpl examples/rss20.xml.tmpl examples/rss10.xml.tmpl examples/opml.xml.tmpl examples/foafroll.xml.tmpl
+
+# The following provide defaults for each template:
+# output_dir: Directory to place output files
+# items_per_page: How many items to put on each page
+# days_per_page: How many complete days of posts to put on each page
+# This is the absolute, hard limit (over the item limit)
+# date_format: strftime format for the default 'date' template variable
+# new_date_format: strftime format for the 'new_date' template variable
+# encoding: output encoding for the file, Python 2.3+ users can use the
+# special "xml" value to output ASCII with XML character references
+# locale: locale to use for (e.g.) strings in dates, default is taken from your
+# system. You can specify more locales separated by ':', planet will
+# use the first available one
+output_dir = examples/output
+items_per_page = 60
+days_per_page = 0
+date_format = %B %d, %Y %I:%M %p
+new_date_format = %B %d, %Y
+encoding = utf-8
+# locale = C
+
+
+# To define a different value for a particular template you may create
+# a section with the same name as the template file's filename (as given
+# in template_files).
+#
+# [examples/rss10.xml.tmpl]
+# items_per_page = 30
+# encoding = xml
+
+
+# Any other section defines a feed to subscribe to. The section title
+# (in the []s) is the URI of the feed itself. A section can also be
+# have any of the following options:
+#
+# name: Name of the feed (defaults to the title found in the feed)
+#
+# Additionally any other option placed here will be available in
+# the template (prefixed with channel_ for the Items loop). You can
+# define defaults for these in a [DEFAULT] section, for example
+# Planet Debian uses the following to define faces:
+#
+# [DEFAULT]
+# facewidth = 64
+# faceheight = 64
+#
+# [http://www.blog.com/rss]
+# face = foo.png
+# faceheight = 32
+#
+# The facewidth of the defined blog defaults to 64.
+
+[http://www.netsplit.com/blog/index.rss]
+name = Scott James Remnant
+
+[http://www.gnome.org/~jdub/blog/?flav=rss]
+name = Jeff Waugh
+
+[http://usefulinc.com/edd/blog/rss91]
+name = Edd Dumbill
+
+[http://blog.clearairturbulence.org/?flav=rss]
+name = Thom May
+
+[http://www.hadess.net/diary.rss]
+name = Bastien Nocera
View
88 planet/examples/basic/index.html.tmpl
@@ -0,0 +1,88 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+
+### Planet HTML template.
+###
+### This is intended to demonstrate and document Planet's templating
+### facilities, and at the same time provide a good base for you to
+### modify into your own design.
+###
+### The output's a bit boring though, if you're after less documentation
+### and more instant gratification, there's an example with a much
+### prettier output in the fancy-examples/ directory of the Planet source.
+
+### Lines like this are comments, and are automatically removed by the
+### templating engine before processing.
+
+
+### Planet makes a large number of variables available for your templates.
+### See INSTALL for the complete list. The raw value can be placed in your
+### output file using <TMPL_VAR varname>. We'll put the name of our
+### Planet in the page title and again in an h1.
+
+<head>
+<title><TMPL_VAR name></title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<meta name="generator" content="<TMPL_VAR generator ESCAPE="HTML">">
+</head>
+
+<body>
+<h1><TMPL_VAR name></h1>
+
+### One of the two loops available is the Channels loop. This allows you
+### to easily create a list of subscriptions, which is exactly what we'll do
+### here.
+
+### Note that we can also expand variables inside HTML tags, but we need
+### to be cautious and HTML-escape any illegal characters using the form
+### <TMPL_VAR varname ESCAPE="HTML">
+
+<div style="float: right">
+<h2>Subscriptions</h2>
+<ul>
+<TMPL_LOOP Channels>
+<li><a href="<TMPL_VAR link ESCAPE="HTML">" title="<TMPL_VAR title ESCAPE="HTML">"><TMPL_VAR name></a> <a href="<TMPL_VAR url ESCAPE="HTML">">(feed)</a></li>
+</TMPL_LOOP>
+</ul>
+</div>
+
+### The other loop is the Items loop, which will get iterated for each
+### news item.
+
+<TMPL_LOOP Items>
+
+### Visually distinguish articles from different days by checking for
+### the new_date flag. This demonstrates the <TMPL_IF varname> ... </TMPL_IF>
+### check.
+
+<TMPL_IF new_date>
+<h2><TMPL_VAR new_date></h2>
+</TMPL_IF>
+
+### Group consecutive articles by the same author together by checking
+### for the new_channel flag.
+
+<TMPL_IF new_channel>
+<h3><a href="<TMPL_VAR channel_link ESCAPE="HTML">" title="<TMPL_VAR channel_title ESCAPE="HTML">"><TMPL_VAR channel_name></a></h3>
+</TMPL_IF>
+
+
+<TMPL_IF title>
+<h4><a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_VAR title></a></h4>
+</TMPL_IF>
+<p>
+<TMPL_VAR content>
+</p>
+<p>
+<em><a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_IF author>by <TMPL_VAR author> at </TMPL_IF><TMPL_VAR date></a></em>
+</p>
+</TMPL_LOOP>
+
+<hr>
+<p>
+<a href="http://www.planetplanet.org/">Powered by Planet!</a><br>
+<em>Last updated: <TMPL_VAR date></em>
+</p>
+</body>
+
+</html>
View
0  planet/examples/cache/.empty
No changes.
View
106 planet/examples/fancy/config.ini
@@ -0,0 +1,106 @@
+# Planet configuration file
+#
+# This illustrates some of Planet's fancier features with example.
+
+# Every planet needs a [Planet] section
+[Planet]
+# name: Your planet's name
+# link: Link to the main page
+# owner_name: Your name
+# owner_email: Your e-mail address
+name = Planet Schmanet
+link = http://planet.schmanet.janet/
+owner_name = Janet Weiss
+owner_email = janet@slut.sex
+
+# cache_directory: Where cached feeds are stored
+# new_feed_items: Number of items to take from new feeds
+# log_level: One of DEBUG, INFO, WARNING, ERROR or CRITICAL
+# feed_timeout: number of seconds to wait for any given feed
+cache_directory = examples/cache
+new_feed_items = 2
+log_level = DEBUG
+feed_timeout = 20
+
+# template_files: Space-separated list of output template files
+template_files = examples/fancy/index.html.tmpl examples/atom.xml.tmpl examples/rss20.xml.tmpl examples/rss10.xml.tmpl examples/opml.xml.tmpl examples/foafroll.xml.tmpl
+
+# The following provide defaults for each template:
+# output_dir: Directory to place output files
+# items_per_page: How many items to put on each page
+# days_per_page: How many complete days of posts to put on each page
+# This is the absolute, hard limit (over the item limit)
+# date_format: strftime format for the default 'date' template variable
+# new_date_format: strftime format for the 'new_date' template variable
+# encoding: output encoding for the file, Python 2.3+ users can use the
+# special "xml" value to output ASCII with XML character references
+# locale: locale to use for (e.g.) strings in dates, default is taken from your
+# system. You can specify more locales separated by ':', planet will
+# use the first available one
+output_dir = examples/output
+items_per_page = 60
+days_per_page = 0
+date_format = %B %d, %Y %I:%M %p
+new_date_format = %B %d, %Y
+encoding = utf-8
+# locale = C
+
+
+# To define a different value for a particular template you may create
+# a section with the same name as the template file's filename (as given
+# in template_files).
+
+# Provide no more than 7 days articles on the front page
+[examples/fancy/index.html.tmpl]
+days_per_page = 7
+
+# If non-zero, all feeds which have not been updated in the indicated
+# number of days will be marked as inactive
+activity_threshold = 0
+
+
+# Options placed in the [DEFAULT] section provide defaults for the feed
+# sections. Placing a default here means you only need to override the
+# special cases later.
+[DEFAULT]
+# Hackergotchi default size.
+# If we want to put a face alongside a feed, and it's this size, we
+# can omit these variables.
+facewidth = 65
+faceheight = 85
+
+
+# Any other section defines a feed to subscribe to. The section title
+# (in the []s) is the URI of the feed itself. A section can also be
+# have any of the following options:
+#
+# name: Name of the feed (defaults to the title found in the feed)
+#
+# Additionally any other option placed here will be available in
+# the template (prefixed with channel_ for the Items loop). We use
+# this trick to make the faces work -- this isn't something Planet
+# "natively" knows about. Look at fancy-examples/index.html.tmpl
+# for the flip-side of this.
+
+[http://www.netsplit.com/blog/index.rss]
+name = Scott James Remnant
+face = keybuk.png
+# pick up the default facewidth and faceheight
+
+[http://www.gnome.org/~jdub/blog/?flav=rss]
+name = Jeff Waugh
+face = jdub.png
+facewidth = 70
+faceheight = 74
+
+[http://usefulinc.com/edd/blog/rss91]
+name = Edd Dumbill
+face = edd.png
+facewidth = 62
+faceheight = 80
+
+[http://blog.clearairturbulence.org/?flav=rss]
+name = Thom May
+face = thom.png
+# pick up the default faceheight only
+facewidth = 59
View
125 planet/examples/fancy/index.html.tmpl
@@ -0,0 +1,125 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
+<html>
+
+### Fancy Planet HTML template.
+###
+### When combined with the stylesheet and images in the output/ directory
+### of the Planet source, this gives you a much prettier result than the
+### default examples template and demonstrates how to use the config file
+### to support things like faces
+###
+### For documentation on the more boring template elements, see
+### examples/config.ini and examples/index.html.tmpl in the Planet source.
+
+<head>
+<title><TMPL_VAR name></title>
+<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+<meta name="generator" content="<TMPL_VAR generator ESCAPE="HTML">">
+<link rel="stylesheet" href="planet.css" type="text/css">
+<TMPL_IF feedtype>
+<link rel="alternate" href="<TMPL_VAR feed ESCAPE="HTML">" title="<TMPL_VAR channel_title_plain ESCAPE="HTML">" type="application/<TMPL_VAR feedtype>+xml">
+</TMPL_IF>
+</head>
+
+<body>
+<h1><TMPL_VAR name></h1>
+
+<TMPL_LOOP Items>
+<TMPL_IF new_date>
+<TMPL_UNLESS __FIRST__>
+### End <div class="channelgroup">
+</div>
+### End <div class="daygroup">
+</div>
+</TMPL_UNLESS>
+<div class="daygroup">
+<h2><TMPL_VAR new_date></h2>
+</TMPL_IF>
+
+<TMPL_IF new_channel>
+<TMPL_UNLESS new_date>
+### End <div class="channelgroup">
+</div>
+</TMPL_UNLESS>
+<div class="channelgroup">
+
+### Planet provides template variables for *all* configuration options for
+### the channel (and defaults), even if it doesn't know about them. We
+### exploit this here to add hackergotchi faces to our channels. Planet
+### doesn't know about the "face", "facewidth" and "faceheight" configuration
+### variables, but makes them available to us anyway.
+
+<h3><a href="<TMPL_VAR channel_link ESCAPE="HTML">" title="<TMPL_VAR channel_title_plain ESCAPE="HTML">"><TMPL_VAR channel_name></a></h3>
+<TMPL_IF channel_face>
+<img class="face" src="images/<TMPL_VAR channel_face ESCAPE="HTML">" width="<TMPL_VAR channel_facewidth ESCAPE="HTML">" height="<TMPL_VAR channel_faceheight ESCAPE="HTML">" alt="">
+</TMPL_IF>
+</TMPL_IF>
+
+
+<div class="entrygroup" id="<TMPL_VAR id>"<TMPL_IF channel_language> lang="<TMPL_VAR channel_language>"</TMPL_IF>>
+<TMPL_IF title>
+<h4<TMPL_IF title_language> lang="<TMPL_VAR title_language>"</TMPL_IF>><a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_VAR title></a></h4>
+</TMPL_IF>
+<div class="entry">
+<div class="content"<TMPL_IF content_language> lang="<TMPL_VAR content_language>"</TMPL_IF>>
+<TMPL_VAR content>
+</div>
+
+### Planet also makes available all of the information from the feed
+### that it can. Use the 'planet-cache' tool on the cache file for
+### a particular feed to find out what additional keys it supports.
+### Comment extra fields are 'author' and 'category' which we
+### demonstrate below.
+
+<p class="date">
+<a href="<TMPL_VAR link ESCAPE="HTML">"><TMPL_IF author>by <TMPL_VAR author> at </TMPL_IF><TMPL_VAR date><TMPL_IF category> under <TMPL_VAR category></TMPL_IF></a>
+</p>
+</div>
+</div>
+
+<TMPL_IF __LAST__>
+### End <div class="channelgroup">
+</div>
+### End <div class="daygroup">
+</div>
+</TMPL_IF>
+</TMPL_LOOP>
+
+
+<div class="sidebar">
+<img src="images/logo.png" width="136" height="136" alt="">
+
+<h2>Subscriptions</h2>
+<ul>
+<TMPL_LOOP Channels>
+<li>
+<a href="<TMPL_VAR url ESCAPE="HTML">" title="subscribe"><img src="images/feed-icon-10x10.png" alt="(feed)"></a> <a <TMPL_IF link>href="<TMPL_VAR link ESCAPE="HTML">" </TMPL_IF><TMPL_IF message>class="message" title="<TMPL_VAR message ESCAPE="HTML">"</TMPL_IF><TMPL_UNLESS message>title="<TMPL_VAR title_plain ESCAPE="HTML">"</TMPL_UNLESS>><TMPL_VAR name></a>
+</li>
+</TMPL_LOOP>
+</ul>
+
+<p>
+<strong>Last updated:</strong><br>
+<TMPL_VAR date><br>
+<em>All times are UTC.</em><br>
+<br>
+Powered by:<br>
+<a href="http://www.planetplanet.org/"><img src="images/planet.png" width="80" height="15" alt="Planet" border="0"></a>
+</p>
+
+<p>
+<h2>Planetarium:</h2>
+<ul>
+<li><a href="http://www.planetapache.org/">Planet Apache</a></li>
+<li><a href="http://planet.debian.net/">Planet Debian</a></li>
+<li><a href="http://planet.freedesktop.org/">Planet freedesktop.org</a></li>
+<li><a href="http://planet.gnome.org/">Planet GNOME</a></li>
+<li><a href="http://planetsun.org/">Planet Sun</a></li>
+<li><a href="http://fedora.linux.duke.edu/fedorapeople/">Fedora People</a></li>
+<li><a href="http://www.planetplanet.org/">more...</a></li>
+</ul>
+</p>
+</div>
+</body>
+
+</html>
View
31 planet/examples/foafroll.xml.tmpl
@@ -0,0 +1,31 @@
+<?xml version="1.0"?>
+<rdf:RDF
+ xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+ xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
+ xmlns:foaf="http://xmlns.com/foaf/0.1/"
+ xmlns:rss="http://purl.org/rss/1.0/"
+ xmlns:dc="http://purl.org/dc/elements/1.1/"
+>
+<foaf:Group>
+ <foaf:name><TMPL_VAR name ESCAPE="HTML"></foaf:name>
+ <foaf:homepage><TMPL_VAR link ESCAPE="HTML"></foaf:homepage>
+ <rdfs:seeAlso rdf:resource="<TMPL_VAR url ESCAPE="HTML">" />
+
+<TMPL_LOOP Channels>
+ <foaf:member>
+ <foaf:Agent>
+ <foaf:name><TMPL_VAR name ESCAPE="HTML"></foaf:name>
+ <foaf:weblog>
+ <foaf:Document rdf:about="<TMPL_VAR link ESCAPE="HTML">">
+ <dc:title><TMPL_VAR title_plain ESCAPE="HTML"></dc:title>
+ <rdfs:seeAlso>
+ <rss:channel rdf:about="<TMPL_VAR url ESCAPE="HTML">" />
+ </rdfs:seeAlso>
+ </foaf:Document>
+ </foaf:weblog>
+ </foaf:Agent>
+ </foaf:member>
+</TMPL_LOOP>
+
+</foaf:Group>
+</rdf:RDF>
View
15 planet/examples/opml.xml.tmpl
@@ -0,0 +1,15 @@
+<?xml version="1.0"?>
+<opml version="1.1">
+ <head>
+ <title><TMPL_VAR name ESCAPE="HTML"></title>
+ <dateModified><TMPL_VAR date_822></dateModified>
+ <ownerName><TMPL_VAR owner_name></ownerName>
+ <ownerEmail><TMPL_VAR owner_email></ownerEmail>
+ </head>
+
+ <body>
+ <TMPL_LOOP Channels>
+ <outline type="rss" text="<TMPL_VAR name ESCAPE="HTML">" xmlUrl="<TMPL_VAR url ESCAPE="HTML">" title="<TMPL_IF title><TMPL_VAR title ESCAPE="HTML"></TMPL_IF><TMPL_UNLESS title><TMPL_VAR name ESCAPE="HTML"></TMPL_UNLESS>"<TMPL_IF channel_link> htmlUrl="<TMPL_VAR channel_link ESCAPE="HTML">"</TMPL_IF> />
+ </TMPL_LOOP>
+ </body>
+</opml>
View
BIN  planet/examples/output/images/edd.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/evolution.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/feed-icon-10x10.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/jdub.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/keybuk.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/logo.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/opml.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/planet.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
BIN  planet/examples/output/images/thom.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View
146 planet/examples/output/planet.css
@@ -0,0 +1,146 @@
+body {
+ border-right: 1px solid black;
+ margin-right: 200px;
+
+ padding-left: 20px;
+ padding-right: 20px;
+}
+
+h1 {
+ margin-top: 0px;
+ padding-top: 20px;
+
+ font-family: "Bitstream Vera Sans", sans-serif;
+ font-weight: normal;
+ letter-spacing: -2px;
+ text-transform: lowercase;
+ text-align: right;
+
+ color: grey;
+}
+
+h2 {
+ font-family: "Bitstream Vera Sans", sans-serif;
+ font-weight: normal;
+ color: #200080;
+
+ margin-left: -20px;
+}
+
+h3 {
+ font-family: "Bitstream Vera Sans", sans-serif;
+ font-weight: normal;
+
+ background-color: #a0c0ff;
+ border: 1px solid #5080b0;
+
+ padding: 4px;
+}
+
+h3 a {
+ text-decoration: none;
+ color: inherit;
+}
+
+h4 {
+ font-family: "Bitstream Vera Sans", sans-serif;
+ font-weight: bold;
+}
+
+h4 a {
+ text-decoration: none;
+ color: inherit;
+}
+
+img.face {
+ float: right;
+ margin-top: -3em;
+}
+
+.entry {
+ margin-bottom: 2em;
+}
+
+.entry .date {
+ font-family: "Bitstream Vera Sans", sans-serif;
+ color: grey;
+}
+
+.entry .date a {
+ text-decoration: none;
+ color: inherit;
+}
+
+.sidebar {
+ position: absolute;
+ top: 0px;
+ right: 0px;
+ width: 200px;
+
+ margin-left: 0px;
+ margin-right: 0px;
+ padding-right: 0px;
+
+ padding-top: 20px;
+ padding-left: 0px;
+
+ font-family: "Bitstream Vera Sans", sans-serif;
+ font-size: 85%;
+}
+
+.sidebar h2 {
+ font-size: 110%;
+ font-weight: bold;
+ color: black;
+
+ padding-left: 5px;
+ margin-left: 0px;
+}
+
+.sidebar ul {
+ padding-left: 1em;
+ margin-left: 0px;
+
+ list-style-type: none;
+}
+
+.sidebar ul li:hover {
+ color: grey;
+}
+
+.sidebar ul li a {
+ text-decoration: none;
+}
+
+.sidebar ul li a:hover {
+ text-decoration: underline;
+}
+
+.sidebar ul li a img {
+ border: 0;
+}
+
+.sidebar p {
+ border-top: 1px solid grey;
+ margin-top: 30px;
+ padding-top: 10px;
+
+ padding-left: 5px;
+}
+
+.sidebar .message {
+ cursor: help;
+ border-bottom: 1px dashed red;
+}
+
+.sidebar a.message:hover {
+ cursor: help;
+ background-color: #ff0000;
+ color: #ffffff !important;
+ text-decoration: none !important;
+}
+
+a:hover {
+ text-decoration: underline !important;
+ color: blue !important;
+}
View
37 planet/examples/rss10.xml.tmpl
@@ -0,0 +1,37 @@
+<?xml version="1.0"?>
+<rdf:RDF
+ xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
+ xmlns:dc="http://purl.org/dc/elements/1.1/"
+ xmlns:foaf="http://xmlns.com/foaf/0.1/"
+ xmlns:content="http://purl.org/rss/1.0/modules/content/"
+ xmlns="http://purl.org/rss/1.0/"
+>
+<channel rdf:about="<TMPL_VAR link ESCAPE="HTML">">
+ <title><TMPL_VAR name ESCAPE="HTML"></title>
+ <link><TMPL_VAR link ESCAPE="HTML"></link>
+ <description><TMPL_VAR name ESCAPE="HTML"> - <TMPL_VAR link ESCAPE="HTML"></description>
+
+ <items>
+ <rdf:Seq>
+<TMPL_LOOP Items>
+ <rdf:li rdf:resource="<TMPL_VAR id ESCAPE="HTML">" />
+</TMPL_LOOP>
+ </rdf:Seq>
+ </items>
+</channel>
+
+<TMPL_LOOP Items>
+<item rdf:about="<TMPL_VAR id ESCAPE="HTML">">
+ <title><TMPL_VAR channel_name ESCAPE="HTML"><TMPL_IF title>: <TMPL_VAR title_plain ESCAPE="HTML"></TMPL_IF></title>
+ <link><TMPL_VAR link ESCAPE="HTML"></link>
+ <TMPL_IF content>
+ <content:encoded><TMPL_VAR content ESCAPE="HTML"></content:encoded>
+ </TMPL_IF>
+ <dc:date><TMPL_VAR date_iso></dc:date>
+ <TMPL_IF author_name>
+ <dc:creator><TMPL_VAR author_name></dc:creator>
+ </TMPL_IF>
+</item>
+</TMPL_LOOP>
+
+</rdf:RDF>
View
30 planet/examples/rss20.xml.tmpl
@@ -0,0 +1,30 @@
+<?xml version="1.0"?>
+<rss version="2.0">
+
+<channel>
+ <title><TMPL_VAR name></title>
+ <link><TMPL_VAR link ESCAPE="HTML"></link>
+ <language>en</language>
+ <description><TMPL_VAR name ESCAPE="HTML"> - <TMPL_VAR link ESCAPE="HTML"></description>
+
+<TMPL_LOOP Items>
+<item>
+ <title><TMPL_VAR channel_name ESCAPE="HTML"><TMPL_IF title>: <TMPL_VAR title_plain ESCAPE="HTML"></TMPL_IF></title>
+ <guid><TMPL_VAR id ESCAPE="HTML"></guid>
+ <link><TMPL_VAR link ESCAPE="HTML"></link>
+ <TMPL_IF content>
+ <description><TMPL_VAR content ESCAPE="HTML"></description>
+ </TMPL_IF>
+ <pubDate><TMPL_VAR date_822></pubDate>
+ <TMPL_IF author_email>
+ <TMPL_IF author_name>
+ <author><TMPL_VAR author_email> (<TMPL_VAR author_name>)</author>
+ <TMPL_ELSE>
+ <author><TMPL_VAR author_email></author>
+ </TMPL_IF>
+ </TMPL_IF>
+</item>
+</TMPL_LOOP>
+
+</channel>
+</rss>
View
194 planet/planet-cache.py
@@ -0,0 +1,194 @@
+#!/usr/bin/env python
+# -*- coding: UTF-8 -*-
+"""Planet cache tool.
+
+"""
+
+__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
+ "Jeff Waugh <jdub@perkypants.org>" ]
+__license__ = "Python"
+
+
+import os
+import sys
+import time
+import dbhash
+import ConfigParser
+
+import planet
+
+
+def usage():
+ print "Usage: planet-cache [options] CACHEFILE [ITEMID]..."
+ print
+ print "Examine and modify information in the Planet cache."
+ print
+ print "Channel Commands:"
+ print " -C, --channel Display known information on the channel"
+ print " -L, --list List items in the channel"
+ print " -K, --keys List all keys found in channel items"
+ print
+ print "Item Commands (need ITEMID):"
+ print " -I, --item Display known information about the item(s)"
+ print " -H, --hide Mark the item(s) as hidden"
+ print " -U, --unhide Mark the item(s) as not hidden"
+ print
+ print "Other Options:"
+ print " -h, --help Display this help message and exit"
+ sys.exit(0)
+
+def usage_error(msg, *args):
+ print >>sys.stderr, msg, " ".join(args)
+ print >>sys.stderr, "Perhaps you need --help ?"
+ sys.exit(1)
+
+def print_keys(item, title):
+ keys = item.keys()
+ keys.sort()
+ key_len = max([ len(k) for k in keys ])
+
+ print title + ":"
+ for key in keys:
+ if item.key_type(key) == item.DATE:
+ value = time.strftime(planet.TIMEFMT_ISO, item[key])
+ else:
+ value = str(item[key])
+ print " %-*s %s" % (key_len, key, fit_str(value, 74 - key_len))
+
+def fit_str(string, length):
+ if len(string) <= length:
+ return string
+ else:
+ return string[:length-4] + " ..."
+
+
+if __name__ == "__main__":
+ cache_file = None
+ want_ids = 0
+ ids = []
+
+ command = None
+
+ for arg in sys.argv[1:]:
+ if arg == "-h" or arg == "--help":
+ usage()
+ elif arg == "-C" or arg == "--channel":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "channel"
+ elif arg == "-L" or arg == "--list":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "list"
+ elif arg == "-K" or arg == "--keys":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "keys"
+ elif arg == "-I" or arg == "--item":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "item"
+ want_ids = 1
+ elif arg == "-H" or arg == "--hide":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "hide"
+ want_ids = 1
+ elif arg == "-U" or arg == "--unhide":
+ if command is not None:
+ usage_error("Only one command option may be supplied")
+ command = "unhide"
+ want_ids = 1
+ elif arg.startswith("-"):
+ usage_error("Unknown option:", arg)
+ else:
+ if cache_file is None:
+ cache_file = arg
+ elif want_ids:
+ ids.append(arg)
+ else:
+ usage_error("Unexpected extra argument:", arg)
+
+ if cache_file is None:
+ usage_error("Missing expected cache filename")
+ elif want_ids and not len(ids):
+ usage_error("Missing expected entry ids")
+
+ # Open the cache file directly to get the URL it represents
+ try:
+ db = dbhash.open(cache_file)
+ url = db["url"]
+ db.close()
+ except dbhash.bsddb._db.DBError, e:
+ print >>sys.stderr, cache_file + ":", e.args[1]
+ sys.exit(1)
+ except KeyError:
+ print >>sys.stderr, cache_file + ": Probably not a cache file"
+ sys.exit(1)
+
+ # Now do it the right way :-)
+ my_planet = planet.Planet(ConfigParser.ConfigParser())
+ my_planet.cache_directory = os.path.dirname(cache_file)
+ channel = planet.Channel(my_planet, url)
+
+ for item_id in ids:
+ if not channel.has_item(item_id):
+ print >>sys.stderr, item_id + ": Not in channel"
+ sys.exit(1)
+
+ # Do the user's bidding
+ if command == "channel":
+ print_keys(channel, "Channel Keys")
+
+ elif command == "item":
+ for item_id in ids:
+ item = channel.get_item(item_id)
+ print_keys(item, "Item Keys for %s" % item_id)
+
+ elif command == "list":
+ print "Items in Channel:"
+ for item in channel.items(hidden=1, sorted=1):
+ print " " + item.id
+ print " " + time.strftime(planet.TIMEFMT_ISO, item.date)
+ if hasattr(item, "title"):
+ print " " + fit_str(item.title, 70)
+ if hasattr(item, "hidden"):
+ print " (hidden)"
+
+ elif command == "keys":
+ keys = {}
+ for item in channel.items():
+ for key in item.keys():
+ keys[key] = 1
+
+ keys = keys.keys()
+ keys.sort()
+
+ print "Keys used in Channel:"
+ for key in keys:
+ print " " + key
+ print
+
+ print "Use --item to output values of particular items."
+
+ elif command == "hide":
+ for item_id in ids:
+ item = channel.get_item(item_id)
+ if hasattr(item, "hidden"):
+ print item_id + ": Already hidden."
+ else:
+ item.hidden = "yes"
+
+ channel.cache_write()
+ print "Done."
+
+ elif command == "unhide":
+ for item_id in ids:
+ item = channel.get_item(item_id)
+ if hasattr(item, "hidden"):
+ del(item.hidden)
+ else:
+ print item_id + ": Not hidden."
+
+ channel.cache_write()
+ print "Done."
View
168 planet/planet.py
@@ -0,0 +1,168 @@
+#!/usr/bin/env python
+"""The Planet aggregator.
+
+A flexible and easy-to-use aggregator for generating websites.
+
+Visit http://www.planetplanet.org/ for more information and to download
+the latest version.
+
+Requires Python 2.1, recommends 2.3.
+"""
+
+__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
+ "Jeff Waugh <jdub@perkypants.org>" ]
+__license__ = "Python"
+
+
+import os
+import sys
+import time
+import locale
+import urlparse
+
+import planet
+
+from ConfigParser import ConfigParser
+
+# Default configuration file path
+CONFIG_FILE = "config.ini"
+
+# Defaults for the [Planet] config section
+PLANET_NAME = "Unconfigured Planet"
+PLANET_LINK = "Unconfigured Planet"
+PLANET_FEED = None
+OWNER_NAME = "Anonymous Coward"
+OWNER_EMAIL = ""
+LOG_LEVEL = "WARNING"
+FEED_TIMEOUT = 20 # seconds
+
+# Default template file list
+TEMPLATE_FILES = "examples/basic/planet.html.tmpl"
+
+
+
+def config_get(config, section, option, default=None, raw=0, vars=None):
+ """Get a value from the configuration, with a default."""
+ if config.has_option(section, option):
+ return config.get(section, option, raw=raw, vars=None)
+ else:
+ return default
+
+def main():
+ config_file = CONFIG_FILE
+ offline = 0
+ verbose = 0
+
+ for arg in sys.argv[1:]:
+ if arg == "-h" or arg == "--help":
+ print "Usage: planet [options] [CONFIGFILE]"
+ print
+ print "Options:"
+ print " -v, --verbose DEBUG level logging during update"
+ print " -o, --offline Update the Planet from the cache only"
+ print " -h, --help Display this help message and exit"
+ print
+ sys.exit(0)
+ elif arg == "-v" or arg == "--verbose":
+ verbose = 1
+ elif arg == "-o" or arg == "--offline":
+ offline = 1
+ elif arg.startswith("-"):
+ print >>sys.stderr, "Unknown option:", arg
+ sys.exit(1)
+ else:
+ config_file = arg
+
+ # Read the configuration file
+ config = ConfigParser()
+ config.read(config_file)
+ if not config.has_section("Planet"):
+ print >>sys.stderr, "Configuration missing [Planet] section."
+ sys.exit(1)
+
+ # Read the [Planet] config section
+ planet_name = config_get(config, "Planet", "name", PLANET_NAME)
+ planet_link = config_get(config, "Planet", "link", PLANET_LINK)
+ planet_feed = config_get(config, "Planet", "feed", PLANET_FEED)
+ owner_name = config_get(config, "Planet", "owner_name", OWNER_NAME)
+ owner_email = config_get(config, "Planet", "owner_email", OWNER_EMAIL)
+ if verbose:
+ log_level = "DEBUG"
+ else:
+ log_level = config_get(config, "Planet", "log_level", LOG_LEVEL)
+ feed_timeout = config_get(config, "Planet", "feed_timeout", FEED_TIMEOUT)
+ template_files = config_get(config, "Planet", "template_files",
+ TEMPLATE_FILES).split(" ")
+
+ # Default feed to the first feed for which there is a template
+ if not planet_feed:
+ for template_file in template_files:
+ name = os.path.splitext(os.path.basename(template_file))[0]
+ if name.find('atom')>=0 or name.find('rss')>=0:
+ planet_feed = urlparse.urljoin(planet_link, name)
+ break
+
+ # Define locale
+ if config.has_option("Planet", "locale"):
+ # The user can specify more than one locale (separated by ":") as
+ # fallbacks.
+ locale_ok = False
+ for user_locale in config.get("Planet", "locale").split(':'):
+ user_locale = user_locale.strip()
+ try:
+ locale.setlocale(locale.LC_ALL, user_locale)
+ except locale.Error:
+ pass
+ else:
+ locale_ok = True
+ break
+ if not locale_ok:
+ print >>sys.stderr, "Unsupported locale setting."
+ sys.exit(1)
+
+ # Activate logging
+ planet.logging.basicConfig()
+ planet.logging.getLogger().setLevel(planet.logging.getLevelName(log_level))
+ log = planet.logging.getLogger("planet.runner")
+ try:
+ log.warning
+ except:
+ log.warning = log.warn
+
+ # timeoutsocket allows feedparser to time out rather than hang forever on
+ # ultra-slow servers. Python 2.3 now has this functionality available in
+ # the standard socket library, so under 2.3 you don't need to install
+ # anything. But you probably should anyway, because the socket module is
+ # buggy and timeoutsocket is better.
+ if feed_timeout:
+ try:
+ feed_timeout = float(feed_timeout)
+ except:
+ log.warning("Feed timeout set to invalid value '%s', skipping", feed_timeout)
+ feed_timeout = None
+
+ if feed_timeout and not offline:
+ try:
+ from planet import timeoutsocket
+ timeoutsocket.setDefaultSocketTimeout(feed_timeout)
+ log.debug("Socket timeout set to %d seconds", feed_timeout)
+ except ImportError:
+ import socket
+ if hasattr(socket, 'setdefaulttimeout'):
+ log.debug("timeoutsocket not found, using python function")
+ socket.setdefaulttimeout(feed_timeout)
+ log.debug("Socket timeout set to %d seconds", feed_timeout)
+ else:
+ log.error("Unable to set timeout to %d seconds", feed_timeout)
+
+ # run the planet
+ my_planet = planet.Planet(config)
+ my_planet.run(planet_name, planet_link, template_files, offline)
+
+ my_planet.generate_all_files(template_files, planet_name,
+ planet_link, planet_feed, owner_name, owner_email)
+
+
+if __name__ == "__main__":
+ main()
+
View
953 planet/planet/__init__.py
@@ -0,0 +1,953 @@
+#!/usr/bin/env python
+# -*- coding: UTF-8 -*-
+"""Planet aggregator library.
+
+This package is a library for developing web sites or software that
+aggregate RSS, CDF and Atom feeds taken from elsewhere into a single,
+combined feed.
+"""
+
+__version__ = "2.0"
+__authors__ = [ "Scott James Remnant <scott@netsplit.com>",
+ "Jeff Waugh <jdub@perkypants.org>" ]
+__license__ = "Python"
+
+
+# Modules available without separate import
+import cache
+import feedparser
+import sanitize
+import htmltmpl
+import sgmllib
+try:
+ import logging
+except:
+ import compat_logging as logging
+
+# Limit the effect of "from planet import *"
+__all__ = ("cache", "feedparser", "htmltmpl", "logging",
+ "Planet", "Channel", "NewsItem")
+
+
+import os
+import md5
+import time
+import dbhash
+import re
+
+try:
+ from xml.sax.saxutils import escape
+except:
+ def escape(data):
+ return data.replace("&","&amp;").replace(">","&gt;").replace("<","&lt;")
+
+# Version information (for generator headers)
+VERSION = ("Planet/%s +http://www.planetplanet.org" % __version__)
+
+# Default User-Agent header to send when retreiving feeds
+USER_AGENT = VERSION + " " + feedparser.USER_AGENT
+
+# Default cache directory
+CACHE_DIRECTORY = "cache"
+
+# Default number of items to display from a new feed
+NEW_FEED_ITEMS = 10
+
+# Useful common date/time formats
+TIMEFMT_ISO = "%Y-%m-%dT%H:%M:%S+00:00"
+TIMEFMT_822 = "%a, %d %b %Y %H:%M:%S +0000"
+
+
+# Log instance to use here
+log = logging.getLogger("planet")
+try:
+ log.warning
+except:
+ log.warning = log.warn
+
+# Defaults for the template file config sections
+ENCODING = "utf-8"
+ITEMS_PER_PAGE = 60
+DAYS_PER_PAGE = 0
+OUTPUT_DIR = "output"
+DATE_FORMAT = "%B %d, %Y %I:%M %p"
+NEW_DATE_FORMAT = "%B %d, %Y"
+ACTIVITY_THRESHOLD = 0
+
+class stripHtml(sgmllib.SGMLParser):
+ "remove all tags from the data"
+ def __init__(self, data):
+ sgmllib.SGMLParser.__init__(self)
+ self.result=''
+ self.feed(data)
+ self.close()
+ def handle_data(self, data):
+ if data: self.result+=data
+
+def template_info(item, date_format):
+ """Produce a dictionary of template information."""
+ info = {}
+ for key in item.keys():
+ if item.key_type(key) == item.DATE:
+ date = item.get_as_date(key)
+ info[key] = time.strftime(date_format, date)
+ info[key + "_iso"] = time.strftime(TIMEFMT_ISO, date)
+ info[key + "_822"] = time.strftime(TIMEFMT_822, date)
+ else:
+ info[key] = item[key]
+ if 'title' in item.keys():
+ info['title_plain'] = stripHtml(info['title']).result
+
+ return info
+
+
+class Planet:
+ """A set of channels.
+
+ This class represents a set of channels for which the items will
+ be aggregated together into one combined feed.
+
+ Properties:
+ user_agent User-Agent header to fetch feeds with.
+ cache_directory Directory to store cached channels in.
+ new_feed_items Number of items to display from a new feed.
+ filter A regular expression that articles must match.
+ exclude A regular expression that articles must not match.
+ """
+ def __init__(self, config):
+ self.config = config
+
+ self._channels = []
+
+ self.user_agent = USER_AGENT
+ self.cache_directory = CACHE_DIRECTORY
+ self.new_feed_items = NEW_FEED_ITEMS
+ self.filter = None
+ self.exclude = None
+
+ def tmpl_config_get(self, template, option, default=None, raw=0, vars=None):
+ """Get a template value from the configuration, with a default."""
+ if self.config.has_option(template, option):
+ return self.config.get(template, option, raw=raw, vars=None)
+ elif self.config.has_option("Planet", option):
+ return self.config.get("Planet", option, raw=raw, vars=None)
+ else:
+ return default
+
+ def gather_channel_info(self, template_file="Planet"):
+ date_format = self.tmpl_config_get(template_file,
+ "date_format", DATE_FORMAT, raw=1)
+
+ activity_threshold = int(self.tmpl_config_get(template_file,
+ "activity_threshold",
+ ACTIVITY_THRESHOLD))
+
+ if activity_threshold:
+ activity_horizon = \
+ time.gmtime(time.time()-86400*activity_threshold)
+ else:
+ activity_horizon = 0
+
+ channels = {}
+ channels_list = []
+ for channel in self.channels(hidden=1):
+ channels[channel] = template_info(channel, date_format)
+ channels_list.append(channels[channel])
+
+ # identify inactive feeds
+ if activity_horizon:
+ latest = channel.items(sorted=1)
+ if len(latest)==0 or latest[0].date < activity_horizon:
+ channels[channel]["message"] = \
+ "no activity in %d days" % activity_threshold
+
+ # report channel level errors
+ if not channel.url_status: continue
+ status = int(channel.url_status)
+ if status == 403:
+ channels[channel]["message"] = "403: forbidden"
+ elif status == 404:
+ channels[channel]["message"] = "404: not found"
+ elif status == 408:
+ channels[channel]["message"] = "408: request timeout"
+ elif status == 410:
+ channels[channel]["message"] = "410: gone"
+ elif status == 500:
+ channels[channel]["message"] = "internal server error"
+ elif status >= 400:
+ channels[channel]["message"] = "http status %s" % status
+
+ return channels, channels_list
+
+ def gather_items_info(self, channels, template_file="Planet", channel_list=None):
+ items_list = []
+ prev_date = []
+ prev_channel = None
+
+ date_format = self.tmpl_config_get(template_file,
+ "date_format", DATE_FORMAT, raw=1)
+ items_per_page = int(self.tmpl_config_get(template_file,
+ "items_per_page", ITEMS_PER_PAGE))
+ days_per_page = int(self.tmpl_config_get(template_file,
+ "days_per_page", DAYS_PER_PAGE))
+ new_date_format = self.tmpl_config_get(template_file,
+ "new_date_format", NEW_DATE_FORMAT, raw=1)
+
+ for newsitem in self.items(max_items=items_per_page,
+ max_days=days_per_page,
+ channels=channel_list):
+ item_info = template_info(newsitem, date_format)
+ chan_info = channels[newsitem._channel]
+ for k, v in chan_info.items():
+ item_info["channel_" + k] = v
+
+ # Check for the start of a new day
+ if prev_date[:3] != newsitem.date[:3]:
+ prev_date = newsitem.date
+ item_info["new_date"] = time.strftime(new_date_format,
+ newsitem.date)
+
+ # Check for the start of a new channel
+ if item_info.has_key("new_date") \
+ or prev_channel != newsitem._channel:
+ prev_channel = newsitem._channel
+ item_info["new_channel"] = newsitem._channel.url
+
+ items_list.append(item_info)
+
+ return items_list
+
+ def run(self, planet_name, planet_link, template_files, offline = False):
+ log = logging.getLogger("planet.runner")
+
+ # Create a planet
+ log.info("Loading cached data")
+ if self.config.has_option("Planet", "cache_directory"):
+ self.cache_directory = self.config.get("Planet", "cache_directory")
+ if self.config.has_option("Planet", "new_feed_items"):
+ self.new_feed_items = int(self.config.get("Planet", "new_feed_items"))
+ self.user_agent = "%s +%s %s" % (planet_name, planet_link,
+ self.user_agent)
+ if self.config.has_option("Planet", "filter"):
+ self.filter = self.config.get("Planet", "filter")
+
+ # The other configuration blocks are channels to subscribe to
+ for feed_url in self.config.sections():
+ if feed_url == "Planet" or feed_url in template_files:
+ continue
+
+ # Create a channel, configure it and subscribe it
+ channel = Channel(self, feed_url)
+ self.subscribe(channel)
+
+ # Update it
+ try:
+ if not offline and not channel.url_status == '410':
+ channel.update()
+ except KeyboardInterrupt:
+ raise
+ except:
+ log.exception("Update of <%s> failed", feed_url)
+
+ def generate_all_files(self, template_files, planet_name,
+ planet_link, planet_feed, owner_name, owner_email):
+
+ log = logging.getLogger("planet.runner")
+ # Go-go-gadget-template
+ for template_file in template_files:
+ manager = htmltmpl.TemplateManager()
+ log.info("Processing template %s", template_file)
+ try:
+ template = manager.prepare(template_file)
+ except htmltmpl.TemplateError:
+ template = manager.prepare(os.path.basename(template_file))
+ # Read the configuration
+ output_dir = self.tmpl_config_get(template_file,
+ "output_dir", OUTPUT_DIR)
+ date_format = self.tmpl_config_get(template_file,
+ "date_format", DATE_FORMAT, raw=1)
+ encoding = self.tmpl_config_get(template_file, "encoding", ENCODING)
+
+ # We treat each template individually
+ base = os.path.splitext(os.path.basename(template_file))[0]
+ url = os.path.join(planet_link, base)
+ output_file = os.path.join(output_dir, base)
+
+ # Gather information
+ channels, channels_list = self.gather_channel_info(template_file)
+ items_list = self.gather_items_info(channels, template_file)
+
+ # Gather item information
+
+ # Process the template
+ tp = htmltmpl.TemplateProcessor(html_escape=0)
+ tp.set("Items", items_list)
+ tp.set("Channels", channels_list)
+
+ # Generic information
+ tp.set("generator", VERSION)
+ tp.set("name", planet_name)
+ tp.set("link", planet_link)
+ tp.set("owner_name", owner_name)
+ tp.set("owner_email", owner_email)
+ tp.set("url", url)
+
+ if planet_feed:
+ tp.set("feed", planet_feed)
+ tp.set("feedtype", planet_feed.find('rss')>=0 and 'rss' or 'atom')
+
+ # Update time
+ date = time.gmtime()
+ tp.set("date", time.strftime(date_format, date))
+ tp.set("date_iso", time.strftime(TIMEFMT_ISO, date))
+ tp.set("date_822", time.strftime(TIMEFMT_822, date))
+
+ try:
+ log.info("Writing %s", output_file)
+ output_fd = open(output_file, "w")
+ if encoding.lower() in ("utf-8", "utf8"):
+ # UTF-8 output is the default because we use that internally
+ output_fd.write(tp.process(template))
+ elif encoding.lower() in ("xml", "html", "sgml"):
+ # Magic for Python 2.3 users
+ output = tp.process(template).decode("utf-8")
+ output_fd.write(output.encode("ascii", "xmlcharrefreplace"))
+ else:
+ # Must be a "known" encoding
+ output = tp.process(template).decode("utf-8")
+ output_fd.write(output.encode(encoding, "replace"))
+ output_fd.close()
+ except KeyboardInterrupt:
+ raise
+ except:
+ log.exception("Write of %s failed", output_file)
+
+ def channels(self, hidden=0, sorted=1):
+ """Return the list of channels."""
+ channels = []
+ for channel in self._channels:
+ if hidden or not channel.has_key("hidden"):
+ channels.append((channel.name, channel))
+
+ if sorted:
+ channels.sort()
+
+ return [ c[-1] for c in channels ]
+
+ def find_by_basename(self, basename):
+ for channel in self._channels:
+ if basename == channel.cache_basename(): return channel
+
+ def subscribe(self, channel):
+ """Subscribe the planet to the channel."""
+ self._channels.append(channel)
+
+ def unsubscribe(self, channel):
+ """Unsubscribe the planet from the channel."""
+ self._channels.remove(channel)
+
+ def items(self, hidden=0, sorted=1, max_items=0, max_days=0, channels=None):
+ """Return an optionally filtered list of items in the channel.
+
+ The filters are applied in the following order:
+
+ If hidden is true then items in hidden channels and hidden items
+ will be returned.
+
+ If sorted is true then the item list will be sorted with the newest
+ first.
+
+ If max_items is non-zero then this number of items, at most, will
+ be returned.
+
+ If max_days is non-zero then any items older than the newest by
+ this number of days won't be returned. Requires sorted=1 to work.
+
+
+ The sharp-eyed will note that this looks a little strange code-wise,
+ it turns out that Python gets *really* slow if we try to sort the
+ actual items themselves. Also we use mktime here, but it's ok
+ because we discard the numbers and just need them to be relatively
+ consistent between each other.
+ """
+ planet_filter_re = None
+ if self.filter:
+ planet_filter_re = re.compile(self.filter, re.I)
+ planet_exclude_re = None
+ if self.exclude:
+ planet_exclude_re = re.compile(self.exclude, re.I)
+
+ items = []
+ seen_guids = {}
+ if not channels: channels=self.channels(hidden=hidden, sorted=0)
+ for channel in channels:
+ for item in channel._items.values():
+ if hidden or not item.has_key("hidden"):
+
+ channel_filter_re = None
+ if channel.filter:
+ channel_filter_re = re.compile(channel.filter,
+ re.I)
+ channel_exclude_re = None
+ if channel.exclude:
+ channel_exclude_re = re.compile(channel.exclude,
+ re.I)
+ if (planet_filter_re or planet_exclude_re \
+ or channel_filter_re or channel_exclude_re):
+ title = ""
+ if item.has_key("title"):
+ title = item.title
+ content = item.get_content("content")
+
+ if planet_filter_re:
+ if not (planet_filter_re.search(title) \
+ or planet_filter_re.search(content)):
+ continue
+
+ if planet_exclude_re:
+ if (planet_exclude_re.search(title) \
+ or planet_exclude_re.search(content)):
+ continue
+
+ if channel_filter_re:
+ if not (channel_filter_re.search(title) \
+ or channel_filter_re.search(content)):
+ continue
+
+ if channel_exclude_re:
+ if (channel_exclude_re.search(title) \
+ or channel_exclude_re.search(content)):
+ continue
+
+ if not seen_guids.has_key(item.id):
+ seen_guids[item.id] = 1;
+ items.append((time.mktime(item.date), item.order, item))
+
+ # Sort the list
+ if sorted:
+ items.sort()
+ items.reverse()
+
+ # Apply max_items filter
+ if len(items) and max_items:
+ items = items[:max_items]
+
+ # Apply max_days filter
+ if len(items) and max_days:
+ max_count = 0
+ max_time = items[0][0] - max_days * 84600
+ for item in items:
+ if item[0] > max_time:
+ max_count += 1
+ else:
+ items = items[:max_count]
+ break
+
+ return [ i[-1] for i in items ]
+
+class Channel(cache.CachedInfo):
+ """A list of news items.
+
+ This class represents a list of news items taken from the feed of
+ a website or other source.
+
+ Properties:
+ url URL of the feed.
+ url_etag E-Tag of the feed URL.
+ url_modified Last modified time of the feed URL.
+ url_status Last HTTP status of the feed URL.
+ hidden Channel should be hidden (True if exists).
+ name Name of the feed owner, or feed title.
+ next_order Next order number to be assigned to NewsItem
+
+ updated Correct UTC-Normalised update time of the feed.
+ last_updated Correct UTC-Normalised time the feed was last updated.
+
+ id An identifier the feed claims is unique (*).
+ title One-line title (*).
+ link Link to the original format feed (*).
+ tagline Short description of the feed (*).
+ info Longer description of the feed (*).
+
+ modified Date the feed claims to have been modified (*).
+
+ author Name of the author (*).
+ publisher Name of the publisher (*).
+ generator Name of the feed generator (*).
+ category Category name (*).
+ copyright Copyright information for humans to read (*).
+ license Link to the licence for the content (*).
+ docs Link to the specification of the feed format (*).
+ language Primary language (*).
+ errorreportsto E-Mail address to send error reports to (*).
+
+ image_url URL of an associated image (*).
+ image_link Link to go with the associated image (*).
+ image_title Alternative text of the associated image (*).
+ image_width Width of the associated image (*).
+ image_height Height of the associated image (*).
+
+ filter A regular expression that articles must match.
+ exclude A regular expression that articles must not match.
+
+ Properties marked (*) will only be present if the original feed
+ contained them. Note that the optional 'modified' date field is simply
+ a claim made by the item and parsed from the information given, 'updated'
+ (and 'last_updated') are far more reliable sources of information.
+
+ Some feeds may define additional properties to those above.
+ """
+ IGNORE_KEYS = ("links", "contributors", "textinput", "cloud", "categories",
+ "url", "href", "url_etag", "url_modified", "tags", "itunes_explicit")
+
+ def __init__(self, planet, url):
+ if not os.path.isdir(planet.cache_directory):
+ os.makedirs(planet.cache_directory)
+ cache_filename = cache.filename(planet.cache_directory, url)
+ cache_file = dbhash.open(cache_filename, "c", 0666)
+
+ cache.CachedInfo.__init__(self, cache_file, url, root=1)
+
+ self._items = {}
+ self._planet = planet
+ self._expired = []
+ self.url = url
+ # retain the original URL for error reporting
+ self.configured_url = url
+ self.url_etag = None
+ self.url_status = None
+ self.url_modified = None
+ self.name = None
+ self.updated = None
+ self.last_updated = None
+ self.filter = None
+ self.exclude = None
+ self.next_order = "0"
+ self.cache_read()
+ self.cache_read_entries()
+
+ if planet.config.has_section(url):
+ for option in planet.config.options(url):
+ value = planet.config.get(url, option)
+ self.set_as_string(option, value, cached=0)
+
+ def has_item(self, id_):
+ """Check whether the item exists in the channel."""
+ return self._items.has_key(id_)
+
+ def get_item(self, id_):
+ """Return the item from the channel."""
+ return self._items[id_]
+
+ # Special methods
+ __contains__ = has_item
+
+ def items(self, hidden=0, sorted=0):
+ """Return the item list."""
+ items = []
+ for item in self._items.values():
+ if hidden or not item.has_key("hidden"):
+ items.append((time.mktime(item.date), item.order, item))
+
+ if sorted:
+ items.sort()
+ items.reverse()
+
+ return [ i[-1] for i in items ]
+
+ def __iter__(self):
+ """Iterate the sorted item list."""
+ return iter(self.items(sorted=1))
+
+ def cache_read_entries(self):
+ """Read entry information from the cache."""
+ keys = self._cache.keys()
+ for key in keys:
+ if key.find(" ") != -1: continue
+ if self.has_key(key): continue
+
+ item = NewsItem(self, key)
+ self._items[key] = item
+
+ def cache_basename(self):
+ return cache.filename('',self._id)
+
+ def cache_write(self, sync=1):
+ """Write channel and item information to the cache."""
+ for item in self._items.values():
+ item.cache_write(sync=0)
+ for item in self._expired:
+ item.cache_clear(sync=0)
+ cache.CachedInfo.cache_write(self, sync)
+
+ self._expired = []
+
+ def feed_information(self):
+ """
+ Returns a description string for the feed embedded in this channel.
+
+ This will usually simply be the feed url embedded in <>, but in the
+ case where the current self.url has changed from the original
+ self.configured_url the string will contain both pieces of information.
+ This is so that the URL in question is easier to find in logging
+ output: getting an error about a URL that doesn't appear in your config
+ file is annoying.
+ """
+ if self.url == self.configured_url:
+ return "<%s>" % self.url
+ else:
+ return "<%s> (formerly <%s>)" % (self.url, self.configured_url)
+
+ def update(self):
+ """Download the feed to refresh the information.
+
+ This does the actual work of pulling down the feed and if it changes
+ updates the cached information about the feed and entries within it.
+ """
+ info = feedparser.parse(self.url,
+ etag=self.url_etag, modified=self.url_modified,
+ agent=self._planet.user_agent)
+ if info.has_key("status"):
+ self.url_status = str(info.status)
+ elif info.has_key("entries") and len(info.entries)>0:
+ self.url_status = str(200)
+ elif info.bozo and info.bozo_exception.__class__.__name__=='Timeout':
+ self.url_status = str(408)
+ else:
+ self.url_status = str(500)
+
+ if self.url_status == '301' and \
+ (info.has_key("entries") and len(info.entries)>0):
+ log.warning("Feed has moved from <%s> to <%s>", self.url, info.url)
+ try:
+ os.link(cache.filename(self._planet.cache_directory, self.url),
+ cache.filename(self._planet.cache_directory, info.url))
+ except:
+ pass
+ self.url = info.url
+ elif self.url_status == '304':
+ log.info("Feed %s unchanged", self.feed_information())
+ return
+ elif self.url_status == '410':
+ log.info("Feed %s gone", self.feed_information())
+ self.cache_write()
+ return
+ elif self.url_status == '408':
+ log.warning("Feed %s timed out", self.feed_information())
+ return
+ elif int(self.url_status) >= 400:
+ log.error("Error %s while updating feed %s",
+ self.url_status, self.feed_information())
+ return
+ else:
+ log.info("Updating feed %s", self.feed_information())
+
+ self.url_etag = info.has_key("etag") and info.etag or None
+ self.url_modified = info.has_key("modified") and info.modified or None
+ if self.url_etag is not None:
+ log.debug("E-Tag: %s", self.url_etag)
+ if self.url_modified is not None:
+ log.debug("Last Modified: %s",
+ time.strftime(TIMEFMT_ISO, self.url_modified))
+
+ self.update_info(info.feed)
+ self.update_entries(info.entries)
+ self.cache_write()
+
+ def update_info(self, feed):
+ """Update information from the feed.
+
+ This reads the feed information supplied by feedparser and updates
+ the cached information about the feed. These are the various
+ potentially interesting properties that you might care about.
+ """
+ for key in feed.keys():
+ if key in self.IGNORE_KEYS or key + "_parsed" in self.IGNORE_KEYS:
+ # Ignored fields
+ pass
+ elif feed.has_key(key + "_parsed"):
+ # Ignore unparsed date fields
+ pass
+ elif key.endswith("_detail"):
+ # retain name and email sub-fields
+ if feed[key].has_key('name') and feed[key].name:
+ self.set_as_string(key.replace("_detail","_name"), \
+ feed[key].name)
+ if feed[key].has_key('email') and feed[key].email:
+ self.set_as_string(key.replace("_detail","_email"), \
+ feed[key].email)
+ elif key == "items":
+ # Ignore items field
+ pass
+ elif key.endswith("_parsed"):
+ # Date fields
+ if feed[key] is not None:
+ self.set_as_date(key[:-len("_parsed")], feed[key])
+ elif key == "image":
+ # Image field: save all the information
+ if feed[key].has_key("url"):
+ self.set_as_string(key + "_url", feed[key].url)
+ if feed[key].has_key("link"):
+ self.set_as_string(key + "_link", feed[key].link)
+ if feed[key].has_key("title"):
+ self.set_as_string(key + "_title", feed[key].title)
+ if feed[key].has_key("width"):
+ self.set_as_string(key + "_width", str(feed[key].width))
+ if feed[key].has_key("height"):
+ self.set_as_string(key + "_height", str(feed[key].height))
+ elif isinstance(feed[key], (str, unicode)):
+ # String fields
+ try:
+ detail = key + '_detail'
+ if feed.has_key(detail) and feed[detail].has_key('type'):
+ if feed[detail].type == 'text/html':
+ feed[key] = sanitize.HTML(feed[key])
+ elif feed[detail].type == 'text/plain':
+ feed[key] = escape(feed[key])
+ self.set_as_string(key, feed[key])
+ except KeyboardInterrupt:
+ raise
+ except:
+ log.exception("Ignored '%s' of <%s>, unknown format",
+ key, self.url)
+
+ def update_entries(self, entries):
+ """Update entries from the feed.
+
+ This reads the entries supplied by feedparser and updates the
+ cached information about them. It's at this point we update
+ the 'updated' timestamp and keep the old one in 'last_updated',
+ these provide boundaries for acceptable entry times.
+
+ If this is the first time a feed has been updated then most of the
+ items will be marked as hidden, according to Planet.new_feed_items.
+
+ If the feed does not contain items which, according to the sort order,
+ should be there; those items are assumed to have been expired from
+ the feed or replaced and are removed from the cache.
+ """
+ if not len(entries):
+ return
+
+ self.last_updated = self.updated
+ self.updated = time.gmtime()
+
+ new_items = []
+ feed_items = []
+ for entry in entries:
+ # Try really hard to find some kind of unique identifier
+ if entry.has_key("id"):
+ entry_id = cache.utf8(entry.id)
+ elif entry.has_key("link"):
+ entry_id = cache.utf8(entry.link)
+ elif entry.has_key("title"):
+ entry_id = (self.url + "/"
+ + md5.new(cache.utf8(entry.title)).hexdigest())
+ elif entry.has_key("summary"):
+ entry_id = (self.url + "/"
+ + md5.new(cache.utf8(entry.summary)).hexdigest())
+ else:
+ log.error("Unable to find or generate id, entry ignored")
+ continue
+
+ # Create the item if necessary and update
+ if self.has_item(entry_id):
+ item = self._items[entry_id]
+ else:
+ item = NewsItem(self, entry_id)
+ self._items[entry_id] = item
+ new_items.append(item)
+ item.update(entry)
+ feed_items.append(entry_id)
+
+ # Hide excess items the first time through
+ if self.last_updated is None and self._planet.new_feed_items \
+ and len(feed_items) > self._planet.new_feed_items:
+ item.hidden = "yes"
+ log.debug("Marked <%s> as hidden (new feed)", entry_id)
+
+ # Assign order numbers in reverse
+ new_items.reverse()
+ for item in new_items:
+ item.order = self.next_order = str(int(self.next_order) + 1)
+
+ # Check for expired or replaced items
+ feed_count = len(feed_items)
+ log.debug("Items in Feed: %d", feed_count)
+ for item in self.items(sorted=1):
+ if feed_count < 1:
+ break
+ elif item.id in feed_items:
+ feed_count -= 1
+ elif item._channel.url_status != '226':
+ del(self._items[item.id])
+ self._expired.append(item)
+ log.debug("Removed expired or replaced item <%s>", item.id)
+
+ def get_name(self, key):
+ """Return the key containing the name."""
+ for key in ("name", "title"):
+ if self.has_key(key) and self.key_type(key) != self.NULL:
+ return self.get_as_string(key)
+
+ return ""
+
+class NewsItem(cache.CachedInfo):
+ """An item of news.
+
+ This class represents a single item of news on a channel. They're
+ created by members of the Channel class and accessible through it.
+
+ Properties:
+ id Channel-unique identifier for this item.
+ id_hash Relatively short, printable cryptographic hash of id
+ date Corrected UTC-Normalised update time, for sorting.
+ order Order in which items on the same date can be sorted.
+ hidden Item should be hidden (True if exists).
+
+ title One-line title (*).
+ link Link to the original format text (*).
+ summary Short first-page summary (*).
+ content Full HTML content.
+
+ modified Date the item claims to have been modified (*).
+ issued Date the item claims to have been issued (*).
+ created Date the item claims to have been created (*).
+ expired Date the item claims to expire (*).
+
+ author Name of the author (*).
+ publisher Name of the publisher (*).
+ category Category name (*).
+ comments Link to a page to enter comments (*).
+ license Link to the licence for the content (*).
+ source_name Name of the original source of this item (*).
+ source_link Link to the original source of this item (*).
+
+ Properties marked (*) will only be present if the original feed
+ contained them. Note that the various optional date fields are
+ simply claims made by the item and parsed from the information
+ given, 'date' is a far more reliable source of information.
+
+ Some feeds may define additional properties to those above.
+ """
+ IGNORE_KEYS = ("categories", "contributors", "enclosures", "links",
+ "guidislink", "date", "tags"<