Skip to content

Commit

Permalink
Replace lede-project with openwrt spider
Browse files Browse the repository at this point in the history
Fixes #113
  • Loading branch information
Florian Preinstorfer committed Feb 15, 2018
1 parent 89e244e commit 561861c
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 24 deletions.
2 changes: 1 addition & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Feeds is currently able to create Atom feeds for the following sites:
* `falter.at <http://www.falter.at>`_: Newest articles and restaurant reviews
* `HELP.gv.at <https://help.gv.at>`_: News and changes in Austrian law
* `KONSUMENT.AT <http://www.konsument.at>`_: Newest articles
* `lede-project.org <https://lede-project.org>`_: Newest LEDE releases
* `openwrt.org <https://openwrt.org>`_: Newest OpenWRT releases
* `LWN.net <https://lwn.net>`_: Newest articles; special treatment
of Weekly Editions
* `Oberösterreichische Nachrichten <https://www.nachrichten.at>`_:
Expand Down
16 changes: 0 additions & 16 deletions docs/spiders/lede-project.org.rst

This file was deleted.

16 changes: 16 additions & 0 deletions docs/spiders/openwrt.org.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
.. _spider_openwrt.org:

openwrt.org
-----------
Newest releases from `OpenWRT <https://openwrt.org>`_.

Configuration
~~~~~~~~~~~~~
Add ``openwrt.org`` to the list of spiders:

.. code-block:: ini
# List of spiders to run by default, one per line.
spiders =
openwrt.org
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,17 @@
from feeds.spiders import FeedsCrawlSpider


class LedeProjectOrgSpider(FeedsCrawlSpider):
name = 'lede-project.org'
allowed_domains = ['lede-project.org']
start_urls = ['https://lede-project.org/releases/start']
class OpenwrtOrgSpider(FeedsCrawlSpider):
name = 'openwrt.org'
allowed_domains = ['openwrt.org']
start_urls = ['https://openwrt.org/releases/start']
rules = (
Rule(LinkExtractor(
allow=('releases/(.*)/start',)), callback='parse_release'),
)

_title = 'New LEDE Release Builds',
_subtitle = 'Newest release builds from the LEDE project.'
_title = 'New OpenWRT Release Builds',
_subtitle = 'Newest release builds from OpenWRT.'
_timezone = 'Europe/Berlin'
_base_url = 'https://{}'.format(name)

Expand All @@ -38,7 +38,8 @@ def parse_release_notes(self, response):
)
il.add_xpath('title', '//h1/text()')
il.add_value('link', response.url)
il.add_xpath('updated', '//span[@class="docInfo"]/ul/li/span/text()')
il.add_xpath(
'updated', '//div[@class="docInfo"]', re='Last modified: (.*) by')
il.add_value('content_html', '<h1>Release Notes</h1>')
il.add_xpath('content_html', '//h1/following-sibling::*')
yield scrapy.Request(
Expand Down

0 comments on commit 561861c

Please sign in to comment.