Skip to content
RSS全文抽取服務
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
admin 初次clone Oct 28, 2013
atom2rss 0527 再拆解更多資料 May 27, 2017
cache debian-7-full-text-rss-i386.tar.gz Mar 12, 2017
css 0527 再拆解更多資料 May 27, 2017
images 初次clone Oct 28, 2013
js 初次clone Oct 28, 2013
libraries add lost file May 9, 2019
site_config 0901 調整標題 Sep 1, 2017
.gitignore 0418 YouTube的調整跟把PhontomJS加入保存庫中 Apr 18, 2017
README.md 0527 再拆解更多資料 May 27, 2017
UPDATING.txt 初次clone Oct 28, 2013
changelog.txt 初次clone Oct 28, 2013
cleancache.php 初次clone Oct 28, 2013
config.php
feed_reformator.php
ftr_compatibility_test.php
index.php 0501 把body的位置轉換成各別網站的設定檔 May 1, 2014
license.txt 初次clone Oct 28, 2013
makefulltextfeed.php 0613 加入抓資料的黑名單,這些名單內的網址不抓資料 Jun 13, 2017

README.md

Full-Text RSS

NOTE 備註

This is a our public version of Full-Text RSS available to download for free from http://code.fivefilters.org.

This version has been enhanced by Pulipuli Chen with:

  • Multi-pages integration / 多頁面整合
  • Part of the interface support Traditional Chinese / 部分介面提供正體中文

For best extraction results, and to help us sustain the project, you can purchase the most up-to-date version at http://fivefilters.org/content-only/#download - so if you like this free version, please consider supporting us by purchasing the latest release.

If you have no need for the latest release, but would still like to contribute something, you can donate via Gittip or Flattr.

ERROR REPORT 錯誤回報

If you always get RSS feed extraction fail, please report the URL of RSS feed to GitHub Issue. (Registation needed) / 如果RSS feed資訊來源一直無法順利抽取出全文,請回報該RSS feed資訊來源的網址到GitHub Issue中(需要註冊,免費)

About 關於

See http://fivefilters.org/content-only/ for a description of the code.

Installation 安裝方法

  1. Extract the files in this ZIP archive to a folder on your computer. / 請下載壓縮檔,解壓縮到你的電腦中

  2. FTP the files up to your server / 將檔案上傳到你的伺服器

  3. Access index.php through your browser. E.g. http://example.org/full-text-rss/index.php / 開啟index.php網頁,例如 http://example.org/full-text-rss/index.php

  4. Enter a URL in the form field to test the code / 輸入網址遞交表單,測試看能不能順利運作

  5. If you get an RSS feed with full-text content, all is working well. :) / 如果網頁正常顯示全文RSS資訊來源,那就是正常運作了! :)

Configuration (optional) 設定 (可選)

  1. Save a copy of config.php as custom_config.php and edit custom_config.php

  2. If you decide to enable caching, make sure the cache folder (and its 2 sub folders) is writable. (You might need to change the permissions of these folders to 777 through your FTP client.)

  3. If extraction always failed, please try to increase "max_execution_time", "max_input_time", and "memory_limit" in your server's php.ini and restart server. / 如果文章抽取總是失敗,請嘗試設定php.ini的"max_execution_time"、"max_input_time"與"memory_limit",並重新啟動伺服器。

Site-specific extraction rules 特定網站的抽取規則

This free version does not contain the site config files we include with purchased copies, but these are now all available online. If you'd like to keep yours up to date using Git, follow the steps below:

  1. Change into the site_config/standard/ folder
  2. Delete everything in there
  3. Using the command line, enter: git clone https://github.com/pulipulichen/full-text-rss.git .
  4. Git should now download the latest site config files for you.
  5. To update the site config files again, you can simply run git pull from the directory.

Code example 程式碼範例

If you're developing an application which requires content extraction, you can call Full-Text RSS as a web service from within your application. Here's how to do it in PHP:

<?php
// $ftr should be URL where you installed this application
$ftr = 'http://example.org/full-text-rss/';
$article = 'http://www.bbc.co.uk/news/world-europe-21936308';

$request = $ftr.'makefulltextfeed.php?format=json&url='.urlencode($article);

// Send HTTP request and get response
$result = @file_get_contents($request);

if (!$result) die('Failed to fetch content');

$json = @json_decode($result);

if (!$json) die('Failed to parse JSON');

// What do we have?
// var_dump($json);

// Items?
// var_dump($json->rss->channel->item);

$title = $json->rss->channel->item->title;
// Note: this works when you're processing an article.
// If the input URL is a feed, ->item will be an array.

echo $title;

Different language? 搭配其他語言

Although we don't have examples in other programming languages, the essential steps should be:

  1. Construct the request URL using URL where you installed Full-Text RSS and the article or feed URL (see $ftr, $article, $request in example above).

  2. Fetch the resulting URL using an HTTP GET request.

  3. Parse the HTTP response body as JSON and grab what you need.

MEMO

設定偵測用的預設網址:/config.php 設定下一頁:/config.php 設定選擇範圍:/libraries/content-extractor/SiteConfig.php

OpenVZ Template

debian-7-full-text-rss_i386.tar.gz https://drive.google.com/open?id=0B5UXWzdIPpm0VzJyV29GRklubkE

ATOM to RSS

apt-get install php5-xsl -y /atom2rss/?atom=[URL]

You can’t perform that action at this time.