Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

preparing 0.3.4

  • Loading branch information...
commit 87ee49bec81fa783b6cc78c63ed6f75921236456 1 parent 807960c
@invisiblellama authored
View
11 History.txt
@@ -1,3 +1,14 @@
+== 0.3.4 / 2009-07-17
+
+* Bug fixes
+
+ * Pre- and post processing filters moved to separate modules.
+ * Non-conformant element IDs are now fixed automaticly
+ * Regardless of the source settings, doctype now is always set to XHTML 1.0 Transitional
+ * -F (disable fixups) option removed, fixups are always on
+ * Documentation updates
+ * More tests
+
== 0.3.3 / 2009-07-05
* New features
View
6 README.rdoc
@@ -29,7 +29,7 @@ Few examples:
* Project Gutenberg's The Adventures Of Sherlock Holmes (with proper table of contents)
- repub -x 'title:div[@class='book']//h1' \
+ repub -x 'title:div[@class="book"]//h1' \
-x 'toc://table' \
-x 'toc_item://tr' \
http://www.gutenberg.org/dirs/etext99/advsh12h.htm
@@ -38,7 +38,7 @@ This tells Repub to look for title in the first found H1 in the DIV of class "bo
located in the first TABLE and TOC item can be found inside TR.
The above will produce readable ePub which can be further enhanced by removing some "noise" content:
- repub -x 'title:div[@class='book']//h1' \
+ repub -x 'title:div[@class="book"]//h1' \
-x 'toc://table' \
-x 'toc_item://tr' \
-X '//pre' -X '//hr' -X '//body/h1' -X '//body/h2' \
@@ -150,7 +150,7 @@ Currently, only "everything-on-one-page" HTML sources are supported. Repub will
Encoding auto-detection is slow.
-Chardet 0.9.0 is broken under Ruby 1.9.
+Chardet 0.9.0 is broken under Ruby 1.9 so if you want to use Ruby 1.9 you have to set encoding manually with -e.
Bugs: probably. If you find any, please report them to dg at invisiblellama dot net.
View
2  TODO
@@ -1,3 +1 @@
-* add support for rx cleaning/modifying source doc
-* make -q/-v actually do something
more parser tokens: author(s) etc ?
View
2  lib/repub.rb
@@ -1,7 +1,7 @@
module Repub
# :stopdoc:
- VERSION = '0.3.3'
+ VERSION = '0.3.4'
LIBPATH = File.expand_path(File.dirname(__FILE__)) + File::SEPARATOR
PATH = File.dirname(LIBPATH) + File::SEPARATOR
# :startdoc:
View
18 lib/repub/app/pre_filters.rb
@@ -21,15 +21,25 @@ class PreFilters
s
end
- # Find and fix all elements with id or name attributes beginning with digit
- # ADE wont follow links referencing such ids
+ # Convert line endings to LF
+ #
+ filter :fix_line_endings do |s|
+ s.gsub(/\r\n/, "\n")
+ end
+
+ # Fix all elements with broken id attribute
+ # In XHTML id must match [A-Za-z][A-Za-z0-9:_.-]*
+ # TODO: currently only testing for non-alpha first char...
#
filter :fix_ids do |s|
- match = s.scan(/\s+(?:id|name)\s*?=\s*?['"](\d+[^'"]*)['"]/im)
+ match = s.scan(/\s+((?:id|name)\s*?=\s*?['"])(\d+[^'"]*)['"]/im)
unless match.empty?
log.debug "-- Fixing broken element IDs"
match.each do |m|
- s.gsub!(m[0], "x#{m[0]}")
+ # fix id so it starts with alpha char
+ s.gsub!(m.join(''), m.join('x'))
+ # update fragment references
+ s.gsub!(/##{m[1]}(['"])/, "#x#{m[1]}\\1")
end
end
s
Please sign in to comment.
Something went wrong with that request. Please try again.