Permalink
Browse files

Initial import

  • Loading branch information...
bashu committed Sep 30, 2007
0 parents commit 526aec890a1712c5e607597c361370ff4fb43d66
Showing with 1,486 additions and 0 deletions.
  1. +2 −0 AUTHORS
  2. +340 −0 COPYING
  3. +35 −0 INSTALL
  4. +11 −0 NEWS
  5. +88 −0 README
  6. +4 −0 TODO
  7. +48 −0 arch.conf
  8. +147 −0 archmage
  9. BIN archmage.1.gz
  10. +414 −0 archmod/CHM.py
  11. +15 −0 archmod/__init__.py
  12. +16 −0 archmod/htmltotext.py
  13. +44 −0 archmod/mod_chm.py
  14. +72 −0 setup.py
  15. +171 −0 templates/arch_contents.html
  16. +2 −0 templates/arch_css.css
  17. +26 −0 templates/arch_frameset.html
  18. +12 −0 templates/arch_header.html
  19. BIN templates/icons/0.gif
  20. BIN templates/icons/1.gif
  21. BIN templates/icons/10.gif
  22. BIN templates/icons/11.gif
  23. BIN templates/icons/12.gif
  24. BIN templates/icons/13.gif
  25. BIN templates/icons/14.gif
  26. BIN templates/icons/15.gif
  27. BIN templates/icons/16.gif
  28. BIN templates/icons/17.gif
  29. BIN templates/icons/18.gif
  30. BIN templates/icons/19.gif
  31. BIN templates/icons/2.gif
  32. BIN templates/icons/20.gif
  33. BIN templates/icons/21.gif
  34. BIN templates/icons/22.gif
  35. BIN templates/icons/23.gif
  36. BIN templates/icons/24.gif
  37. BIN templates/icons/25.gif
  38. BIN templates/icons/26.gif
  39. BIN templates/icons/27.gif
  40. BIN templates/icons/3.gif
  41. BIN templates/icons/35.gif
  42. BIN templates/icons/37.gif
  43. BIN templates/icons/39.gif
  44. BIN templates/icons/4.gif
  45. BIN templates/icons/5.gif
  46. BIN templates/icons/6.gif
  47. BIN templates/icons/7.gif
  48. BIN templates/icons/8.gif
  49. BIN templates/icons/9.gif
  50. BIN templates/icons/90.gif
  51. BIN templates/icons/91.gif
  52. BIN templates/icons/92.gif
  53. BIN templates/icons/93.gif
  54. BIN templates/icons/94.gif
  55. BIN templates/icons/95.gif
  56. BIN templates/icons/96.gif
  57. BIN templates/icons/97.gif
  58. BIN templates/icons/98.gif
  59. BIN templates/icons/99.gif
  60. BIN templates/icons/next.gif
  61. BIN templates/icons/prev.gif
  62. +39 −0 templates/index.html
@@ -0,0 +1,2 @@
Copyright (c) 2003 Eugeny Korekin <az@ftc.ru>
Copyright (c) 2005-2007 Basil Shubin <basil.shubin@gmail.com>
340 COPYING

Large diffs are not rendered by default.

Oops, something went wrong.
35 INSTALL
@@ -0,0 +1,35 @@
Source Tarball
==============
First unpark source tarball:
# tar xzvf archmage-0.1.9.tar.gz
change directory:
# cd archmage-0.1.9
to install arCHMage run following command:
# python setup.py install
Debian / Ubuntu
===============
You can use prepackaged version from archive:
# apt-get install archmage
if you want to use it with apache:
# apt-get install libapache-mod-python
or to use it with apache2:
# apt-get install libapache2-mod-python
To have ability to dump HTML data from CHM file as plain text:
# apt-get install lynx
11 NEWS
@@ -0,0 +1,11 @@
arCHMage 0.2
============
Bug fixes:
* [SF #1767529] IOError exception using CHM dump option
arCHMage 0.1.9
==============
Changes/New features:
* New option - 'dump' for dumping HTML data from CHM file as Plain Text
88 README
@@ -0,0 +1,88 @@
About arCHMage
==============
arCHMage is an extensible reader and decompiler for files in the CHM
format. This is the format used by Microsoft HTML Help, and is also known as
Compiled HTML.
arCHMage is written in the Python programming language and uses PyCHM - python
bindings for CHMLib from GnoCHM project.
Originally this utility was written by Eugeny Korekin, but since 2005 it has
been maintained and developed by Basil Shubin.
Features List
=============
* Extracting CHM content
* Dumping HTML data from CHM file as plain text (using external tools)
* Running as standalone http-server
* Extension for Apache Web Server - mod_chm
System Requirements
===================
arCHMage requires the following libraries:
* Python 2.3 or later
* PyCHM
* CHMLib
Other (optional) dependencies:
* Lynx or ELinks - dumping HTML as plain text
* mod_python - Apache/Python Integration
Installation
============
See INSTALL file for more details.
Simple Usage HOWTO
==================
There is three ways to use arCHMage package now:
1) Extract CHM file content in to directory (directory will be created):
archmage -x <chmfile> <directory>
Note: Decompilation will fail, if destination directory already exist.
2) Dump HTML data from CHM file as plain text:
archmage -d <chmfile>
Note: All data dumped in to standard output. To use this feature you must
have lynx or elinks text browser installed. See arch.conf for details.
3) Run as http-server, which will publish chm file contents on specified port.
archmage -p <port> <chmfile>
Note: You can first decompress chm file into directory and use this
directory instead chm file, i.e: archmage -p <port> <chmdir>
4) Tune your apache to publish chm file contents if there is trailing slash in
request to that file (you will need working mod_python for that):
Add that lines to your httpd.conf:
AddHandler python-program .chm
PythonHandler archmod.mod_chm
Restart apache.
Let's suppose, you have file sample.chm in DocumentRoot of your apache.
After that tuning you can receive raw chm file, if you point your browser
to
http://yourserver/sample.chm
or you can view chm file on the fly if you point your browser to
http://yourserver/sample.chm/ (note trailing slash)
4 TODO
@@ -0,0 +1,4 @@
* Update manpage
* CHM to PDF or ODT converter !!!
* PDF or ODT to CHM converter ???
* Any other ideas?
@@ -0,0 +1,48 @@
# Directory for templates, all files in that directory will be parsed
# and <%.+%> occurencies will be replaced with values from that
# file. For example, <%title%>, will be substituted by value of title
# variable.
# There is also some special variables, which have default values:
# contents - list, which represents chm file contents and deftopic -
# name of default page.
templates_dir='/usr/share/archmage/templates/'
# Directory for icons
icons_dir='/usr/share/archmage/templates/icons/'
# List of auxillary files, stored inside chm.
# Those files would not be extracted.
auxes=('/#IDXHDR', '/#ITBITS', '/#STRINGS', '/#SYSTEM', '/#TOPICS',
'/#URLSTR', '/#URLTBL', '/#WINDOWS', '/$FIftiMain', '/$OBJINST',
'/$WWAssociativeLinks', '/$WWKeywordLinks', ':')
# Title. That is value, which you want to see in browser title.
# 'sourcename' is the name of source file.
from os.path import basename
title=basename(sourcename)
# Background and foreground colors for header.
bcolor='#63baff'
fcolor='white'
# Filenames inside chm stored in utf-8, but links can be in some
# national codepage. If you set fs_encoding such links would be
# converted to it.
fs_encoding='utf-8'
# If your filesystem is case-sensitive, links in the html can point to
# files that have differences in the case you need to set
# filename_case to 1 in that case :-)
filename_case=1
# If you want to add javascript code for restore framing to every
# page, set addframing.
restore_framing=1
# CHM2TEXT converting. Use following command to convert CHM content to plain
# text and dump results into stdout. Make sure that below apps is installed
# on your PC and they are accessible through $PATH
#htmltotext='lynx -dump -stdin'
htmltotext='elinks -dump'
147 archmage
@@ -0,0 +1,147 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# arCHMage -- extensible reader and decompiler for files in the CHM format.
#
# Originally written by Eugeny Korekin <aaaz@users.sourceforge.net>
# Significantly modified by Basil Shubin <bashu@users.sourceforge.net>
#
# Copyright (c) 2003 Eugeny Korekin <aaaz@users.sourceforge.net>
# Copyright (c) 2005-2007 Basil Shubin <bashu@users.sourceforge.net>
"""arCHMage -- extensible reader and decompiler for files in the CHM format.
Usage: %(program)s [options] <chmfile> [destdir]
Where:
-x / --extract
Extracts CHM file into specified directory. If destination
directory is ommited, than the new one will be created based
on name of CHM file. This options is by defaut.
-p number
--port=number
Acts as HTTP server on specified port number, so you can read
CHM file with your favourite browser. You can specify a directory
with decompressed content.
-d / --dump
Dump HTML data as plain text into standard output.
-V / --version
Print version number and exit.
-h / --help
Print this text and exit.
"""
import os
import sys
import getopt
from archmod import __version__, message, error_msg
from archmod.CHM import *
program = sys.argv[0]
EXTRACT = 1
HTTPSERVER = 2
DUMPHTML = 3
COMMASPACE = ', '
def usage(code=0, msg=''):
message(code, __doc__ % globals())
message(code, msg)
sys.exit(code)
def file2dir(filename):
""" Convert filename.chm to filename_html """
dirname = filename.rsplit('.', 1)[0] + '_' + 'html'
return dirname
def parseargs():
try:
opts, args = getopt.getopt(sys.argv[1:], 'xdp:Vh',
['extract', 'dump', 'port=', 'version', 'help'])
except getopt.error, msg:
usage(1, msg)
class Options:
mode = None # EXTRACT or HTTPSERVER or other
port = None # HTTP port number
chmfile = None # CHM File to view/extract
dirname = None # Destination directory
options = Options()
for opt, arg in opts:
if opt in ('-h', '--help'):
usage()
elif opt in ('-V', '--version'):
message(0, __version__)
sys.exit(0)
elif opt in ('-p', '--port'):
if options.mode is not None:
usage(1, '-x and -p are mutually exclusive')
options.mode = HTTPSERVER
try:
options.port = int(arg)
except ValueError, msg:
usage(1, 'Invalid port number: %s' % msg)
elif opt in ('-x', '--extract'):
if options.mode is not None:
usage(1, '-x and -p are mutually exclusive')
options.mode = EXTRACT
elif opt in ('-d', '--dump'):
if options.mode is not None:
usage(1, '-d should be used without any other options')
options.mode = DUMPHTML
else:
assert False, (opt, arg)
# Sanity checks
if options.mode is None:
options.mode = EXTRACT
if not args:
usage(1, 'No CHM file was specified!')
else:
options.chmfile = args.pop(0)
# CHM content should be extracted
if options.mode == EXTRACT:
if not args:
options.dirname = file2dir(options.chmfile)
else:
options.dirname = args.pop(0)
# Any other arguments are invalid
if args:
usage(1, 'Invalid arguments: ' + COMMASPACE.join(args))
return options
def main():
options = parseargs()
if not os.path.exists(options.chmfile):
error_msg('No such file: %s' % options.chmfile)
# Check where is argument a CHM file or directory with decompressed
# content. Depending on results make 'source' instance of CHMFile or
# CHMDir class.
source = os.path.isfile(options.chmfile) and \
CHMFile(options.chmfile) or CHMDir(options.chmfile)
if options.mode == HTTPSERVER:
CHMServer(source, port=options.port).run()
elif options.mode == DUMPHTML:
source.dump_html()
else:
source.extract(options.dirname)
if __name__ == '__main__':
main()
BIN +1.31 KB archmage.1.gz
Binary file not shown.
Oops, something went wrong.

0 comments on commit 526aec8

Please sign in to comment.