-
Notifications
You must be signed in to change notification settings - Fork 10
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 37af16c
Showing
108 changed files
with
22,677 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
CATDOC CODING STANDARD | ||
~~~~~~~~~~~~~~~~~~~~~~ | ||
0. CATDOC ISN'T WRITTEN ON C++!!! | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
C and C++ are different languages. | ||
No // comments, no references, no declaration in the middle of block. | ||
|
||
1. Catdoc is portable program. | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
Please never make following assumptions: | ||
1. That int is more than 16-bit wide | ||
(consequentually, that signed int can hold Unicode character) | ||
2. That sizeof(int)>=sizeof(int *) | ||
3. That int is always 16-bit (it can be 32 bit as well) | ||
4. That long is 32-bit | ||
5. That char (and int and short as well) is either signed or unsigned | ||
Always use explicit signedness specifier | ||
6. That integer arithmetic is 32-bit long. | ||
7. That input is always seekable. Catdoc is often used as filter | ||
8. That filenames are either case-sensitive or case-insensitive | ||
9. That there is no difference between binary and text file opening mode | ||
10. That opening file in the text mode will do something reasonable. | ||
Always open files in binary mode. This is only way to produce | ||
results, consistent on all platforms. | ||
11. That you can rely on compiler POSIX or C99 compliance. If you need | ||
to use some function defined by this standard, write configure test | ||
and provide fallback. | ||
12. That you can allocate chunk of memory larger than 64K. | ||
13. That filenames can be longer that 8+3. | ||
|
||
2. Catdoc is used world-wide | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
1. Never write comments on languages other than English. | ||
2. Never assume that you can output character without passing it through | ||
convert_char function. | ||
|
||
3. Code formatting | ||
~~~~~~~~~~~~~~~~~ | ||
1. Use <Tab> for identation. If your text editor insists on <Tab> being | ||
8 char, consider using some other editor. vim is at least a bit more | ||
portable than catdoc. | ||
2. Open curly bracket on the same line as statement it belongs to: | ||
if (condition) { | ||
code | ||
} | ||
rather than | ||
if (condition) | ||
{ | ||
code | ||
} | ||
|
||
3. The only exeception from rule 2 are blocks in the switch statement: | ||
switch (var) { | ||
case value: | ||
{ | ||
code | ||
} | ||
} | ||
rather than | ||
switch (var) { | ||
case value: { | ||
code | ||
} | ||
} | ||
|
||
4. Write comments at the start of each function describing its purpose | ||
and arguments. | ||
|
||
5. If you use some potentially dangerous construct, such as sprintf on | ||
static buffer, comment why it is safe in this particular case. | ||
|
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
Note: people listed in this file are listed in arbitrary order. | ||
Kawai Takanori (Hippo2000) kwitknr@cpan.org | ||
Author of perl module Spreadsheet::ParseExcel, which I use as | ||
reference manual for Excel format | ||
Alex Ott <ott@jet.msk.su> | ||
Fixed handling of long SST, contributed handling of RK records, | ||
wrote RTF and OLE parsers | ||
Pawel Wiecek <coven@debian.org> | ||
Current maintainer of Debian catdoc packag | ||
Peter Novodvosky <nidd@debian.org> | ||
maintained debian package for catdoc. | ||
Bjorn Brenander <bjorn@debian.org> | ||
maintained debian package for catdoc. | ||
Eugene B. Byrganov <E.B.Byrganov@inp.nsk.su> | ||
Suggested -l switch, found me an example of partly 8-bit/partly | ||
16-bit file and some typos in builtin docs. Fixed some long-standing | ||
bugs in config-parsed code. | ||
Artem Chuprina <ran@ran.pp.ru> | ||
Provided lot of bugfixes and suggestions. Also maintained some | ||
unofficial packaged versions of catdoc. | ||
Stephen Farrell <stephen@farrell.org> | ||
maintains FreeBSD port, and have persuaded me to write autoconf | ||
configuration | ||
Martin Kraemer <martin.kraemer@mch.sni.de> | ||
contributed some fixes for ascii.rpl and noted typo in catdoc.h | ||
Arfst Ludwig <Arfst.Ludwig@LHSystems.COM> | ||
give me the idea of creating README.charset | ||
Dmitry Potapov <dpotapov@capitalsoft.com> | ||
contributed rtf-parsing code | ||
David Rysdam | ||
Wrote program biffview, which parses XLS file and used as base | ||
for xls2csv. | ||
Duncan Simpson <dps@io.stargate.co.uk> | ||
audited catdoc code for possible buffer overruns (and found much more | ||
of them than actually existed) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
INSTALLING catdoc 0.91.x | ||
|
||
Starting with patchlevel alpha 3 catdoc version 0.90 have autoconf | ||
configuration. Thanks for Stephen Farell to convince me. | ||
|
||
So typically you should run | ||
./configure | ||
make | ||
make install | ||
|
||
to compile and install catdoc. | ||
|
||
NOTE for HPUX users. If you want to compile catdoc with aCC, | ||
use CC="aCC -Ae" ./configure | ||
|
||
Configure script for catdoc recognizes following options (apart from | ||
standard --prefix, --exec-prefix and so on) | ||
|
||
--disable-wordview - disables building of Tcl/Tk viewer wordview, | ||
which requires X11. (note, it would be disabled automatically, | ||
if you don't have appropriate version of Tcl/Tk). You may | ||
wish to use this if you don't have X installed. | ||
|
||
--with-wish=path - specifies path to wish interpreter. This option have | ||
two uses | ||
1. If executable named wish, found in your PATH is old, and | ||
you have newer wish installed as wish4.2 or wish8.0, | ||
you should specify this in order to build wordview viewer | ||
2. If you are compiling catdoc from telnet connection or | ||
text console, you can specify this option to skip tcl | ||
version check, which would run wish and fail if it couldn't | ||
find X display (which would lead configure to assume, that | ||
you don't have good wish) | ||
|
||
--with-input=charset | ||
--with-output=charset | ||
Allows you to specify charset names to expect in 8-bit word | ||
file and to produce as output text file. Do ls ./charsets/*.txt | ||
to find out which charsets are provided in distribution. | ||
Additional charsets can be obtained from | ||
ftp.unicode.org | ||
Note that make would fail if you specify charset, which | ||
doesn't exist in charset directory. | ||
|
||
--disable-charset-check | ||
By default, make in charsets directory fails, if it is unable | ||
to find *.txt files corresponding to default input and output | ||
charsets. This option allows you to disable this check. Make | ||
in charsets directory would always succeed, but it is your | ||
responsibility to provide charset files in catdoc library | ||
directory after make install. | ||
--disable-langinfo | ||
By default, catdoc tries to use your current locale charset | ||
as its output charset. It can be, of cource always overriden | ||
by command line switch. But charset from the locale takes | ||
precedence over charset in configuration file, unless | ||
you put use_locale=no into this file. | ||
|
||
If your C library is not XPG4-compatible, and configure fails | ||
to detect it, you can completely disable langinfo support | ||
using this switch. | ||
|
||
If you experience strange and unexpected behavoir of catdoc, try to | ||
remove optimization flag (-02) from FLAGS in src/Makefile. | ||
If you can write autoconf test to check for this problem, please send it | ||
to me. | ||
|
||
It was known problem with version 0.35 on HP/UX 9, and I scarcely changed | ||
my style of writing since. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
INSTALLING catdoc 0.90a on MS-DOS system. | ||
|
||
Surprise, but MS-DOS is native platform for this version of catdoc. | ||
In difference of previous version, which was UNIX program, ported to | ||
DOS, this one was developed under DOS on nine-years old 286 laptop | ||
with Turbo C 2.0. | ||
|
||
So, catdoc works perfectly well on MS-DOS systems. | ||
|
||
Documentation can be found in files CATDOC.TXT and CATDOC.PS | ||
(both produced by UNIX man command) | ||
|
||
If you've fetched BINARY DISTRIBUTION, note following: | ||
|
||
1. catdoc expect to find its system-wide configuration file | ||
in the same directory as executable (and therefore require DOS | ||
version 3 or above) If you wish to move charset and special char | ||
maps to location other than default (charsets subdirectory of | ||
directory, containing executable) you must have this configuration | ||
file. | ||
|
||
2. Any file name in configuration file can contain %s escape, which | ||
would be substituted by directory of executable. | ||
|
||
3. All configuration files can use either DOS or UNIX end-of-line | ||
convention. | ||
|
||
4. Per-user configuration probably wouldn't work. But try to define | ||
environment variable HOME and put catdoc.rc file in directory, | ||
pointed by it. | ||
|
||
5. Catdoc uses DOS country information as specified by COUNTRY statement | ||
in your configuration file to determine output encoding. This | ||
settings have priority over settings in configuration files (either | ||
per-user or system-wide). If it is not what you want, set | ||
use_locale = no in the configuration file. | ||
|
||
If you are insisting on COMPILING catdoc YOURSELF. | ||
Please note that catdoc was compiled under DOS using Turbo C 2.01, | ||
downloaded from http://community.borland.com/museum. You can get the | ||
same one. | ||
|
||
I've made some attempts to compile catdoc with Watcom C (16-bit), | ||
but haven't completely socceeded. If you do, let me know. | ||
|
||
1. With 16-bit compilier, use COMPACT memory model | ||
If you are using Turbo C make -fmakefile.tc in src directory | ||
should be enough. If you have to change anything in | ||
the makefile.tc, please let me know. | ||
|
||
2. If you are using compilier other than Turbo C /Borland C or | ||
Watcom, you should take look on fileutil.c file and possible | ||
add couple of #ifdefs here. If your succed with it, send me a | ||
patch (or entire modified file, if you don't know how to make | ||
a good unix-like patch). | ||
|
||
|
||
3. With 32-bit compilier you are on your own. I don't think that | ||
small utilities like catdoc should require extender or DPMI host, | ||
so I've never tried to build 32-bit version of catdoc for DOS, | ||
But if you mix buffer sizes from UNIX version and file-name | ||
dependent defines from DOS, you should probably achieve good | ||
results. | ||
|
||
4. With Turbo C you'll need file getopt.c which comes with Turbo C | ||
and unistd.h which is provided in compat directory. | ||
Compile getopt.c and add it to cc.lib and put unistd.h in | ||
your include directory. Later it might help you to port other | ||
unix software. With other compilier you can also make use | ||
of getopt.c in compat directory (which is from GNU), but I was | ||
unable to make it work with Watcom 10.0 | ||
|
||
5. It is probably good idea to link wildargs.obj (or wildargv.obj) | ||
with catdoc. I didn't do it myself becouse I use korn shell on | ||
machine where I've developed catdoc, so I don't need to include | ||
parameter expansion in program. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
|
||
# Your C compilier and flags | ||
SHELL = /bin/sh | ||
|
||
|
||
all: | ||
for i in src doc charsets; do\ | ||
(cd $$i; $(MAKE) all);\ | ||
done | ||
|
||
install: | ||
for i in src doc charsets; do\ | ||
(cd $$i; $(MAKE) install);\ | ||
done | ||
clean: | ||
for i in src doc charsets; do\ | ||
(cd $$i; $(MAKE) clean);\ | ||
done | ||
distclean: | ||
for i in src doc charsets; do\ | ||
(cd $$i; $(MAKE) distclean);\ | ||
done | ||
rm Makefile config.* | ||
dist: | ||
$(MAKE) -C doc dosdoc | ||
$(MAKE) distclean |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
0.90.1 Nov 26 1998 | ||
Top-level Makefile now uses $MAKE instead of make | ||
fixed missing end-line escaping in wordview.tcl | ||
All occurences of strcpy, strcat and sprinf investigated | ||
to avoid buffer overflows. | ||
0.90 Oct 29 1998 | ||
Fixed bug with charset names redeclared locally in main() | ||
Fixed problem in configure with wish 8.0.3 | ||
Catdoc considered to be stable enough for release | ||
0.90b5 Oct 14 1998 | ||
Fixed handling of 0x1F char (soft hyphen in Word 6.0), | ||
now it is translated to 0x00AD (unicode soft hyphen) | ||
Fixed permissions for manual page | ||
Added --with-install-root configure arg to simplify | ||
building of binary packages. | ||
0.90b4 September 17 1998 | ||
Added proper configuration of library dir in wordview. | ||
Added --disable-charset-check config option | ||
Added 0x2026 symbol in ascii.rpl | ||
Added more Windows codepages in distribution | ||
0.90b3 September 11 1998 | ||
Added -x switch to simplify debugging of substitution maps | ||
0.90b2 September 10 1998 | ||
Added some symbols is 0x2000-0x20FF range to substituton maps | ||
These symbols occurs in cp1251 so they are frequently found | ||
in Word files. Fixed some filename-handling problems in | ||
wordview.tcl | ||
|
||
0.90b1 September 8 1998 | ||
Added us-ascii.charset, fixed small bugs in confugre, | ||
install is used for all installation files. Code is | ||
considered stable enough to be beta. | ||
|
||
0.90a3 September 7 1998 | ||
Fixed small bug in table handling, which caused catdoc to | ||
output extra column delimiter just before row delimiter. Added | ||
autoconf configuration. install is back, although not for | ||
charsets | ||
|
||
0.90a2 August 18 1998 | ||
version 0.90 was tested on BSDI and Solaris platform. Makefile | ||
was rewritten to avoid use of highly incompatible | ||
/usr/{ucb,bin}/install | ||
|
||
0.90a1 August 13 1998 | ||
Catdoc undergone major rewrite. Now it has proper charset | ||
handling, including UNICODE and runtime configurability. | ||
|
||
0.35 - June 5 1998 | ||
Fixed bug with -s switch which prevents catdoc from returning | ||
non-zero code when invoked on UNIX text file | ||
|
||
0.34 - Apr 28 1998 | ||
Files now opened in binary mode thus allowing catdoc to work on | ||
DOS and simular systems. All specs arrays now have terminating | ||
NULL | ||
|
||
0.33 - October 1997 | ||
Fixed missing terminating NUL in specs array, which caused | ||
random seqfaults on Linux and many other systems, becouse | ||
_specs_ is searched by _strchr_ fynction | ||
|
||
0.32 - August 1997 | ||
First mayor public release, uploaded to CTAN. Tk interface | ||
appeared, manual page was written. Unfortunately, this release | ||
was buggy. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
CATDOC version 0.93 | ||
|
||
CATDOC is program which reads MS-Word file and prints readable | ||
ASCII text to stdout, just like Unix cat command. | ||
It also able to produce correct escape sequences if some UNICODE | ||
charachers have to be represented specially in your typesetting system | ||
such as (La)TeX. | ||
|
||
This is completely new version of catdoc, rewritten from scratch. | ||
It features runtime configuration, proper charset handling, | ||
user-definable output formats and support | ||
for Word97 files, which contain UNICODE internally. | ||
|
||
Since 0.93.0 catdoc parses OLE structure and extracts WordDocment | ||
stream, but doesn't parse internal structure of it. | ||
|
||
This rough approach inevitable results in some garbage in output file, | ||
especially near the end of file and if file contains embedded OLE objects, | ||
such as pictures or equations. | ||
|
||
So, if you are looking for purely authomatic way to convert Word to LaTeX, | ||
you can better investigate word2x, wvware or LAOLA. | ||
|
||
|
||
Catdoc is distributed under GNU Public License version 2 or above. | ||
|
||
|
||
Your bug reports and suggestions are welcome. | ||
|
||
There is also major work to do - define correct TeX commands | ||
for accented latin letters into tex.specchars file and commands | ||
for mathematical symbols (unicode 20xx-25xx). | ||
|
||
|
||
Contributions are welcome. | ||
|
||
See files INSTALL and INSTALL.dos for information about compiling and | ||
installing catdoc. | ||
|
||
Catdoc is documented in its UNIX-style manual page. For those who don't | ||
have man command (i.e. MS-DOS users) plain text and postscript versions | ||
of manual are provided in doc directory | ||
Victor Wagner <vitus@45.free.net> | ||
|
||
|
Oops, something went wrong.