tlwg/swath
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
master
Could not load branches
Nothing to show
Could not load tags
Nothing to show
{{ refName }}
default
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
-
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more.
- Open with GitHub Desktop
- Download ZIP
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
SWATH (Smart Word Analysis for THai) ==================================== Thai script has no word delimiter. While it's trivial for human readers to recognize word boundaries while reading, it requires some knowledge for the machine to do the same when wrapping lines or moving cursor word-wise, etc. Normally, applications need such feature to support Thai text processing. Swath is a general-purpose utility to workaround the lack of such capability in applications. It analyzes the given Thai text by consulting a Thai word list for word boundaries, before outputting the same text with the predefined word delimiters inserted. It can read many kinds of input, including plain text and structured documents like HTML, RTF, LaTeX and Lambda (Unicode version of LaTeX with Omega typesetter kernel). [See -f option]. For the known documents, it inserts the common word delimiters used in the corresponding formats, and pipes (|) for plain text. But the user can always override this with a preferred delimiter. [See -b option.] Swath can also be configured to use different algorithms for the analysis. Currently, it supports two schemes: longest (greedy) matching and maximal (least words) matching. [See -m option.] EXAMPLES ======== - For LaTeX (to be used with babel-thai package): $ swath -f latex < mydoc.tex > mydoc.ttex $ latex mydoc.ttex Or if you composed your LaTeX source in UTF-8: $ swath -f latex -u u,t mydoc.tex > mydoc.ttex $ latex mydoc.ttex This is equivalent to filtering with iconv(1): $ iconv -f UTF-8 -t TIS-620 mydoc.tex | swath -f latex > mydoc.ttex $ latex mydoc.ttex - For HTML (to provide web pages to web browsers that cannot wrap Thai lines properly, but support the <wbr> tag): $ swath -f html < mydoc.html > mydoc-wbr.html
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published