Skip to content

A liblouis table parser based on instaparse. Converts tables to Clojure data structures.

License

Notifications You must be signed in to change notification settings

liblouis/rewrite-louis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rewrite-louis

An experimental parser for liblouis tables based on instaparse. Given a liblouis table the parser returns plain old Clojure data. This can be used for example to rewrite the table into any other format.

Usage

(require '[instaparse.core :as insta])
(def louis-parser (insta/parser (clojure.java.io/resource "louis.bnf")))
(louis-parser (slurp "~/src/liblouis/tables/ar-ar-comp8.utb"))

which will return

[:table
 [:comment "afr#1#Afrikaans Uncontracted#za#Afrikaans onverkort"]
 ,,,
 [:comment " <http://www.gnu.org/licenses/>."]
 [:include "en-ueb-g1.ctb"]]

Command line

find ~/src/liblouis/tables -type f -print | grep -v -e '\.dic' -e 'Makefile' -e 'maketablelist.sh' -e 'README' | sort | xargs lein run

Status

As it stands the parser can parse probably around 95% of the tables in the liblouis distribution. At the moment it has no support for

  • continuation lines (da-dk-g16-lit.ctb, da-dk-g26-lit.ctb, da-dk-g26l-lit.ctb)
  • huge tables cause a OutOfMemoryError (GC overhead limit exceeded) (zh-chn.ctb, zhcn-g1.ctb, zhcn-g2.ctb, ko-chars.cti, zh-tw.ctb, etc)

Acknowledgements

A lot of the EBNF grammar was basically re-used from louis-parser and its liblouis table grammar definition in the form of Parsing expression grammar.

License

Copyright © 2021 Swiss Library for the Blind, Visually Impaired and Print Disabled

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

About

A liblouis table parser based on instaparse. Converts tables to Clojure data structures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages