Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
A Lojban parser written in python3.0 http://jbotcan.org/zirsam/
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
INSTALLATION ============ 1) There must be a 'python3' command in the $PATH. The easiest way to do this sudo ln /usr/bin/python3.0 /usr/bin/python3 The other versions in the py3k series will work, of course. 2) Update the BNF data. The following command should be run as a user ./setup.py build 3) Then install the program, with admin privlidges sudo ./setup.py install If you are unable to install it, you may be able to access the demo on my box by: ssh email@example.com The password is "lojban". USAGE ===== Installing the library also installs a command for running the parser from the command line: zirsam This command will output a condensed parse tree. To start parsing something using the actual library, first pick what depth you want. Do you want orthography (letters), morphology (words), thaumatology (metasyntax/pre-processing) or dendrography (grammar parsing)? If all you want are valsi (morphology), then a program using this might be: import zirsam.morphology import zirsam.tokens stream = zirsam.morphology.Stream() #In general, module.Stream() returns an iter object valsi = list(_ for _ in stream if not isinstance(_, tokens.BORING)) print("There are", len(valsi), "tokens.") The valsi variable will now have a list of Token objects. Options can be set using the Configuration object, which is located in the config module. import zirsam.orthography import zirsam.config conf = zirsam.config.Configuration() #The empty list prevents Configuration from parsing arguments from sys.argv conf.alphabet = zirsam.config.alphabets.liberal #This could also be passed in the list to Configuration(), similiar to the 'zirsam' program for c in zirsam.orthography.Stream(conf): print(c, end='') print() DOCUMENTATION NOT REALLY ======================== docs/BRKWORDS.TXT has my comments regarding morphology, I've also included the original file in this directory. docs/unfinished_business is the todo list Many directories have README files. TESTING ======= I've got two scripts to test zirsam: one to test morphology, the other to test grammar. They are under tools/test_* . The test lines for the grammar (zirsam/data/gram_test_sentences [TODO I moved this]) are from rlpowell's website.