This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Failed to load latest commit information.
================ ``langid.c`` readme ================ Introduction ------------ `langid.c` is an experimental implementation of the language identifier described by  in pure C. It is largely based on the design of `langid.py`, and uses `langid.py` to train models. Planned features ---------------- See TODO Speed ----- Initial comparisons against Google's cld2 suggest that `langid.c` is about twice as fast. (langid.c) @mlui langid.c git:[master] wc -l wikifiles 28600 wikifiles (langid.c) @mlui langid.c git:[master] time cat wikifiles | ./compact_lang_det_batch > xxx cat wikifiles 0.00s user 0.00s system 0% cpu 7.989 total ./compact_lang_det_batch > xxx 7.77s user 0.60s system 98% cpu 8.479 total (langid.c) @mlui langid.c git:[master] time cat wikifiles | ./langidOs -b > xxx cat wikifiles 0.00s user 0.00s system 0% cpu 3.577 total ./langidOs -b > xxx 3.44s user 0.24s system 97% cpu 3.759 total (langid.c) @mlui langid.c git:[master] wc -l rcv2files 20000 rcv2files (langid.c) @mlui langid.c git:[master] time cat rcv2files | ./langidO2 -b > xxx cat rcv2files 0.00s user 0.00s system 0% cpu 31.702 total ./langidO2 -b > xxx 8.23s user 0.54s system 22% cpu 38.644 total (langid.c) @mlui langid.c git:[master] time cat rcv2files | ./compact_lang_det_batch > xxx cat rcv2files 0.00s user 0.00s system 0% cpu 18.343 total ./compact_lang_det_batch > xxx 18.14s user 0.53s system 97% cpu 19.155 total Model Training -------------- Google's protocol buffers  are used to transfer models between languages. The Python program `ldpy2ldc.py` can convert a model produced by langid.py  into the protocol-buffer format, and also the C source format used to compile an in-built model directly into executable. Dependencies ------------ Protocol buffers  protobuf-c  Contact ------- Marco Lui <firstname.lastname@example.org> References ----------  http://aclweb.org/anthology-new/I/I11/I11-1062.pdf  https://github.com/saffsd/langid.py  https://code.google.com/p/cld2/  https://github.com/google/protobuf/  https://github.com/protobuf-c/protobuf-c