/ bitextor Public
lpla released this
14 Jun 08:25
· 3785 commits to master since this release
Hi there! Here we go with the v6.0.0-rc.1 of Bitextor. This release is related to the code release at Paracrawl project. There are lots of changes since v5.0 of Bitextor and it is the first release since we moved into Github.
How do I install Bitextor?
How do I run Bitextor?
Any example to check if it is working?
- Updated documentation and
README.mdwith new dependencies, commands and troubleshooting
- Added original repositories for most of compiled dependencies (mgiza, clustercat, bicleaner...)
- Fixed encoding errors in
- Added option to use
nltkas sentence splitter
- Added lots of parameters and options for
bitextorto control most parts of the pipeline and long named versions of them (see
- Added option for a config file in
bitextor. See README.md.
- Added ELRC metrics and filters
zipporahclassifiers and thresholds for filtering
httrackas alternative crawler
- Added a JHU processing script for processing crawler content (option
- Added an alternative document aligner translate based (Paracrawl) (option
- Minor changes and bugfixes
bitextor-v6.0.0-rc.1.zip tarball does not include submodules code. If you start compiling the project from this tarball, first you need to
git submodule update --init --recursive. Also, you can't perform this command on the source code
.zip packages, so we recommend the
bitextor-v6.0.0-rc.1.zip tarball or cloning the repo.