Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/giellalt/lang-sme
Browse files Browse the repository at this point in the history
  • Loading branch information
Trondtr committed Jun 12, 2024
2 parents 1296108 + 57a42ec commit e5e7402
Show file tree
Hide file tree
Showing 200 changed files with 1,913 additions and 744 deletions.
37 changes: 21 additions & 16 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@
*.zhfst
*.zip
*.zpipe
*.py.log
*.py.trs
*.sh.log
*.sh.trs
*.yaml.log
*.yaml.trs
.DS_Store
.bundle
.~lock.*#
Expand Down Expand Up @@ -77,22 +83,20 @@
/src/fst/phonetics/tests/tests/*.sh
/test/run-morph-tester.sh
/test/run-yaml-testcases.sh
/test/src/morphology/all*.txt
/test/src/morphology/analysed*.txt
/test/src/morphology/filtered*
/test/src/morphology/generate-*-lemmas.sh
/test/src/morphology/generated*.txt
/test/src/morphology/missing_*.txt
/test/src/phonology/negative-*.txt
/test/src/phonology/hfst-twolc-error-messages.txt
/test/src/phonology/pair-*.txt
/test/src/phonology/pair-test-*.sh
/test/src/phonology/positive-*.txt
/test/src/phonology/twolcscript.sh
/test/tools/spellcheckers/fstbased/desktop/hfst/*.txt
/test/tools/spellcheckers/fstbased/desktop/hfst/accept-all-lemmas.sh
/test/tools/spellcheckers/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh
/test/tools/spellcheckers/test-zhfst-file.sh
/src/fst/morphology/test/*-adjective.txt
/src/fst/morphology/test/*.txt
/src/fst/morphology/test/filtered*
/src/fst/morphology/test/generate-*-lemmas.sh
/src/fst/morphology/test/phonology/negative-*.txt
/src/fst/morphology/test/phonology/hfst-twolc-error-messages.txt
/src/fst/morphology/test/phonology/pair-*.txt
/src/fst/morphology/test/phonology/pair-test-*.sh
/src/fst/morphology/test/phonology/positive-*.txt
/src/fst/morphology/test/phonology/twolcscript.sh
/tools/spellcheckers/test/fstbased/desktop/hfst/*.txt
/tools/spellcheckers/test/fstbased/desktop/hfst/accept-all-lemmas.sh
/tools/spellcheckers/test/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh
/tools/spellcheckers/test/test-zhfst-file.sh
/tools/analysers/*.cg3
/tools/analysers/*.pmhfst
/tools/analysers/*.zcheck
Expand Down Expand Up @@ -149,5 +153,6 @@ Makefile.in
build
bygg
generated*
test-suite.log
.deps
.generated
4 changes: 2 additions & 2 deletions .gut/delta.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
template = "https://github.com/giellalt/template-lang-und"
rev_id = 175
template_sha = "bf3ac2ead0081366d7a999df6f804fc6662bbe30"
rev_id = 183
template_sha = "d0a0bae6ad6b62a4cd75ab409a3666d7362152b1"

[replacements]
__REPO__ = "lang-sme"
Expand Down
1 change: 1 addition & 0 deletions Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ dev:
# Remove html tables created by some of the developer tools:
clean-local:
rm -f *.html
include $(top_srcdir)/../giella-core/am-shared/devtest-include.am
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,17 @@ The North Sami morphology and tools
[![Maturity](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2Fgiellalt%2Flang-sme%2Fgh-pages%2Fmaturity.json)](https://giellalt.github.io/MaturityClassification.html)
![Lemma count](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2Fgiellalt%2Flang-sme%2Fgh-pages%2Flemmacount.json)
[![GitHub issues](https://img.shields.io/github/issues-raw/giellalt/lang-sme)](https://github.com/giellalt/lang-sme/issues)
[![Build Status](https://divvun-tc.giellalt.org/api/github/v1/repository/giellalt/lang-sme/main/badge.svg)](https://github.com/giellalt/lang-sme/actions)
[![License](https://img.shields.io/github/license/giellalt/lang-sme)](https://github.com/giellalt/lang-sme/blob/main/LICENSE)
[![Desktop speller download](https://img.shields.io/badge/download%40latest-desktop--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sme?platform=desktop&channel=nightly)
[![Mobile speller download](https://img.shields.io/badge/download%40latest-mobile--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sme?platform=mbile&channel=nightly)
[![Doc Build Status](https://github.com/giellalt/lang-sme/workflows/Docs/badge.svg)](https://github.com/giellalt/lang-sme/actions)
[![CI/CD Build Status](https://divvun-tc.giellalt.org/api/github/v1/repository/giellalt/lang-sme/main/badge.svg)](https://divvun-tc.giellalt.org/api/github/v1/repository/giellalt/lang-sme/main/latest)

Download nightly / CI/CD installation packages for testing (contains the core zhfst file(s)):

[![Windows](https://img.shields.io/badge/download%40latest-Windows--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sme?platform=windows&channel=nightly)
[![MacOS](https://img.shields.io/badge/download%40latest-macOS--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sme?platform=macos&channel=nightly)
[![Mobile](https://img.shields.io/badge/download%40latest-mobile--bhfst-brightgreen)](https://pahkat.uit.no/main/download/speller-sme?platform=mobile&channel=nightly)

__NB!!__ Note that the nightly / CI/CD installation packages are not tested for language quality, and might contain regressions and errors.

This repository contains finite state source files for the North Sami language,
for building morphological analysers, proofing tools
Expand Down Expand Up @@ -55,7 +62,7 @@ dictionaries, you need:
- an FST compiler: [HFST](https://github.com/hfst/hfst), [Foma](https://github.com/mhulden/foma) or [Xerox Xfst](https://web.stanford.edu/~laurik/fsmbook/home.html)
- [VislCG3](https://visl.sdu.dk/svn/visl/tools/vislcg3/trunk) Constraint Grammar tools

To install VislCG3 and HFST, just copy/paste this into your Terminal on **Mac OS X**:
To install VislCG3 and HFST, just copy/paste this into your Terminal on **macOS**:

```
curl https://apertium.projectjj.com/osx/install-nightly.sh | sudo bash
Expand Down
18 changes: 5 additions & 13 deletions configure.ac
Original file line number Diff line number Diff line change
Expand Up @@ -34,14 +34,6 @@ AC_CONFIG_MACRO_DIR([m4])
AM_INIT_AUTOMAKE(
1.11.6 tar-pax -Wall -Werror
foreign -Wno-portability
dnl Automake versions before 1.13 (when the serial-tests option was
dnl still the default) still defined the badly obsolete macro
dnl 'AM_PROG_CC_STDC'. By checking for the non-existence of this macro,
dnl we can now force serial testing for newer automakes (with prettier
dnl output) and at the same time work reasonably with older automakes.
dnl Code based on:
dnl https://lists.gnu.org/archive/html/automake/2013-01/msg00060.html
m4_ifndef([AM_PROG_CC_STDC], [serial-tests])
)
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES([yes])])

Expand Down Expand Up @@ -70,7 +62,7 @@ AC_SUBST([GTLANGUAGE], $GLANGUAGE)
### The AC variables SPELLER_NAME_xxx and SPELLER_DESC_xxx are used in:
### - manifest.toml.in
### - tools/spellcheckers/index.*.xml.in
AC_SUBST([SPELLERVERSION], [4.3.2])
AC_SUBST([SPELLERVERSION], [4.4.3])
AC_SUBST([SPELLER_NAME_ENG], ["$GLANGUAGE spellchecker"])
AC_SUBST([SPELLER_NAME_NATIVE], ["Autonym spellchecker"])
AC_SUBST([SPELLER_DESC_ENG], ["A spellchecker for $GLANGUAGE, made by members of the language community, and by the Divvun and Giellatekno groups at UiT The Arctic University of Norway"])
Expand Down Expand Up @@ -216,10 +208,10 @@ gt_CONFIG_FILES
##### BEGIN: Add language-specific list of files to ######
########## be processed by autoconf below here: ##########

AC_CONFIG_FILES([test/src/morphology/generate-adverb-lemmas.sh], \
[chmod a+x test/src/morphology/generate-adverb-lemmas.sh])
AC_CONFIG_FILES([test/src/morphology/generate-adpos-lemmas.sh], \
[chmod a+x test/src/morphology/generate-adpos-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-adverb-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-adverb-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-adpos-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-adpos-lemmas.sh])
AC_CONFIG_FILES([tools/shellscripts/generate-ipa-lexicon.sh], \
[chmod a+x tools/shellscripts/generate-ipa-lexicon.sh])
AC_CONFIG_FILES([src/fst/phonetics/tests/tests/test-ipa-conversion.sh], \
Expand Down
2 changes: 1 addition & 1 deletion devtools/tag_test.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#shell script to see if there are tags which are not declared in root.lexc or if tags are misspelled

echo 'Possible tags not declared in root.lexc or misspelled:'
cat src/fst/morphology/clitics.lexc src/fst/morphology/compounding.lexc src/fst/morphology/affixes/*lexc src/fst/morphology/stems/*lexc ../shared-smi/src/fst/morphology/stems/*lexc |cut -d '!' -f1 |grep ' ;' | cut -d ':' -f1 |rev |cut -d ' ' -f1 |rev |sed 's/+/¢+/g' |sed 's/@/¢@/g'|tr '¢' '\n' | tr '#"' '\n'| egrep '(\+|@)' |sort -u | egrep -v '^(\+|\+%|\+\/\-|\+Cmp\-|\+Cmp%\-|\@0|\@%)$' > lexctags
cat src/fst/morphology/clitics.lexc src/fst/morphology/compounding.lexc src/fst/morphology/affixes/*lexc src/fst/morphology/stems/*lexc ../shared-smi/src/fst/stems/*lexc |cut -d '!' -f1 |grep ' ;' | cut -d ':' -f1 |rev |cut -d ' ' -f1 |rev |sed 's/+/¢+/g' |sed 's/@/¢@/g'|tr '¢' '\n' | tr '#"' '\n'| egrep '(\+|@)' |sort -u | egrep -v '^(\+|\+%|\+\/\-|\+Cmp\-|\+Cmp%\-|\@0|\@%)$' > lexctags

cat src/fst/morphology/root.lexc |cut -d '!' -f1 |cut -d ':' -f1 |sed 's/+/¢+/g'|sed 's/@/¢@/g' |tr '¢' '\n' | egrep '(\+|@)' |tr -d ' ' | tr -d '\t'|sort -u > roottags

Expand Down
1 change: 1 addition & 0 deletions docs/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@
# The generated docs are automatically detected by the automake script

include $(top_srcdir)/../giella-core/am-shared/docs-dir-include.am
include $(top_srcdir)/../giella-core/am-shared/devtest-include.am
70 changes: 34 additions & 36 deletions m4/giella-config-files.m4
Original file line number Diff line number Diff line change
Expand Up @@ -12,30 +12,19 @@ AC_CONFIG_FILES([Makefile \
src/fst/syllabification/Makefile \
src/fst/Makefile \
src/fst/morphology/Makefile \
src/fst/morphology/test/Makefile \
src/fst/morphology/test/phonology/Makefile \
src/fst/orthography/Makefile \
src/fst/orthography/test/Makefile \
src/fst/phonetics/Makefile \
src/fst/phonetics/tests/Makefile \
src/fst/test/Makefile \
src/cg3/Makefile \
src/cg3/test/Makefile \
src/fst/tagsets/Makefile \
src/fst/transcriptions/Makefile \
docs/Makefile \
test/Makefile \
test/tools/Makefile \
test/tools/hyphenators/Makefile \
test/tools/hyphenators/fstbased/Makefile \
test/tools/hyphenators/patternbased/Makefile \
test/tools/mt/Makefile \
test/tools/mt/apertium/Makefile \
test/tools/spellcheckers/Makefile \
test/tools/spellcheckers/fstbased/Makefile \
test/tools/spellcheckers/fstbased/desktop/Makefile \
test/tools/spellcheckers/fstbased/desktop/hfst/Makefile \
test/tools/spellcheckers/fstbased/mobile/Makefile \
test/src/Makefile \
test/src/morphology/Makefile \
test/src/orthography/Makefile \
test/src/phonology/Makefile \
test/src/syntax/Makefile \
tools/Makefile \
tools/analysers/Makefile \
tools/analysers/pipespec.xml \
Expand All @@ -45,11 +34,15 @@ AC_CONFIG_FILES([Makefile \
tools/grammarcheckers/tests/Makefile \
tools/hyphenators/Makefile \
tools/hyphenators/filters/Makefile \
tools/hyphenators/test/Makefile \
tools/hyphenators/test/fstbased/Makefile \
tools/hyphenators/test/patternbased/Makefile \
tools/mt/Makefile \
tools/mt/filters/Makefile \
tools/mt/apertium/Makefile \
tools/mt/apertium/filters/Makefile \
tools/mt/apertium/tagsets/Makefile \
tools/mt/apertium/test/Makefile \
tools/mt/cgbased/Makefile \
tools/tokenisers/Makefile \
tools/tokenisers/filters/Makefile \
Expand All @@ -60,6 +53,11 @@ AC_CONFIG_FILES([Makefile \
tools/spellcheckers/index.mobile.xml \
tools/spellcheckers/filters/Makefile \
tools/spellcheckers/neural/Makefile \
tools/spellcheckers/test/Makefile \
tools/spellcheckers/test/fstbased/Makefile \
tools/spellcheckers/test/fstbased/desktop/Makefile \
tools/spellcheckers/test/fstbased/desktop/hfst/Makefile \
tools/spellcheckers/test/fstbased/mobile/Makefile \
tools/spellcheckers/weights/Makefile \
tools/tts/Makefile \
tools/tts/pipespec.xml \
Expand All @@ -69,33 +67,33 @@ AC_CONFIG_FILES([Makefile \
# Spell checker tests, all languages:
AC_CONFIG_FILES([src/fst/phonetics/tests/run_tests.sh],
[chmod a+x src/fst/phonetics/tests/run_tests.sh])
AC_CONFIG_FILES([test/tools/spellcheckers/test-zhfst-file.sh], \
[chmod a+x test/tools/spellcheckers/test-zhfst-file.sh])
AC_CONFIG_FILES([test/tools/spellcheckers/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh], \
[chmod a+x test/tools/spellcheckers/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh])
AC_CONFIG_FILES([tools/spellcheckers/test/test-zhfst-file.sh], \
[chmod a+x tools/spellcheckers/test/test-zhfst-file.sh])
AC_CONFIG_FILES([tools/spellcheckers/test/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh], \
[chmod a+x tools/spellcheckers/test/fstbased/desktop/hfst/test-zhfst-basic-sugg-speed.sh])
AC_CONFIG_FILES([test/run-yaml-testcases.sh], \
[chmod a+x test/run-yaml-testcases.sh])
AC_CONFIG_FILES([test/run-morph-tester.sh], \
[chmod a+x test/run-morph-tester.sh])
# Phonology tests, all languages:
AC_CONFIG_FILES([test/src/phonology/pair-test-positive.sh], \
[chmod a+x test/src/phonology/pair-test-positive.sh])
AC_CONFIG_FILES([test/src/phonology/pair-test-negative.sh], \
[chmod a+x test/src/phonology/pair-test-negative.sh])
AC_CONFIG_FILES([test/src/phonology/pair-test-hfst.sh], \
[chmod a+x test/src/phonology/pair-test-hfst.sh])
AC_CONFIG_FILES([src/fst/morphology/test/phonology/pair-test-positive.sh], \
[chmod a+x src/fst/morphology/test/phonology/pair-test-positive.sh])
AC_CONFIG_FILES([src/fst/morphology/test/phonology/pair-test-negative.sh], \
[chmod a+x src/fst/morphology/test/phonology/pair-test-negative.sh])
AC_CONFIG_FILES([src/fst/morphology/test/phonology/pair-test-hfst.sh], \
[chmod a+x src/fst/morphology/test/phonology/pair-test-hfst.sh])
# Lemma generation tests, all languages:
AC_CONFIG_FILES([test/src/morphology/generate-adjective-lemmas.sh], \
[chmod a+x test/src/morphology/generate-adjective-lemmas.sh])
AC_CONFIG_FILES([test/src/morphology/generate-noun-lemmas.sh], \
[chmod a+x test/src/morphology/generate-noun-lemmas.sh])
AC_CONFIG_FILES([test/src/morphology/generate-propernoun-lemmas.sh], \
[chmod a+x test/src/morphology/generate-propernoun-lemmas.sh])
AC_CONFIG_FILES([test/src/morphology/generate-verb-lemmas.sh], \
[chmod a+x test/src/morphology/generate-verb-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-adjective-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-adjective-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-noun-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-noun-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-propernoun-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-propernoun-lemmas.sh])
AC_CONFIG_FILES([src/fst/morphology/test/generate-verb-lemmas.sh], \
[chmod a+x src/fst/morphology/test/generate-verb-lemmas.sh])
# Lemma acceptance test for spellers, all languages:
AC_CONFIG_FILES([test/tools/spellcheckers/fstbased/desktop/hfst/accept-all-lemmas.sh], \
[chmod a+x test/tools/spellcheckers/fstbased/desktop/hfst/accept-all-lemmas.sh])
AC_CONFIG_FILES([tools/spellcheckers/test/fstbased/desktop/hfst/accept-all-lemmas.sh], \
[chmod a+x tools/spellcheckers/test/fstbased/desktop/hfst/accept-all-lemmas.sh])
# Shorthand shell scripts instead of the old-type aliases - all languages:
AC_CONFIG_FILES([tools/shellscripts/usme-gt.sh], \
[chmod a+x tools/shellscripts/usme-gt.sh])
Expand Down
22 changes: 21 additions & 1 deletion m4/giella-macros.m4
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ AC_MSG_RESULT([$GIELLA_CORE])
###############################################################
### This is the version of the Giella Core that we require. ###
### UPDATE AS NEEDED.
_giella_core_min_version=0.23.0
_giella_core_min_version=1.0.0
# GIELLA_CORE/GTCORE env. variable, required by the infrastructure to find scripts:
AC_ARG_VAR([GIELLA_CORE], [directory for the Giella infra core scripts and other required resources])
Expand Down Expand Up @@ -674,6 +674,23 @@ AS_IF([test "x$enable_grammarchecker" != "xno"],
then: pipx install git+https://github.com/divvun/giellaltgramtools
])]),
AC_MSG_RESULT(yes))
_gtgramtool_min_version=0.7.0
gtgramtool_too_old_message="gtgramtool needs to be updated.
If you installed it with pipx, run:
pipx upgrade GiellaLTGramTools"
AC_MSG_CHECKING([the version of gtgramtool])
AS_IF([test "x${GTGRAMTOOL}" != xno],
[_gtgramtool_version=$( "${GTGRAMTOOL}" --version | sed -e 's/^.*version //')],
[_gtgramtool_version=0])
AC_MSG_RESULT([$_gtgramtool_version])
AS_IF([test "x$enable_grammarchecker" != "xno"],
AC_MSG_CHECKING([whether the gtgramtool version is at least $_gtgramtool_min_version])
AX_COMPARE_VERSION([$_gtgramtool_version], [ge], [$_gtgramtool_min_version],
[gtgramtool_version_ok=yes], [gtgramtool_version_ok=no])
AS_IF([test "x${gtgramtool_version_ok}" != xno],
[AC_MSG_RESULT([$gtgramtool_version_ok])],
[AC_MSG_ERROR([$gtgramtool_too_old_message])]))
# Enable all spellers - default is 'no'
AC_ARG_ENABLE([spellers],
Expand Down Expand Up @@ -998,6 +1015,9 @@ To build, test and install:
make
make check
make install
The developers’ version of the test suite is available under:
make devtest
this version does not halt on errors and should be useful when fixing bugs
EOF
AS_IF([test x$gt_prog_xslt = xno -a \
"$(find ${srcdir}/src/fst/morphology/stems -name "*.xml" | head -n 1)" != "" ],
Expand Down
2 changes: 1 addition & 1 deletion manifest.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
spellername = "North Sami"
spellerversion = "4.3.2"
spellerversion = "4.4.3"

# Name of the speller package in various languages.
# The default is English + autonym:
Expand Down
1 change: 1 addition & 0 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@ SUBDIRS = fst cg3
##################################################################

include $(top_srcdir)/../giella-core/am-shared/src-dir-include.am
include $(top_srcdir)/../giella-core/am-shared/devtest-include.am
4 changes: 4 additions & 0 deletions src/cg3/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
## Copyright: Sámediggi/Divvun/UiT
## Licence: GPL v3+

# build before test
SUBDIRS=. test

##################################################################
#### BEGIN: Add local processing instructions BELOW this line ####
##################################################################
Expand Down Expand Up @@ -46,3 +49,4 @@ clean-local:
####### Build rules via include: ########

include $(top_srcdir)/../giella-core/am-shared/src-syntax-dir-include.am
include $(top_srcdir)/../giella-core/am-shared/devtest-include.am
Loading

0 comments on commit e5e7402

Please sign in to comment.