Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glossary: Match by glossary terms already suffixed #1795

Draft
wants to merge 28 commits into
base: develop
Choose a base branch
from

Conversation

pedro-mendonca
Copy link
Member

@pedro-mendonca pedro-mendonca commented Feb 21, 2024

Identify if a glossary term is in a suffixed form according to the existent suffix rules, and revert it to the unsuffixed form.
Afterwards, run the regular matching for terms with suffixes.

This improves the glossary matching for cases where the terms aren't in the base form, for example, for nouns in plural and for verbs in past tense.
It will first revert the verb to infinitive, allowing next to match to all the other known regular forms included in the rules.

Problem

Since #1373 it's possible to match terms to similar words, depending of the rules set for each part_of_speech.
Currently the match only works if the glossary entry is the base term:

  • nouns must be in the singular form
  • verbs must be in infinitive

If a noun glossary term is introduced in plural form, or a verb is not in infinitive, it will only match those exact terms, not being able to match other known regular forms.

Solution

Before try to match to suffixed terms, try to match the term to a suffixed form and use the same rules to revert to the base form. And afterwards, proceed the same by matching any possible suffixed terms.

Added a test check_map_glossary_from_suffixed() that picks each glossary suffixed term and uses it for the test instead of the base term, than runs the regular check_map_glossary() glossary match test to check if all the terms are matched as previoulsy.
All the current test cases are passing.

Testing Instructions

  1. Add a glossary term like "added", it should match not only .
  2. Add an original with the exact "added" and other currently working matches like "add", "adding".
  3. It should now match not only the exat term "added" but also the infinitive "add" and all the other known regular forms.

Fixes #1794

@pedro-mendonca pedro-mendonca marked this pull request as draft February 21, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Glossary: Match by glossary terms already suffixed
1 participant