Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecating gff.ti.gff_tigrinya #218

Open
dyacob opened this issue Jul 4, 2023 · 11 comments
Open

Deprecating gff.ti.gff_tigrinya #218

dyacob opened this issue Jul 4, 2023 · 11 comments

Comments

@dyacob
Copy link
Contributor

dyacob commented Jul 4, 2023

The gff.ti.gff_tigrinya lexical model is based on the Unilex project's wordlist for Tigrinya. Unfortunately, the contents contain many misspellings and non-Tigrinya words that come corpus of unknown provenance and pedigree. The contents also combine conflicting spelling conventions of both Eritrea and Ethiopia which also impact the frequency counts negatively.

An approach that would better meet user expectations is to have separate wordlists for each region. PR #216 and #217 address this directly. The gff.ti.gff_tigrinya lexicon can then be deleted from the repository or moved into a legacy directory if there is interest to preserve it.

@DavidLRowe
Copy link
Contributor

@mcdurdin What is the best way to proceed?

@DavidLRowe
Copy link
Contributor

Note that the PRs for the new models have not yet been merged.

@mcdurdin
Copy link
Member

mcdurdin commented Jul 6, 2023

We can use the deprecates field in the .model_info for the new models to mark the existing model as deprecated. Then we should be able to remove it -- there is probably not a lot of value in keeping the older model around, unlike with keyboards. (The old version will still be on the download server, but it won't be in the catalogue so users won't be offered it.)

@dyacob
Copy link
Contributor Author

dyacob commented Jul 7, 2023

@mcdurdin , my main concern is to avoid the keyboards downloading both dictionaries. Will the dictionary server be aware of the deprecates property and prevent the older wordlist from being downloaded? If not, then deletion from the repository might be the better option.

@mcdurdin
Copy link
Member

Will the dictionary server be aware of the deprecates property and prevent the older wordlist from being downloaded?

Yes, the only model to be downloaded should be the new one, as deprecated models are not offered in the search by language API.

@dyacob
Copy link
Contributor Author

dyacob commented Jul 11, 2023

With the v1.0.1 versions of the respective lexical models, the deprecates statements are in place.

I can create a new PR that deletes the older Tigrinya lexical model, or leave it to the Keyman team to delete older resources at the appropriate time.

@DavidLRowe
Copy link
Contributor

@mcdurdin Will it be a problem to have statements that deprecate a lexical model that no longer is present in the repository?

Or maybe it would just be tidier to remove the deprecations in the same PR that deletes the lexical model?

@mcdurdin
Copy link
Member

It's best to leave the deprecation references in, because this allows for a clean upgrade path for users who are on the older model -- a query for updates for the deprecated model reports that a replacement model is present. (Although perhaps we need to go through and re-test this at some point, because it's been a long time since we've done anything with it, and at that point we didn't have any deprecated models...)

@dyacob
Copy link
Contributor Author

dyacob commented Jul 12, 2023

I suppose a DEPRECATED.md file is in order then?

@DavidLRowe
Copy link
Contributor

@mcdurdin Should a DEPRECATED.md file be added to the deprecated lexical model? Or just delete it?

@mcdurdin
Copy link
Member

mcdurdin commented Aug 3, 2023

@mcdurdin Should a DEPRECATED.md file be added to the deprecated lexical model? Or just delete it?

I think it would be fine to delete it, as the new model lists the deprecation path. The old model is available in the git history for archival/history purposes, and as far as I can tell, we'll never want to serve it to users. So go ahead and delete it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants