Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing analysis code with uralicNLP #2

Open
nikopartanen opened this issue Jun 10, 2019 · 1 comment
Open

Replacing analysis code with uralicNLP #2

nikopartanen opened this issue Jun 10, 2019 · 1 comment
Assignees

Comments

@nikopartanen
Copy link
Member

Ideally each speaker's tier would be sent to analyser in one go.

@nikopartanen nikopartanen added this to the elan-fst 2.0 milestone Jun 10, 2019
@nikopartanen nikopartanen self-assigned this Jun 10, 2019
@nikopartanen
Copy link
Member Author

@meehkal and @jeutzsch, you probably have some thoughts on this. I recently implemented this change within the file:

https://github.com/langdoc/elan-fst/blob/master/elan_fst.py

And it seems to work very well, there are following advantages:

  • We can run multiple FST's and then put that through one CG, i.e. analysis on kpv + koi and CG with kpv.
  • This also gives us the language tags, if those would want to be inserted
  • Local installations are not needed
  • Does not use xfst anywhere
  • We also get syntactic readings now (for Komi there are so few that inserting them is not maybe necessary)

Disadvantages:

  • Can be used for languages not supported by uralicNLP, but this is not very well documented, and involves putting compiled transducers to the locations where uralicNLP finds them. In principle for old setup any compiled FST could had been given.

Things to do:

  • Connection to the lexical resources is more complicated question, but needs to be addressed if we want to get translations of the lemmas to the ELAN file. This is also really wide topic where lots of people are already doing much work, so instead of active doing one should sit and wait that things just progress to the point they can be integrated here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant