Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong split for SrIrapi #23

Closed
kmadathil opened this issue Jul 16, 2017 · 3 comments
Closed

Wrong split for SrIrapi #23

kmadathil opened this issue Jul 16, 2017 · 3 comments

Comments

@kmadathil
Copy link
Owner

Sandhi module

(integ)*$ python SanskritLexicalAnalyzer.py --split SrIrapi --input-encoding SLP1
Parsing of XMLs started at 2017-07-16 11:50:23.374043
666994 forms cached for quick search
Parsing of XMLs completed at 2017-07-16 11:50:28.310012
Input String: SrIrapi
Input String in SLP1: SrIrapi
Start Split: 2017-07-16 11:50:34.029952
End DAG generation: 2017-07-16 11:50:34.032363
End pathfinding: 2017-07-16 11:50:34.033762
Splits:
[u'Sri', u'Ira', u'pi']
[u'SrI', u'Ira', u'pi']

Internal splitter:

(integ)*$ python SanskritLexicalAnalyzer.py --split SrIrapi --input-encoding SLP1 --use-internal-sandhi-splitter
Parsing of XMLs started at 2017-07-16 11:50:45.124203
666994 forms cached for quick search
Parsing of XMLs completed at 2017-07-16 11:50:50.126418
Input String: SrIrapi
Input String in SLP1: SrIrapi
Start Split: 2017-07-16 11:50:55.797431
End DAG generation: 2017-07-16 11:50:55.799320
End pathfinding: 2017-07-16 11:50:55.803311
Splits:
[u'SrIs', u'api']
[u'SrI', u'ras', u'pi']
[u'Sri', u'Iras', u'pi']
[u'SrI', u'Iras', u'pi']
[u'Sri', u'Ira', u'pi']
[u'SrI', u'iras', u'pi']
[u'Sri', u'iras', u'pi']
[u'SrI', u'Ira', u'pi']

@avinashvarna
Copy link
Collaborator

I don't see it on my branch:
(avinashvarna) $ python SanskritLexicalAnalyzer.py --split --input-encoding SLP1 --max-paths 25 SrIrapi
666994 forms cached for quick search
Parsing of XMLs completed at 2017-07-16 10:21:37.869000
Input String: SrIrapi
Input String in SLP1: SrIrapi
Start Split: 2017-07-16 10:21:43.024000
End DAG generation: 2017-07-16 10:21:43.027000
End pathfinding: 2017-07-16 10:21:43.038000
Splits:
[u'SrIs', u'api']
[u'SrI', u'Ira', u'pi']
[u'Sri', u'Ira', u'pi']
[u'SrI', u'Iras', u'pi']
[u'Sri', u'Iras', u'pi']
[u'SrI', u'iras', u'pi']
[u'Sri', u'iras', u'pi']
[u'SrI', u'ras', u'pi']

Same for #24. Will take a look at the integ branch.

@avinashvarna
Copy link
Collaborator

My apologies, looks like the internal splitter was the default on my branch. This has to do with the issue we discussed about accepting visarga for the forward sandhi but returning s for the backward split. I had forgotten to write the rule to take this into account. I updated it, and it seems to be working. Will test it and check it in

avinashvarna added a commit that referenced this issue Jul 16, 2017
Modified visarga sandhi rule to return s also in backward split
avinashvarna added a commit that referenced this issue Jul 16, 2017
Modified visarga sandhi rule to return s also in backward split
@avinashvarna
Copy link
Collaborator

Please check if it is fixed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants