Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orthography profile: "nd" as a single segment, vs "n d" in transnewguineaorg #13

Closed
XachaB opened this issue Feb 24, 2021 · 4 comments
Closed

Comments

@XachaB
Copy link
Contributor

XachaB commented Feb 24, 2021

In this dataset, the sequence "nd" is segmented as a single segment:

https://github.com/lexibank/joophonosemantic/blob/master/etc/orthography.tsv#L192

Example:

Enga-33_one-1,,Enga,33_one,m.e.nd.ɑ.i,m.e.nd.ɑ.i,m e nd ɑ i,,,,,^ m . e . nd . ɑ . i $,default

However, the same sequence is segmented as "n d" in transnewguineaorg:

enga-wapi-one-1,164073,enga-wapi,one,mendai,mendai,m e n d a i,,davies_and_comrie1985,,,^ m e n d a i $,default

Is it possible to normalize to one or the other ?

@XachaB
Copy link
Contributor Author

XachaB commented Mar 2, 2021

pinging @LinguList

@LinguList
Copy link
Contributor

Look, @XachaB, this is not my idea, but the source, right? The source is already segmented. So you should bring this up in transnewguineaorg, where on eshould then discuss to merge all nd instances to prenasalized n + d, and the same for mb, ng, etc.! But here again, it should also be discussed with @SimonGreenhill.

@XachaB
Copy link
Contributor Author

XachaB commented Mar 2, 2021

Noted, thanks

@MuffinLinwist
Copy link
Collaborator

I'm closing this since in transnewguineaorg the issue is already resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants