Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Urum basic lexicon #62

Open
LinguList opened this issue Dec 12, 2015 · 7 comments
Open

Urum basic lexicon #62

LinguList opened this issue Dec 12, 2015 · 7 comments

Comments

@LinguList
Copy link
Contributor

http://urum.lili.uni-bielefeld.de/download/docs/uum-lexicon.pdf

This list draws from WOLD, adds 90 more concepts, and provides alternative categories. It is long, and it is a PDF, so now way to quickly extract a linking to the concepticon. The semantic categories would be interesting, though, but this is probably rather a long-term than a short-term list-to-map.

@xrotwang
Copy link
Contributor

Maybe this is of interest for the dictionaria project

@AnnikaTjuka
Copy link
Collaborator

@LinguList and @xrotwang Should we keep the basic lexicon as an issue in Concepticon or should I open a new issue on the Dictionaria GitHub page?

Btw. the link didn't work anymore, but I found this one: http://projects.turkmas.uoa.gr/urum/download/docs/uum-lexicon.pdf

@LinguList
Copy link
Contributor Author

This is a dataset for one language in lexibank, or even more than one, given the glossing languages. One would need to see to which degree the concept list can be extracted from the data (using some tools like adobe pro). One may also think of contacting the authors, if they are interested in sharing the concept list in form of an excel sheet.

@LinguList
Copy link
Contributor Author

I think it may also be interesting for @ilchec. Maybe he even knows the authors. And yes, @xrotwang, when asking them if they want to publish through dictionaria, they might be interested.

@LinguList
Copy link
Contributor Author

But we should then ask them directly, maybe now?

@ilchec
Copy link
Collaborator

ilchec commented Jun 19, 2020

It looks like the PDF that you've linked can be parsed relatively easily, the entries are all organized similarly (there are no optional notes) and each piece of information is preceded by the keyword, so it won't be that hard with PDFMiner. And unfortunately I don't know the authors =(

@LinguList
Copy link
Contributor Author

So we can already prepare the data with adobe pro (this is working even better), I think @MacyL has it, otherwise I'll ask Nathan, and then we have the concept list, which is anyway nice. In the meantime, we ask the authors if they are interested in submitting their data to dictionaria?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants