Skip to content

Commit

Permalink
Merge pull request #491 from LinguList/master
Browse files Browse the repository at this point in the history
update data, fixes #479
  • Loading branch information
xrotwang committed Jul 17, 2018
2 parents f432626 + 4ad5570 commit 5f06b44
Show file tree
Hide file tree
Showing 11 changed files with 805 additions and 509 deletions.
6 changes: 4 additions & 2 deletions CONTRIBUTORS.md
Expand Up @@ -52,7 +52,8 @@ Anthony Grant | MS | 1.1
Ben Yackley | P | 1.1
Christoph Rzymski | CLPG | 1.1
Claire Bowern | D | 1.1
Cormac Anderson | C | 1.1
Christoph Rzymski | CTLG | 1.2
Cormac Anderson | CP | 1.1, 1.2
Damian Satterthwaite-Phillips | D | 1.0
Doug Cooper | DCSM | 1.1
Evgeniya Korovina | MSCLPA | 1.1
Expand All @@ -66,6 +67,7 @@ Lars Borin | DL | 1.0
Magdalena Łuniewska | D | 1.1
Martin Haspelmath | AD | 1.0
Maurizio Serva | MD | 1.1
Mei-Shin Wu | CTLG | 1.2
Michael Dunn | G | 1.1
Nathan Hill | MSDP | 1.1
Nicholas Evans | A | 1.0
Expand All @@ -75,7 +77,7 @@ Robert Blust | D | 1.0
Sean Lee | D | 1.0
Sebastian Nicolai | CMS | 1.0
Thiago Chacon | MSD | 1.0, 1.1
Tiago Tresoldi | CTLG | 1.1
Tiago Tresoldi | CTLG | 1.1, 1.2
Viola Kirchhoff da Cruz | CLS | 1.0
Wang Feng | D | 1.0
Yunfan Lai | CTLG | 1.1
82 changes: 41 additions & 41 deletions concepticondata/README.md

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion concepticondata/conceptlists.tsv
@@ -1,4 +1,5 @@
ID AUTHOR YEAR LIST_SUFFIX ITEMS TAGS SOURCE_LANGUAGE TARGET_LANGUAGE URL REFS PDF NOTE PAGES ALIAS
Perrin-2010-110 Perrin, Loïc-Michel 2010 110 annotated English,German,French Global https://journals.dartmouth.edu/cgi-bin/WebObjects/Journals.woa/xmlpage/1/article/353?htmlOnce=yes Perrin2010 Perrin2010 This list was used as an initial questionnaire for colexification studies on a world-wide sample of languages. 276f
Sun-1991-1004 Sun, Hongkai et al. 1991 1004 questionnaire English Tibeto-Burman languages http://stedt.berkeley.edu/~stedt-cgi/rootcanal.pl Sun1991 This concept list originally served as a questionnaire for a large-scale investigation on Tibeto-Burman languages. The original questionnaire was in Chinese with no English translation. Later the STEDT project (:bib:Matisoff2015) digitized the data, translating Chinese concept labels to English, but not listing the original Chinese forms. As a result, some concept labels are identical, although they are different in the Chinese version. We are still trying to add the original Chinese concept labels to this resource, but for the moment, we only link the STEDT version, occasionally adding Chinese concept labels, where we figure they are important to distinguish the original meaning.
Robinson-2012-398 Robinson, Laura C and Holton, Gary 2012 398 basic English Alor-Pantar languages http://booksandjournals.brillonline.com/content/journals/10.1163/22105832-20120201 Robinson2012 The authors inform that the list is built upon the about 400 items of [Holton (2012)](:bib:Holton2012), but this source did not submit the dataset.
Heggarty-2017-200 Heggarty, Paul and Anderson, Cormac and Scarborough, Matthew 2017 200 basic English Indo-European languages http://cobl.clld.org Heggarty2017 This is the basic list of 200 items which were selected for the Cognates in Basic Lexicon project. The compilers selected the items following various different criteria, such as ease of elicitation, non-fuzziness of the meaning, representation in Indo-European languages, and more. The numbers are at times higher than 200. This results from the earlier selection which containted more concepts which were no longer used in the official version.
Expand Down Expand Up @@ -221,5 +222,5 @@ Kibrik-2012-122 Kibrik, Andrej D. 2012 122 basic English Global Kibrik2012 A
Marrison-1967-917 Marrison, Geoffrey Edward 1967 917 basic,areal English Naga A list of 917 lexical items of Ao Naga (Chungli). But the document only listed 911 items.
Kaiping-2018-591 Kaiping, Gereon A. AND Klamer, Marian 2018 591 basic,areal English,Indonesian Languages of eastern Indonesia http://www.model-ling.eu/lexirumah/parameters Kaiping2018 The LexiRumah list was specifically designed for the study of Austronesian and Timor-Alor-Pantar families, and their contact history, in eastern Indonesia. The list expands Swadesh's [200 item](:ref:Swadesh-1952-200) list by concepts which were found to be subject to frequent borrowing (cf. [Tadmor 2009)](:ref:Tadmor2009), as well as concepts of strong cultural importance in the area (eg. betel nut, lontar palm, ritual ground). LexiRumah
Marrison-1967-909 Marrison, Geoffrey Edward 1967 909 basic,areal English Naga http://stedt.berkeley.edu/~stedt-cgi/rootcanal.pl/gnis?lexicon.lgid=270 Marrison1967 A list of 909 lexical items of Ao Naga (Chungli).
Kolipakam-2018-100 Kolipakam, Vishnupriya and Jordan, Fiona M. and Dunn, Michael and Greenhill, Simon J. and Bouckaert, Remco and Gray, Russell D. and Verkerk, Annemarie 2018 100 basic,areal English Dravidian Kolipakam2018 Kolipakam2018 A 100 item wordlist for Dravidian languages based on [Kassian et al's list](:ref:Kassian-2010-116)
Kolipakam-2018-100 Kolipakam, Vishnupriya and Jordan, Fiona M. and Dunn, Michael and Greenhill, Simon J. and Bouckaert, Remco and Gray, Russell D. and Verkerk, Annemarie 2018 100 basic,areal English Dravidian Kolipakam2018 A 100 item wordlist for Dravidian languages based on [Kassian et al's list](:ref:Kassian-2010-116)
Reinhard-1970-337 Reinhard, Johan and Toba, Tim 1970 337 basic,areal English Kusunda Reinhard1970 There are 337 items in the book.
111 changes: 111 additions & 0 deletions concepticondata/conceptlists/Perrin-2010-110.tsv
@@ -0,0 +1,111 @@
ID NUMBER ENGLISH FRENCH GERMAN CONCEPTICON_ID CONCEPTICON_GLOSS SIMILARITY
Perrin-2010-110-1 1 ACID acide sauer 1906 SOUR
Perrin-2010-110-2 2 BAD mauvais schlecht 1292 BAD
Perrin-2010-110-3 3 BEAUTIFUL beau schön 1427 BEAUTIFUL
Perrin-2010-110-4 4 BENT courbé krumm 297 CROOKED
Perrin-2010-110-5 5 BIG gros groß 1202 BIG
Perrin-2010-110-6 6 BITTER amer bitter 887 BITTER
Perrin-2010-110-7 7 BLUNT émoussé stumpf 379 BLUNT
Perrin-2010-110-8 8 BOILED cuit gekocht 269 COOKED
Perrin-2010-110-9 9 BRAVE courageux mutig 3 BRAVE
Perrin-2010-110-10 10 CALM calme ruhig 258 CALM
Perrin-2010-110-11 11 CHEAP bon-marché billig 1887 CHEAP
Perrin-2010-110-12 12 light CLAIR. hell 679 BRIGHT
Perrin-2010-110-13 13 CLEAN propre sauber 704 CLEAN
Perrin-2010-110-14 14 CLEVER malin schlau 1310 CLEVER
Perrin-2010-110-15 15 COLD froid kalt 1287 COLD
Perrin-2010-110-16 16 CONSTANT constant beständig
Perrin-2010-110-17 17 COOL frais frisch 2483 COLD (OF WEATHER)
Perrin-2010-110-18 18 COWARDLY lâche feige
Perrin-2010-110-19 19 DEEP profond tief 1593 DEEP
Perrin-2010-110-20 20 DIFFICULT difficile schwierig 584 DIFFICULT
Perrin-2010-110-21 21 DELICIOUS délicieux schmackhaft 1813 TASTY
Perrin-2010-110-22 22 DENSE dense dicht 2239 DENSE
Perrin-2010-110-23 23 thick épais DICK 1244 THICK
Perrin-2010-110-24 24 thick épais (non-liquide) DICKFLÜSSIG 2239 DENSE
Perrin-2010-110-25 25 DIRTY sale schmutzig 1230 DIRTY
Perrin-2010-110-26 26 soft DOUX sanft 1856 SOFT
Perrin-2010-110-27 27 DRESSED habillé angezogen
Perrin-2010-110-28 28 right DROITE rechts 1019 RIGHT
Perrin-2010-110-29 29 DRY sec trocken 1398 DRY
Perrin-2010-110-30 30 EASY facile einfach 686 EASY
Perrin-2010-110-31 31 EMPTY vide leer 1624 EMPTY
Perrin-2010-110-32 32 EXPENSIVE cher teuer 1426 EXPENSIVE
Perrin-2010-110-33 33 FAR loin fern 1406 FAR
Perrin-2010-110-34 34 FAST rapide schnell 1631 FAST
Perrin-2010-110-35 35 FAT gras fett 1279 FAT (OBESE)
Perrin-2010-110-36 36 FEARFUL peureux ängstlich 528 DREADFUL
Perrin-2010-110-37 37 FLAT plat flach 1633 FLAT
Perrin-2010-110-38 38 FRAGILE fragile zerbrechlich
Perrin-2010-110-39 39 FREQUENT fréquent häufig
Perrin-2010-110-40 40 FULL plein voll 1429 FULL
Perrin-2010-110-41 41 GAY joyeux fröhlich 1495 HAPPY
Perrin-2010-110-42 42 GENEROUS généreux freigiebig
Perrin-2010-110-43 43 HEALTHY en bonne santé gesund 1364 HEALTHY
Perrin-2010-110-44 44 GOOD bon gut 1035 GOOD
Perrin-2010-110-45 45 HANDICAPPED infirme behindert
Perrin-2010-110-46 46 HARD dur hart 1884 HARD
Perrin-2010-110-47 47 HEAVY lourd schwer 1210 HEAVY
Perrin-2010-110-48 48 HOT chaud (brûlant) heiß 1286 HOT
Perrin-2010-110-49 49 FOOLISH idiot dumm 1518 STUPID
Perrin-2010-110-50 50 ILL malade krank 1847 SICK
Perrin-2010-110-51 51 LARGE grand, vaste groß 1202 BIG
Perrin-2010-110-52 52 LAZY paresseux faul 1564 LAZY
Perrin-2010-110-53 53 LIGHT léger leicht 1052 LIGHT (WEIGHT)
Perrin-2010-110-54 54 NOT DENSE espacé licht(er).
Perrin-2010-110-55 55 TIED UP lié festgebunden
Perrin-2010-110-56 56 LIMPING boiteux hinkend
Perrin-2010-110-57 57 LITTLE petit, jeune klein 1246 SMALL
Perrin-2010-110-58 58 LONG long lang 1203 LONG
Perrin-2010-110-59 59 untied détaché LOSE 2506 LOOSE
Perrin-2010-110-60 60 LOUD bruyant laut 377 LOUD
Perrin-2010-110-61 61 thin MAIGRE mager 2219 LEAN (MEAT)
Perrin-2010-110-62 62 soft MOU weich 1856 SOFT
Perrin-2010-110-63 63 NASTY méchant boshaft 419 SEVERE
Perrin-2010-110-64 64 NARROW étroit eng 1267 NARROW
Perrin-2010-110-65 65 NEAR proche nah 1942 NEAR
Perrin-2010-110-66 66 NEW nouveau neu 1231 NEW
Perrin-2010-110-67 67 NUMEROUS nombreux zahlreich 1198 MANY
Perrin-2010-110-68 68 OLD vieux alt 1229 OLD
Perrin-2010-110-69 69 OPEN ouvert offen 1156 OPEN
Perrin-2010-110-70 70 PAINFUL douloureux schmerzhaft 1129 PAINFUL
Perrin-2010-110-71 71 POOR pauvre arm 1674 POOR
Perrin-2010-110-72 72 POLITE poli höflich 583 KIND OR POLITE
Perrin-2010-110-73 73 PROUD fier stolz 174 PROUD
Perrin-2010-110-74 74 PURE pur rein 1147 PURE
Perrin-2010-110-75 75 RAW cru roh 1959 RAW
Perrin-2010-110-76 76 RIPE mûr reif 178 RIPE
Perrin-2010-110-77 77 ROTTEN pourri verdorben 1728 ROTTEN
Perrin-2010-110-78 78 ROUGH rugueux rauh 1923 ROUGH
Perrin-2010-110-79 79 ROUND rond rund 1395 ROUND
Perrin-2010-110-80 80 RUDE impoli unhöflich 1412 RUDE
Perrin-2010-110-81 81 SHALLOW peu profond seicht 193 SHALLOW
Perrin-2010-110-82 82 SALT salé salzig 1274 SALT
Perrin-2010-110-83 83 SHARP tranchant scharf 1396 SHARP
Perrin-2010-110-84 84 SHORT court kurz 1645 SHORT
Perrin-2010-110-85 85 SHY timide schüchtern 487 SHY
Perrin-2010-110-86 86 SILENT silencieux still 48 BE SILENT
Perrin-2010-110-87 87 SLOW lent langsam 701 SLOW
Perrin-2010-110-88 88 SMALL petit (de taille) klein 1246 SMALL
Perrin-2010-110-89 89 SMOOTH lisse glatt 1234 SMOOTH
Perrin-2010-110-90 90 SOLID solide fest 3003 SOLID
Perrin-2010-110-91 91 SOUR aigre sauer 1906 SOUR
Perrin-2010-110-92 92 pointed pointu SPITZ 372 POINTED
Perrin-2010-110-93 93 STINGY avare geizig 1774 STINGY
Perrin-2010-110-94 94 STINKY malodorant stinkend 42 STINKING
Perrin-2010-110-95 95 STRAIGHT droit gerade 1404 STRAIGHT
Perrin-2010-110-96 96 STRONG fort stark 785 STRONG
Perrin-2010-110-97 97 STUBBORN têtu stur
Perrin-2010-110-98 98 SUPERFICIAL superficiel oberflächlich
Perrin-2010-110-99 99 SWEET sucré süß 717 SWEET
Perrin-2010-110-100 100 TIGHT serré eng 3053 TIGHT
Perrin-2010-110-101 101 THIN mince dünn 1400 THIN (SLIM)
Perrin-2010-110-102 102 TRUE vrai wahr 1657 TRUE
Perrin-2010-110-103 103 WARM chaud warm 1232 WARM
Perrin-2010-110-104 104 WEAK faible schwach 1601 WEAK
Perrin-2010-110-105 105 WET humide feucht 1726 WET
Perrin-2010-110-106 106 WHITE blanc weiß 1335 WHITE
Perrin-2010-110-107 107 WIDE large weit 1243 WIDE
Perrin-2010-110-108 108 WISE sage weise 698 WISE
Perrin-2010-110-109 109 WRONG faux falsch 1390 WRONG
Perrin-2010-110-110 110 YOUNG jeune jung 1207 YOUNG
7 changes: 4 additions & 3 deletions concepticondata/conceptlists/README.md
Expand Up @@ -99,7 +99,7 @@
| [Kibrik-2012-122](Kibrik-2012-122.tsv) | 103 | 84 | 0 |
| [Kingsada-1999-303](Kingsada-1999-303.tsv) | 297 | 98 | 1 |
| [Kitchen-2009-95](Kitchen-2009-95.tsv) | 95 | 100 | 0 |
| [Kolipakam-2018-100](Kolipakam-2018-100.tsv) | 0 | 0 | 0 |
| [Kolipakam-2018-100](Kolipakam-2018-100.tsv) | 100 | 100 | 0 |
| [Kraft-1981-434](Kraft-1981-434.tsv) | 434 | 100 | 6 |
| [Krisadawan-2000-210](Krisadawan-2000-210.tsv) | 211 | 100 | 1 |
| [Larson-1972-68](Larson-1972-68.tsv) | 66 | 100 | 0 |
Expand Down Expand Up @@ -150,6 +150,7 @@
| [Pallas-1789-285](Pallas-1789-285.tsv) | 282 | 98 | 10 |
| [Payne-1991-202](Payne-1991-202.tsv) | 202 | 100 | 39 |
| [Peiros-1999-100](Peiros-1999-100.tsv) | 100 | 100 | 0 |
| [Perrin-2010-110](Perrin-2010-110.tsv) | 98 | 89 | 5 |
| [Peust-2013-54](Peust-2013-54.tsv) | 54 | 100 | 0 |
| [Pozdniakov-2014-100a](Pozdniakov-2014-100a.tsv) | 100 | 100 | 0 |
| [Pozdniakov-2014-100b](Pozdniakov-2014-100b.tsv) | 100 | 100 | 0 |
Expand All @@ -158,7 +159,7 @@
| [Reinhard-1970-337](Reinhard-1970-337.tsv) | 295 | 87 | 27 |
| [Ringe-1992-100](Ringe-1992-100.tsv) | 100 | 100 | 0 |
| [Ringe-2002-333](Ringe-2002-333.tsv) | 333 | 100 | 0 |
| [Robinson-2012-398](Robinson-2012-398.tsv) | 386 | 96 | 0 |
| [Robinson-2012-398](Robinson-2012-398.tsv) | 393 | 98 | 0 |
| [SIL-1980-281](SIL-1980-281.tsv) | 282 | 100 | 1 |
| [SIL-2002-436](SIL-2002-436.tsv) | 442 | 100 | 5 |
| [Samarin-1967-100](Samarin-1967-100.tsv) | 100 | 100 | 0 |
Expand Down Expand Up @@ -227,5 +228,5 @@
| [Zorc-1974-100](Zorc-1974-100.tsv) | 100 | 100 | 0 |
| [vanderWouw-1974-28](vanderWouw-1974-28.tsv) | 28 | 100 | 0 |

(224 rows)
(225 rows)

11 changes: 11 additions & 0 deletions concepticondata/references/references.bib
@@ -1,6 +1,17 @@
% This file was created with JabRef 2.10.
% Encoding: UTF8
@Article{Perrin2010,
Title = {{P}olysemous qualities and universal networks, invariance and diversity},
Author = {Perrin, Loïc-Michel},
Journal = {Linguistic Discovery},
Number = {1},
Pages = {259-280},
Volume = {8},
Year = {2010},
}


@inproceedings{List2016a,
Publisher = {European Language Resources Association (ELRA)},
Author = {List, Johann-Mattis and Cysouw, Michael and Forkel, Robert},
Expand Down
Binary file added concepticondata/sources/Perrin2010.pdf
Binary file not shown.
2 changes: 1 addition & 1 deletion pyconcepticon/data/map-af.tsv
Expand Up @@ -260,5 +260,5 @@ ID GLOSS PRIORITY
55 WHISPER///fluister 1
3107 WHISTLE (INSTRUMENT)///fluitjie 1
1025 WHISTLE///blaas 1
958 WILD ANIMAL////Az/ 1
958 WILD ANIMAL////\A\z/ 1
1672 WRITE///skryf 1

0 comments on commit 5f06b44

Please sign in to comment.