Skip to content

Commit

Permalink
Information about the data set
Browse files Browse the repository at this point in the history
  • Loading branch information
KonstantinHoffmann committed Apr 23, 2019
1 parent f3aa49f commit 53483fa
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 1 deletion.
5 changes: 4 additions & 1 deletion README.md
Expand Up @@ -15,6 +15,10 @@ tracerversion: 1.7.1

Bayesian phylogenies are widely used in comparative linguistics. They provide not only information about the relationship of different languages, but also test hypotheses concerning the ages of language families {% cite Bouckaert2012 --file LanguagePhylogenies/master_refs.bib %}. The data set can contain grammatical features such as "has productive plural marking on nouns" or statistics of what cognates exist. The languages are then encoded as a binary string representing the presence or absence of these features. This forms the "sequence alignment matrix" on which the phylogeny is built.

In this tutorial we will analyse a minor selection of Central Pacific languages and focus on the steps that are required in a linguistic analysis. These linguistic data come from 20 languages spoken in Polynesia in the Pacific including Hawaiian, Fijian, Tahitian, Tongan, Samoan, Rapanui (spoken on Easter Island) and Maori (spoken in New Zealand). This group of languages form the "Central Pacific” clade of the great Austronesian language family which originated in Taiwan around 5000 years ago, and spread though-out the Pacific from Madagascar to Easter Island ({% cite Gray2009 --file LanguagePhylogenies/master_refs.bib %}).

This dataset contains binary data demarcating the presence or absence of ‘cognates’, which are homologous word forms inherited from the common ancestor of these languages (“Proto-Central Pacific”). The raw linguistic data are available on the Austronesian Basic Vocabulary Database website ([https://abvd.shh.mpg.de/austronesian/]https://abvd.shh.mpg.de/austronesian/), and more information about the relationships of these languages and the research behind this classification can be found at Glottolog, with links to the primary research here: [https://glottolog.org/resource/languoid/id/cent2060]https://glottolog.org/resource/languoid/id/cent2060

----

# Programs used in this Exercise
Expand Down Expand Up @@ -44,7 +48,6 @@ IcyTree ([https://icytree.org](https://icytree.org)) is a browser-based phylogen

# Practical: Creating a language phylogeny

In this tutorial we will analyse a minor selection of Central Pacific languages and focus on the steps that are required in a linguistic analysis.

## Installing necessary packages
First we need to install the `babel` (v. 0.2.1 or above) package for linguistic analyses. Further we will use a Birth-Death model which requires the package `BDSKY` (v. 1.4.5 or above). The latter one can be easily done via the package manager of BEAUti. For `babel` we first need to add extra repositories to the package manager.
Expand Down
13 changes: 13 additions & 0 deletions master_refs.bib
Expand Up @@ -99,3 +99,16 @@ @Article{Bouckaert2012
publisher = {American Association for the Advancement of Science ({AAAS})},
}

@Article{Gray2009,
author = {R. D. Gray and A. J. Drummond and S. J. Greenhill},
title = {Language Phylogenies Reveal Expansion Pulses and Pauses in Pacific Settlement},
journal = {Science},
year = {2009},
volume = {323},
number = {5913},
pages = {479--483},
month = {jan},
doi = {10.1126/science.1166858},
publisher = {American Association for the Advancement of Science ({AAAS})},
}

0 comments on commit 53483fa

Please sign in to comment.