Permalink
Please sign in to comment.
Browse files
Initial release including the R code, the input data, the outputs, as…
… well as the paper describing the process and the release v1.0
- Loading branch information...
Showing
with
990,724 additions
and 13 deletions.
- +45 −11 README.md
- +2,147 −0 code/FamilyTrees.R
- +2 −0 code/README.md
- +1,084 −0 code/StandardizedTrees.R
- +10 −0 input/autotyp/README.txt
- +2,927 −0 input/autotyp/autotyp-trees.csv
- BIN input/distances/ASJP/ASJPSoftware003.zip
- +48 −0 input/distances/ASJP/ReadMe.txt
- +576 −0 input/distances/ASJP/conversion.log
- +339 −0 input/distances/ASJP/gpl-2.0.txt
- BIN input/distances/ASJP/listss15.txt.tar.xz
- BIN input/distances/ASJP/listss16-and-listss16dd.tar.xz
- +138 −0 input/distances/ASJP/process-asjp15-distances.R
- +154 −0 input/distances/ASJP/process-asjp16-distances.R
- +4 −0 input/distances/AUTOTYP/ReadMe.txt
- BIN input/distances/AUTOTYP/autotyp-dist.RData
- +512 −0 input/distances/MG2015/.Rhistory
- BIN input/distances/MG2015/MG2015-autotyp-alpha=0.69.RData
- BIN input/distances/MG2015/MG2015-ethnologue-alpha=0.69.RData
- BIN input/distances/MG2015/MG2015-glottolog-alpha=0.69.RData
- BIN input/distances/MG2015/MG2015-wals-alpha=0.69.RData
- +6 −0 input/distances/MG2015/ReadMe.txt
- +161 −0 input/distances/MG2015/compute MG2015.R
- +13 −0 input/distances/WALS/ReadMe.txt
- +339 −0 input/distances/WALS/gpl-2.0.txt
- +144 −0 input/distances/WALS/process-wals-distances.R
- +7,480 −0 input/ethnologue/LanguageCodes.tab
- BIN input/ethnologue/Language_Code_Data_20140425.zip
- +7,880 −0 input/ethnologue/iso-639-3_20140320.tab
- +4 −0 input/glottolog/ReadMe.txt
- +1 −0 input/glottolog/glottocodes2iso.json
- +435 −0 input/glottolog/tree-glottolog-newick.txt
- +14 −0 input/wals/README.txt
- +2,680 −0 input/wals/language.csv
- +404 −0 output/autotyp/autotyp-newick-constant=1.00.csv
- +404 −0 output/autotyp/autotyp-newick-ga+asjp16.csv
- +404 −0 output/autotyp/autotyp-newick-ga+autotyp.csv
- +404 −0 output/autotyp/autotyp-newick-ga+geo.csv
- +404 −0 output/autotyp/autotyp-newick-ga+mg2015(autotyp).csv
- +404 −0 output/autotyp/autotyp-newick-ga+wals(euclidean).csv
- +404 −0 output/autotyp/autotyp-newick-ga+wals(euclidean,mode).csv
- +404 −0 output/autotyp/autotyp-newick-ga+wals(gower).csv
- +404 −0 output/autotyp/autotyp-newick-ga+wals(gower,mode).csv
- +404 −0 output/autotyp/autotyp-newick-grafen.csv
- +404 −0 output/autotyp/autotyp-newick-nj+asjp16.csv
- +404 −0 output/autotyp/autotyp-newick-nj+autotyp.csv
- +404 −0 output/autotyp/autotyp-newick-nj+geo.csv
- +404 −0 output/autotyp/autotyp-newick-nj+mg2015(autotyp).csv
- +404 −0 output/autotyp/autotyp-newick-nj+wals(euclidean).csv
- +404 −0 output/autotyp/autotyp-newick-nj+wals(euclidean,mode).csv
- +404 −0 output/autotyp/autotyp-newick-nj+wals(gower).csv
- +404 −0 output/autotyp/autotyp-newick-nj+wals(gower,mode).csv
- +404 −0 output/autotyp/autotyp-newick-nnls+asjp16.csv
- +404 −0 output/autotyp/autotyp-newick-nnls+autotyp.csv
- +404 −0 output/autotyp/autotyp-newick-nnls+geo.csv
- +404 −0 output/autotyp/autotyp-newick-nnls+mg2015(autotyp).csv
- +404 −0 output/autotyp/autotyp-newick-nnls+wals(euclidean).csv
- +404 −0 output/autotyp/autotyp-newick-nnls+wals(euclidean,mode).csv
- +404 −0 output/autotyp/autotyp-newick-nnls+wals(gower).csv
- +404 −0 output/autotyp/autotyp-newick-nnls+wals(gower,mode).csv
- +404 −0 output/autotyp/autotyp-newick-proportional=1.00.csv
- +404 −0 output/autotyp/autotyp-newick.csv
- +4,277 −0 output/autotyp/autotyp-nexus-constant=1.00.nex
- +2,736 −0 output/autotyp/autotyp-nexus-ga+asjp16.nex
- +3,426 −0 output/autotyp/autotyp-nexus-ga+autotyp.nex
- +3,430 −0 output/autotyp/autotyp-nexus-ga+geo.nex
- +3,588 −0 output/autotyp/autotyp-nexus-ga+mg2015(autotyp).nex
- +2,888 −0 output/autotyp/autotyp-nexus-ga+wals(euclidean).nex
- +3,014 −0 output/autotyp/autotyp-nexus-ga+wals(euclidean,mode).nex
- +2,888 −0 output/autotyp/autotyp-nexus-ga+wals(gower).nex
- +3,014 −0 output/autotyp/autotyp-nexus-ga+wals(gower,mode).nex
- +4,277 −0 output/autotyp/autotyp-nexus-grafen.nex
- +2,152 −0 output/autotyp/autotyp-nexus-nj+asjp16.nex
- +237 −0 output/autotyp/autotyp-nexus-nj+autotyp.nex
- +2,684 −0 output/autotyp/autotyp-nexus-nj+geo.nex
- +2,742 −0 output/autotyp/autotyp-nexus-nj+mg2015(autotyp).nex
- +696 −0 output/autotyp/autotyp-nexus-nj+wals(euclidean).nex
- +2,359 −0 output/autotyp/autotyp-nexus-nj+wals(euclidean,mode).nex
- +696 −0 output/autotyp/autotyp-nexus-nj+wals(gower).nex
- +2,359 −0 output/autotyp/autotyp-nexus-nj+wals(gower,mode).nex
- +2,758 −0 output/autotyp/autotyp-nexus-nnls+asjp16.nex
- +1,979 −0 output/autotyp/autotyp-nexus-nnls+autotyp.nex
- +3,452 −0 output/autotyp/autotyp-nexus-nnls+geo.nex
- +3,588 −0 output/autotyp/autotyp-nexus-nnls+mg2015(autotyp).nex
- +1,704 −0 output/autotyp/autotyp-nexus-nnls+wals(euclidean).nex
- +3,042 −0 output/autotyp/autotyp-nexus-nnls+wals(euclidean,mode).nex
- +1,704 −0 output/autotyp/autotyp-nexus-nnls+wals(gower).nex
- +3,042 −0 output/autotyp/autotyp-nexus-nnls+wals(gower,mode).nex
- +4,277 −0 output/autotyp/autotyp-nexus-proportional=1.00.nex
- +4,277 −0 output/autotyp/autotyp-nexus.nex
- +17,257 −0 output/code_mappings_iso_wals_autotyp_glottolog.csv
- +148 −0 output/ethnologue/ethnologue-newick-constant=1.00.csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+asjp16.csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+autotyp.csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+geo.csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+mg2015(ethnologue).csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+wals(euclidean).csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+wals(euclidean,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+wals(gower).csv
- +148 −0 output/ethnologue/ethnologue-newick-ga+wals(gower,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-grafen.csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+asjp16.csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+autotyp.csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+geo.csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+mg2015(ethnologue).csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+wals(euclidean).csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+wals(euclidean,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+wals(gower).csv
- +148 −0 output/ethnologue/ethnologue-newick-nj+wals(gower,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+asjp16.csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+autotyp.csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+geo.csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+mg2015(ethnologue).csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(euclidean).csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(euclidean,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(gower).csv
- +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(gower,mode).csv
- +148 −0 output/ethnologue/ethnologue-newick-proportional=1.00.csv
- +148 −0 output/ethnologue/ethnologue-newick.csv
- +10,495 −0 output/ethnologue/ethnologue-nexus-constant=1.00.nex
- +5,321 −0 output/ethnologue/ethnologue-nexus-ga+asjp16.nex
- +3,526 −0 output/ethnologue/ethnologue-nexus-ga+autotyp.nex
- +9,271 −0 output/ethnologue/ethnologue-nexus-ga+geo.nex
- +10,456 −0 output/ethnologue/ethnologue-nexus-ga+mg2015(ethnologue).nex
- +3,260 −0 output/ethnologue/ethnologue-nexus-ga+wals(euclidean).nex
- +3,309 −0 output/ethnologue/ethnologue-nexus-ga+wals(euclidean,mode).nex
- +3,260 −0 output/ethnologue/ethnologue-nexus-ga+wals(gower).nex
- +3,309 −0 output/ethnologue/ethnologue-nexus-ga+wals(gower,mode).nex
- +10,495 −0 output/ethnologue/ethnologue-nexus-grafen.nex
- +3,901 −0 output/ethnologue/ethnologue-nexus-nj+asjp16.nex
- +223 −0 output/ethnologue/ethnologue-nexus-nj+autotyp.nex
- +7,235 −0 output/ethnologue/ethnologue-nexus-nj+geo.nex
- +7,532 −0 output/ethnologue/ethnologue-nexus-nj+mg2015(ethnologue).nex
- +487 −0 output/ethnologue/ethnologue-nexus-nj+wals(euclidean).nex
- +2,316 −0 output/ethnologue/ethnologue-nexus-nj+wals(euclidean,mode).nex
- +487 −0 output/ethnologue/ethnologue-nexus-nj+wals(gower).nex
- +2,316 −0 output/ethnologue/ethnologue-nexus-nj+wals(gower,mode).nex
- +5,321 −0 output/ethnologue/ethnologue-nexus-nnls+asjp16.nex
- +718 −0 output/ethnologue/ethnologue-nexus-nnls+autotyp.nex
- +9,271 −0 output/ethnologue/ethnologue-nexus-nnls+geo.nex
- +10,456 −0 output/ethnologue/ethnologue-nexus-nnls+mg2015(ethnologue).nex
- +832 −0 output/ethnologue/ethnologue-nexus-nnls+wals(euclidean).nex
- +3,309 −0 output/ethnologue/ethnologue-nexus-nnls+wals(euclidean,mode).nex
- +832 −0 output/ethnologue/ethnologue-nexus-nnls+wals(gower).nex
- +3,309 −0 output/ethnologue/ethnologue-nexus-nnls+wals(gower,mode).nex
- +10,495 −0 output/ethnologue/ethnologue-nexus-proportional=1.00.nex
- +10,495 −0 output/ethnologue/ethnologue-nexus.nex
- +436 −0 output/glottolog/glottolog-newick-constant=1.00.csv
- +436 −0 output/glottolog/glottolog-newick-ga+asjp16.csv
- +436 −0 output/glottolog/glottolog-newick-ga+autotyp.csv
- +436 −0 output/glottolog/glottolog-newick-ga+geo.csv
- +436 −0 output/glottolog/glottolog-newick-ga+mg2015(glottolog).csv
- +436 −0 output/glottolog/glottolog-newick-ga+wals(euclidean).csv
- +436 −0 output/glottolog/glottolog-newick-ga+wals(euclidean,mode).csv
- +436 −0 output/glottolog/glottolog-newick-ga+wals(gower).csv
- +436 −0 output/glottolog/glottolog-newick-ga+wals(gower,mode).csv
- +436 −0 output/glottolog/glottolog-newick-grafen.csv
- +436 −0 output/glottolog/glottolog-newick-nj+asjp16.csv
- +436 −0 output/glottolog/glottolog-newick-nj+autotyp.csv
- +436 −0 output/glottolog/glottolog-newick-nj+geo.csv
- +436 −0 output/glottolog/glottolog-newick-nj+mg2015(glottolog).csv
- +436 −0 output/glottolog/glottolog-newick-nj+wals(euclidean).csv
- +436 −0 output/glottolog/glottolog-newick-nj+wals(euclidean,mode).csv
- +436 −0 output/glottolog/glottolog-newick-nj+wals(gower).csv
- +436 −0 output/glottolog/glottolog-newick-nj+wals(gower,mode).csv
- +436 −0 output/glottolog/glottolog-newick-nnls+asjp16.csv
- +436 −0 output/glottolog/glottolog-newick-nnls+autotyp.csv
- +436 −0 output/glottolog/glottolog-newick-nnls+geo.csv
- +436 −0 output/glottolog/glottolog-newick-nnls+mg2015(glottolog).csv
- +436 −0 output/glottolog/glottolog-newick-nnls+wals(euclidean).csv
- +436 −0 output/glottolog/glottolog-newick-nnls+wals(euclidean,mode).csv
- +436 −0 output/glottolog/glottolog-newick-nnls+wals(gower).csv
- +436 −0 output/glottolog/glottolog-newick-nnls+wals(gower,mode).csv
- +436 −0 output/glottolog/glottolog-newick-proportional=1.00.csv
- +436 −0 output/glottolog/glottolog-newick.csv
- +23,017 −0 output/glottolog/glottolog-nexus-constant=1.00.nex
- +3,331 −0 output/glottolog/glottolog-nexus-ga+asjp16.nex
- +1,820 −0 output/glottolog/glottolog-nexus-ga+autotyp.nex
- +7,232 −0 output/glottolog/glottolog-nexus-ga+geo.nex
- +22,534 −0 output/glottolog/glottolog-nexus-ga+mg2015(glottolog).nex
- +1,629 −0 output/glottolog/glottolog-nexus-ga+wals(euclidean).nex
- +1,721 −0 output/glottolog/glottolog-nexus-ga+wals(euclidean,mode).nex
- +1,629 −0 output/glottolog/glottolog-nexus-ga+wals(gower).nex
- +1,721 −0 output/glottolog/glottolog-nexus-ga+wals(gower,mode).nex
- +23,017 −0 output/glottolog/glottolog-nexus-grafen.nex
- +2,021 −0 output/glottolog/glottolog-nexus-nj+asjp16.nex
- +192 −0 output/glottolog/glottolog-nexus-nj+autotyp.nex
- +4,651 −0 output/glottolog/glottolog-nexus-nj+geo.nex
- +15,738 −0 output/glottolog/glottolog-nexus-nj+mg2015(glottolog).nex
- +235 −0 output/glottolog/glottolog-nexus-nj+wals(euclidean).nex
- +1,018 −0 output/glottolog/glottolog-nexus-nj+wals(euclidean,mode).nex
- +235 −0 output/glottolog/glottolog-nexus-nj+wals(gower).nex
- +1,018 −0 output/glottolog/glottolog-nexus-nj+wals(gower,mode).nex
- +3,331 −0 output/glottolog/glottolog-nexus-nnls+asjp16.nex
- +509 −0 output/glottolog/glottolog-nexus-nnls+autotyp.nex
- +7,232 −0 output/glottolog/glottolog-nexus-nnls+geo.nex
- +22,534 −0 output/glottolog/glottolog-nexus-nnls+mg2015(glottolog).nex
- +558 −0 output/glottolog/glottolog-nexus-nnls+wals(euclidean).nex
- +1,734 −0 output/glottolog/glottolog-nexus-nnls+wals(euclidean,mode).nex
- +558 −0 output/glottolog/glottolog-nexus-nnls+wals(gower).nex
- +1,734 −0 output/glottolog/glottolog-nexus-nnls+wals(gower,mode).nex
- +23,017 −0 output/glottolog/glottolog-nexus-proportional=1.00.nex
- +23,017 −0 output/glottolog/glottolog-nexus.nex
- +420,850 −0 output/tree_comparisons_between_methods.csv
- +215 −0 output/wals/wals-newick-constant=1.00.csv
- +215 −0 output/wals/wals-newick-ga+asjp16.csv
- +215 −0 output/wals/wals-newick-ga+autotyp.csv
- +215 −0 output/wals/wals-newick-ga+geo.csv
- +215 −0 output/wals/wals-newick-ga+mg2015(wals).csv
- +215 −0 output/wals/wals-newick-ga+wals(euclidean).csv
- +215 −0 output/wals/wals-newick-ga+wals(euclidean,mode).csv
- +215 −0 output/wals/wals-newick-ga+wals(gower).csv
- +215 −0 output/wals/wals-newick-ga+wals(gower,mode).csv
- +215 −0 output/wals/wals-newick-grafen.csv
- +215 −0 output/wals/wals-newick-nj+asjp16.csv
- +215 −0 output/wals/wals-newick-nj+autotyp.csv
- +215 −0 output/wals/wals-newick-nj+geo.csv
- +215 −0 output/wals/wals-newick-nj+mg2015(wals).csv
- +215 −0 output/wals/wals-newick-nj+wals(euclidean).csv
- +215 −0 output/wals/wals-newick-nj+wals(euclidean,mode).csv
- +215 −0 output/wals/wals-newick-nj+wals(gower).csv
- +215 −0 output/wals/wals-newick-nj+wals(gower,mode).csv
- +215 −0 output/wals/wals-newick-nnls+asjp16.csv
- +215 −0 output/wals/wals-newick-nnls+autotyp.csv
- +215 −0 output/wals/wals-newick-nnls+geo.csv
- +215 −0 output/wals/wals-newick-nnls+mg2015(wals).csv
- +215 −0 output/wals/wals-newick-nnls+wals(euclidean).csv
- +215 −0 output/wals/wals-newick-nnls+wals(euclidean,mode).csv
- +215 −0 output/wals/wals-newick-nnls+wals(gower).csv
- +215 −0 output/wals/wals-newick-nnls+wals(gower,mode).csv
- +215 −0 output/wals/wals-newick-proportional=1.00.csv
- +215 −0 output/wals/wals-newick.csv
- +3,404 −0 output/wals/wals-nexus-constant=1.00.nex
- +2,407 −0 output/wals/wals-nexus-ga+asjp16.nex
- +2,637 −0 output/wals/wals-nexus-ga+autotyp.nex
- +3,012 −0 output/wals/wals-nexus-ga+geo.nex
- +3,089 −0 output/wals/wals-nexus-ga+mg2015(wals).nex
- +2,964 −0 output/wals/wals-nexus-ga+wals(euclidean).nex
- +3,089 −0 output/wals/wals-nexus-ga+wals(euclidean,mode).nex
- +2,964 −0 output/wals/wals-nexus-ga+wals(gower).nex
- +3,089 −0 output/wals/wals-nexus-ga+wals(gower,mode).nex
- +3,404 −0 output/wals/wals-nexus-grafen.nex
- +2,052 −0 output/wals/wals-nexus-nj+asjp16.nex
- +291 −0 output/wals/wals-nexus-nj+autotyp.nex
- +2,513 −0 output/wals/wals-nexus-nj+geo.nex
- +2,530 −0 output/wals/wals-nexus-nj+mg2015(wals).nex
- +310 −0 output/wals/wals-nexus-nj+wals(euclidean).nex
- +2,530 −0 output/wals/wals-nexus-nj+wals(euclidean,mode).nex
- +310 −0 output/wals/wals-nexus-nj+wals(gower).nex
- +2,530 −0 output/wals/wals-nexus-nj+wals(gower,mode).nex
- +2,429 −0 output/wals/wals-nexus-nnls+asjp16.nex
- +1,466 −0 output/wals/wals-nexus-nnls+autotyp.nex
- +3,016 −0 output/wals/wals-nexus-nnls+geo.nex
- +3,089 −0 output/wals/wals-nexus-nnls+mg2015(wals).nex
- +1,637 −0 output/wals/wals-nexus-nnls+wals(euclidean).nex
- +3,089 −0 output/wals/wals-nexus-nnls+wals(euclidean,mode).nex
- +1,637 −0 output/wals/wals-nexus-nnls+wals(gower).nex
- +3,089 −0 output/wals/wals-nexus-nnls+wals(gower,mode).nex
- +3,404 −0 output/wals/wals-nexus-proportional=1.00.nex
- +3,404 −0 output/wals/wals-nexus.nex
- +1 −1 paper/README.md
- +751 −0 paper/family-trees-with-brlength.Rmd
- +344 −0 paper/family-trees-with-brlength.bib
- +4,802 −0 paper/family-trees-with-brlength.html
- BIN paper/family-trees-with-brlength.pdf
- +1 −1 releases/README.md
- BIN releases/v1.0.tar.xz
| @@ -1,27 +1,61 @@ | ||
| -# lgfam-newick | ||
| -Language family classifications as Newick trees | ||
| +# lgfam-newick: Language family classifications as Newick trees | ||
| -This repository contains the data and code associated with the **PAPER**. | ||
| -The code is released under GPL v2, but the various pieces of input data might be govered by different licenses (specified in the respective folders). | ||
| +## Summary | ||
| + | ||
| +This repository contains the data, R code, outputs and description of a flexible method for generating standardized [Newick](http://evolution.genetics.washington.edu/phylip/newicktree.html) language family trees with branch lengths from the four most used language classification databases: [Ethnologue](http://www.ethnologue.com/), [WALS](http://wals.info/), [AUTOTYP](http://www.autotyp.uzh.ch/) and [Glottolog](http://glottolog.org/). | ||
| +The code is released under [GPL v2](http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html), but the various pieces of input data might be governed by different licenses (specified in the respective folders). | ||
| The aims of this project are to: | ||
| a) provide several well-known linguistic (genealogical) classifications (currently [WALS](http://wals.info/), [Ethnologue](http://www.ethnologue.com/), [Glottolog](http://glottolog.org/) and [AUTOTYP](http://www.autotyp.uzh.ch/)) in the *de facto* standard [Newick format](https://en.wikipedia.org/wiki/Newick_format), and | ||
| b) offer a set of [`R`](http://www.r-project.org/) `S3` classes and functions for reading, converting, writing and working with language family trees. | ||
| -The accompanying **PAPER** describes in detail the data sources and conversion process. | ||
| +## Accompanying paper, outputs and acknowledging this work | ||
| + | ||
| +The **accompanying paper** (in the `./paper/` directory) describes in detail the data sources and the conversion process. | ||
| +The paper itself is written in [`R Markdown`](http://rmarkdown.rstudio.com/) and can be compiled to PDF (the primary output in the `family-trees-with-brlength.pdf` file) or HTML (the `family-trees-with-brlength.html` file). | ||
| + | ||
| +The actual Newick trees with branch lengths are in the `./output/` directory and can be used directly (the file formats are described in the **accompanying paper** but briefly they come as **CSV TAB-separated files** and equivalent **Nexus files** that contain the language family trees in the **Newick format**; the file name gives details about the classification, method and parameters used to compute the topology and branch lengths). | ||
| + | ||
| +If you use (parts of) the `R` scripts and/or the generated Newick trees, please do cite this in your work and provide links to this repository ([https://github.com/ddediu/lgfam-newick](https://github.com/ddediu/lgfam-newick))! | ||
| + | ||
| + | ||
| +## Releases | ||
| + | ||
| +"Official" releases can be found in the `./relases` directory. | ||
| -This repository contains: | ||
| -a) the original data and code associated with the **PAPER**, but also | ||
| -b) updates and bugfixes concerning the data and code. | ||
| +## Running the `R` code | ||
| -If you find this useful please cite the **PAPER** in your work! | ||
| +If you are **trying to run the `R` code yourself**, please note that I have removed some of the large cached intermediary results (in order to save space). | ||
| +Thus, you must first generate these cached data, as follows. | ||
| -Thank you, | ||
| +Run the `./input/distances/WALS/process-wals-distances.R` script to generate the WALS-based distance matrices. | ||
| + | ||
| +Run the `./input/distances/ASJP/process-asjp16-distances.R` script to generate the ASJP16-based distance matrix. | ||
| + | ||
| +Run the `./code/StandardizedTrees.R` main `R` script with the following parameters set to `TRUE`: `MATCH_CODES` (compute the equivalences between the ISO, WALS, AUTOTYP and GLOTTOLOG codes and generate the UULIDs), 'PREOPTIMIZE_DISTS' (pre-optimize the distance matrices for fast loading when required), `COMPUTE_GEO_DISTS` (compute the geographic distances between languages). | ||
| +For later runs (after these data has been generated and cached) these parameters can be safely set to `FALSE` (this pre-processing is computationally very expensive). | ||
| +The parameters `TRANSFORM_TREES` (transform the trees from their original specific representation to the Newick notation no branch length), `EXPORT_NEXUS` (export the trees to a NEXUS file), `EXPORT_NEXUS_TRANSLATE_BLOCK` (when exporting NEXUS files, generate a TRANSLATE block; useful when using programs such as BayesTraits that have issues parsing complicated taxa names), `EXPORT_CSV` (export the trees to a CSV file) can be left on `TRUE` (except perhaps the first as the tree topologies will probably not change very often in the original databases). | ||
| +Please note that the first time the Ethnologue tree topologies are transformed to Newick, these will be downloaded from the Ethnologue website and cached locally. | ||
| +The last two parameters are `COMPUTE_BRLEN` (apply the various branch length methods to the Newick topologies) and `COMPARE_TREES` (compute the distance between equivalent trees). | ||
| +Finally, `CPU_CORES` controls multi-core processing (using `mclapply` -- might not work on Windows!). | ||
| +(It is a good idea to leave `quotes="'"`). | ||
| +Parameters `CLASSIFICATIONS`, `METHODS`, `CONSTANT` and `DISTS.CODES` control which classification, methods and parameters to use for generating the Newick trees. | ||
| +These are very specific to the current implementation but can be used to extend this work to other classifications of branch length methods.s | ||
| + | ||
| +## Possible bugs! Please report them! | ||
| + | ||
| +Please note that even if the `R` code is relatively well-tested there might be bugs or other issues! | ||
| +So, please use these with caution and any comments, suggestions or bug reports are welcome, either through GitHub's own issue reporting facilities or by e-mail to <Dan.Dediu@mpi.nl>. | ||
| + | ||
| + | ||
| +## Thank you | ||
| Dan Dediu | ||
| -August 2015 | ||
| +The Netherlands | ||
| + | ||
| +October 2015 | ||
Oops, something went wrong.
0 comments on commit
c8c92f0