Permalink
Browse files

Initial release including the R code, the input data, the outputs, as…

… well as the paper describing the process and the release v1.0
  • Loading branch information...
1 parent 052ade4 commit c8c92f03dfbbc7969a5c96129e48134190ff513c @ddediu committed Oct 23, 2015
Showing with 990,724 additions and 13 deletions.
  1. +45 −11 README.md
  2. +2,147 −0 code/FamilyTrees.R
  3. +2 −0 code/README.md
  4. +1,084 −0 code/StandardizedTrees.R
  5. +10 −0 input/autotyp/README.txt
  6. +2,927 −0 input/autotyp/autotyp-trees.csv
  7. BIN input/distances/ASJP/ASJPSoftware003.zip
  8. +48 −0 input/distances/ASJP/ReadMe.txt
  9. +576 −0 input/distances/ASJP/conversion.log
  10. +339 −0 input/distances/ASJP/gpl-2.0.txt
  11. BIN input/distances/ASJP/listss15.txt.tar.xz
  12. BIN input/distances/ASJP/listss16-and-listss16dd.tar.xz
  13. +138 −0 input/distances/ASJP/process-asjp15-distances.R
  14. +154 −0 input/distances/ASJP/process-asjp16-distances.R
  15. +4 −0 input/distances/AUTOTYP/ReadMe.txt
  16. BIN input/distances/AUTOTYP/autotyp-dist.RData
  17. +512 −0 input/distances/MG2015/.Rhistory
  18. BIN input/distances/MG2015/MG2015-autotyp-alpha=0.69.RData
  19. BIN input/distances/MG2015/MG2015-ethnologue-alpha=0.69.RData
  20. BIN input/distances/MG2015/MG2015-glottolog-alpha=0.69.RData
  21. BIN input/distances/MG2015/MG2015-wals-alpha=0.69.RData
  22. +6 −0 input/distances/MG2015/ReadMe.txt
  23. +161 −0 input/distances/MG2015/compute MG2015.R
  24. +13 −0 input/distances/WALS/ReadMe.txt
  25. +339 −0 input/distances/WALS/gpl-2.0.txt
  26. +144 −0 input/distances/WALS/process-wals-distances.R
  27. +7,480 −0 input/ethnologue/LanguageCodes.tab
  28. BIN input/ethnologue/Language_Code_Data_20140425.zip
  29. +7,880 −0 input/ethnologue/iso-639-3_20140320.tab
  30. +4 −0 input/glottolog/ReadMe.txt
  31. +1 −0 input/glottolog/glottocodes2iso.json
  32. +435 −0 input/glottolog/tree-glottolog-newick.txt
  33. +14 −0 input/wals/README.txt
  34. +2,680 −0 input/wals/language.csv
  35. +404 −0 output/autotyp/autotyp-newick-constant=1.00.csv
  36. +404 −0 output/autotyp/autotyp-newick-ga+asjp16.csv
  37. +404 −0 output/autotyp/autotyp-newick-ga+autotyp.csv
  38. +404 −0 output/autotyp/autotyp-newick-ga+geo.csv
  39. +404 −0 output/autotyp/autotyp-newick-ga+mg2015(autotyp).csv
  40. +404 −0 output/autotyp/autotyp-newick-ga+wals(euclidean).csv
  41. +404 −0 output/autotyp/autotyp-newick-ga+wals(euclidean,mode).csv
  42. +404 −0 output/autotyp/autotyp-newick-ga+wals(gower).csv
  43. +404 −0 output/autotyp/autotyp-newick-ga+wals(gower,mode).csv
  44. +404 −0 output/autotyp/autotyp-newick-grafen.csv
  45. +404 −0 output/autotyp/autotyp-newick-nj+asjp16.csv
  46. +404 −0 output/autotyp/autotyp-newick-nj+autotyp.csv
  47. +404 −0 output/autotyp/autotyp-newick-nj+geo.csv
  48. +404 −0 output/autotyp/autotyp-newick-nj+mg2015(autotyp).csv
  49. +404 −0 output/autotyp/autotyp-newick-nj+wals(euclidean).csv
  50. +404 −0 output/autotyp/autotyp-newick-nj+wals(euclidean,mode).csv
  51. +404 −0 output/autotyp/autotyp-newick-nj+wals(gower).csv
  52. +404 −0 output/autotyp/autotyp-newick-nj+wals(gower,mode).csv
  53. +404 −0 output/autotyp/autotyp-newick-nnls+asjp16.csv
  54. +404 −0 output/autotyp/autotyp-newick-nnls+autotyp.csv
  55. +404 −0 output/autotyp/autotyp-newick-nnls+geo.csv
  56. +404 −0 output/autotyp/autotyp-newick-nnls+mg2015(autotyp).csv
  57. +404 −0 output/autotyp/autotyp-newick-nnls+wals(euclidean).csv
  58. +404 −0 output/autotyp/autotyp-newick-nnls+wals(euclidean,mode).csv
  59. +404 −0 output/autotyp/autotyp-newick-nnls+wals(gower).csv
  60. +404 −0 output/autotyp/autotyp-newick-nnls+wals(gower,mode).csv
  61. +404 −0 output/autotyp/autotyp-newick-proportional=1.00.csv
  62. +404 −0 output/autotyp/autotyp-newick.csv
  63. +4,277 −0 output/autotyp/autotyp-nexus-constant=1.00.nex
  64. +2,736 −0 output/autotyp/autotyp-nexus-ga+asjp16.nex
  65. +3,426 −0 output/autotyp/autotyp-nexus-ga+autotyp.nex
  66. +3,430 −0 output/autotyp/autotyp-nexus-ga+geo.nex
  67. +3,588 −0 output/autotyp/autotyp-nexus-ga+mg2015(autotyp).nex
  68. +2,888 −0 output/autotyp/autotyp-nexus-ga+wals(euclidean).nex
  69. +3,014 −0 output/autotyp/autotyp-nexus-ga+wals(euclidean,mode).nex
  70. +2,888 −0 output/autotyp/autotyp-nexus-ga+wals(gower).nex
  71. +3,014 −0 output/autotyp/autotyp-nexus-ga+wals(gower,mode).nex
  72. +4,277 −0 output/autotyp/autotyp-nexus-grafen.nex
  73. +2,152 −0 output/autotyp/autotyp-nexus-nj+asjp16.nex
  74. +237 −0 output/autotyp/autotyp-nexus-nj+autotyp.nex
  75. +2,684 −0 output/autotyp/autotyp-nexus-nj+geo.nex
  76. +2,742 −0 output/autotyp/autotyp-nexus-nj+mg2015(autotyp).nex
  77. +696 −0 output/autotyp/autotyp-nexus-nj+wals(euclidean).nex
  78. +2,359 −0 output/autotyp/autotyp-nexus-nj+wals(euclidean,mode).nex
  79. +696 −0 output/autotyp/autotyp-nexus-nj+wals(gower).nex
  80. +2,359 −0 output/autotyp/autotyp-nexus-nj+wals(gower,mode).nex
  81. +2,758 −0 output/autotyp/autotyp-nexus-nnls+asjp16.nex
  82. +1,979 −0 output/autotyp/autotyp-nexus-nnls+autotyp.nex
  83. +3,452 −0 output/autotyp/autotyp-nexus-nnls+geo.nex
  84. +3,588 −0 output/autotyp/autotyp-nexus-nnls+mg2015(autotyp).nex
  85. +1,704 −0 output/autotyp/autotyp-nexus-nnls+wals(euclidean).nex
  86. +3,042 −0 output/autotyp/autotyp-nexus-nnls+wals(euclidean,mode).nex
  87. +1,704 −0 output/autotyp/autotyp-nexus-nnls+wals(gower).nex
  88. +3,042 −0 output/autotyp/autotyp-nexus-nnls+wals(gower,mode).nex
  89. +4,277 −0 output/autotyp/autotyp-nexus-proportional=1.00.nex
  90. +4,277 −0 output/autotyp/autotyp-nexus.nex
  91. +17,257 −0 output/code_mappings_iso_wals_autotyp_glottolog.csv
  92. +148 −0 output/ethnologue/ethnologue-newick-constant=1.00.csv
  93. +148 −0 output/ethnologue/ethnologue-newick-ga+asjp16.csv
  94. +148 −0 output/ethnologue/ethnologue-newick-ga+autotyp.csv
  95. +148 −0 output/ethnologue/ethnologue-newick-ga+geo.csv
  96. +148 −0 output/ethnologue/ethnologue-newick-ga+mg2015(ethnologue).csv
  97. +148 −0 output/ethnologue/ethnologue-newick-ga+wals(euclidean).csv
  98. +148 −0 output/ethnologue/ethnologue-newick-ga+wals(euclidean,mode).csv
  99. +148 −0 output/ethnologue/ethnologue-newick-ga+wals(gower).csv
  100. +148 −0 output/ethnologue/ethnologue-newick-ga+wals(gower,mode).csv
  101. +148 −0 output/ethnologue/ethnologue-newick-grafen.csv
  102. +148 −0 output/ethnologue/ethnologue-newick-nj+asjp16.csv
  103. +148 −0 output/ethnologue/ethnologue-newick-nj+autotyp.csv
  104. +148 −0 output/ethnologue/ethnologue-newick-nj+geo.csv
  105. +148 −0 output/ethnologue/ethnologue-newick-nj+mg2015(ethnologue).csv
  106. +148 −0 output/ethnologue/ethnologue-newick-nj+wals(euclidean).csv
  107. +148 −0 output/ethnologue/ethnologue-newick-nj+wals(euclidean,mode).csv
  108. +148 −0 output/ethnologue/ethnologue-newick-nj+wals(gower).csv
  109. +148 −0 output/ethnologue/ethnologue-newick-nj+wals(gower,mode).csv
  110. +148 −0 output/ethnologue/ethnologue-newick-nnls+asjp16.csv
  111. +148 −0 output/ethnologue/ethnologue-newick-nnls+autotyp.csv
  112. +148 −0 output/ethnologue/ethnologue-newick-nnls+geo.csv
  113. +148 −0 output/ethnologue/ethnologue-newick-nnls+mg2015(ethnologue).csv
  114. +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(euclidean).csv
  115. +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(euclidean,mode).csv
  116. +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(gower).csv
  117. +148 −0 output/ethnologue/ethnologue-newick-nnls+wals(gower,mode).csv
  118. +148 −0 output/ethnologue/ethnologue-newick-proportional=1.00.csv
  119. +148 −0 output/ethnologue/ethnologue-newick.csv
  120. +10,495 −0 output/ethnologue/ethnologue-nexus-constant=1.00.nex
  121. +5,321 −0 output/ethnologue/ethnologue-nexus-ga+asjp16.nex
  122. +3,526 −0 output/ethnologue/ethnologue-nexus-ga+autotyp.nex
  123. +9,271 −0 output/ethnologue/ethnologue-nexus-ga+geo.nex
  124. +10,456 −0 output/ethnologue/ethnologue-nexus-ga+mg2015(ethnologue).nex
  125. +3,260 −0 output/ethnologue/ethnologue-nexus-ga+wals(euclidean).nex
  126. +3,309 −0 output/ethnologue/ethnologue-nexus-ga+wals(euclidean,mode).nex
  127. +3,260 −0 output/ethnologue/ethnologue-nexus-ga+wals(gower).nex
  128. +3,309 −0 output/ethnologue/ethnologue-nexus-ga+wals(gower,mode).nex
  129. +10,495 −0 output/ethnologue/ethnologue-nexus-grafen.nex
  130. +3,901 −0 output/ethnologue/ethnologue-nexus-nj+asjp16.nex
  131. +223 −0 output/ethnologue/ethnologue-nexus-nj+autotyp.nex
  132. +7,235 −0 output/ethnologue/ethnologue-nexus-nj+geo.nex
  133. +7,532 −0 output/ethnologue/ethnologue-nexus-nj+mg2015(ethnologue).nex
  134. +487 −0 output/ethnologue/ethnologue-nexus-nj+wals(euclidean).nex
  135. +2,316 −0 output/ethnologue/ethnologue-nexus-nj+wals(euclidean,mode).nex
  136. +487 −0 output/ethnologue/ethnologue-nexus-nj+wals(gower).nex
  137. +2,316 −0 output/ethnologue/ethnologue-nexus-nj+wals(gower,mode).nex
  138. +5,321 −0 output/ethnologue/ethnologue-nexus-nnls+asjp16.nex
  139. +718 −0 output/ethnologue/ethnologue-nexus-nnls+autotyp.nex
  140. +9,271 −0 output/ethnologue/ethnologue-nexus-nnls+geo.nex
  141. +10,456 −0 output/ethnologue/ethnologue-nexus-nnls+mg2015(ethnologue).nex
  142. +832 −0 output/ethnologue/ethnologue-nexus-nnls+wals(euclidean).nex
  143. +3,309 −0 output/ethnologue/ethnologue-nexus-nnls+wals(euclidean,mode).nex
  144. +832 −0 output/ethnologue/ethnologue-nexus-nnls+wals(gower).nex
  145. +3,309 −0 output/ethnologue/ethnologue-nexus-nnls+wals(gower,mode).nex
  146. +10,495 −0 output/ethnologue/ethnologue-nexus-proportional=1.00.nex
  147. +10,495 −0 output/ethnologue/ethnologue-nexus.nex
  148. +436 −0 output/glottolog/glottolog-newick-constant=1.00.csv
  149. +436 −0 output/glottolog/glottolog-newick-ga+asjp16.csv
  150. +436 −0 output/glottolog/glottolog-newick-ga+autotyp.csv
  151. +436 −0 output/glottolog/glottolog-newick-ga+geo.csv
  152. +436 −0 output/glottolog/glottolog-newick-ga+mg2015(glottolog).csv
  153. +436 −0 output/glottolog/glottolog-newick-ga+wals(euclidean).csv
  154. +436 −0 output/glottolog/glottolog-newick-ga+wals(euclidean,mode).csv
  155. +436 −0 output/glottolog/glottolog-newick-ga+wals(gower).csv
  156. +436 −0 output/glottolog/glottolog-newick-ga+wals(gower,mode).csv
  157. +436 −0 output/glottolog/glottolog-newick-grafen.csv
  158. +436 −0 output/glottolog/glottolog-newick-nj+asjp16.csv
  159. +436 −0 output/glottolog/glottolog-newick-nj+autotyp.csv
  160. +436 −0 output/glottolog/glottolog-newick-nj+geo.csv
  161. +436 −0 output/glottolog/glottolog-newick-nj+mg2015(glottolog).csv
  162. +436 −0 output/glottolog/glottolog-newick-nj+wals(euclidean).csv
  163. +436 −0 output/glottolog/glottolog-newick-nj+wals(euclidean,mode).csv
  164. +436 −0 output/glottolog/glottolog-newick-nj+wals(gower).csv
  165. +436 −0 output/glottolog/glottolog-newick-nj+wals(gower,mode).csv
  166. +436 −0 output/glottolog/glottolog-newick-nnls+asjp16.csv
  167. +436 −0 output/glottolog/glottolog-newick-nnls+autotyp.csv
  168. +436 −0 output/glottolog/glottolog-newick-nnls+geo.csv
  169. +436 −0 output/glottolog/glottolog-newick-nnls+mg2015(glottolog).csv
  170. +436 −0 output/glottolog/glottolog-newick-nnls+wals(euclidean).csv
  171. +436 −0 output/glottolog/glottolog-newick-nnls+wals(euclidean,mode).csv
  172. +436 −0 output/glottolog/glottolog-newick-nnls+wals(gower).csv
  173. +436 −0 output/glottolog/glottolog-newick-nnls+wals(gower,mode).csv
  174. +436 −0 output/glottolog/glottolog-newick-proportional=1.00.csv
  175. +436 −0 output/glottolog/glottolog-newick.csv
  176. +23,017 −0 output/glottolog/glottolog-nexus-constant=1.00.nex
  177. +3,331 −0 output/glottolog/glottolog-nexus-ga+asjp16.nex
  178. +1,820 −0 output/glottolog/glottolog-nexus-ga+autotyp.nex
  179. +7,232 −0 output/glottolog/glottolog-nexus-ga+geo.nex
  180. +22,534 −0 output/glottolog/glottolog-nexus-ga+mg2015(glottolog).nex
  181. +1,629 −0 output/glottolog/glottolog-nexus-ga+wals(euclidean).nex
  182. +1,721 −0 output/glottolog/glottolog-nexus-ga+wals(euclidean,mode).nex
  183. +1,629 −0 output/glottolog/glottolog-nexus-ga+wals(gower).nex
  184. +1,721 −0 output/glottolog/glottolog-nexus-ga+wals(gower,mode).nex
  185. +23,017 −0 output/glottolog/glottolog-nexus-grafen.nex
  186. +2,021 −0 output/glottolog/glottolog-nexus-nj+asjp16.nex
  187. +192 −0 output/glottolog/glottolog-nexus-nj+autotyp.nex
  188. +4,651 −0 output/glottolog/glottolog-nexus-nj+geo.nex
  189. +15,738 −0 output/glottolog/glottolog-nexus-nj+mg2015(glottolog).nex
  190. +235 −0 output/glottolog/glottolog-nexus-nj+wals(euclidean).nex
  191. +1,018 −0 output/glottolog/glottolog-nexus-nj+wals(euclidean,mode).nex
  192. +235 −0 output/glottolog/glottolog-nexus-nj+wals(gower).nex
  193. +1,018 −0 output/glottolog/glottolog-nexus-nj+wals(gower,mode).nex
  194. +3,331 −0 output/glottolog/glottolog-nexus-nnls+asjp16.nex
  195. +509 −0 output/glottolog/glottolog-nexus-nnls+autotyp.nex
  196. +7,232 −0 output/glottolog/glottolog-nexus-nnls+geo.nex
  197. +22,534 −0 output/glottolog/glottolog-nexus-nnls+mg2015(glottolog).nex
  198. +558 −0 output/glottolog/glottolog-nexus-nnls+wals(euclidean).nex
  199. +1,734 −0 output/glottolog/glottolog-nexus-nnls+wals(euclidean,mode).nex
  200. +558 −0 output/glottolog/glottolog-nexus-nnls+wals(gower).nex
  201. +1,734 −0 output/glottolog/glottolog-nexus-nnls+wals(gower,mode).nex
  202. +23,017 −0 output/glottolog/glottolog-nexus-proportional=1.00.nex
  203. +23,017 −0 output/glottolog/glottolog-nexus.nex
  204. +420,850 −0 output/tree_comparisons_between_methods.csv
  205. +215 −0 output/wals/wals-newick-constant=1.00.csv
  206. +215 −0 output/wals/wals-newick-ga+asjp16.csv
  207. +215 −0 output/wals/wals-newick-ga+autotyp.csv
  208. +215 −0 output/wals/wals-newick-ga+geo.csv
  209. +215 −0 output/wals/wals-newick-ga+mg2015(wals).csv
  210. +215 −0 output/wals/wals-newick-ga+wals(euclidean).csv
  211. +215 −0 output/wals/wals-newick-ga+wals(euclidean,mode).csv
  212. +215 −0 output/wals/wals-newick-ga+wals(gower).csv
  213. +215 −0 output/wals/wals-newick-ga+wals(gower,mode).csv
  214. +215 −0 output/wals/wals-newick-grafen.csv
  215. +215 −0 output/wals/wals-newick-nj+asjp16.csv
  216. +215 −0 output/wals/wals-newick-nj+autotyp.csv
  217. +215 −0 output/wals/wals-newick-nj+geo.csv
  218. +215 −0 output/wals/wals-newick-nj+mg2015(wals).csv
  219. +215 −0 output/wals/wals-newick-nj+wals(euclidean).csv
  220. +215 −0 output/wals/wals-newick-nj+wals(euclidean,mode).csv
  221. +215 −0 output/wals/wals-newick-nj+wals(gower).csv
  222. +215 −0 output/wals/wals-newick-nj+wals(gower,mode).csv
  223. +215 −0 output/wals/wals-newick-nnls+asjp16.csv
  224. +215 −0 output/wals/wals-newick-nnls+autotyp.csv
  225. +215 −0 output/wals/wals-newick-nnls+geo.csv
  226. +215 −0 output/wals/wals-newick-nnls+mg2015(wals).csv
  227. +215 −0 output/wals/wals-newick-nnls+wals(euclidean).csv
  228. +215 −0 output/wals/wals-newick-nnls+wals(euclidean,mode).csv
  229. +215 −0 output/wals/wals-newick-nnls+wals(gower).csv
  230. +215 −0 output/wals/wals-newick-nnls+wals(gower,mode).csv
  231. +215 −0 output/wals/wals-newick-proportional=1.00.csv
  232. +215 −0 output/wals/wals-newick.csv
  233. +3,404 −0 output/wals/wals-nexus-constant=1.00.nex
  234. +2,407 −0 output/wals/wals-nexus-ga+asjp16.nex
  235. +2,637 −0 output/wals/wals-nexus-ga+autotyp.nex
  236. +3,012 −0 output/wals/wals-nexus-ga+geo.nex
  237. +3,089 −0 output/wals/wals-nexus-ga+mg2015(wals).nex
  238. +2,964 −0 output/wals/wals-nexus-ga+wals(euclidean).nex
  239. +3,089 −0 output/wals/wals-nexus-ga+wals(euclidean,mode).nex
  240. +2,964 −0 output/wals/wals-nexus-ga+wals(gower).nex
  241. +3,089 −0 output/wals/wals-nexus-ga+wals(gower,mode).nex
  242. +3,404 −0 output/wals/wals-nexus-grafen.nex
  243. +2,052 −0 output/wals/wals-nexus-nj+asjp16.nex
  244. +291 −0 output/wals/wals-nexus-nj+autotyp.nex
  245. +2,513 −0 output/wals/wals-nexus-nj+geo.nex
  246. +2,530 −0 output/wals/wals-nexus-nj+mg2015(wals).nex
  247. +310 −0 output/wals/wals-nexus-nj+wals(euclidean).nex
  248. +2,530 −0 output/wals/wals-nexus-nj+wals(euclidean,mode).nex
  249. +310 −0 output/wals/wals-nexus-nj+wals(gower).nex
  250. +2,530 −0 output/wals/wals-nexus-nj+wals(gower,mode).nex
  251. +2,429 −0 output/wals/wals-nexus-nnls+asjp16.nex
  252. +1,466 −0 output/wals/wals-nexus-nnls+autotyp.nex
  253. +3,016 −0 output/wals/wals-nexus-nnls+geo.nex
  254. +3,089 −0 output/wals/wals-nexus-nnls+mg2015(wals).nex
  255. +1,637 −0 output/wals/wals-nexus-nnls+wals(euclidean).nex
  256. +3,089 −0 output/wals/wals-nexus-nnls+wals(euclidean,mode).nex
  257. +1,637 −0 output/wals/wals-nexus-nnls+wals(gower).nex
  258. +3,089 −0 output/wals/wals-nexus-nnls+wals(gower,mode).nex
  259. +3,404 −0 output/wals/wals-nexus-proportional=1.00.nex
  260. +3,404 −0 output/wals/wals-nexus.nex
  261. +1 −1 paper/README.md
  262. +751 −0 paper/family-trees-with-brlength.Rmd
  263. +344 −0 paper/family-trees-with-brlength.bib
  264. +4,802 −0 paper/family-trees-with-brlength.html
  265. BIN paper/family-trees-with-brlength.pdf
  266. +1 −1 releases/README.md
  267. BIN releases/v1.0.tar.xz
View
@@ -1,27 +1,61 @@
-# lgfam-newick
-Language family classifications as Newick trees
+# lgfam-newick: Language family classifications as Newick trees
-This repository contains the data and code associated with the **PAPER**.
-The code is released under GPL v2, but the various pieces of input data might be govered by different licenses (specified in the respective folders).
+## Summary
+
+This repository contains the data, R code, outputs and description of a flexible method for generating standardized [Newick](http://evolution.genetics.washington.edu/phylip/newicktree.html) language family trees with branch lengths from the four most used language classification databases: [Ethnologue](http://www.ethnologue.com/), [WALS](http://wals.info/), [AUTOTYP](http://www.autotyp.uzh.ch/) and [Glottolog](http://glottolog.org/).
+The code is released under [GPL v2](http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html), but the various pieces of input data might be governed by different licenses (specified in the respective folders).
The aims of this project are to:
a) provide several well-known linguistic (genealogical) classifications (currently [WALS](http://wals.info/), [Ethnologue](http://www.ethnologue.com/), [Glottolog](http://glottolog.org/) and [AUTOTYP](http://www.autotyp.uzh.ch/)) in the *de facto* standard [Newick format](https://en.wikipedia.org/wiki/Newick_format), and
b) offer a set of [`R`](http://www.r-project.org/) `S3` classes and functions for reading, converting, writing and working with language family trees.
-The accompanying **PAPER** describes in detail the data sources and conversion process.
+## Accompanying paper, outputs and acknowledging this work
+
+The **accompanying paper** (in the `./paper/` directory) describes in detail the data sources and the conversion process.
+The paper itself is written in [`R Markdown`](http://rmarkdown.rstudio.com/) and can be compiled to PDF (the primary output in the `family-trees-with-brlength.pdf` file) or HTML (the `family-trees-with-brlength.html` file).
+
+The actual Newick trees with branch lengths are in the `./output/` directory and can be used directly (the file formats are described in the **accompanying paper** but briefly they come as **CSV TAB-separated files** and equivalent **Nexus files** that contain the language family trees in the **Newick format**; the file name gives details about the classification, method and parameters used to compute the topology and branch lengths).
+
+If you use (parts of) the `R` scripts and/or the generated Newick trees, please do cite this in your work and provide links to this repository ([https://github.com/ddediu/lgfam-newick](https://github.com/ddediu/lgfam-newick))!
+
+
+## Releases
+
+"Official" releases can be found in the `./relases` directory.
-This repository contains:
-a) the original data and code associated with the **PAPER**, but also
-b) updates and bugfixes concerning the data and code.
+## Running the `R` code
-If you find this useful please cite the **PAPER** in your work!
+If you are **trying to run the `R` code yourself**, please note that I have removed some of the large cached intermediary results (in order to save space).
+Thus, you must first generate these cached data, as follows.
-Thank you,
+Run the `./input/distances/WALS/process-wals-distances.R` script to generate the WALS-based distance matrices.
+
+Run the `./input/distances/ASJP/process-asjp16-distances.R` script to generate the ASJP16-based distance matrix.
+
+Run the `./code/StandardizedTrees.R` main `R` script with the following parameters set to `TRUE`: `MATCH_CODES` (compute the equivalences between the ISO, WALS, AUTOTYP and GLOTTOLOG codes and generate the UULIDs), 'PREOPTIMIZE_DISTS' (pre-optimize the distance matrices for fast loading when required), `COMPUTE_GEO_DISTS` (compute the geographic distances between languages).
+For later runs (after these data has been generated and cached) these parameters can be safely set to `FALSE` (this pre-processing is computationally very expensive).
+The parameters `TRANSFORM_TREES` (transform the trees from their original specific representation to the Newick notation no branch length), `EXPORT_NEXUS` (export the trees to a NEXUS file), `EXPORT_NEXUS_TRANSLATE_BLOCK` (when exporting NEXUS files, generate a TRANSLATE block; useful when using programs such as BayesTraits that have issues parsing complicated taxa names), `EXPORT_CSV` (export the trees to a CSV file) can be left on `TRUE` (except perhaps the first as the tree topologies will probably not change very often in the original databases).
+Please note that the first time the Ethnologue tree topologies are transformed to Newick, these will be downloaded from the Ethnologue website and cached locally.
+The last two parameters are `COMPUTE_BRLEN` (apply the various branch length methods to the Newick topologies) and `COMPARE_TREES` (compute the distance between equivalent trees).
+Finally, `CPU_CORES` controls multi-core processing (using `mclapply` -- might not work on Windows!).
+(It is a good idea to leave `quotes="'"`).
+Parameters `CLASSIFICATIONS`, `METHODS`, `CONSTANT` and `DISTS.CODES` control which classification, methods and parameters to use for generating the Newick trees.
+These are very specific to the current implementation but can be used to extend this work to other classifications of branch length methods.s
+
+## Possible bugs! Please report them!
+
+Please note that even if the `R` code is relatively well-tested there might be bugs or other issues!
+So, please use these with caution and any comments, suggestions or bug reports are welcome, either through GitHub's own issue reporting facilities or by e-mail to <Dan.Dediu@mpi.nl>.
+
+
+## Thank you
Dan Dediu
-August 2015
+The Netherlands
+
+October 2015
Oops, something went wrong.

0 comments on commit c8c92f0

Please sign in to comment.