This repository contains online material from the AfricanNeo Project (Fortes-Lima et al. Nature 2023).
The genetic legacy of the expansion of Bantu-speaking peoples in Africa . Cesar A. Fortes-Lima, Concetta Burgarella, Rickard Hammarén, Anders Eriksson, Mário Vicente, Cecile Jolly, Armando Semo, Hilde Gunnink, Sara Pacchiarotti, Leon Mundeke, Igor Matonda, Joseph Koni Muluwa, Peter Coutros, Terry S. Nyambe, Cirhuza Justin Cikomola, Vinet Coetzee, Minique de Castro, Peter Ebbesen, Joris Delanghe, Mark Stoneking, Larry Barham, Marlize Lombard, Anja Meyer, Maryna Steyn, Helena Malmström, Jorge Rocha, Himla Soodyall, Brigitte Pakendorf, Koen Bostoen, and Carina M. Schlebusch. 2023. doi: 10.1038/s41586-023-06770-6.
Acronyms: BSP means “Bantu-speaking populations” and RHG means “African rainforest hunter-gatherers”
Of note, some figures were plotted together in the same file (e.g., different projections of the same PCA results). To visualise each figure from some panel you need to click on the taps at the top-corner left.
On the right side of the interactive plots there are different tools to scroll, zoom-in and zoom-out in the plot with the mouse (using Box Zoom, Wheel Zoom, Wheel Zoom on the y-axis, Zoom in on the x-axis, Undo, and Reset), or to Save the plots as a figure in high resolution (*.svg or *.png format).
Supplementary Fig. 2a | New and previously reported modern-day populations included in the AfricanNeo dataset
Supplementary Fig. 6a | UMAP approach on the basis of genotype data of the groups included in the AfricanNeo dataset
Supplementary Fig. 6b | UMAP approach on the basis of genotype data of the populations included in the AfricanNeo dataset
Supplementary Fig. 7 | PCA plots for all the Bantu-speaking populations included in the Only-BSP dataset
Supplementary Fig. 8 | PCA plots for selected sub-Saharan African groups included in the Only-African dataset
Supplementary Fig. 8 | PCA plots for selected sub-Saharan African populations included in the Only-African dataset
Supplementary Fig. 12a | PCA-UMAP approach on the basis of genotype data included in the AfricanNeo dataset
Supplementary Fig. 12b | PCA-UMAP approach on the basis of genotype data included in the AfricanNeo dataset
Supplementary Fig. 40 | All categories of ROH length for all the populations included in the AfricanNeo dataset
Supplementary Fig. 41 | All categories of ROH length only for BSP included in the AfricanNeo dataset
Supplementary Fig. 49b | Shaded areas of Suppl. Fig. 49a between the lower and the upper 95% confidence interval
Supplementary Fig. 83b | FST map for BSP included in the AfricanNeo dataset except for the Lozi population
Supplementary Fig. 96_right column | PCA highlighting with colors present-day groups and aDNA individuals.
Supplementary Fig. 97c | PCA-UMAP highlighting with colors present-day populations and aDNA individuals.
Cognates are lexical roots originating in one and the same ancestral lexeme which related languages share through inheritance from a common ancestor. Here, we provide updated cognacy judgments (5,355 in total) and a binary-coded root-meaning association matrix using Lexedata. This lexical dataset is available in Cross-Linguistic Data Format (CLDF) here: cognates.csv, cognatesets.csv, forms.csv, languages.csv, parameters.csv, sources.bib, and Wordlist-metadata.json.
-
This study was funded by the European Research Council (ERC) under the grant AfricanNeo (“The African Neolithic: A genetic perspective”; grant agreement ID: 759933), granted to Prof. Carina Schlebusch.
-
SNP array genotype data of modern-day African populations and whole-genome data of ancient DNA individuals generated in this project were made available through the European Genome-phenome Archive (EGA) data repository.
AfricanNeo aDNA Study EGA ID: EGAD00001011320. See https://ega-archive.org/datasets/EGAD00001011320
Whole-genome sequencing data from 12 Late Iron Age individuals (6 from Zambia and 6 from South Africa). Encrypted raw data are available for direct download in *.bam format.
AfricanNeo ModernDNA Study EGA ID: EGAS50000000006. See https://ega-archive.org/studies/EGAS50000000006
Genome-wide SNP data of 1,763 participants, including 1,526 Bantu speakers from 147 populations across 14 African countries. Encrypted raw autosomal data are available in *.tped and *.tfam format.
- AfricanNeo_A dataset, or Soodyall's Collection, 1027 samples: EGAD50000000008. See https://ega-archive.org/datasets/EGAD50000000008
- AfricanNeo_B dataset, or Pakendorf's Collection, 156 samples: EGAD50000000006. See https://ega-archive.org/datasets/EGAD50000000006
- AfricanNeo_C dataset, or Bostoen's Collection, 300 samples: EGAD50000000009. See https://ega-archive.org/datasets/EGAD50000000009
- AfricanNeo_D dataset, or Delanghe's Collection, 151 samples: EGAD50000000010. See https://ega-archive.org/datasets/EGAD50000000010
- AfricanNeo_E dataset, or Coetzee's Collection, 100 samples: EGAD50000000011. See https://ega-archive.org/datasets/EGAD50000000011
- AfricanNeo_F dataset, or Ebbesen's Collection, 29 samples: EGAD50000000007. See https://ega-archive.org/datasets/EGAD50000000007
AfricanNeo Data Access Committee ID: EGAC00001003398. See https://ega-archive.org/dacs/EGAC00001003398
Data Access Policy EGA ID: EGAP00001003469. See https://i-am-an-african.net/documents
Prof. Carina Schlebusch. Email: carina.schlebusch@ebc.uu.se
Human Evolution Program, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University Norbyvägen 18C, SE-752 36 Uppsala, Sweden
Cesar Fortes-Lima, PhD. Uppsala University, Sweden. Email: cesar.fortes-lima@ebc.uu.se
More scripts and figures are available upon request.