Fossils and open science: Modeling extinction risk of amphibians
I am a biologist working on extinction risk in amphibians at the Museum für Naturkunde in Berlin. Within my PhD project, I am trying to figure out which species are doing great, which are not, and what factors determine survival or extinction. I focus on amphibians, as they are currently the most endangered land vertebrate group and therefore of great interest for conservation. To do so, I am working with the fossil record of extinct species as well as with data from living species. Understanding the mechanisms behind extinctions, and identifying the key factors that are detrimental to survival, is of great interest as it helps decision making for conservation policies in terms of invested work power and money. In my fellow-program project, I create a model based on the fossil record of amphibians which identifies important traits that influence extinction risk. This model will be applied to living amphibians to compare the predicted extinction risk with the risk category they got assigned by the IUCN Red List. I aim at making my research more accessible by other researchers and the interested public by integrating two things in my work: reproducibility and community feedback.
Using the fossil record of amphibians (but also other taxa) has the advantage that we actually know how long a species with certain traits survived. We simply measure the distance between the oldest and the youngest occurrence of the species in the fossil record. The longer a species survived, the lower its extinction risk was. For living species we need another way to determine extinction risk, as their life span is obviously incomplete, they are not extinct yet. Traits that have been found to be of influence for extinction risk [2,3], like morphological characteristics (e.g., body size) or geographic range size, are taken into account for estimating their extinction risk. However, until the species eventually becomes extinct, one naturally cannot be sure if the extinction risk estimation is right. Looking at the fossil record of species with similar traits might improve our knowledge about the effect of those traits on living species. My model connects the duration of a species from the fossil record with its traits to find out how they influence extinction risk. In a next step, this model will be applied to living species to predict their extinction risk. To find the right model for my data, I am experimenting with different modeling techniques like simple generalized linear models, zero-inflated models, and also methodologically different models like Random Forest, which is a machine learning approach.
Open science hands-on experience
I am collecting a lot of data for my work, and part of the data comes from openly accessible databases like the Paleobiology Database, which has recently been assigned the CC BY 4.0 license. As I am using lots of freely available data, I want to keep my work accesible as well. That includes using free open source software like R for analysis, and also eventually publishing the results in an open access journal. Another open science aspect that I want to explore in my project is community feedback in early stages of manuscript writing, which most likely gives interesting ideas, and might even help make some decisions on technical aspects in my work. The first thing I did in my project was figuring out how scientific journals actually deal with preprint servers or the like. In my working environment preprints are still very uncommon, so I was surprised that a lot of the journals, that I am reading on a regular basis, do not only allow but even encourage authors to deposit early versions of articles in repositories or preprint servers.
The next step of my project was figuring out a convenient way to incorporate community feedback in my work flow. An initial step involved making my R scripts and some data freely available. As GitHub is probably the most prominent platform for code sharing, I decided to make parts of my data, and the R scripts I am working on, accessible in an online repository. So when I am working on these files, the newest version is uploaded to the webpage where people can either read, comment, or download them, and try the analysis for themselves. While getting familiar with Git for version control and also GitHub, I also discovered GitHub Pages. GitHub Pages hosts websites and is connected to your GitHub account, which is a nice solution to introduce a small project. The setup is quite simple and does not necessarily require advanced HTML knowledge, as the pages can be written using RMarkdown syntax. The code for the webpage is stored inside a Git-repository and can simply be copied and modified, as it is openly accessible. In a later phase of my project, I want to publish a first draft of my manuscript on a preprint server like bioRxiv to benefit from feedback, even before peer review takes place. As the biological sciences are lagging behind in the usage of preprints, this is likely going to be an interesting experience.
An encouragement for open science
I already gained a lot of insights into the ins and outs of open science via the fellow-program. At first, however, I felt a bit overwhelmed by the huge number of things you can integrate in your work flow, which open up your research for a broader audience. Publish or perish is the uncomfortable reality for most natural scientists, and the constant time pressure that most of us are experiencing easily leads to a feeling that you cannot add anything else to your work schedule. Competition, on the other hand, adds another aspect to the debate. We are in a situation in which scientists are in direct competition to each other when it comes to sparsely available grant money. Some of my colleagues were expressing concerns about theft of ideas or data. These issues might be, besides timely effort, a reason for scientists to hold back from making more use of some aspects of open science.
While there is probably no way to integrate something new into the work flow without it taking up some time to learn at first, there is always the option to start small and less time consuming. One step might be choosing a journal that is open access or at least offers open access publications, which can be read by everyone and adds to the availability and distribution of your research. Ensuring reproducibility as another aspect of is not only good scientific practice, but adds as well to the distribution and likely impact of your research and enables other scientists to build upon this work. A good way to start this process might be to choose one aspect of your work and improve it. I found the vienna principles to be a good inspiration.
Making research more widely accessible likely increases its visibility, in turn creating more impact as well as resulting in new ideas and collaborations. When using the option for preprints, an early stage feedback can even speed up the review process. If you are interested in how a project webpage can look like, or if you are interested in the modeling progress, you can visit my page here.
- Baillie, J. E. M., Griffiths, J., Turvey, S. T., Loh, J. & Collen, B. 2010 Evolution Lost: Status and trends of the world’s vertebrates. Zoological Society of London.
- Sodhi, N. S., Bickford, D., Diesmos, A. C., Lee, T. M., Koh, L. P., Brook, B. W., Sekercioglu, C. H. & Bradshaw, C. J. A. 2008 Measuring the meltdown: drivers of global amphibian extinction and decline. PLoS One 3, e1636. (doi:10.1371/journal.pone.0001636)
- Harnik, P. G. 2011 Direct and indirect effects of biological factors on extinction risk in fossil bivalves. Proc. Natl. Acad. Sci. U. S. A. 108, 13594–13599. (doi:10.1073/pnas.1100572108)