Reproducible Research Resources
This repository contains information to help you make your research reproducible.
This is a public repository and all resources are available under a CC-BY licence. This means you're free to share them in any medium or format and to adapt them to your needs (even commercially) so long as you acknowledge Kirstie Whitaker, Martin O'Reilly and the Turing Reproducible Research contributing team.
Table of contents
- Announcement: Turing Reproducible Research Champions
- FAQ: Open Access
- Reproducible Research Lunches [Every other Monday at 1pm, usually in Mary Shelley/Isaac Asimov meeting rooms]
Announcement: Turing Reproducible Research Champions
Thursday 24th May 2018
We are delighted to announce our three Turing Reproducible Research Champions: Theo Damoulas, Elena Kochkina and Terry Lyons.
Each of our Champions has proposed an example of their work that they want to “level up” in terms of reproducibility - over the coming months, Kirstie Whitaker and the Research Engineering Group will build up a set of resources that can be used to reproduce the work in the selected papers.
We’re really excited to be working on such a range of projects, each of which presents a different challenge in terms of reproducibility.
Congratulations to all our Champions!
Regular updates about each of the Champions will be posted on their separate pages and on the #reproducible-research Slack channel. Further details about the Champions programme can be found here.
Spatio-temporal Bayesian on-line changepoint detection with model selection
In collaboration with Jeremias Knoblauch
We develop probabilistic algorithms for modelling and predicting non-stationary processes (such as e.g. air pollution, financial or other urban processes) across spatio-temporal domains. This specific family of algorithms can be thought of as segmenting a complex dynamic process via more manageable local models. Furthermore, we better exploit spatio-temporal correlation and retain multiple models and multiple potential segmentations in a formal probabilistic manner.
Turing at SemEval-2017 Task 8: Sequential approach to rumour stance classification with branch-LSTM
In collaboration with Maria Liakata and Isabelle Augenstein
In this paper we deal with rumour stance classification, the task of determining the attitude of the users discussing a rumour towards the truthfulness of the rumour. Stance classification is considered to be an important step towards rumour verification, therefore performing well in this task is expected to be useful in debunking false rumours. We propose a LSTM-based sequential model that, through modelling the conversational structure of tweets, outperforms other systems submitted to the SemEval-2017 Task 8.
A signature-based machine learning model for distinguishing bipolar disorder and borderline personality disorder
In collaboration with Imanol Perez Arribas, Guy Goodwin, John Geddes and Kate Saunders
The diagnosis and provision of feedback for psychiatric disorders is hampered by the dependency on narrative recall and the difficulty of defining their persistence over time. The shortcomings of current diagnostic approaches have motivated a ‘bottom up’ approach to monitoring using more objective data streams. However analysis of these data streams is very challenging. This paper demonstrates that it is possible to place people on a spectrum using simple mood zoom information (daily scores for several emotions); the dimension reduction and restructuring achieved by signatures meant that even with the small samples available one was able to use second order information (the order of different events: anger before depression, …) in addition to first order (intensity, longivity of depression) of events to give considerable classification power and allow the differing diagnoses of the communities participating in this trial to be well separated.
Given the size of the trial, and the complex noisy nature of the data this was an excellent outcome clinically and from a data science point of view. Speciﬁcally, we sought, and succeeded to a good degree, to classify the diagnosis of participants on the basis of their evolving mood and predict their mood the following day. The generality of the signature-based machine learning model allows these problems to be treated in similar and generic methodology that can be shared with other contexts where one is analysing complex multimodal data.
Thank you to all the members of the Turing Reproducible Research community.