Interview Summary Report #23

LydiaFrance · 2022-03-02T16:50:12Z

We spoke to multiple group leaders at the Crick about their experiences in running projects with computational biology, using specific tools, and their opinions of projects and work within the Crick Institute.

Short notable points from each meeting:

Radoslav Enchev

Structural biologist working in a wet lab, and excited about the future ability to implement different tools which can predict or help delineate structures which are experimentally soft.
Current no machine learning methods in his group
Generating data sets form wet lab experiments that will help develop and train machine learning tools
Data generated at the Crick has no meta data or anything that could make it useable for machine learning in the future, the data is effectively lost
5Pb of data during the Crick's lifetime/5 years, how valuable that would be to train models if it was standardised with future-proofing
Having reproducible pipelines and anticipating future advances
Younger researchers don't go to textbooks, they contact other researchers directly when faced with a problem in a computational method, diving into the unknown
Separation between the bioinformatics services and experimentalists, don't see the steps behind the computational methods

Victor Tybulewicz

Works with bulk RNA sequencing and single cell RNA sequencing methods
Oursources the computational analysis to a bioinformatics team outside of his group
He took non-technical training to get a better view of the projects he was supervising
He said that more junior and younger scientists don't see the divide between computational and non-computational projects
Lots of ECRs and PhDs trying to teach themselves tools and needing help, one of his student is trying to start a computational project soon
He was not very familiar with version control or specific methods or how to supervise directly, AI is just buzzwords
The computational models in papers are inaccessible, as well as tools like computer vision solutions

Evangenline Corcoran

a quantitative ecologist applying machine learning and advanced statistical models to environmental science and ecology fields
Issues with onboarding new members with different backgrounds
Best practices for computational projects is hard to learn in theory without having experienced the project work
Making shared resources and databases more accessible for biologists without technical backgrounds
Code reviewing and testing the assumptions of an analysis pipeline
Fears of her colleagues about releasing code that isn't perfect and they're not an expert in computational biology
Wanting to share code but can't share data, and so using notebooks to at least show the data and steps

Francesca Ciccarell

Using statistics to infer signals from genomic (big) data
Self trained computational biologist, brought in expertise in machine learning into the group
Lab group is a mix of different expertise and disciplines
ecosystem right where people lean on each other based on their own expertise because it's just not possible to pick up an expert and every single dimension, you have to you have to have that reliance
Getting different groups to work together and lots of discussions

Jim Maas

40 years research experience in computational biology
Can't just shortcut people into expertise in computational methods, there needs to be a lot of time and energy over a long period of time
You need to be realistic and the people need to be realistic about how much the volume of knowledge and information they have to pick up if they want to become even competent at commenting on many of these areas in machine learning or computational biology.

Florencia Lacaruso

Neurobiologist working with computational methods to measure and analyse neurone signals
I have not developed any tools or new methods to analyze it, but I'm always searching for new methods on how to analyze my data.
Bandwidth problem for directly helping members of the lab to code, especially in a large group (11 people)
Keeping on top of all the evolving landscape gets really, really complicated
Senior researchers are not used to being able to contact people through GitHub with pull requests and reporting bugs
Releasing data with publication, but not before.

Important points:

Future proofing data and this is specifically problematic in the Crick
Generational divide where younger scientists don't separate out computational with non-computational, and are more comfortable with tools and open communities
Groups with people from different expertise and communciation/supervisions is difficult, lack of bandwidth and knowhow to directly help
Groups without computational experts and the work is outsourced, not knowing what is happening in the pipeline, relying on collaborations and on people who are not integrated with the project as a whole
Groups without computational methods, missing out on new techniques, not understanding the changing landscape
There are no shortcuts to becoming an expert and no course can get someone up to speed with machine learning or computational biology expertise. But tools can be used by anyone.
Lessons about best practices can be too theoretical until actually trying out collaborations and projects.

malvikasharan · 2022-03-02T19:13:44Z

Lydia, this looks so fantastic <3 thank you so much for putting this together. I have added the link to thsi document in the report and will crosslink to the training material repo as well.

LydiaFrance assigned malvikasharan Mar 2, 2022

malvikasharan mentioned this issue Mar 2, 2022

Include feedback and examples from external experts in the lessson carpentries-incubator/managing-computational-projects#19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interview Summary Report #23

Interview Summary Report #23

LydiaFrance commented Mar 2, 2022

malvikasharan commented Mar 2, 2022

Interview Summary Report #23

Interview Summary Report #23

Comments

LydiaFrance commented Mar 2, 2022

malvikasharan commented Mar 2, 2022