03 Call Digest [ 2023-fred-hutch ] #11

stefaniebutland · 2023-09-28T16:47:25Z

Hi @Openscapes/2023-fred-hutch-cohort !

Our Cohort Call 03 on Tuesday felt great. You all shared so many valuable reflections in the doc and discussions. I's clear that as a group you have ideas and approaches that can help each other! Below is a light digest of Call 03 with a reminder of tasks for next time.

Here's our gorgeous Zoomie. We'll take another in Call 4 so Sean Kross can join.

Monica asked if/how people use Fred Hutch's Biomedical Data Science Wiki and we screenshared an example from the Data Science Lab Resource Library on De-identification of Structured Data. Sita is looking for communal resources (Fred Hutch or other public sources) for what's good / what are the rules for different types of data like single-cell or genomic data and they've agreed to connect about getting something on this into the Wiki! This is a great example of something that anyone join in on during Coworking.

We're moving our optional Coworking time to Thursdays (Oct 5 & 19) 10:00 - 11:00 am, based on a vote during the Call. Look for a Google Calendar invite today.

Need to review material? Agendas and recordings for every call, plus a folder with a Pathway template, can all be found in Openscapes_CohortCalls [ 2023-fred-hutch ]. Hard to access with your Hutch email? Let us know an alternative and we'll share with it.

Happy Thursday,

Stef, Julie, Sean, Monica, Liz

Digest: Cohort Call 03 [ 2023-fred-hutch ]

Openscapes_CohortCalls [ 2023-fred-hutch ] Google folder - contains agendas, recordings, pathways

https://openscapes.github.io/2023-fred-hutch - Cohort website

Goals: We discussed team culture and data strategies for future us

Tasks: please see the Agenda doc (under Closing) for details and links

Have a Seaside Chat (meet with your team) & continue your Pathway work, shifting to "Next Steps", think about onboarding/offboarding
(optional) Attend Coworking. We're moving our optional Coworking time to Thursdays (Oct 5 & 19) 10:00 - 11:00 am. Come prepared to get your own work done, ask questions, or listen in to other conversations. You could also use the time to hold your team Seaside Chat.

Slide Decks:

Team culture (slides)
Data strategies for Future Us (slides)

A few lines from shared notes in the Agenda doc

Colleagues say "building from what xx person said..." to show listening and amplifying each other +1
My boss would always introduce me to others as a "colleague" rather than an employee/post doc.
Having a guiding principle's document that discusses shared values
Having a team approach with people from different backgrounds and experiences help ppl feel more comfortable.
Public copy of data organization and naming README template for sequencing expt: CITE-seq of HSPC and Mature bone marrow mononuclear cells in Scl-Cre SRSF2 P95H mutant vs WT mice (thanks Elana!)
I have shared out the data organization in spreadsheets paper many, many, times! 🥳 🥳
This is important for me because I've been working with data from other labs, where the formatting is variable from project to project. Especially for metadata!
The point that stood out most to me during the discussion of tidydata was on slide 11, when talking about 'one-off approaches.' It is quite often that I find myself jumping to write a script to do something, even though I have a few other scripts that do effectively the same thing but with slightly different formats for many different pipelines. Perhaps standardizing inputs to our pipelines would mean less one-off scripts, and improve the ease of access to our computational tools for others in the future.
File naming conventions are a big issue
My first computational mentor placed a strong emphasis on readability, reproducibility, and compartmentalization of every data type and analysis done. Some examples include creating readmes for other lab members to know what is going on in a directory and creating web pages with data from newly annotated genomes.
Sometimes data is organized and perfect, other times it's messy in its own way. When it's ugly, I try to educate (share the data organization in spreadsheets article), and try to receive the data in the SAME ugly format if possible until we begin the next project with tidy data.
Generally, by the time I see our data it is minimally processed alignment files, which are tidy in the sense that their format is almost always standardized and ready to go through a standard data processing pipeline. However, the non-tidy part of this is that the changes made to this data upstream of us receiving it are, at times, not properly disclosed or documented, which can be a headache.
We point our new employees to many of the pages [in Fred Hutch's Biomedical Data Science Wiki] describing the Hutch resources and how-tos.
we have a wiki within our lab that describes our common workflows
Balance between having good csv plus documentation about that specific data set. But having a documentation tab not ideal for someone who just wants to ingest the data

stefaniebutland added the digest label Sep 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

03 Call Digest [ 2023-fred-hutch ] #11

03 Call Digest [ 2023-fred-hutch ] #11

stefaniebutland commented Sep 28, 2023 •

edited

03 Call Digest [ 2023-fred-hutch ] #11

03 Call Digest [ 2023-fred-hutch ] #11

Comments

stefaniebutland commented Sep 28, 2023 • edited

Digest: Cohort Call 03 [ 2023-fred-hutch ]

stefaniebutland commented Sep 28, 2023 •

edited