Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

03 Call Digest [ 2023-fred-hutch ] #11

Open
stefaniebutland opened this issue Sep 28, 2023 · 0 comments
Open

03 Call Digest [ 2023-fred-hutch ] #11

stefaniebutland opened this issue Sep 28, 2023 · 0 comments
Labels

Comments

@stefaniebutland
Copy link
Member

stefaniebutland commented Sep 28, 2023

Hi @Openscapes/2023-fred-hutch-cohort !

Our Cohort Call 03 on Tuesday felt great. You all shared so many valuable reflections in the doc and discussions. I's clear that as a group you have ideas and approaches that can help each other! Below is a light digest of Call 03 with a reminder of tasks for next time.

Here's our gorgeous Zoomie. We'll take another in Call 4 so Sean Kross can join. 

ChampionsCohort_Zoom   2023-fred-hutch

Monica asked if/how people use Fred Hutch's Biomedical Data Science Wiki and we screenshared an example from the Data Science Lab Resource Library on De-identification of Structured Data. Sita is looking for communal resources (Fred Hutch or other public sources) for what's good / what are the rules for different types of data like single-cell or genomic data and they've agreed to connect about getting something on this into the Wiki! This is a great example of something that anyone join in on during Coworking. 

We're moving our optional Coworking time to Thursdays (Oct 5 & 19) 10:00 - 11:00 am, based on a vote during the Call. Look for a Google Calendar invite today.

Need to review material? Agendas and recordings for every call, plus a folder with a Pathway template, can all be found in Openscapes_CohortCalls [ 2023-fred-hutch ]. Hard to access with your Hutch email? Let us know an alternative and we'll share with it.

Happy Thursday,

Stef, Julie, Sean, Monica, Liz

Digest: Cohort Call 03 [ 2023-fred-hutch ]

Openscapes_CohortCalls [ 2023-fred-hutch ] Google folder - contains agendas, recordings, pathways 

https://openscapes.github.io/2023-fred-hutch - Cohort website

Goals: We discussed team culture and data strategies for future us

Tasks: please see the Agenda doc (under Closing) for details and links

  1. Have a Seaside Chat (meet with your team) & continue your Pathway work, shifting to "Next Steps", think about onboarding/offboarding
  2. (optional) Attend Coworking. We're moving our optional Coworking time to Thursdays (Oct 5 & 19) 10:00 - 11:00 am. Come prepared to get your own work done, ask questions, or listen in to other conversations. You could also use the time to hold your team Seaside Chat.

Slide Decks:

  • Team culture (slides)
  • Data strategies for Future Us (slides)

A few lines from shared notes in the Agenda doc

  • Colleagues say "building from what xx person said..." to show listening and amplifying each other +1
  • My boss would always introduce me to others as a "colleague" rather than an employee/post doc.
  • Having a guiding principle's document that discusses shared values
  • Having a team approach with people from different backgrounds and experiences help ppl feel more comfortable.
  • Public copy of data organization and naming README template for sequencing expt: CITE-seq of HSPC and Mature bone marrow mononuclear cells in Scl-Cre SRSF2 P95H mutant vs WT mice (thanks Elana!)
  • I have shared out the data organization in spreadsheets paper many, many, times! 🥳 🥳
  • This is important for me because I've been working with data from other labs, where the formatting is variable from project to project. Especially for metadata! 
  • The point that stood out most to me during the discussion of tidydata was on slide 11, when talking about 'one-off approaches.' It is quite often that I find myself jumping to write a script to do something, even though I have a few other scripts that do effectively the same thing but with slightly different formats for many different pipelines. Perhaps standardizing inputs to our pipelines would mean less one-off scripts, and improve the ease of access to our computational tools for others in the future. 
  • File naming conventions are a big issue
  • My first computational mentor placed a strong emphasis on readability, reproducibility, and compartmentalization of every data type and analysis done. Some examples include creating readmes for other lab members to know what is going on in a directory and creating web pages with data from newly annotated genomes. 
  • Sometimes data is organized and perfect, other times it's messy in its own way. When it's ugly, I try to educate (share the data organization in spreadsheets article), and try to receive the data in the SAME ugly format if possible until we begin the next project with tidy data.
  • Generally, by the time I see our data it is minimally processed alignment files, which are tidy in the sense that their format is almost always standardized and ready to go through a standard data processing pipeline. However, the non-tidy part of this is that the changes made to this data upstream of us receiving it are, at times, not properly disclosed or documented, which can be a headache. 
  • We point our new employees to many of the pages [in Fred Hutch's Biomedical Data Science Wiki] describing the Hutch resources and how-tos. 
  • we have a wiki within our lab that describes our common workflows 
  • Balance between having good csv plus documentation about that specific data set. But having a documentation tab not ideal for someone who just wants to ingest the data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant