Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example participants in a workshop #14

Open
tracykteal opened this issue Jan 8, 2017 · 8 comments
Open

Example participants in a workshop #14

tracykteal opened this issue Jan 8, 2017 · 8 comments

Comments

@tracykteal
Copy link

There has been discussion about the example audience for this workshop. We are thinking of someone like an intermediate beginner - someone who has some exposure to programming, but knows they could be more effective and is concerned about reproducibility, for themselves and others. In particular, the focus is on people who are doing programming to work with data. This is where the Jupyter notebooks are particularly effective. As a framework, these are some example users of people who would attend this workshop

Population with potentially limited computational experience, but faced with complex computational analysis tasks in their research.

Alice is a biologist who is studying the effects of diet on mouse activity. She has several studies that she’s run where she’s manipulated the diet and has the hypothesis that the cheetos-only diet will decrease activity over time. She has data on the food fed each mouse, their measured activity levels in a maze once a day, various measures from their blood each day and sequencing data from each mouse at the beginning of the experiment. First she wants to analyze the data to see if there is any relationship between diet and activity over time. She would also like to analyze the genomic data to see if there are correlations between the genomes of the mouse and their response to diet to see if there is any genetic plus environmental relationship. While she is a whiz in the lab and with handling mice, she hasn’t had much computational experience. She has done some work at the command line to use bioinformatics programs and some work in R for a statistics class. She’s looking for a better way to work with all this data. She finds herself doing analyses and not remembering what she’s done or where the data is stored. She wants to make her work reproducible for herself as well as to share with her collaborator who is conducting a similar study in Australia. She also knows she’ll be required to release the data and analysis when she submits the paper, so she wants to be prepared.

People with some programming experience, but want to figure out how to use the Jupyter notebook in their workflow and share and publish their data and code

Bob is a digital humanist. He is analyzing the meter in poems from different poets in 1800s European poetry with the idea that meter is linked to ancestral background of the poet. He is relatively new to programming, but excited about these approaches in linking historical context and poetry. He mainly programs in Python with the NLTK toolkit and is learning more about machine learning. He has been programming at the command line, but wants to be able to explore his data more effectively and share and communicate his results. As the digital humanities field is new, he’d also like to be able to show how doing work this way is reproducible in order to show the power of this approach, be clear about his work and mentor others.

People with more significant programming experience looking to better share their code and results with collaborators and potentially the public.

Eve is an earth and climate sciences graduate student. She started programming in C++ in undergrad and now she’s using mainly Python in her work. Up to now she’s done most of her work on her own, but now she’s working on a collaborative project with people across the globe. She’s using data from Earthshine as well as some Earth satellite data. The data sets are large, but she can link to them on an Amazon instance. She wants to find ways to work more collaboratively and share the results of her work with her collaborators, but also ultimately the public who will be interested in this data. Her work has some climate implications, so she is particularly concerned that her work be clear and reproducible.

@raynamharris
Copy link
Contributor

Thanks @tracykteal. During instructor training, we on of the activities we do is ask the participants to spend a few minutes writing learning profiles. It might be useful to take 5-10 minutes for participants this week to think about the learners they have in mind when thinking about this curricula.

@hlapp
Copy link
Contributor

hlapp commented Jan 9, 2017

Lifted text and moved to wiki, where it's easier to reference and collaboratively develop.

@raynamharris
Copy link
Contributor

We wrote some more learner profiles in during our brain dump. We could probably refine them over the next to days and add them to the wiki persona file.

https://docs.google.com/document/d/1OO3bNMTDOH8EzXb-l-dMqT8a7RadVL8mIRYxWn-EC6g/edit

@raynamharris
Copy link
Contributor

These could be refined into really good personas:

My leaner is a graduate student in the natural sciences. They have lots of experience at the bench. They are teaching themselves to code by reading R vignettes and rewriting them to fit their dataset. They R scripts look EXACTLY like the vignette, and they barely have their own comments. They want to make the workflow reproducible. They want to create output files for their boss that just has figures and text, but they also want to create output files that show the code so they can get feedback on the workflow. Then, they need to publish the data to get a postdoc fellowships. They edit their figures on illustrator. They think they might want to post a preprint but they really want to get in Science, Nature, and Cell. One challenge this student faces is that her research is collaborative. She has her piece in a R notebook, but the dataframe that she got from her collaborator was generated by matlab. How does she combine pieces from other researchers into one reproducible workflow.

My student is a library science master’s student interested in publishing their digital humanities text mining project. She has code in a variety of scripts stored on a folder. Her data was pulled in from hathi trust (a data source the openly provides in copyright book text data). She wants to construct a jupyter notebook with the python code to publish on her online portfolio to highlight on her resume in her upcoming job search. She has some notes in a physical notebook about what she did, and some of the work was done as part of her text mining course, so only exists in assignments. She’d like to submit it to a regional python conference as a talk as well. She used sublime to write her scripts, but some of the analysis and modeling was done in an Oracle database per her assignment instructions, and the data was extracted from (the data source) using a script written by a consultant at (the data source). She doesn’t care so much about writing a journal article for publication, as she is mostly interested in getting the results and work in the public ASAP so she can mention it in her cover letters.

Alice is a postdoc changing fields somewhat. She has learned one set of skills but her new lab is expecting a whole new set of skills. Her PI is busy and expects her to be able to learn these things on their own. She learns of Jupyter from a Data Carpentry bootcamp and wants to see how her existing document editing workflow (using X∃LaTeX, .bib files, glossaries, extensive cross referencing). The group she has joined is known for its exceptional data visualisations but she has not been trained in basic data visualisation (though she knows a good deal about modern machine learning methods). She is used to version controlling documents (with LaTeX) but now has a PI that wants to write in docx.

Angel has some preliminary work that she would like feedback on from local and remote collaborators. How can she publish her work privately and get specific feedback on her methods and directions.

My learner has a manuscript that is very near completion. She’s currently working on responding to final reviewer comments. She has her submission in the form of a word doc and supplementary pdfs and excel files. She did all of her analysis in R, but barely can read her own scripts, much less get others to understand them. She also doesn’t know whether she can release her data, because it is governed by an IRB. She wants to know if there is anything she can do at this stage in the game to make this manuscript more open and the analyses more reproducible. She also wants to learn skills so that she can do things better in the future. What does she need to know about how publishing this manuscript will impact her career if she wants to be part of the open/reproducible research community? She’s also fighting with a reviewer who wants her to remove methodological details and supplementary files (that would make it /less/ reproducible). What can she do to convince her community that these things are important? How can she be an advocate for reproducibility and openness?

@hlapp
Copy link
Contributor

hlapp commented Jan 12, 2017

@raynamharris do you want to add them to the Personas wiki page?

@raynamharris
Copy link
Contributor

@hlapp I think it would be good to have them on the wiki, but they need to be refined to fit the format (Eve, Bob, Alice... ) and be more concise. So, let's keep this issue open until I (or someone else) gets the chance to do that.

@hlapp
Copy link
Contributor

hlapp commented Jan 12, 2017

I guess what I was trying to say is that this refinement would more easily happen on the wiki. Issue comments don't lend themselves well to collaborative editing, whereas the wiki does.

@raynamharris
Copy link
Contributor

raynamharris commented Jan 13, 2017

Got it. They've been added to the wiki for easy editing. https://github.com/Reproducible-Science-Curriculum/RR-Jupyter-Hackathon-Jan-2017/wiki/Personas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants