New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add draft of getting review doc #17
Conversation
This needs a lot of organization, but most of the content is there.
@vitillo - can you PTAL when you have a chance? Not currently urgent. |
Drive-by suggestion that you can use the "save notebook as gist" button in the jupyter notebook to create a gist with both the |
|
||
# Finding a Reviewer | ||
|
||
There are two major types of review for most analyses: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should mention "code review" as well? There have been cases in the past where the code review was done by an engineer and the data review by an analyst.
@mreid-moz, what do you think about this article? Is it close enough to merge with small improvements or should we abandon and try again in 2018? |
I don't see anything wrong with what's here, let's land it and iterate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start to a very very very desperately needed process! I've given some general comments. Please feel free to loop me in for more discussion at any time.
|
||
**Methodology review** primarily covers **statistical concepts**. | ||
For example, if you're training regressions, forecasting data, or doing complex | ||
statistical tests, you should probably get Methodology review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can make a slightly more explicit (possibly almost exhaustive) list of types of statistical analyses that are common in Mozilla so that people are not wondering whether their particular analysis requires methodological review or not.
I.e. I'm doing a clustering analysis on some categorical feature space. Do i need to get this reviewed for methodology?
Your data reviewer will tell you if you should get methodology review | ||
and will help you find a reviewer. | ||
|
||
## Data Reviewers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose a similar table for methodological review. Perhaps we may even find an overlap between reviewers who have been working with particular datasets or particular anlaytical methods long enough to be in both tables.
|
||
# Jupyter Notebooks | ||
|
||
It's difficult to review Jupyter notebooks on Github. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jupyter notebooks can be exported as MD files and the github gist function allows a nice view in a separate tab. Normally when reviewing other people's notebooks that's my workflow. It isnt prefect but it allows a pretty comprehensive review using github's existing tooling.
## RTMO | ||
|
||
For simple notebooks, the best way to get review is through the | ||
[knowledge repo](https://github.com/mozilla/mozilla-reports). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step-by-step instructions on how to go from a runnning ipython notebook to a review request via RTMO would be very helpful here.
|
||
If you are still prototyping your job but want to move out of a Jupyter notebook, | ||
take a look at | ||
[cookiecutter-python-etl](https://github.com/harterrt/cookiecutter-python-etl). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a repo for in-progress (still hacking) notebooks?
Just brainstorming here so feel free to disregard, but perhaps that would satisfy the desired work flow using existing tools. I'm imagining a process where a notebook is in progress in a sort of scratchpad branch and then a PR is initiated to move it to a post-review repo. The review process could require a methodological and data review and the triage owner could waive one or the other if inappropriate.
When the analysis lands in the master branch of the data analysis repo it is considered reviewed.
Thanks, All. @mlopatka, thanks for the review and suggestions. I'm going to submit this as-is and capture your feedback in the bug. I won't have time to make edits until November. Feel free to file a PR if you'd like to make edits! Review status: 0 of 1 files reviewed at latest revision, 6 unresolved discussions. concepts/review.md, line 12 at r1 (raw file): Previously, vitillo (Roberto Agostino Vitillo) wrote…
OK concepts/review.md, line 22 at r1 (raw file): Previously, mlopatka wrote…
This sounds like a great idea. I don't think I'm the right person to write this section though. Can you suggest a data scientist who may be a good fit? concepts/review.md, line 30 at r1 (raw file): Previously, mlopatka wrote…
SGTM concepts/review.md, line 49 at r1 (raw file): Previously, mlopatka wrote…
This works for simple notebooks well, but is difficult to work with for more complicated notebooks. I'm working on a revision to this workflow in Q4 that will hopefully make RTMO more usable and keep us from storing too much analysis in notebooks. concepts/review.md, line 58 at r1 (raw file): Previously, mlopatka wrote…
There's a comprehensive guide to this in RTMO's README concepts/review.md, line 86 at r1 (raw file):
Not yet, but I'd like to have one. Mark it as a Q4 goal. I like this workflow. I think we may be able to build this as a Jupyter addon, which would make it fairly seemless. Comments from Reviewable |
This document isn't ready to be submitted yet, but I want to get some early feedback. This doc keeps getting pre-empted by competing work and I have a suspicion that this document may trigger some larger process discussions. I'd like to get those conversations started now.
This needs a lot of organization, but most of the content is there. High level commentary welcome! Let's be sure to use reviewable since I'll likely revise this document a few times.
You can preview the rendered documentation here
This change is