# Module 6: Code Review


## Lesson Overview

By the end of this lesson you should understand:

* What is a code review? How do we use them as part of the development and CI/CD process to create high quality code?

* What makes a good code review?

* How code reviews are compatible with testing infrastructure and tools like code linters.

* The process of reviewing code using a related tutorial, [Code Review](https://code-review.org/docs/welcome/introduction/), developed by Helen Kershaw of NCAR.

## Related Training Video

Video link will be posted after the session.

## Important Terminology

- Changelist (CL) - a set of code updates a developer is looking to merge into a larger codebase. Can usually be tied to a single commit or patch.

  - Sometimes also called a changeset](https://en.wikipedia.org/wiki/Changeset).

- Peer review - another name for a code review

- LGTM - Looks Good To Me, a typical message used by some code reviewers to approve a changelist / changeset





## What is a Code Review?

A code review is a manual process whereby a developer and a set of their peers review proposed changes to a codebase. Reviewers examine the changelist for code to be committed and determine (along with output from automated tests) whether the code improves the quality of the codebase.  

### When and How do Code Reviews Happen?

Code reviews can happen anytime that code is pushed to a production-related branch or repository. Some larger projects have specific rules that require a certain number of code reviews (usually 1-3 reviewers) in addition to automated testing and checks for code style. As an example, some companies might require two reviews for user-facing changes (i.e. to a web app that is publicly available) or even three reviews (e.g., two peer reviews and one manager) for critical internal applications.

In general the process is:

1. Developer submits a pull request to merge new code. They can request
specific reviewers.
2. Automated tests check to see if code is functionally correct.
3. Linters can check for added errors and style issues
4. Reviewers look over the code and possibly test it in their own environment.
5. Reviewers suggest any changes for the developer to make.
6. Developer updates their PR (if needed)
7. Code is merged, or the iterative process goes back to 4).

As an example from Open Liberty, we see that two automated tests completed correctly but one manual review is needed to merge the new code.

![openliberty_example_review](https://raw.githubusercontent.com/gt-ospo/oss-training/main/img/lesson-05/openliberty-github-pr-review.png)

## Why Do We Need Code Reviews?

As discussed in previous lectures, automated testing can be a very valuable tool to check if code is correct before it is merged into a larger project. However, it's very tough to get 100% test coverage, and tests can be flawed and might miss some key corner case that is not covered by a test.

**Code review provides an important human element to double-check that updates to a code are valid and useful!**

Code Reviews also provide good opportunities for collaboration, mentoring, and team building in addition to the benefits of integrating "correct" code.

## What Makes a Good Code Review?

There are many good resources on how to do a good code review, like Google's [Standard of Code Review](https://google.github.io/eng-practices/review/reviewer/standard.html)

In general a good code review process will follow these practices:

* A good pull request from the programmer should be complete and easy to parse
  * Include a description of what the changes do, testing instructions, and screenshots for GUi-related behaviors
* Code reviews should encompass small changelists. That is, a programmer should submit changelists that can be easily reviewed and reviewers should be prompt in their review of submitted code.
  * Size of changes varies but many developers suggest a changelist that can be reviewed in less than an hour or 200 lines of code
  * Consider splitting CLs along function or file boundaries to make code easier to review.
* Automate where possible!
  * Automating functionality tests and style checks saves both the developer and reviewer time in the review process.
* Communication is important for both authors and reviewers
  * Authors should clearly communicate changes and address comments in a timely fashion
     * Review guides for [writing good CL descriptions for commits](https://google.github.io/eng-practices/review/developer/cl-descriptions.html)
  * Reviewers should review code in a timely fashion and be considerate of the author's effort and time
     * Code examples or references to guides can be used to convey how to modify or update CLs
* Issues of code style (ie, how many spaces are needed) should be resolved using an agreed upon [style guide](https://en.wikipedia.org/wiki/Style_guide), not by personal preference.
  * [Google Style Guides](https://google.github.io/styleguide/)
  * [PEP 8 Python Style Guide](https://peps.python.org/pep-0008/)


For more information, we suggest that you review the resources below in the `Learn More` section with guides for authors and reviewers.


## Creating Effective Pull Requests

With the advent of GenAI tools like GitHub Copilot, ChatGPT, and specialized code review assistants, the ability to create clear, well-structured pull requests has become more important than ever. While GenAI can assist maintainers in understanding and reviewing code, **training in how to craft effective PRs is essential** to ensure that both human reviewers and AI-assisted review tools can easily understand your changes.

### Key Tips for Creating PRs That Are Easy to Review

Here are critical best practices for writing PRs that maintainers can review efficiently:

**1. Write a Clear and Descriptive Title**
   - Use a concise title that summarizes the change at a glance (e.g., "Fix authentication timeout in login flow" rather than "Bug fix")
   - This helps reviewers (and AI tools) immediately understand the purpose of the PR

**2. Provide a Detailed PR Description**
   - Explain *what* you changed and *why* you made those changes
   - Include context about the problem being solved, any related issues or tickets, and the intended impact
   - Mention any breaking changes or migrations that need to be made
   - This narrative context is crucial for AI review tools to assess the appropriateness of your changes

**3. Keep Your PR Focused and Within Reasonable Size**
   - Aim for PRs that can be reviewed in under an hour and contain fewer than 200-400 lines of changed code
   - If your changes are larger, split them into multiple PRs along logical boundaries (features, files, or functions)
   - Smaller PRs are easier for both humans and AI to review thoroughly and accurately

**4. Include Test Coverage**
   - Add or update tests that cover your changes
   - Explain any new test cases and why they're necessary
   - This gives reviewers confidence that your code works correctly and helps AI tools understand the intended behavior

**5. Add Examples and Context**
   - For UI changes, include screenshots or videos
   - For API changes, provide usage examples
   - For complex logic, add comments explaining the "why" behind the code
   - Visual and narrative context helps AI tools and reviewers understand the bigger picture

**6. Reference Related Work**
   - Link to relevant issues, design documents, or previous discussions
   - Mention any dependencies on other PRs or branches
   - This helps reviewers (and AI) understand the full context of your changes

**7. Be Responsive to Feedback**
   - Address review comments promptly and thoroughly
   - Explain your reasoning if you disagree with a suggestion
   - Update your PR with changes and let reviewers know when you've made updates
   - Clear communication accelerates the review process for both human and AI-assisted reviews

**Why This Matters with GenAI:** GenAI review tools can better assist maintainers when they have clear context about your changes. Well-written PRs provide the information needed for AI to make accurate suggestions about code quality, security, and best practices, ultimately making the review process faster and more effective for everyone involved.

# Hands On Practice

Fork the codebase from https://code-review.org/ and follow the `Python` reviewing path.

Bonus Question:

- Do you see any other issues with the GitHub Action workflows as they run? How might you fix this issue?


# Using GitHub Project Management

Effective project management is crucial for the success of open source projects, especially when coordinating multiple contributors, tracking issues, and prioritizing work. GitHub provides several built-in features that make it easier to organize, plan, and track progress on your project.

## Why Prioritize Issues in Open Source Projects?

In an active open source project, issues can quickly accumulate. Without proper prioritization, contributors may:
- Work on low-priority tasks while critical bugs remain unfixed
- Duplicate efforts by working on similar issues
- Feel overwhelmed by the number of open issues
- Struggle to find good first issues to tackle

**Benefits of prioritization:**
- Helps maintainers focus contributor efforts on the most impactful work
- Makes it easier for new contributors to find appropriate tasks
- Provides transparency about project roadmap and priorities
- Enables better resource allocation across the team
- Improves project velocity and code quality

## GitHub Project Management Features

GitHub offers several tools to help manage your open source project:

### 1. **Labels**
- Categorize issues and PRs (e.g., `bug`, `enhancement`, `good first issue`, `help wanted`)
- Prioritize work (e.g., `priority: high`, `priority: low`)
- Indicate status (e.g., `in progress`, `needs review`, `blocked`)

### 2. **Milestones**
- Group related issues and PRs together
- Track progress toward specific releases or goals
- Set due dates to keep the team on schedule

### 3. **Projects (GitHub Projects)**
- Create Kanban-style boards to visualize workflow
- Organize issues and PRs across multiple repositories
- Create custom views and filters for different perspectives
- Track progress with automated status updates

### 4. **Issue Templates**
- Standardize how contributors report bugs or request features
- Ensure all necessary information is collected upfront
- Speed up triage and review processes

### 5. **Discussions**
- Facilitate community conversations separate from code issues
- Gather feedback on proposals before creating issues
- Build community knowledge base

## Creating Tasks and Managing Sub-Issues

Complex features or bug fixes often require breaking down work into smaller, manageable pieces. GitHub supports this through:

### Task Lists in Issues
You can create task lists directly in issue descriptions using markdown:

```markdown
- [ ] Research authentication libraries
- [ ] Design API endpoints
- [ ] Implement user login
- [ ] Add tests for authentication flow
- [ ] Update documentation
```

### Creating Sub-Issues
For larger efforts, create separate issues for each sub-task and link them:
- Use "Closes #123" or "Relates to #123" in issue descriptions
- Create a parent tracking issue that lists all related sub-issues
- Use labels to connect related work

### Using Chunking to Break Down R&D Efforts

When tackling complex research and development work, apply the principle of "chunking" to break tasks into smaller pieces:

**Why Chunking Matters:**
- Reduces cognitive load and makes work less overwhelming
- Creates natural checkpoints for progress tracking
- Enables parallel work by multiple contributors
- Makes it easier to estimate effort and timelines
- Provides opportunities for early feedback and course correction

**How to Chunk Effectively:**

1. **Start with the End Goal** - Clearly define what success looks like
2. **Identify Major Components** - Break the work into 3-7 major pieces
3. **Decompose Each Component** - Further break down each piece into tasks that take 1-4 hours
4. **Define Dependencies** - Identify which tasks must be done in sequence vs. parallel
5. **Create Clear Acceptance Criteria** - Each chunk should have measurable completion criteria

**Example: Adding OAuth Authentication**

Instead of one large issue "Add OAuth authentication," break it down:

1. **Research Phase**
   - [ ] Research OAuth 2.0 providers and select best option
   - [ ] Document security requirements and compliance needs
   
2. **Design Phase**
   - [ ] Design authentication flow and user experience
   - [ ] Create API endpoint specifications
   - [ ] Design database schema for OAuth tokens
   
3. **Implementation Phase**
   - [ ] Set up OAuth provider credentials
   - [ ] Implement OAuth callback handler
   - [ ] Add token storage and refresh logic
   - [ ] Create user profile sync functionality
   
4. **Testing & Documentation**
   - [ ] Write unit tests for OAuth flows
   - [ ] Add integration tests
   - [ ] Update user documentation
   - [ ] Create administrator setup guide

Each of these sub-tasks can be its own issue, making progress trackable and allowing multiple contributors to work in parallel.

**Resource:** [How to Break Down Tasks](https://activecollab.com/blog/project-management/break-down-tasks) - ActiveCollab provides excellent guidance on effective task decomposition.

## Working with Student Contributors

Student contributors bring fresh perspectives and energy to open source projects, but they also have unique needs and constraints. Here are best practices for supporting student contributors effectively:

### Understanding Student Constraints

Students often have:
- **Limited time** - Balancing coursework, exams, and other commitments
- **Academic schedules** - Availability varies by semester, with breaks during finals and holidays
- **Learning curves** - May need more guidance on tools, workflows, and domain knowledge
- **Variable experience** - Skills range from beginners to advanced developers

### Best Practices for Maintainers

**1. Create Clear "Good First Issues"**
- Tag issues with `good first issue` or `student-friendly`
- Provide detailed descriptions with clear acceptance criteria
- Include links to relevant code sections and documentation
- Estimate difficulty and time required (e.g., "2-4 hours")

**2. Provide Comprehensive Documentation**
- Write detailed CONTRIBUTING.md guides
- Create video tutorials for common workflows
- Document your development environment setup process
- Maintain an FAQ for common questions

**3. Offer Structured Mentorship**
- Assign a mentor or primary point of contact for each student
- Schedule regular check-ins (weekly or bi-weekly)
- Be responsive to questions (aim for <24 hour response time on workdays)
- Provide constructive, educational feedback on PRs

**4. Set Realistic Expectations**
- Communicate expected timelines and deadlines clearly
- Be flexible around academic calendars (midterms, finals)
- Break large tasks into smaller milestones with intermediate check-ins
- Celebrate small wins and progress

**5. Create a Welcoming Environment**
- Foster a psychologically safe space to ask questions
- Recognize that mistakes are learning opportunities
- Publicly celebrate contributions (e.g., in release notes, blog posts)
- Connect students with the broader project community

**6. Facilitate Learning Opportunities**
- Pair students with more experienced contributors for code review
- Encourage students to attend project meetings or community calls
- Provide opportunities to present their work
- Share resources for relevant skills (Git, testing, CI/CD)

**7. Plan for Transitions**
- Recognize that student availability may change semester-to-semester
- Document work thoroughly so others can continue it
- Create handoff processes for ongoing work
- Maintain relationships with students who move on

### Example: Structuring a Student-Friendly Issue

```markdown
## Title: Add dark mode toggle to settings page

**Labels:** `good first issue`, `frontend`, `student-friendly`, `help wanted`

**Description:**
We want to add a dark mode toggle button to the user settings page.

**What needs to be done:**
1. Add a toggle switch component to `src/components/Settings.js`
2. Connect it to the theme state in our Redux store
3. Add a test for the new toggle in `src/components/__tests__/Settings.test.js`

**Relevant files:**
- Settings component: `src/components/Settings.js` 
- Theme reducer: `src/store/slices/themeSlice.js`
- Existing tests: `src/components/__tests__/Settings.test.js`

**Resources:**
- Our component library docs: [link]
- Redux documentation: [link]
- Example of similar toggle: [link to code]

**Estimated time:** 2-3 hours

**Need help?** Feel free to ask questions in this issue or reach out on our Discord channel!
```

### Benefits of Student Contributors

When supported well, student contributors:
- Bring fresh perspectives and innovative ideas
- Help identify documentation gaps and usability issues
- Become long-term community members and maintainers
- Spread awareness of your project in academic circles
- Contribute diverse skills from their coursework and research

By investing in student contributors through thoughtful project management and mentorship, you build a stronger, more sustainable open source community.


# Learn More

To learn more about the code review process and review best practices, please see the following resources:

## Resources for Developers / Code Authors

- [How to Make Your Code Reviewer Fall in Love with You](https://mtlynch.io/code-review-love/) - Michael Lynch; 2020
- [The Change Authors Guide](https://google.github.io/eng-practices/review/developer/) - Google Engineering Best Practices

## Resources for Reviewers

- [How to Do Code Reviews Like a Human](https://mtlynch.io/human-code-reviews-1/) [Part 2](https://mtlynch.io/human-code-reviews-2/) - Michael Lynch; 2017
- [What to Look For In a Code Review](https://google.github.io/eng-practices/review/reviewer/looking-for.html) - Google Engineering Best Practices

## General Code Review Resources
- [What is a Code Review?](https://about.gitlab.com/topics/version-control/what-is-code-review/) - Gitlab
- [5 Code Review Best Practices](https://www.atlassian.com/blog/add-ons/code-review-best-practices) - Atlassian
    - Provides a link to the Cisco study recommending reviewing less than 200 lines of code.
- [Code Review Best Practices](https://roadmap.sh/best-practices/code-review)
    - A detailed flow chart showing all the possible steps of a code review process. Note that not all of these steps may apply to you!
- [30 Proven Code Review Best Practices from Microsoft](https://www.michaelagreiler.com/code-review-best-practices/) - Michaela Greiler
    - This site has some additional ideas for authors and reviewers and notes that reviewers should avoid unintentional bias in their reviews.

## Videos

- [How to Do Code Reviews Like a Human](https://www.youtube.com/watch?v=0t4_MfHgb_A) - Michael Lynch, PyGotham 2018
- [Code Review Best Practices](https://www.youtube.com/watch?v=a9_0UUUNt-Y) - Trisha Gee, Upsource Webinars; 2018
- [Investing in Code Reviews for Better Research Software](https://ideas-productivity.org/events/hpcbp-068-codereview)
    - HPC Best Practices Webinar - Thibault Lestang Dominik KrzemiÅ„ski, Valerio Maggio; October 2022