Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add project selection processing script #3466

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
35 changes: 35 additions & 0 deletions utilities/project_planning/README.md
Expand Up @@ -82,3 +82,38 @@ The input file is an Excel spreadsheet which looks like the following:
![Excel spreadsheet](_docs/example_weeks_spreadsheet.png)

The output CSVs will have two columns: the project name and the computed weeks.

## Project Selection

As the last step in project planning, using data compiled from the above steps,
maintainers vote on which projects they believe should be included in the next
year. The weeks of work estimation and total weeks of work are both used to
calculate how many projects can be selected when the maintainers are voting.
Instructions provided to the maintainers are as follows:

> **Instructions**:
>
> Select which projects should be included in the year, based on the number of
> development weeks available and the number of weeks each project is expected
> to take.
>
> Number of weeks has been calculated two ways:
>
> 1. Straight average across all votes
> 2. Weighted averaged based on confidence provided for each vote
>
> You may go slightly over the total available hours (by either count), but try
> to keep it less than 5 off! The hours left will indicate how many are left
> while you are voting.

This final script takes all the selection votes and groups them into 4
categories:

- Projects everyone voted for
- Projects most voted for
- Projects only some voted for
- Projects one or none voted for

The input file is an Excel spreadsheet which looks like the following:

![Excel spreadsheet](_docs/example_selection_spreadsheet.png)
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
72 changes: 72 additions & 0 deletions utilities/project_planning/process_selection_votes.py
@@ -0,0 +1,72 @@
"""
Script for gathering project selection votes and sharing the results

See the README for more information.
"""
from datetime import datetime
from pathlib import Path

import click
import pandas as pd
import sheet_utils


INPUT_FILE = Path(__file__).parent / "data" / "selection_votes.xlsx"

COLUMN_VOTED = "Included"
SKIP_SHEETS = {
# Ignore the template sheet
"Template",
# Ignore the reference
"_ref",
# If present, ignore the team's final decisions
"Team",
}


def _print_series(s: pd.Series) -> None:
"""Shorthand for printing a series"""
for name, count in s.items():
print(f"{name} ({count})")


@click.command()
@click.option(
"--input-file",
help="Input Excel document to use",
type=click.Path(path_type=Path),
default=INPUT_FILE,
)
def main(input_file: Path):
frames, projects = sheet_utils.read_file(input_file)
members = list(set(frames.keys()) - SKIP_SHEETS)

# Get the "voted for" column
included = sheet_utils.get_columns_by_members(
frames, members, projects, COLUMN_VOTED
)
# This is planning for the *next* year, e.g. one beyond the current one
datetime.now().year + 1

# Get the sum of all the "Yes" votes for each project
votes = included.eq("Yes").sum(axis=1)

# Bin by certain ranges
all_voted = votes[votes == 7]
most_voted = votes[(votes >= 4) & (votes < 7)].sort_values(ascending=False)
some_voted = votes[(votes >= 2) & (votes < 4)].sort_values(ascending=False)
few_no_voted = votes[votes <= 1].sort_values(ascending=False)

# Print results
print("\n### Projects everyone voted for")
_print_series(all_voted)
print("\n### Projects most voted for (4-6)")
_print_series(most_voted)
print("\n### Projects some voted for (2-3)")
_print_series(some_voted)
print("\n### Projects 1 or nobody voted for (0-1)")
_print_series(few_no_voted)


if __name__ == "__main__":
main()