# Map priority levels into a class for each.
This notebook will map each different priority levels in a common class.
The project with 5 different priority levels are selected.

In [None]:
import pandas as pd
# Import dataset from csv file
df = pd.read_csv(f'all_projects.csv')

In [37]:
# rename columns using rename function
df.rename(columns={'fields.priority.name': 'priority', 'fields.description': 'description', 'fields.project.name': 'project', 'fields.issuetype.name': 'issuetype', 'fields.labels': 'labels'}, inplace=True)
df

Unnamed: 0,priority,description,project,labels,issuetype,collection
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring
...,...,...,...,...,...,...
2655048,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2655049,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2655050,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2655051,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype


In [38]:
# drop priority rows with NaN
df = df.dropna(subset=['priority'])
# rest index
df = df.reset_index(drop=True)
df

Unnamed: 0,priority,description,project,labels,issuetype,collection
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring
...,...,...,...,...,...,...
2071438,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2071439,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2071440,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2071441,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype


In [39]:
# count priority
df['priority'].value_counts().to_frame()[:50]


Unnamed: 0_level_0,count
priority,Unnamed: 1_level_1
Major,1080149
Minor,287654
Major - P3,109573
Low,86032
Critical,77997
Medium,72888
Blocker,57974
P2: Important,46926
Not Evaluated,42453
Trivial,34313


In [40]:
# Remove issues with no priority level set.
df = df[df['priority'] != 'Unset']
df = df[df['priority'] != 'TBD']
df = df[df['priority'] != 'Undefined']
df = df[df['priority'] != 'Unprioritized']
df = df[df['priority'] != 'Not Evaluated']
df = df[df['priority'] != 'Unknown']


# rest index
df = df.reset_index(drop=True)
df

Unnamed: 0,priority,description,project,labels,issuetype,collection
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring
...,...,...,...,...,...,...
2018905,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018906,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018907,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018908,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype


In [41]:

df['priority'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
priority,Unnamed: 1_level_1
Major,1080149
Minor,287654
Major - P3,109573
Low,86032
Critical,77997
Medium,72888
Blocker,57974
P2: Important,46926
Trivial,34313
P3: Somewhat important,28075


## Create dataset with projects with 5 different priority levels.
Using definition from Jira docs.
https://support.atlassian.com/jira-service-management-cloud/docs/what-are-priority-levels-in-jira-service-management/

Example:
* 4: Trivial = Lowest
* 3: Minor = Low
* 2: Major  = Medium
* 1: Critical = High
* 0: Blocker = Highest


In [42]:
# Remove projects that does not use 5 priority levels
# QT, JFrog, Jira, Mindville
df = df[df['project'] != 'JFrog']
df = df[df['project'] != 'Mindville']
df = df[df['project'] != 'Jira']
# Project with few issues with priority or bad use of priority levels
df = df[df['project'] != 'Mojang']

# Remove some other labels that are had to fit into 5 priority levels
# Apache removes rows with priority: Low, Normal, High, Urgent
priorities_to_remove = ['Low', 'Normal', 'High', 'Urgent']
collection_to_match = 'Apache'

# Filter the DataFrame to keep rows that don't match both conditions
df = df[~((df['priority'].isin(priorities_to_remove)) & (df['collection'] == collection_to_match))]
df





Unnamed: 0,priority,description,project,labels,issuetype,collection
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring
...,...,...,...,...,...,...
2018905,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018906,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018907,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype
2018908,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype


In [43]:
# show priority in the apache collection
apache = df[df['collection'] == 'Apache']
apache['priority'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
priority,Unnamed: 1_level_1
Major,667276
Minor,197747
Critical,43553
Blocker,31854
Trivial,28761
P2,6259
P3,6077
Not a Priority,1671
P1,701
P4,296


In [44]:
# count priority
df['priority'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
priority,Unnamed: 1_level_1
Major,1080149
Minor,287654
Major - P3,109573
Low,79317
Critical,77997
Medium,72888
Blocker,57974
P2: Important,46926
Trivial,34313
P3: Somewhat important,28075


In [45]:
# In collection RedHat, Mojang make priorty low == lowest
df.loc[df['collection'].isin(['RedHat', 'Mojang']) & (df['priority'] == 'Low'), 'priority'] = 'Lowest'
# Priority in RedHat collection
redhat = df[df['collection'] == 'Mojang']
redhat['priority'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
priority,Unnamed: 1_level_1
Normal,3356
Lowest,53
Important,22
Critical,5
Blocker,2


In [46]:
import pandas as pd

# Assuming df is your existing DataFrame with a 'priority' column

# Define your mapping for the 'class' column
priority_mapping = {
    # Highest priority
    'Blocker': 'Highest',
    'P0': 'Highest',
    'Urgent': 'Highest',
    'Showstopper': 'Highest',
    'P0: Blocker': 'Highest', # Note: some project uses P0 as the highest priority level,
    'P1-Urgent': 'Highest', # others use P1 as the highest priority level
    'Highest': 'Highest',
    'Blocker - P1': 'Highest',
    # High
    'High': 'High',
    'Critical': 'High',
    'P1': 'High',
    'Severe': 'High',
    'Critical': 'High',
    'P2-High': 'High',
    'P1: Critical': 'High',
    'Critical - P2': 'High',
    # Medium
    'Medium': 'Medium',
    'Major': 'Medium',
    'P2': 'Medium',
    'P3-Medium': 'Medium', 
    'Important': 'Medium',
    'P2: Important': 'Medium',
    'Major - P3': 'Medium',
    # Low
    'Low': 'Low',
    'Minor': 'Low',
    'P3': 'Low',
    'Normal': 'Low',
    'P4-Low': 'Low',
    'P3: Somewhat important': 'Low',
    'Minor - P4': 'Low',
    # Lowest
    'Lowest': 'Lowest',
    'P4': 'Lowest',
    'Trivial': 'Lowest',
    'P5-Trivial': 'Lowest',
    'P4: Low': 'Lowest',
    'Trivial - P5': 'Lowest',


}

# Apply the mapping
df['class'] = df['priority'].map(priority_mapping)


In [47]:

# Find the priorities that are not mapped to 'Highest' by checking for nulls in 'class'
unmapped_priorities = df[df['class'].isnull()]['priority'].unique()

print("Priorities not mapped to a new class:")
for priority in unmapped_priorities:
    print(f"- {priority}")



Priorities not mapped to a new class:
- Optional
- Should Have
- Must Have
- Not a Priority
- P5: Not important
- 2 - Critical
- 4 - Normal
- 3 - High
- 1 - Blocker
- 5 - Minor
- 6 - Trivial
- Level 3
- Level 4
- Level 1
- Level 2
- Complex Fast-Track


In [48]:
# Value counts of class
df['class'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
class,Unnamed: 1_level_1
Medium,1318342
Low,420707
High,124104
Highest,69884
Lowest,46258


In [49]:
df

Unnamed: 0,priority,description,project,labels,issuetype,collection,class
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring,Highest
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring,Medium
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring,Lowest
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring,Medium
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring,Low
...,...,...,...,...,...,...,...
2018905,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2018906,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2018907,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2018908,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium


In [50]:
# Show count issuetype
df['issuetype'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
issuetype,Unnamed: 1_level_1
Bug,1027418
Improvement,300288
Task,202922
Sub-task,125833
New Feature,83062
New Project,65487
Feature Request,42491
Story,28674
Enhancement,28333
Test,10147


In [54]:
# Make new df with only issuetype: Technical Debt
technical_debt = df[df['issuetype'] == 'Technical Debt']
# To csv
technical_debt.to_csv('priority_with_TD.csv', index=False)
technical_debt

Unnamed: 0,priority,description,project,labels,issuetype,collection,class


In [53]:
technical_debt['class'].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
class,Unnamed: 1_level_1
Medium,205
Low,128
Lowest,15
Highest,12
High,10


In [55]:
# Remove issues with Technical Debt issue type
df = df[df['issuetype'] != 'Technical Debt']
df = df.reset_index(drop=True)
df

Unnamed: 0,priority,description,project,labels,issuetype,collection,class
0,Blocker,We tried upgrading from Spring Boot 2.0.6 to S...,Spring XD,[],Bug,Spring,Highest
1,Major,The jobs that appear under Executions section ...,Spring XD,[],Bug,Spring,Medium
2,Trivial,Working with Spring-XD version 1.3.2.RELEASE\n...,Spring XD,[],Bug,Spring,Lowest
3,Major,My project 7 node cluster and in that 2 node a...,Spring XD,"['Spring', 'xd']",Bug,Spring,Medium
4,Minor,See https://github.com/spring-projects/spring-...,Spring XD,[],Story,Spring,Low
...,...,...,...,...,...,...,...
2000661,Major,it is very beautiful.,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2000662,Major,library,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2000663,Major,What is reactive-gremlin\r\n\r\nreactive-greml...,Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium
2000664,Major,"Android view for a swipeable, weekly calendar.",Community Support - Open Source Project Reposi...,[],New Project,Sonatype,Medium


In [59]:
df["class"].value_counts().to_frame()[:50]

Unnamed: 0_level_0,count
class,Unnamed: 1_level_1
Medium,1318137
Low,420579
High,124094
Highest,69872
Lowest,46243


In [60]:
# Save to csv
df.to_csv('all_priority_group_in_classes.csv', index=False)

In [61]:
# Read csv to check if file is saved correctly
df = pd.read_csv('all_priority_group_in_classes.csv')

In [65]:
import os
priority_levels = ['Highest', 'High', 'Medium', 'Low', 'Lowest']

for level in priority_levels:
    try:
        # Make dir with level
        os.makedirs(f'{level}', exist_ok=True)
        # df with level class
        df_level = df[df['class'] == level]
        # Save to csv
        df_level.to_csv(f'{level}/{level}.csv', index=False)
        print(f"Saved {level}.csv")
    except Exception as e:
        print(f"An error occurred for level {level}: {str(e)}")


Saved Highest.csv
Saved High.csv
Saved Medium.csv
Saved Low.csv
Saved Lowest.csv


In [67]:
# Read csv to check if file is saved correctly
for level in priority_levels:
    try:
        df = pd.read_csv(f'{level}/{level}.csv')
        print(f"Read {level}.csv")
    except Exception as e:
        print(f"An error occurred while reading {level}.csv: {str(e)}")


Read Highest.csv
  priority                                        description    project  \
0  Blocker  We tried upgrading from Spring Boot 2.0.6 to S...  Spring XD   
1  Blocker  I download the spring XD example projects, and...  Spring XD   
2  Blocker  See https://github.com/spring-projects/spring-...  Spring XD   
3  Blocker  Several issues with 1.3.0.M1 staged version\n\...  Spring XD   
4  Blocker  Building and then running spring-cloud-data-re...  Spring XD   

            labels issuetype collection    class  
0               []       Bug     Spring  Highest  
1               []     Story     Spring  Highest  
2  ['PullRequest']       Bug     Spring  Highest  
3               []       Bug     Spring  Highest  
4               []       Bug     Spring  Highest  
Read High.csv
   priority                                        description  \
0  Critical  According to the documentation we can load jar...   
1  Critical  as stated in https://jira.spring.io/browse/INT...   
2  Criti

## Definition of each priority level according to the atlassian documentation.
* Lowest - Trivial problem with little or no impact on progress. Color: Light grey.
* Low - Minor problem or easily worked around. Color: Dark grey.
* Medium - Has the potential to affect progress. Color: Yellow.
* High - Serious problem that could block progress. Color: Orange.
* Highest - The problem will block progress. Color: A dark red.

#### Compared to github

* High == High and highest 
* medium == medium
* low = low and lowest

# Ranking of priority levels.
Most commonly used priority levels in this dataset.
Using 4-6 levels of priority.
5 levels is most common.
From low to high
## 5 priority levels
### Sonatype, MongoDB, Apache, RedHat, Spring, Sakai(not any trivial), JiraEcosytem, Sonatype
* Trivial
* Minor
* Major
* Critical
* Blocker
### Apache
* P4
* P3
* P2
* P1
* P0

### Hyperledger, Mindville
* Lowest
* Low 
* Medium
* High
* Highest

### RedHat
* Low
* Normal
* Medium
* High
* Urgent

### IntelDOAS:
* P5-Trivial
* P4-Low
* P3-Medium
* P2-High
* P1-Urgent

### SecondLife
* Trivial
* Minor
* Major
* Severe
* Showstopper

### Mojang
* Low
* Normal
* Important
* Critical
* Blocker

## 6 levels
### QT
* P5: Not important (removing this)
* P4: Low
* P3: Somewhat important
* P2: Important
* P1: Critical
* P0: Blocker

### JFrog
* Trivial
* Minor
* Normal
* High
* Critical
* Blocker


## 4 levels
### Apache
* Low
* Normal
* High
* Urgent

### Jira (the org)
* Low
* Medium
* High
* Highest
### Mindville (few issues) can ignore.
* Level 4
* Level 3
* Level 2
* Level 1