# Bug Report Collection from Bugzilla API

<u><b>Full Procedure <u><b>

1. Import Dependencies

- Essential libraries are imported, including requests for interacting with the Bugzilla REST API, pandas for data structuring, and tqdm for progress visualization

2. Define API Parameters

- Parameters for the API request are set to filter for: Client Software classification, Firefox product, RESOLVED bugs with a FIXED resolution
- Pagination is controlled using limit and offset
- A dictionary is initialized to store bug metadata (e.g., ID, type, summary, description), and a set is used to avoid duplicate bug entries

3. Fetch Bug Reports

- The script makes repeated requests to the Bugzilla API, collecting bugs in batches
- Bugs without a priority are skipped
- Each bug’s metadata is stored
- For each bug, the script retrieves the first comment using the comments endpoint
- This serves as the bug’s description and is added to the dataset
- The script updates the offset after each batch and stops when the maximum number of bugs (MAX_BUGS) is reached

4. Output

- Once the desired number of bugs is collected, the total count is printed and the full dataset is ready for export or further processing

### Imports and Request Data

In [2]:
import requests
import pandas as pd
from tqdm import tqdm

tqdm.pandas()

In [None]:
bugzilla_url = "https://bugzilla.mozilla.org/rest/bug"
params = {
    "classification": "Client Software", 
    "product": "Firefox",
    "status": "RESOLVED",
    "resolution": "FIXED",
    "limit": 100,  # Fetch 100 bugs at a time
    "offset": 0,   # Start from first bug
}

MAX_BUGS = 10000

# Store bug reports
bug_reports = {"id": [], "type": [], "product": [], "component": [], "status": [],
               "summary": [], "priority": [], "description": []
}

# Set to track seen IDs and avoid duplicates
seen_bug_ids = set()

while True:
    try:
        response = requests.get(bugzilla_url, params = params)  # Increase timeout to 30 sec
        bugs = response.json()
        
        for bug in bugs["bugs"]:

            if bug["priority"] == "--":
                continue # Skip bugs where priority is none
            
            bug_id = bug["id"] 

            if bug_id in seen_bug_ids:
                continue  # Skip duplicates
                
            seen_bug_ids.add(bug_id)
            
            bug_reports["id"].append(bug["id"])
            bug_reports["type"].append(bug["type"])
            bug_reports["product"].append(bug["product"])
            bug_reports["component"].append(bug["component"])
            bug_reports["status"].append(bug["status"])
            bug_reports["summary"].append(bug["summary"])
            bug_reports["priority"].append(bug["priority"])

             # Fetch the first comment (bug description)
            comments_url = f"https://bugzilla.mozilla.org/rest/bug/{bug_id}/comment"
            
            try:
                comments_response = requests.get(comments_url)
                comments_data = comments_response.json()
    
                # Extract the first comment as the bug description
                first_comment = comments_data["bugs"][str(bug_id)]["comments"][0]["text"]
            except (requests.exceptions.RequestException, IndexError, KeyError) as e:
                first_comment = "--"
                
            bug_reports["description"].append(first_comment)

            params["offset"] += params["limit"] # Move to next batch

            # params["offset"] += len(bugs) # Move to next batch

            # Break if max bugs reached
            if len(bug_reports["id"]) >= MAX_BUGS:
                break

        # Also break outer loop if max bugs reached
        if len(bug_reports["id"]) >= MAX_BUGS:
            break
    
    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        break

print(f"Total bugs fetched: {len(bug_reports)}")

### Retrieve comments (description)

In [3]:
data = pd.read_csv("bug_reports_mozilla_firefox_resolved_fixed.csv")

In [5]:
data.head()

Unnamed: 0,Bug ID,Type,Summary,Product,Component,Status,Resolution,Priority,Severity
0,1955715,enhancement,Update addonsInfo asrouter targeting to allow ...,Firefox,Messaging System,RESOLVED,FIXED,P1,--
1,1951788,enhancement,Retrieve custom wallpaper from profile and set...,Firefox,New Tab Page,RESOLVED,FIXED,P1,--
2,1953155,task,Enable expand on hover and remove coming soon ...,Firefox,Sidebar,RESOLVED,FIXED,P1,--
3,1953560,task,Add strings for Firefox Labs,Firefox,New Tab Page,RESOLVED,FIXED,P1,--
4,1953857,enhancement,Add support for picker style tiles in the Abou...,Firefox,Messaging System,RESOLVED,FIXED,P1,--


In [6]:
def fetch_first_comment(bug_id):
    
    url = f"https://bugzilla.mozilla.org/rest/bug/{bug_id}/comment"
    
    try:
        response = requests.get(url)
        comments_data = response.json()
        return comments_data["bugs"][str(bug_id)]["comments"][0]["text"]
        
    except Exception as e:
        return "--"

In [11]:
data["Description"] = data["Bug ID"].progress_apply(fetch_first_comment)

100%|███████████████████████████████████████████████████████████████████████████| 10000/10000 [1:52:10<00:00,  1.49it/s]


In [28]:
# Drop rows without content
data.drop(data[data["Description"] == "--"].index, inplace=True)

In [35]:
data.drop(data[data["Description"] == ""].index, inplace=True)

In [36]:
data.Description.isna().sum()

np.int64(0)

In [37]:
data.shape

(9157, 10)

In [40]:
data.head()

Unnamed: 0,Bug ID,Type,Summary,Product,Component,Status,Resolution,Priority,Severity,Description
0,1955715,enhancement,Update addonsInfo asrouter targeting to allow ...,Firefox,Messaging System,RESOLVED,FIXED,P1,--,"Currently, the addonsInfo targeting returns an..."
2,1953155,task,Enable expand on hover and remove coming soon ...,Firefox,Sidebar,RESOLVED,FIXED,P1,--,"When expand on hover is enabled, the message s..."
4,1953857,enhancement,Add support for picker style tiles in the Abou...,Firefox,Messaging System,RESOLVED,FIXED,P1,--,In bug 1910633 we added support for a single s...
5,1945526,task,[SPIKE] What’s New Notification: Windows Toast...,Firefox,Messaging System,RESOLVED,FIXED,P1,--,Spike to understand how the Windows Toast Noti...
6,1945564,enhancement,Add new callout for Create Tab Group action &&...,Firefox,Messaging System,RESOLVED,FIXED,P1,--,Scope is to update && add to the onboarding ca...


In [45]:
# Check for duplicates
data["Description"].duplicated().sum()

np.int64(27)

In [46]:
# Save to CSV
data.to_csv("bug_reports_mozilla_firefox_resolved_fixed_comments.csv", index = None)