## Generate a Synthetic dataset

- model used : Qwen3-0.6B
- model uses a few shot prompt technique to understand the query of the customer and then converts the query into a `support ticket vector` for the customer support process.
- I have used ChatGPT to create 40 customer queries and later use these customer queries are passed to the LLM Qwen3-0.6B to generate a default formatted Support Ticket.
- Used Few-Shot Prompts to generate a Support Ticket via the LLM
- Parse the string created by the LLM and convert it into an actual JSON object
- store the queries and the support ticket into a csv file for later use as a vector db

In [1]:
"""Generating a synthetic dataset"""

import numpy as np
import pandas as pd
import json


In [2]:
# Sample Cutomer Query
customer_query = """Hi there,
We're experiencing issues with file uploads in your web app. Whenever we try to upload a PDF larger than 5MB, it fails silently—no error message, just nothing happens. This occurs on both Edge (v120) and Chrome (v121) on Windows 11. Smaller files seem to upload fine. Can you look into this?

Best,
Ayesha (IT Admin, Acme Corp)"""

In [3]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import re
def generate_support_ticket(query):

    model_name = "Qwen/Qwen3-1.7B"

    # load the tokenizer and the model
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype="auto",
        device_map="auto"
    )

    customer_query = query
    prompt = f"""
    You are a structured support assistant. Your role is to extract key information from customer queries and generate a suggested resolution. Return everything as a properly formatted JSON object.

    Instructions:
    - Output only the JSON object — no additional text, labels, or explanations.
    - Ensure the JSON is syntactically valid: use double quotes, proper commas, and braces.
    - The "Resolution" field must contain a reasonable fix or suggestion based on the described problem.
    - The "Resolution" field is mandatory — infer a likely troubleshooting step or recommendation, even if it’s not explicitly stated in the query.
    - Try to fill in as many relevant values as possible based on the query. If you are unsure about a value for any key, set it to null.
    - If the customer value is null, make it 'Individual'

    Fields to extract:
    {{
        "Query": "Full query with all greetings and names removed",
        "Title": "One-line summary of the issue",
        "Browser": "Browser name and version if mentioned",
        "OS": "Operating system if mentioned",
        "Customer Type": "Individual, Enterprise, Developer, etc.",
        "Issue": "Short description of the core problem",
        "Resolution": "Recommended troubleshooting step or fix"
    }}

    Examples:

    Example 1:
    Query:
    "
    Hi,
    I'm having trouble logging into our account using Single Sign-On (SSO) on Safari (version 16.3) on macOS Ventura. When I try to log in, I'm stuck in a redirect loop and never reach the dashboard. This issue doesn't occur on other browsers. Could you please help me resolve this?

    Thanks,
    John
    "

    Response:
    {{
        "Query": "I'm having trouble logging into our account using Single Sign-On (SSO) on Safari (version 16.3) on macOS Ventura. When I try to log in, I'm stuck in a redirect loop and never reach the dashboard. This issue doesn't occur on other browsers.",
        "Title": "Login failure on Safari for SSO users",
        "Browser": "Safari 16.3",
        "OS": "macOS Ventura",
        "Customer Type": "Enterprise",
        "Issue": "Redirect loop during SSO login",
        "Resolution": "Clear cookies and enable cross-site tracking in Safari settings"
    }}

    Example 2:
    Query:
    "
    Hello team,
    Our developer portal isn't loading correctly in Chrome. We see a blank white screen after logging in. No errors appear, but the console shows some CORS-related messages. We're accessing from Windows 10.

    Thanks,
    Kira (DevOps Engineer)
    "

    Response:
    {{
        "Query": "Our developer portal isn't loading correctly in Chrome. We see a blank white screen after logging in. No errors appear, but the console shows some CORS-related messages. We're accessing from Windows 10.",
        "Title": "Developer portal fails to load in Chrome",
        "Browser": "Chrome (version not specified)",
        "OS": "Windows 10",
        "Customer Type": "Developer",
        "Issue": "Blank screen and CORS warnings post-login",
        "Resolution": "Check CORS headers on the server and suggest using the latest browser version"
    }}

    Now process the following query and return only a valid JSON object:

    Query:
    {customer_query}
    """


    messages = [
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    # conduct text completion
    generated_ids = model.generate(
        **model_inputs,
        max_new_tokens=32768
    )
    output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist() 

    # parsing thinking content
    try:
        # rindex finding 151668 (</think>)
        index = len(output_ids) - output_ids[::-1].index(151668)
    except ValueError:
        index = 0

    thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
    content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
    
    return content



### Import customer queries csv file
- Load the customer queries from the `customer queries.csv` file into a dataset variable using pandas
- Convert the pandas dataframe into list for easy list parsing

In [4]:

queries_dataset = pd.read_csv("./datasets/customer_queries.csv")
queries_dataset = queries_dataset['Query'].to_list()


In [5]:
"""Function to structure the response as the responses may be unstructure
- A class SupportCase deriving from Pydantic BaseModel
- Parsers to parse the response and convert it into a well-defined structured Support Ticket item.
"""

import re
from pydantic import BaseModel, Field


class SupportCase(BaseModel):
    Query: str
    Title: str
    Browser: str
    OS: str
    Customer_Type: str
    Issue: str
    Resolution: str


def parse_unstructured_response(response: str) -> SupportCase:
    """Define the patterns for the parser and later parse the response to convert it to the desired format
    Rules for parsing: 
        1. If double quotes (") are before newline there then it stops until double quotes
        2. If comma (,) is before newline and no double quotes then it stops until comma
        3. If both are present (, and ") it stops until double quotes.
        4. If both are absent then the parser stops at the newline

    """
    patterns = {
        "Query": r'"Query":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "Title": r'"Title":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "Browser": r'"Browser":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "OS": r'"OS":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "Customer_Type": r'"Customer Type":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "Issue": r'"Issue":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?',
        "Resolution": r'"Resolution":\s*"([^"\n,]*)(?:(?<=,)[^"\n]*|(?<=\n)[^"]*)?'
    }
    parsed_data = {key: re.search(pattern, response, re.DOTALL).group(1).strip() if re.search(pattern, response, re.DOTALL) else "" for key, pattern in patterns.items()}
    return SupportCase(**parsed_data)


In [6]:

from tqdm import tqdm
def generate_synthetic_dataset(query): 
    """Generate the synthetic data
    - Parse the response to a structured SupportCase object
    - Convert the SupportCase object into a dictionary
    - return the dictionary
    """   

    response = generate_support_ticket(query)
    support_ticket = parse_unstructured_response(response)
    support_ticket = support_ticket.model_dump()
    return support_ticket



In [7]:
synthetic_data = []
for query in tqdm(queries_dataset, desc = "Generating Synthetic Data .. "):
    data = generate_synthetic_dataset(query)
    print(data)
    synthetic_data.append(data)  

Generating Synthetic Data .. :   0%|          | 0/40 [00:00<?, ?it/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :   2%|▎         | 1/40 [00:39<25:25, 39.12s/it]

{'Query': "I'm unable to generate reports from the dashboard. The button remains greyed out even after selecting all the required filters. This happens on Chrome on macOS Sonoma.", 'Title': 'Dashboard report generation button is greyed out after applying filters', 'Browser': 'Chrome', 'OS': 'macOS Sonoma', 'Customer_Type': 'Individual', 'Issue': 'Dashboard report generation button remains greyed out after applying filters', 'Resolution': 'Clear browser cache and cookies'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :   5%|▌         | 2/40 [01:08<21:05, 33.29s/it]

{'Query': "Our team can't access shared folders anymore. We keep getting a 'Permission Denied' error", 'Title': 'Shared folder access denied error on Firefox on Windows 10', 'Browser': 'Firefox', 'OS': 'Windows 10', 'Customer_Type': 'Enterprise', 'Issue': 'Permission denied error when accessing shared folders despite no role changes', 'Resolution': 'Check firewall settings and ensure the shared folder permissions are correctly configured'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :   8%|▊         | 3/40 [02:06<27:37, 44.81s/it]

{'Query': "The mobile app crashes on launch for Android 14 users. It briefly shows the splash screen then closes. We've tried reinstalling. Help? – Dev Team", 'Title': 'Mobile app crashes on launch for Android 14 users', 'Browser': '', 'OS': 'Android 14', 'Customer_Type': 'Developer', 'Issue': 'Mobile app crashes on launch for Android 14 users', 'Resolution': 'Check for app updates and ensure the device is compatible with Android 14. If the issue persists'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  10%|█         | 4/40 [02:30<21:59, 36.65s/it]

{'Query': "Emails from your system are being flagged as spam by Gmail. We're missing important notifications. Can you adjust the sending domain or SPF records? Thanks", 'Title': 'Email spam flagging and notification issues', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Emails flagged as spam by Gmail leading to missing notifications', 'Resolution': "Check and update SPF records or adjust the sending domain settings in Gmail's spam filtering settings"}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  12%|█▎        | 5/40 [03:05<21:01, 36.04s/it]

{'Query': 'I’m seeing a 502 Bad Gateway error when trying to access our billing page. This happens in Safari on iOS 17.', 'Title': 'Billing page fails with 502 Bad Gateway error in Safari on iOS 17', 'Browser': 'Safari iOS 17', 'OS': 'iOS 17', 'Customer_Type': 'Individual', 'Issue': '502 Bad Gateway error when accessing billing page in Safari on iOS 17', 'Resolution': 'Check server logs and ensure the billing service is running correctly. Verify if there are any known issues with the billing page or server configuration.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  15%|█▌        | 6/40 [03:27<17:34, 31.01s/it]

{'Query': "we tried importing our user data via CSV but keep getting 'Invalid file format' even though it follows the template.", 'Title': 'CSV import failure with invalid format', 'Browser': 'Edge', 'OS': 'Windows 11', 'Customer_Type': 'Individual', 'Issue': 'CSV file format error despite following the template', 'Resolution': 'Verify file encoding (UTF-8 or UTF-16) and check for missing commas or incorrect delimiters in the CSV'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  18%|█▊        | 7/40 [03:50<15:39, 28.46s/it]

{'Query': "I noticed the time tracking widget is not syncing with our calendar anymore. We're using Google Calendar integration. This started last week. Help appreciated! – Camille", 'Title': 'Time tracking widget not syncing with Google Calendar', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Time tracking widget not syncing with Google Calendar', 'Resolution': "Check the Google Calendar integration settings in the time tracking tool and ensure it's properly configured"}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  20%|██        | 8/40 [04:28<16:48, 31.52s/it]

{'Query': "we just onboarded a new employee but they’re stuck at the 'Verify your email' step. They never get the email. We've checked spam and firewall.", 'Title': 'Email verification failure for new employee', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Email verification failure for new employee', 'Resolution': "Check if the email was sent to the correct address and ensure the employee is checking their inbox. Verify email settings and confirm the employee's account is properly configured."}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  22%|██▎       | 9/40 [04:53<15:12, 29.45s/it]

{'Query': "Our API calls are failing with a 403 error since yesterday. Our API key hasn't changed. Could it be rate limiting or permission-related?", 'Title': 'API 403 error due to rate limiting or permissions', 'Browser': '', 'OS': '', 'Customer_Type': 'Developer', 'Issue': '403 error during API calls', 'Resolution': 'Check API rate limits and verify API key permissions. Ensure the API key is valid and has the necessary access rights.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  25%|██▌       | 10/40 [05:25<15:11, 30.37s/it]

{'Query': 'The new dashboard layout doesn’t load on Safari 15. It shows a blank screen. Other browsers work fine. Is this a known issue? – Michael', 'Title': 'Dashboard layout fails to load on Safari 15 with blank screen', 'Browser': 'Safari 15', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Dashboard layout fails to load on Safari 15 with blank screen', 'Resolution': 'Check browser compatibility for Safari 15 and consider updating to a newer version or checking for known issues in the browser support documentation.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  28%|██▊       | 11/40 [05:52<14:06, 29.20s/it]

{'Query': "the dark mode setting resets every time I log in. I’ve tried clearing cookies but no luck. I'm on Firefox on Linux.", 'Title': 'Dark mode resets on login', 'Browser': 'Firefox', 'OS': 'Linux', 'Customer_Type': 'Individual', 'Issue': 'Dark mode resets after login', 'Resolution': "Check if system-wide dark mode settings are overriding Firefox's theme and disable any browser extensions that may affect the theme"}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  30%|███       | 12/40 [06:22<13:43, 29.43s/it]

{'Query': "our invoices are showing incorrect tax calculations for European customers. VAT isn't being added.", 'Title': 'Incorrect VAT calculation in invoices for European customers', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'VAT not being added to invoices for European customers', 'Resolution': 'Check and update tax settings to include VAT for European customers'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  32%|███▎      | 13/40 [06:55<13:45, 30.58s/it]

{'Query': "I'm trying to upgrade our subscription but the payment form keeps timing out. I've tried multiple cards. Using Chrome on Windows 10.", 'Title': 'Subscription upgrade payment form timeout', 'Browser': 'Chrome (version not specified)', 'OS': 'Windows 10', 'Customer_Type': 'Individual', 'Issue': 'Payment form timeouts during subscription upgrade', 'Resolution': 'Clear browser cache and cookies'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  35%|███▌      | 14/40 [07:26<13:21, 30.84s/it]

{'Query': 'we’re seeing delays in real-time notifications—up to 10 minutes sometimes. This impacts our workflow significantly. Any updates? – Operations Team', 'Title': 'Delays in real-time notifications', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Delays in real-time notifications', 'Resolution': 'Check system logs for real-time notification delays or contact support for further details'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  38%|███▊      | 15/40 [07:49<11:45, 28.22s/it]

{'Query': 'when we export our dashboard to PDF', 'Title': 'PDF export issue: charts missing or cut off', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Charts missing or cut off during PDF export', 'Resolution': 'Check PDF export settings and ensure that charts are included in the export. Try exporting in a different PDF viewer to see if the issue persists.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  40%|████      | 16/40 [08:14<10:55, 27.32s/it]

{'Query': "I’m using the CLI tool and suddenly getting an 'Authentication failed' message. Token hasn’t changed. What should I do? – Greg", 'Title': 'Authentication failed with unchanged token in CLI tool', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Authentication failure with unchanged token in CLI tool', 'Resolution': 'Check if the token is expired or invalid. Try re-authenticating or clearing cached credentials. Verify server-side authentication settings.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  42%|████▎     | 17/40 [08:51<11:33, 30.15s/it]

{'Query': "attachments aren't downloading properly in Safari—they appear as .txt instead of PDFs. This is only happening for some users.", 'Title': 'Attachments not downloading correctly in Safari', 'Browser': 'Safari', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Attachments are displayed as .txt instead of PDFs in Safari.', 'Resolution': "Check Safari's download settings to ensure PDFs are allowed and try clearing the cache."}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  45%|████▌     | 18/40 [09:17<10:36, 28.91s/it]

{'Query': 'every time I try to reset my password', 'Title': 'Password reset token expired error', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Password reset token expired error', 'Resolution': 'Check if the password reset link was clicked immediately and ensure the token is valid. If the issue persists'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  48%|████▊     | 19/40 [09:57<11:20, 32.38s/it]

{'Query': 'we recently switched to SAML authentication', 'Title': "SAML Authentication Issue: New Users Getting 'Invalid Assertion' Errors", 'Browser': '', 'OS': '', 'Customer_Type': 'Enterprise', 'Issue': "New users are encountering 'invalid assertion' errors after switching to SAML authentication", 'Resolution': 'Verify SAML configuration'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  50%|█████     | 20/40 [10:35<11:18, 33.94s/it]

{'Query': 'the drag-and-drop upload feature doesn’t work in Firefox but works in Chrome. It just opens the file instead of uploading. Any workaround? – Daniel', 'Title': 'Drag-and-Drop Upload Issue in Firefox', 'Browser': 'Firefox', 'OS': '', 'Customer_Type': 'Developer', 'Issue': 'Drag-and-drop upload feature fails in Firefox', 'Resolution': 'Try clearing browser cache and cookies in Firefox. If issue persists'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  52%|█████▎    | 21/40 [11:09<10:46, 34.02s/it]

{'Query': 'we can’t delete old projects—clicking delete does nothing. No error message either. Using Edge on Windows', 'Title': 'Project deletion fails in Edge on Windows', 'Browser': 'Edge', 'OS': 'Windows', 'Customer_Type': 'Individual', 'Issue': 'Project deletion fails when clicking delete button', 'Resolution': 'Clear browser cache and try again'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  55%|█████▌    | 22/40 [11:35<09:28, 31.57s/it]

{'Query': "I'm getting logged out randomly while working. It’s disrupting my work. Session timeout setting? – Omar", 'Title': 'Random login logout due to session timeout', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Random login logout due to session timeout', 'Resolution': 'Check and adjust the session timeout settings in your account or system configuration'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  57%|█████▊    | 23/40 [12:15<09:41, 34.23s/it]

{'Query': 'the app is stuck on a loading spinner after login. Happens on Android 13. Works fine on iOS. Any updates? – Beta Tester', 'Title': 'App loading spinner stuck after login on Android 13', 'Browser': 'Android 13', 'OS': 'Android', 'Customer_Type': 'Beta Tester', 'Issue': 'App is stuck on loading spinner after login on Android 13', 'Resolution': 'Check for app updates and ensure the device is running the latest Android OS version'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  60%|██████    | 24/40 [12:39<08:17, 31.10s/it]

{'Query': 'Our webhook endpoint isn’t receiving payloads. Nothing in our logs. Can you check if they’re being sent from your side? Thanks', 'Title': 'Webhook endpoint not receiving payloads', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Webhook endpoint not receiving payloads', 'Resolution': 'Verify webhook URL configuration and check server logs on the other side for payload transmission details'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  62%|██████▎   | 25/40 [13:04<07:18, 29.25s/it]

{'Query': "scheduled reports aren't being sent via email. They show as 'Delivered' in the UI but never arrive.", 'Title': 'Scheduled reports not arriving via email despite marked as delivered', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Scheduled reports not arriving via email despite marked as delivered', 'Resolution': 'Check email server settings and verify that the report delivery configuration is correct'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  65%|██████▌   | 26/40 [13:36<06:59, 30.00s/it]

{'Query': "I just tried uploading a logo in our profile settings and received 'Unsupported file format'—but it's a PNG.", 'Title': "Logo upload fails with 'Unsupported file format' error", 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': "Logo upload fails with 'Unsupported file format' error", 'Resolution': 'Check if the server settings allow PNG file uploads and ensure the file is correctly formatted.'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  68%|██████▊   | 27/40 [14:10<06:46, 31.27s/it]

{'Query': 'Hey', 'Title': 'E-signature signing fails for legal documents', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'E-signature signing fails for legal documents', 'Resolution': 'Verify e-signature settings and ensure the document is correctly formatted for signing'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  70%|███████   | 28/40 [15:02<07:30, 37.51s/it]

{'Query': 'auto-save is not working in the editor anymore. We’ve lost work due to this. Using Chrome on Windows 10.', 'Title': 'Auto-save functionality not working in editor causing data loss', 'Browser': 'Chrome', 'OS': 'Windows 10', 'Customer_Type': 'Individual', 'Issue': 'Auto-save is not working in the editor anymore. We’ve lost work due to this.', 'Resolution': 'Clear browser cache and cookies'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  72%|███████▎  | 29/40 [15:30<06:21, 34.68s/it]

{'Query': "we're seeing duplicated records after importing contacts. Can you prevent duplicates based on email ID? – CRM Admin", 'Title': 'Duplicate records during contact import based on email ID', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Duplicate records after importing contacts based on email ID', 'Resolution': 'Configure the import settings to check for existing email IDs and prevent duplicates'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  75%|███████▌  | 30/40 [15:56<05:19, 31.96s/it]

{'Query': 'the analytics widget isn’t updating in real-time—it shows data from yesterday only. Any known issue?', 'Title': 'Analytics widget not updating in real-time', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Analytics widget shows data from yesterday instead of real-time updates', 'Resolution': 'Check if the widget is updated or contact support for further assistance'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  78%|███████▊  | 31/40 [16:17<04:19, 28.87s/it]

{'Query': 'the custom domain setup for our portal fails at the SSL verification step. Using Cloudflare for DNS. Please help. – IT Admin', 'Title': 'SSL verification failure during custom domain setup with Cloudflare', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'SSL verification fails during custom domain setup with Cloudflare', 'Resolution': 'Verify Cloudflare SSL/TLS settings'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  80%|████████  | 32/40 [17:03<04:30, 33.86s/it]

{'Query': "We’re getting 'Unknown error' when trying to submit forms on iPads running iPadOS 17. Worked before. Fix? – Front Desk", 'Title': 'Unknown error on iPadOS 17 form submission', 'Browser': 'iPadOS 17', 'OS': 'iPadOS 17', 'Customer_Type': 'Individual', 'Issue': 'Unknown error when submitting forms on iPadOS 17', 'Resolution': 'Check for iPadOS updates and clear browser cache if using Safari'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  82%|████████▎ | 33/40 [17:49<04:22, 37.55s/it]

{'Query': 'when I create a new user group', 'Title': 'User group permissions not saving correctly leading to full access', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Permissions not saving correctly leading to full access', 'Resolution': 'Check and verify that the user group permissions are correctly assigned and saved in the system. If not'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  85%|████████▌ | 34/40 [18:18<03:29, 34.87s/it]

{'Query': 'your widget causes layout shifts on our site when it loads. Can we delay or async-load it? – Frontend Dev', 'Title': 'Layout shifts caused by widget loading', 'Browser': '', 'OS': '', 'Customer_Type': 'Developer', 'Issue': 'Layout shifts occur when widget is loaded', 'Resolution': "Implement async or defer the widget's loading to prevent layout shifts"}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  88%|████████▊ | 35/40 [18:44<02:41, 32.26s/it]

{'Query': "our backups to Dropbox failed with a 'Token revoked' error. We didn’t change anything. How can we reconnect safely? – IT Ops", 'Title': 'Dropbox Backup Token Revoked Error', 'Browser': '', 'OS': '', 'Customer_Type': 'Enterprise', 'Issue': "Backup failed with 'Token revoked' error", 'Resolution': 'Re-authenticate with Dropbox to resolve the token revoked error'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  90%|█████████ | 36/40 [19:08<01:59, 29.96s/it]

{'Query': 'we integrated your app with Slack but messages are delayed or missing. Webhook retries? – Product Team', 'Title': 'Slack integration messages delayed or missing with webhook retries', 'Browser': '', 'OS': '', 'Customer_Type': 'Enterprise', 'Issue': 'Delayed or missing Slack messages with webhook retries', 'Resolution': 'Verify webhook retry settings in Slack app configuration and ensure proper integration'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  92%|█████████▎| 37/40 [19:35<01:27, 29.12s/it]

{'Query': 'I’m using Brave browser and the login page doesn’t load—it’s just blank. Any compatibility issues?', 'Title': 'Login page not loading in Brave browser', 'Browser': 'Brave', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Login page fails to load with blank screen', 'Resolution': 'Check for browser updates and disable any conflicting extensions'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  95%|█████████▌| 38/40 [20:07<00:59, 29.82s/it]

{'Query': 'when exporting a CSV report', 'Title': 'CSV export formatting issue with commas in text fields', 'Browser': '', 'OS': '', 'Customer_Type': 'Developer', 'Issue': 'Commas in text fields cause CSV export formatting issues', 'Resolution': 'Wrap fields in quotes when exporting CSV to prevent comma issues'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. :  98%|█████████▊| 39/40 [20:43<00:31, 31.67s/it]

{'Query': 'We tried sending a bulk invite to users but only some received the email. We’re using the admin console. Why?', 'Title': 'Bulk invite not reaching all users in admin console', 'Browser': '', 'OS': '', 'Customer_Type': 'Enterprise', 'Issue': 'Bulk invite not reaching all users in admin console', 'Resolution': 'Check email server settings and verify recipient list for bulk invites'}


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Generating Synthetic Data .. : 100%|██████████| 40/40 [21:07<00:00, 31.68s/it]

{'Query': 'I accidentally deleted an important workspace. Is there a way to restore it? It happened an hour ago. – Maria', 'Title': 'Workspace deletion and restoration request', 'Browser': '', 'OS': '', 'Customer_Type': 'Individual', 'Issue': 'Accidental deletion of workspace and need to restore it', 'Resolution': 'Check if a backup exists and restore it'}





In [8]:
synthetic_data

[{'Query': "I'm unable to generate reports from the dashboard. The button remains greyed out even after selecting all the required filters. This happens on Chrome on macOS Sonoma.",
  'Title': 'Dashboard report generation button is greyed out after applying filters',
  'Browser': 'Chrome',
  'OS': 'macOS Sonoma',
  'Customer_Type': 'Individual',
  'Issue': 'Dashboard report generation button remains greyed out after applying filters',
  'Resolution': 'Clear browser cache and cookies'},
 {'Query': "Our team can't access shared folders anymore. We keep getting a 'Permission Denied' error",
  'Title': 'Shared folder access denied error on Firefox on Windows 10',
  'Browser': 'Firefox',
  'OS': 'Windows 10',
  'Customer_Type': 'Enterprise',
  'Issue': 'Permission denied error when accessing shared folders despite no role changes',
  'Resolution': 'Check firewall settings and ensure the shared folder permissions are correctly configured'},
 {'Query': "The mobile app crashes on launch for An

In [9]:
# Check if synthetic data is of type json object
import json

def is_json_serializable(data):
    """Functioin to check if the synthetic data is of type json object. So that it can be converted easily to a csv file using pandas"""

    try:
        json.dumps(data)
        return True
    except (TypeError, OverflowError):
        return False
    
is_json_serializable(synthetic_data)

True

In [10]:
import pandas as pd

data = synthetic_data
csv_filename = "./datasets/query_support_ticket.csv"

df = pd.DataFrame(data)
df.to_csv(csv_filename, index=False)