This notebook is what I wrote to generate ~500k tickets of somewhat random contents using the [Zendesk bulk import API](https://developer.zendesk.com/api-reference/ticketing/tickets/ticket_import/#ticket-bulk-import) endpoint when working on getting good sample data for testing export ticket APIs.

The goal was to get tickets with some semi-random and different text assigned to some set of requesters and assignees, so that searching for some arbitrary words/tags and users would produce a nice subset of tickets for export searching.
I added attachments to some tickets to test how ticket attachments were exported.



I used a combination of [wonderwords](https://pypi.org/project/wonderwords/) as well as semi random presets to get somewhat convincing tickets with a decent variance for some simple text searching later on for search exports.



In [1]:
import json
import requests
from datetime import datetime, timedelta
import random
from wonderwords import RandomSentence
from wonderwords import RandomWord
import names
import os
import threading


Basic auth was used here: input your own username and password as well as subdomain. The subdomain is the preceding portion of your Zendesk instance, eg if your instance is d3v-example.zendesk.com, replace the subdomain with "d3v-example".

In [2]:
username = "user@example.com"
password = "password"
subdomain = "d3v-example"

I grabbed the list of end-users so I can randomly assign tickets to requesters later on. If you want to have many users to assign tickets to, 
[Create many users](https://developer.zendesk.com/api-reference/ticketing/users/users/#create-many-users) is a good option to create multiple users.

following which, I get the list of agents to have users to assign tickets to.


In [None]:
url = "https://"+subdomain+".zendesk.com/api/v2/users.json?page=1&role=end-user"

user_ids = []
while url:
    # Send an HTTP GET request to the URL
    response = requests.get(url, auth=(username, password))

    # Check if the request was successful
    if response.status_code == 200:
        # Parse the JSON data from the response
        data = response.json()

        # Extract user IDs and add them to the user_ids list
        user_ids.extend([user["id"] for user in data["users"]])

        # Get the next page URL
        url = data.get("next_page")

        # Print some information (you can remove this if not needed)
        print(f"Downloaded {len(user_ids)} user IDs from {url}")

    else:
        print(f"Failed to retrieve data from {url}. Status code: {response.status_code}")
        break

In [None]:
url = "https://"+subdomain+".zendesk.com/api/v2/users.json?page=1&role=agent"
agent_ids = []
while url:
    # Send an HTTP GET request to the URL
    response = requests.get(url, auth=(username, password))

    # Check if the request was successful
    if response.status_code == 200:
        # Parse the JSON data from the response
        data = response.json()

        # Extract user IDs and add them to the user_ids list
        agent_ids.extend([user["id"] for user in data["users"]])

        # Get the next page URL
        url = data.get("next_page")

        # Print some information (you can remove this if not needed)
        print(f"Downloaded {len(agent_ids)} agent IDs from {url}")

    else:
        print(f"Failed to retrieve data from {url}. Status code: {response.status_code}")
        break


agent_ids

Status, type, priority only have a few options, so I can randomly choose these. I also set up some boilerplate descriptions with placeholders for somewhat similar tickets with some variance in content.

In [None]:
tags_list = ["foo","bar","baz","urgent","feature request","billing issues","refund","product","sales","marketing"]
status = ["hold", "solved", "closed"]
type = ["question","incident","problem","task"]
priority = ["low","normal","high","urgent"]
descriptions_list = [
    "Hello, I'm experiencing issues with {issue}. Can you help me?",
    "I'm having trouble with {feature}. Please assist.",
    "I need help with {software} version {version}.",
    "My {account} has a problem. Can you look into it?",
    "I'd like to report a bug in {software} ({version}).",
    "Is there a way to {request}?",
]
subject_templates = [
    "Issue with {issue}",
    "Payment Processing Problem for {user}",
    "Website Down for Maintenance on {date}",
    "Password Reset Request for {account}",
    "Feature Request: {feature}",
    "Billing Inquiry for {user}",
    "Bug Report: {software} - {version}",
]
placeholders = {
    "{issue}": ["Account Login", "Payment", "Website", "Password", "Feature", "Billing", "Software", "Service"],
    "{feature}": ["Login", "Payments", "User Registration", "Dashboard", "Search", "Billing"],
    "{software}": ["ApplicationX", "SoftwareY", "ProductZ","Everlaw","Zendesk","Google","Facebook","Youtube"],
    "{version}": ["v1.0", "v2.1", "v3.5"],
    "{account}": ["User1's Account", "CustomerB's Account", "ClientC's Account"],
    "{request}": ["reset my password", "get a refund", "change my account settings"],
}


In [None]:
subject = random.choice(subject_templates)
for placeholder, values in placeholders.items():
    print(placeholder,values)
    value = random.choice(values)
    if random.random() > 0.8:
        value = RandomWord().word()

    subject = subject.replace(placeholder, value)
        # if placeholder == "{user}":
subject = subject.replace("{user}", names.get_full_name())
subject = subject.replace("{date}", datetime(2019, 10, 1).strftime("%Y-%m-%d"))
subject = subject.replace("{version}", "v"+str(round(random.random()*10,1)))
    # else:
subject

{issue} ['Account Login', 'Payment', 'Website', 'Password', 'Feature', 'Billing', 'Software', 'Service']
{feature} ['Login', 'Payments', 'User Registration', 'Dashboard', 'Search', 'Billing']
{software} ['ApplicationX', 'SoftwareY', 'ProductZ', 'Everlaw', 'Zendesk', 'Google', 'Facebook', 'Youtube']
{version} ['v1.0', 'v2.1', 'v3.5']
{account} ["User1's Account", "CustomerB's Account", "ClientC's Account"]
{request} ['reset my password', 'get a refund', 'change my account settings']


"Password Reset Request for CustomerB's Account"

The main ticket generation json function: I fixed the date range because Zendesk allows for creating tickets in the past, which is helpful for making tickets over a span of time rather than being clustered at creation time. This looks more realistic than having 100 tickets clustured at the same time.
The bulk import API supports up to 100 ticket objects/request.

In [24]:

def generate_ticket_json(num_jsons=1,num_tickets=100):
    start_date = datetime(2005, 1, 1)
    end_date = datetime(2005, 12, 31)


# Generate a number of tickets
    ticket_json = {
        "tickets": []
    }

    for i in range(num_tickets):
        # Generate random "created_at" timestamps within a specific range
        created_at = start_date + timedelta(seconds=random.randint(0, int((end_date - start_date).total_seconds())))
        comment_1 = created_at + timedelta(days=random.randint(1,3))
        comment_2 = created_at + timedelta(days=random.randint(3,5))
        # Incremental values
        description = random.choice(descriptions_list)
        description = description.replace("{version}", "v"+str(round(random.random()*10,1)))
        for placeholder, values in placeholders.items():
            value = random.choice(values)
            if random.random() > 0.8:
                value = RandomWord().word()
                
            description = description.replace(placeholder, value)
        description += f" ticket_description_{i}"
        if random.random() > 0.8:
            description += " " + RandomSentence().sentence()
        subject = random.choice(subject_templates)
        subject = subject.replace("{user}", names.get_full_name())
        subject = subject.replace("{date}", start_date.strftime("%Y-%m-%d"))
        subject = subject.replace("{version}", "v"+str(round(random.random()*10,1)))    
        for placeholder, values in placeholders.items():
            value = random.choice(values)
            if random.random() > 0.8:
                value = RandomWord().word()
            subject = subject.replace(placeholder, value)

        subject += f" ticket_subject_{i}"
        num_tags = random.randint(1,3)

        # Randomly select IDs
        requester_id = random.choice(user_ids)


        # Create a ticket dictionary
        ticket = {
            "assignee_id": random.choice(agent_ids),
            "created_at": created_at.strftime("%Y-%m-%dT%H:%M:%SZ"),
            "status": random.choice(status),
            "type": random.choice(type),
            "priority": random.choice(priority),
            "comments": [
                {
                    "author_id": requester_id,
                    "created_at": comment_1.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "value": f"This is a comment on ticket_{i}"
                },
                {
                    "author_id": random.choice(agent_ids),
                    "public": False,
                    "created_at": comment_2.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "value": f"This is a private comment on ticket{i}"
                }
            ],
            "description": description,
            "requester_id": requester_id,
            "subject": subject,
            "tags": random.sample(tags_list,num_tags)
        }
        # 20% of the time add some extra comments with a random sentence
        if random.random() >0.8:
            ticket['comments'] +={
                    "author_id": requester_id,
                    "created_at": comment_1.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "value": RandomSentence().sentence()
                },{
                    "author_id": random.choice(agent_ids),
                    "public": False,
                    "created_at": comment_2.strftime("%Y-%m-%dT%H:%M:%SZ"),
                    "value": RandomSentence().sentence()
                }

        # Append the ticket to the list
        ticket_json["tickets"].append(ticket)
    return ticket_json

generate_ticket_json(num_jsons=1,num_tickets=1)

{'tickets': [{'assignee_id': 18673514369179,
   'created_at': '2005-05-07T03:15:47Z',
   'status': 'solved',
   'type': 'question',
   'priority': 'high',
   'comments': [{'author_id': 18673490435099,
     'created_at': '2005-05-08T03:15:47Z',
     'value': 'This is a comment on ticket_0'},
    {'author_id': 18673514369179,
     'public': False,
     'created_at': '2005-05-11T03:15:47Z',
     'value': 'This is a private comment on ticket0'}],
   'description': "Hello, I'm experiencing issues with Payment. Can you help me? ticket_description_0",
   'requester_id': 18673490435099,
   'subject': 'Website Down for Maintenance on 2005-01-01 ticket_subject_0',
   'tags': ['foo', 'sales', 'refund']}]}

If we want tickets with attachments, attachments need to first be uploaded to Zendesk, and then Zendesk returns a single use upload token which can be used in a ticket body json.
I found a [sample pokemon image dataset from kaggl](https://www.kaggle.com/datasets/vishalsubbiah/pokemon-images-and-types) which worked fine for my purposes of randomly selecting images. Bonus: the images were named after the pokemon, so I could extract the image name to use in tickets.

In [None]:
#helper function to create an attachment and upload to Zendesk so we can use the image in a ticket
def create_attachment():
    '''
        creates a random pokemon image attachment on Zendesk for adding to comments
        returns tuple of pokemon img filename and token
    '''
    # change to your own image directory
    image_directory = 'dir/images/'
    url = 'https://'+subdomain+'.zendesk.com/api/v2/uploads.json'
    

    # Check if the directory exists
    if not os.path.exists(image_directory):
        print("Directory not found.")
    else:
        # List all files in the directory
        image_files = [f for f in os.listdir(image_directory) if os.path.isfile(os.path.join(image_directory, f))]
        # Check if there are any image files
        if not image_files:
            print("No files found in the directory.")
        else:
            # Choose a random image file
            random_image = random.choice(image_files)

            # Print the name of the random image
            pokemon_filename = str(random_image)
            pokemon_name = os.path.splitext(pokemon_filename)[0]
            # print(pokemon_name)


    # upload pokemon image and get token
    # upload file
    params = {'filename': pokemon_filename}
    headers = {'Content-Type': 'image/png'}
    with open(image_directory+pokemon_filename, 'rb') as f:
        response = requests.post(url, params=params, data=f, headers=headers, auth=(username,password)).json()
    upload_token = response['upload']['token']

    return (pokemon_name,upload_token)
create_attachment()

('genesect', 'Ie84HrD31Kgz0ru7xwT2D6fD9')

In [None]:

def generate_ticket_with_attachments_json(num_tickets=100):
# start/end date variables
	start_date = datetime(2010, 1, 1)
	end_date = datetime(2011, 10, 1)
# init placeholders
	descriptions_list = [
		"What is the best pokemon for {scenario}?",
		"Should I go to {city}?",
		"I challenge you to a battle for my {pokemon}!",
		"What type is {pokemon} weak to?"
	]
	subject_templates = [
		"I wanna be the very best, like {user} never was",
		"Challenging gym leader {user}",
		"I can't find {pokemon}",
		"Best {pokemon} ever!"
	]
	tags_list = ["battle", "gym","level","bug","story","legendary","mewtwo"]
	placeholders = {
	"{scenario}" :  ["Pokémon Battle", "Catching a Legendary Pokémon", "Evolving a Pokémon", "Visiting a Pokémon Center", "Navigating a Gym Puzzle", "Pokémon Contest", "Breeding Pokémon", "Shiny Pokémon Encounter", "Team Rocket Encounter", "Exploring a Legendary Pokémon's Lair"],
	"{city}": ["Pallet Town", "Viridian City", "Pewter City", "Cerulean City", "Vermilion City", "Lavender Town", "Celadon City", "Fuchsia City", "Saffron City", "Cinnabar Island", "Viridian City", "Goldenrod City", "Violet City", "Ecruteak City", "Olivine City", "Cianwood City", "Mahogany Town", "Blackthorn City", "New Bark Town", "Cherrygrove City", "Violet City", "Azalea Town", "Goldenrod City", "Ecruteak City", "Olivine City", "Cianwood City", "Mahogany Town", "Blackthorn City", "Littleroot Town", "Oldale Town", "Petalburg City", "Rustboro City", "Dewford Town", "Slateport City", "Mauville City", "Fallarbor Town", "Verdanturf Town", "Lavaridge Town", "Fortree City", "Lilycove City", "Mossdeep City", "Sootopolis City", "Pacifidlog Town", "Ever Grande City", "Twinleaf Town", "Sandgem Town", "Jubilife City", "Oreburgh City", "Floaroma Town", "Eterna City"]
	
	}

# Generate tickets
	ticket_json = {
		"tickets": []
	}
# loop for number of tickets in payload
	for i in range(num_tickets):
		# Generate random "created_at" timestamps within a specific range
		created_at = start_date + timedelta(seconds=random.randint(0, int((end_date - start_date).total_seconds())))
		comment_1 = created_at + timedelta(days=random.randint(1,3))
		comment_2 = created_at + timedelta(days=random.randint(3,5))
		# upload attachments
		pokemons = []
		for i in range(3):
			pokemons.append(create_attachment())
		

		# Incremental values
		description = random.choice(descriptions_list)
		subject = random.choice(subject_templates)
		description = description.replace("{pokemon}", random.choice(pokemons)[0])
		subject = subject.replace("{pokemon}", random.choice(pokemons)[0])
		for placeholder, values in placeholders.items():
			value = random.choice(values)
			description = description.replace(placeholder, value)
		description += f" ticket_description_{i}"
		subject = subject.replace("{user}", names.get_full_name()) 
		for placeholder, values in placeholders.items():
			value = random.choice(values)
			subject = subject.replace(placeholder, value)

		subject += f" ticket_subject_{i}"
		num_tags = random.randint(1,3)
		tags = random.sample(tags_list,num_tags)
		tags.append("pokemon")

		# Randomly select IDs
		requester_id = random.choice(user_ids)


		# Create a ticket dictionary
		ticket = {
			"assignee_id": random.choice(agent_ids),
			"created_at": created_at.strftime("%Y-%m-%dT%H:%M:%SZ"),
			"status": random.choice(status),
			"type": random.choice(type),
			"priority": random.choice(priority),
			"comments": [
				{
					"author_id": requester_id,
					"created_at": comment_1.strftime("%Y-%m-%dT%H:%M:%SZ"),
					"value": f"{pokemons[0][0]} is the best pokemon! This is a comment on ticket_{i}",
					"uploads": [pokemons[0][1]]
				},
				{
					"author_id": random.choice(agent_ids),
					"public": False,
					"created_at": comment_2.strftime("%Y-%m-%dT%H:%M:%SZ"),
					"value": f"Obviously {pokemons[1][0]} is better This is a private comment on ticket{i}",
					"uploads": [pokemons[1][1]]
				}
			],
			"description": description,
			"requester_id": requester_id,
			"subject": subject,
			"tags": tags
		}
		if random.random() >0.8:
			ticket['comments'] +={
					"author_id": requester_id,
					"created_at": comment_1.strftime("%Y-%m-%dT%H:%M:%SZ"),
					"value": f"Actually, {pokemons[2][0]} is the best! I love my pokemon!",
					"uploads": [pokemons[2][1]]

				},{
					"author_id": random.choice(agent_ids),
					"public": False,
					"created_at": comment_2.strftime("%Y-%m-%dT%H:%M:%SZ"),
					"value": "Mewtwo is the best, and then pikachu"
				}

		# Append the ticket to the list
		ticket_json["tickets"].append(ticket)
	return ticket_json

generate_ticket_with_attachments_json(num_tickets=1)

{'tickets': [{'assignee_id': 18673514369179,
   'created_at': '2010-09-19T06:57:09Z',
   'status': 'closed',
   'type': 'question',
   'priority': 'normal',
   'comments': [{'author_id': 18722212564251,
     'created_at': '2010-09-20T06:57:09Z',
     'value': 'servine is the best pokemon! This is a comment on ticket_2',
     'uploads': ['QGTJqRebst9OhyV6jCc9n9R7z']},
    {'author_id': 18673514369435,
     'public': False,
     'created_at': '2010-09-23T06:57:09Z',
     'value': 'Obviously spritzee is better This is a private comment on ticket2',
     'uploads': ['1wcPlxPmZyhpYzPLDSQQf2b7I']}],
   'description': 'Should I go to Ecruteak City? ticket_description_2',
   'requester_id': 18722212564251,
   'subject': 'Best carvanha ever! ticket_subject_2',
   'tags': ['legendary', 'bug', 'pokemon']}]}

Finally, call the bulk import endpoint with the ticket jsons: I don't handle rate limiting, since the contents are effectively random I don't care if some are dropped. If you have specific contents that needs to be included, retry logic can be added to handle rate limiting scenarios.

In [None]:

def post_import_random_tickets(num_jsons=3000,attachments=False,tickets_per_json=100):
	for i in range(num_jsons):
		if attachments:
			json1 = generate_ticket_with_attachments_json(num_tickets=tickets_per_json)
		else:
			json1 = generate_ticket_json(num_jsons=1,num_tickets=tickets_per_json)
		url = "https://"+subdomain+".zendesk.com/api/v2/imports/tickets/create_many?archive_immediately=true"

		headers = {
			"Content-Type": "application/json",
		}

		response = requests.request(
			"POST",
			url,
			auth=(username, password),
			headers=headers,
			json=json1
		)

		print(i,response.text)

In [None]:
post_import_random_tickets(100,attachments=False,tickets_per_json=100)

For ticket attachments, the time it takes to create 100 tickets is non-trivial, so I changed the number of tickets/request to a lower number of 20, and then threaded the process 10x to saturate the API requests limit. I was testing on a instance with the maximum requests/min of 2500, so if you have a smaller ZD instance you may want to reduce the number of threads.

In [None]:
# Create and start 10 threads
threads = []
for i in range(10):
    thread = threading.Thread(target=post_import_random_tickets, args=(50,True,20))
    threads.append(thread)
    thread.start()

# Wait for all threads to finish
for thread in threads:
    thread.join()

print("All threads have finished.")