Skip to content

Commit

Permalink
Improve default system prompt for OpenAI and make it configurable
Browse files Browse the repository at this point in the history
The default prompt now includes instructions to **never** translate HTML tags (e.g. `<strong>`) nor Ruby I18n variables (e.g. `%{count}`).

This commit also makes the system prompt passed to OpenAI configurable directly from `i18n-tasks.yml`.
  • Loading branch information
michaelbaudino authored and glebm committed Nov 20, 2023
1 parent 7985e14 commit 77e5372
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 5 deletions.
24 changes: 19 additions & 5 deletions lib/i18n/tasks/translators/openai_translator.rb
Original file line number Diff line number Diff line change
@@ -1,11 +1,24 @@
# frozen_string_literal: true

require 'i18n/tasks/translators/base_translator'
require 'active_support/core_ext/string/filters'

module I18n::Tasks::Translators
class OpenAiTranslator < BaseTranslator
# max allowed texts per request
BATCH_SIZE = 50
DEFAULT_SYSTEM_PROMPT = <<~PROMPT.squish
You are a professional translator that translates content from the %{from} locale
to the %{to} locale in an i18n locale array.
The array has a structured format and contains multiple strings. Your task is to translate
each of these strings and create a new array with the translated strings.
HTML markups (enclosed in < and > characters) must not be changed under any circumstance.
Variables (starting with %%{ and ending with }) must not be changed under any circumstance.
Keep in mind the context of all the strings for a more accurate translation.
PROMPT

def initialize(*)
begin
Expand Down Expand Up @@ -54,6 +67,10 @@ def model
@model ||= @i18n_tasks.translation_config[:openai_model].presence || 'gpt-3.5-turbo'
end

def system_prompt
@system_prompt ||= @i18n_tasks.translation_config[:openai_system_prompt].presence || DEFAULT_SYSTEM_PROMPT
end

def translate_values(list, from:, to:)
results = []

Expand All @@ -66,14 +83,11 @@ def translate_values(list, from:, to:)
results.flatten
end

def translate(values, from, to) # rubocop:disable Metrics/MethodLength
def translate(values, from, to)
messages = [
{
role: 'system',
content: "You are a helpful assistant that translates content from the #{from} to #{to} locale in an i18n
locale array. The array has a structured format and contains multiple strings. Your task is to translate
each of these strings and create a new array with the translated strings. Keep in mind the context of all
the strings for a more accurate translation.\n"
content: format(system_prompt, from: from, to: to)
},
{
role: 'user',
Expand Down
12 changes: 12 additions & 0 deletions templates/config/i18n-tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,18 @@ search:
# # OpenAI
# openai_api_key: "sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
# # openai_model: "gpt-3.5-turbo" # see https://platform.openai.com/docs/models
# # may contain `%{from}` and `%{to}`, which will be replaced by source and target locale codes, respectively (using `Kernel.format`)
# # openai_system_prompt: >-
# # You are a professional translator that translates content from the %{from} locale
# # to the %{to} locale in an i18n locale array.
# #
# # The array has a structured format and contains multiple strings. Your task is to translate
# # each of these strings and create a new array with the translated strings.
# #
# # HTML markups (enclosed in < and > characters) must not be changed under any circumstance.
# # Variables (starting with %%{ and ending with }) must not be changed under any circumstance.
# #
# # Keep in mind the context of all the strings for a more accurate translation.

## Do not consider these keys missing:
# ignore_missing:
Expand Down

0 comments on commit 77e5372

Please sign in to comment.