<a href="https://colab.research.google.com/github/deep-diver/auto-data-fountain/blob/main/notebooks/pilot_derivation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install google-generativeai

In [2]:
GEMINI_API_KEY="..."

In [13]:
import json

def find_json_code_snippet(raw_code_snippet):
	json_parsed_string = None

	json_start_index = raw_code_snippet.find('{')
	json_end_index = raw_code_snippet.rfind('}')

	if json_start_index >= 0 and json_end_index >= 0:
		json_code_snippet = raw_code_snippet[json_start_index:json_end_index+1]
		try:
			json_parsed_string = json.loads(json_code_snippet, strict=False)
		except:
			raise ValueError('failed to parse string into JSON format')
	else:
		raise ValueError('No JSON code snippet found in string.')

	return json_parsed_string

def parse_first_json_code_snippet(code_snippet):
	json_parsed_string = None

	if isinstance(code_snippet, list):
		for code_snippet_piece in code_snippet:
			try:
				json_parsed_string = find_json_code_snippet(code_snippet_piece)
				return json_parsed_string
			except:
				pass
	else:
		try:
			json_parsed_string = find_json_code_snippet(code_snippet)
		except Exception as e:
			print(e)
			raise ValueError()

	return json_parsed_string

In [45]:
def determine_model_name(given_image=None):
  if given_image is None:
    return "gemini-pro"
  else:
    return "gemini-pro-vision"

def construct_image_part(given_image):
  return {
    "mime_type": "image/jpeg",
    "data": given_image
  }

def call_gemini(prompt="", API_KEY=None, given_text=None, given_image=None, generation_config=None, safety_settings=None):
  import google.generativeai as genai
  genai.configure(api_key=API_KEY)

  if generation_config is None:
    generation_config = {
      "temperature": 0.9,
      "top_p": 1,
      "top_k": 32,
      "max_output_tokens": 8192,
    }

  if safety_settings is None:
    safety_settings = [
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_ONLY_HIGH"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_ONLY_HIGH"
      },
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_ONLY_HIGH"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_ONLY_HIGH"
      },
    ]

  model_name = determine_model_name(given_image)
  model = genai.GenerativeModel(model_name=model_name,
                                generation_config=generation_config,
                                safety_settings=safety_settings)

  USER_PROMPT = prompt
  if given_text is not None:
    USER_PROMPT = f"""
------------------------------------------------
{given_text}
"""
  prompt_parts = [USER_PROMPT]
  if given_image is not None:
    prompt_parts.append(construct_image_part(given_image))

  response = model.generate_content(prompt_parts)
  return response.text

In [46]:
initial_prompt = """
The below erDiagram describes the basic setup of a certain scene.

Generate possible conversations between a user and an assistant.
The conversations should sound natural and logical.
The conversations should be occured without exposuring the underlying information of the erDiagram.

The user should play the role of "COUNSELEE" appeared in the erDiagram.
The user should start conversations focused on its role, "COUNSELEE".
The assistant should play the role of "COUNSELOR" appeared in the erDiagram.
The assistant should start conversations focused on its role, "COUNSELOR".
Based on the words that the user say, the assistant gives appropriate, detailed, and long answers.

The generated conversations are recorded in a valid JSON as
{"conversations":[{"user": text, "assistant": text},...]}.
------------------------------
erDiagram
    COUNSELOR ||--|{ COUNSELEE : "provides counseling to"

    %% Comments for relationship attributes
    %% Start date: 2024-02-14
    %% Frequency: Weekly
    %% Topic: marriage guidance
"""

In [47]:
derived_prompt = """
The below erDiagram describes the basic setup of a certain scene.

Here is the first few conversations
%s

Generate possible conversations between a user and an assistant after the first few conversations.
The conversations should sound natural and logical.
The conversations should be occured without exposuring the underlying information of the erDiagram.

The user should play the role of "COUNSELEE" appeared in the erDiagram.
The user should start conversations focused on its role, "COUNSELEE".
The assistant should play the role of "COUNSELOR" appeared in the erDiagram.
The assistant should start conversations focused on its role, "COUNSELOR".
Based on the words that the user say, the assistant gives appropriate, detailed, and long answers.

The generated conversations are recorded in a valid JSON as
{"conversations":[{"user": text, "assistant": text},...]}.
------------------------------
erDiagram
    COUNSELOR ||--|{ COUNSELEE : "provides counseling to"

    %% Comments for relationship attributes
    %% Start date: 2024-02-14
    %% Frequency: Weekly
    %% Topic: marriage guidance
"""

In [56]:
retry_num = 4
test_json = None
test = None

In [57]:
cur_retry = 0

while test_json is None and cur_retry <= retry_num:
  try:
    test_json = call_gemini(
      prompt=initial_prompt,
      API_KEY=GEMINI_API_KEY
    )
  except ValueError as e:
    print(e)
    continue

  try:
    test = parse_first_json_code_snippet(test_json)
  except:
    continue

  cur_retry = cur_retry + 1

In [58]:
test

{'conversations': [{'user': 'As a client, what kind of assistance do I get from you?',
   'assistant': 'As your counselor, I provide you with marriage guidance and counseling services. This includes helping you and your spouse identify the issues that are causing problems in your marriage, developing strategies for resolving those issues, and providing you with the tools and support you need to implement those strategies.'},
  {'user': 'What are the benefits of seeking guidance from you?',
   'assistant': "There are many benefits to seeking guidance from a counselor, including: \n\n* Improved communication and understanding between you and your spouse\n* Increased empathy and compassion for each other's needs and perspectives\n* Enhanced conflict resolution skills\n* Improved ability to manage difficult emotions and behaviors\n* Greater resilience in the face of challenges\n* A stronger and more fulfilling marriage"},
  {'user': 'What are some of the things we will be doing in our sess

In [80]:
len(test['conversations'])

5

In [85]:
d_factor = 3
retry_num = 4

results = []
base_conversation = {'conversations': []}

for conversation in test['conversations']:
  base_conversation['conversations'].append(conversation)

  for _ in range(d_factor):
    generated_conversation = None
    generated_conversation_json = None

    base_prompt = derived_prompt % json.dumps(base_conversation)

    cur_retry = 0

    while generated_conversation_json is None and \
          generated_conversation is None and \
          cur_retry <= retry_num:
      try:
        generated_conversation_json = call_gemini(
          prompt=initial_prompt,
          API_KEY=GEMINI_API_KEY
        )

        generated_conversation = parse_first_json_code_snippet(generated_conversation_json)
      except:
        cur_retry = cur_retry + 1
        print(f"RETRY... {cur_retry}")
        continue

    if generated_conversation is not None:
      base_conversation_copy = base_conversation.copy()
      base_conversation_copy['conversations'] = base_conversation_copy['conversations'] + generated_conversation['conversations']
      results.append(base_conversation_copy)


RETRY... 1
RETRY... 1
RETRY... 2
RETRY... 1
failed to parse string into JSON format
RETRY... 2
RETRY... 1
RETRY... 1
RETRY... 2
RETRY... 1
RETRY... 2
RETRY... 3
failed to parse string into JSON format
RETRY... 4
RETRY... 1
RETRY... 1


In [86]:
len(results)

13

In [87]:
results[0]

{'conversations': [{'user': 'As a client, what kind of assistance do I get from you?',
   'assistant': 'As your counselor, I provide you with marriage guidance and counseling services. This includes helping you and your spouse identify the issues that are causing problems in your marriage, developing strategies for resolving those issues, and providing you with the tools and support you need to implement those strategies.'},
  {'user': "I'm in a difficult relationship, and I think I need counseling. Can you help me?",
   'assistant': 'Of course, I can. I can provide you with a list of licensed counselors in your area who specialize in relationship guidance. Would you like me to do that?'},
  {'user': 'Yes, that would be great. Thank you.',
   'assistant': "You're welcome. Let me just gather some information from you so that I can provide you with the best possible list of counselors. What is your ZIP code?"},
  {'user': 'My ZIP code is 92109.',
   'assistant': 'What is your current rel

In [84]:
results[1]

{'conversations': [{'user': 'As a client, what kind of assistance do I get from you?',
   'assistant': 'As your counselor, I provide you with marriage guidance and counseling services. This includes helping you and your spouse identify the issues that are causing problems in your marriage, developing strategies for resolving those issues, and providing you with the tools and support you need to implement those strategies.'},
  {'user': 'What are the benefits of seeking guidance from you?',
   'assistant': "There are many benefits to seeking guidance from a counselor, including: \n\n* Improved communication and understanding between you and your spouse\n* Increased empathy and compassion for each other's needs and perspectives\n* Enhanced conflict resolution skills\n* Improved ability to manage difficult emotions and behaviors\n* Greater resilience in the face of challenges\n* A stronger and more fulfilling marriage"},
  {'user': "I'm feeling overwhelmed by my marriage and I need some g