# Categorización y Análisis de Juegos en el Contexto de AI Generativa

## Objetivo de la Sesión

El objetivo de esta sesión es entender cómo **AI generativa** puede ser aplicada para analizar grandes cantidades de contenido, como juegos, y generar insights clave. Para ello, utilizaremos APIs de OpenAI y otras herramientas para automatizar y optimizar el análisis de juegos en plataformas extensas como Roblox. Aprenderemos a clasificar juegos y evaluar su contenido, proporcionando a los padres un mejor entendimiento de los tipos de juegos a los que sus hijos están expuestos.

---

### ¿Qué es un API?

Un **API** (Application Programming Interface) es un conjunto de definiciones y protocolos que permite que un software se comunique con otro. En esta sesión, trabajaremos con APIs de herramientas como OpenAI, Google Translate, y bases de datos. Usaremos estas APIs para procesar información masiva de manera eficiente y analizar juegos.

**Ejemplos de APIs:**
- **OpenAI API**: Para obtener embeddings de texto que nos ayudarán a categorizar y comparar contenidos.
- **Google Translate API**: Para traducir textos cuando el idioma de un juego específico no coincide con el idioma deseado de salida.
- **Roblox API**: Para obtener información detallada de los juegos, como género, nombre y descripción.

---

### Caso Práctico: Gabu y el Análisis de Juegos en Roblox

**Contexto del Caso**:
- **Problema**: Gabu, una empresa de monitoreo de juegos en línea, busca brindar información a los padres sobre los juegos que sus hijos juegan en Roblox, plataforma con más de 40 millones de juegos.
- **Objetivo de Negocio**: Permitir que los padres conozcan el contenido y categoría de los juegos de sus hijos, detectando posibles riesgos como contenido inapropiado, violencia o interacción social excesiva.
- **Restricciones**:
  - La clasificación debe ser rápida y precisa.
  - Proteger la privacidad de los datos de los usuarios.
  - Minimizar los costos y tiempo de procesamiento en la obtención de resultados.

#### Objetivos Específicos
1. **Clasificación de Juegos**: Utilizar embeddings para categorizar los juegos según su contenido.
2. **Análisis de Riesgo de Contenido**: Evaluar y etiquetar juegos en función de criterios como "Contenido Violento", "Interacción Social", y "Monetización".
3. **Generación de Resumen de Juego**: Proporcionar un resumen de cada juego que sea fácil de entender y relevante para los padres.

---

### Caso Práctico: Gabu y el Análisis de Audio

**Contexto del Caso**:
- **Problema**: Gabu, una empresa de monitoreo de juegos en línea, busca brindar información a los padres sobre las sesiones que tienen en linea sus hijos.
- **Objetivo de Negocio**: Permitir a los padres tener conclusiones sobre las sesiones de juego para ampliar el úso de la aplicación por los padres.
- **Restricciones**:
  - El análisis tiene que ser en batch y en primeras instancias no mandarse automáticamente para analizar los resultados.
  - Proteger la privacidad de los niños.
  - Minimizar los costos y tiempo de procesamiento en la obtención de resultados.

---

In [1]:
!pip install openai

Collecting openai
  Obtaining dependency information for openai from https://files.pythonhosted.org/packages/b5/49/020190de3190ec9532c387e90d70aa8185fb7df4282b5e8b79addeaa4d99/openai-1.54.1-py3-none-any.whl.metadata
  Downloading openai-1.54.1-py3-none-any.whl.metadata (24 kB)
Collecting distro<2,>=1.7.0 (from openai)
  Obtaining dependency information for distro<2,>=1.7.0 from https://files.pythonhosted.org/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl.metadata
  Downloading distro-1.9.0-py3-none-any.whl.metadata (6.8 kB)
Collecting httpx<1,>=0.23.0 (from openai)
  Obtaining dependency information for httpx<1,>=0.23.0 from https://files.pythonhosted.org/packages/56/95/9377bcb415797e44274b51d46e3249eba641711cf3348050f76ee7b15ffc/httpx-0.27.2-py3-none-any.whl.metadata
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jiter<1,>=0.4.0 (from openai)
  Obtaining dependency information for jiter<1,>=0.4.0 from

In [14]:
from openai import OpenAI

client = OpenAI(
    api_key="",
)

def generate_response(prompt, model="gpt-3.5-turbo", temperature=0.7, max_tokens=100):
    try:
        response = client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": prompt,
                }
            ],
            model=model,
            temperature=temperature
        )
        
        return response
    except Exception as e:
        return f"Error: {e}"

prompt = """
Rolbox issnfdasoidfhoajsdiofjoaisdf


What is Roblox
"""
response = generate_response(prompt)
print(response)

ChatCompletion(id='chatcmpl-AQOz2dtVC7jzTOn4r0Ee3iDsEZFmr', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Roblox is a popular online platform and game creation system that allows users to design their own games and play games created by other users. It was created by the company Roblox Corporation and was officially launched in 2006. Users can create their own virtual worlds using the Roblox Studio, a game development tool provided by the platform. The games on Roblox are created using the programming language Lua. Players can customize their avatar, purchase virtual items, and interact with other players in a social environment. Roblox has become a global phenomenon and has a large and active community of players.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None))], created=1730855928, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=Completion

In [None]:
import os
from dotenv import load_dotenv
import openai
import psycopg2
import json
from opensearchpy import OpenSearch
import logging
from datetime import datetime
import numpy as np

logger = logging.getLogger()
logger.setLevel(logging.INFO)

load_dotenv()

openai.api_key = os.getenv("OPENAI_API_KEY")
user, password, host = os.getenv('OPENSEARCH_USER'), os.getenv('OPENSEARCH_PASSWORD'), os.getenv('OPENSEARCH_HOST')
port = os.getenv('OPENSEARCH_PORT')
auth = (user, password)

client = OpenSearch(
    hosts=[{'host': host, 'port': port}],
    http_compress=True,
    http_auth=auth,
    use_ssl=True,
    verify_certs=True,
    ssl_assert_hostname=False,
    ssl_show_warn=True,
    timeout=120
)


def cosine_similarity(vec1, vec2):
    dot_product = np.dot(vec1, vec2)
    norm_a = np.linalg.norm(vec1)
    norm_b = np.linalg.norm(vec2)
    similarity = dot_product / (norm_a * norm_b)
    return similarity


def mmr(docs, query_vector, lambda_param=0.5):
    selected = []
    while docs and len(selected) < 5:
        remaining = [doc for doc in docs if doc not in selected]
        if not remaining:
            break
        next_selected = max(
            remaining,
            key=lambda x: lambda_param * cosine_similarity(query_vector, x['_source']['content_vector']) -
                          (1 - lambda_param) * max([cosine_similarity(x['_source']['content_vector'],
                                                                      y['_source']['content_vector']
                                                                      ) for y in selected] or [0])
        )
        selected.append(next_selected)
        docs.remove(next_selected)
    return selected


def fetch_similar_docs(instruction_embedded_vector, roblox_universe_id, index_name="web_content"):
    query_body = {
        "_source": ["text_content", "title", "content_vector"],
        "query": {
            "bool": {
                "must": {
                    "script_score": {
                        "query": {
                            "bool": {
                                "filter": [
                                    {"term": {"category": "game"}},
                                    {"term": {"subcategory": roblox_universe_id}}
                                ]
                            }
                        },
                        "script": {
                            "source": "knn_score",
                            "lang": "knn",
                            "params": {
                                "field": "content_vector",
                                "query_value": instruction_embedded_vector,
                                "space_type": "cosinesimil"
                            }
                        }
                    }
                }
            }
        },
        "size": 30
    }

    response = client.search(index=index_name, body=query_body)['hits']['hits']
    diverse_docs = mmr(response, instruction_embedded_vector)

    formatted_output = ""
    for i, hit in enumerate(diverse_docs, start=1):
        title = hit['_source']['title']
        text_content = hit['_source']['text_content']
        formatted_output += f"Result {i} - {title} \n\"\"\"\n{text_content}\n\"\"\"\n\n"

    return formatted_output


def connect_to_postgres():
    dbhost = os.getenv("PG_DATABASE_HOST")
    dbname = os.getenv("PG_DATABASE_NAME")
    user = os.getenv("PG_USER")
    password = os.getenv("PG_PASSWORD")
    try:
        db = psycopg2.connect(f"dbname='{dbname}' user='{user}' host='{dbhost}' password='{password}'")
    except:
        raise "Unable to connect to the database"
    return db


def save_to_postgres(db, roblox_universe_id, completion_type, json_data, metadata):
    with db.cursor() as cursor:
        try:
            metadata_json = json.dumps(metadata)
            if isinstance(json_data, list):
                for item in json_data:
                    item_json = json.dumps(item)
                    cursor.execute(
                        "INSERT INTO ai_game_completions(roblox_universe_id, completion_type, content, metadata) "
                        "VALUES (%s, %s, %s, %s)",
                        (roblox_universe_id, completion_type, item_json, metadata_json))
            else:
                json_string = json.dumps(json_data)
                cursor.execute(
                    "INSERT INTO ai_game_completions(roblox_universe_id, completion_type, content, metadata) "
                    "VALUES (%s, %s, %s, %s)",
                    (roblox_universe_id, completion_type, json_string, metadata_json))
            db.commit()
        except (Exception, psycopg2.DatabaseError) as error:
            print(error)


def get_name_genre_and_description(db, roblox_universe_id):
    query = """
    SELECT name, genre, description 
    FROM roblox_universes
    WHERE roblox_universe_id = %s
    LIMIT 1
    """
    try:
        with db.cursor() as cursor:
            cursor.execute(query, (roblox_universe_id,))
            result = cursor.fetchone()
            if result:
                roblox_game, genre, description = result
                return roblox_game, genre, description
            else:
                logger.warning(f"No record found for roblox_universe_id: {roblox_universe_id}")
                return None, None, None
    except psycopg2.Error as e:
        logger.error(f"Database error: {e}", exc_info=True)
        raise
    finally:
        if cursor:
            cursor.close()


def read_text_file(file_path):
    with open(file_path, 'r', encoding='utf-8') as file:
        return file.read()


def get_embedding(text, model="text-embedding-ada-002"):
    text = text.replace("\n", " ")
    return openai.Embedding.create(input=[text], model=model)['data'][0]['embedding']


def model_completion(prompt, model="gpt-3.5-turbo"):
    completion = openai.ChatCompletion.create(model=model,
                                              temperature=0,
                                              messages=prompt)

    return completion["choices"][0]["message"]["content"]


def get_model_metadata(function, model_name, version, endpoint, source):
    endpoint = endpoint
    version = version
    return {
        "function": function.__name__,
        "model_name": model_name,
        "version": version,
        "endpoint": endpoint,
        "source": source
    }


def construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                     example_context, example_answer):
    instruction_embedded_vector = get_embedding(instruction)

    context = fetch_similar_docs(instruction_embedded_vector, roblox_universe_id)

    db = connect_to_postgres()
    roblox_game, genre, description = get_name_genre_and_description(db, roblox_universe_id)
    if db:
        db.close()

    example_message = {
        "role": "user",
        "content": (
            f"Name: {example_roblox_game}\n"
            f"Roblox Website Genre: {example_genre}\n"
            f"Description: \"\"\"\n{example_description}\n\"\"\"\n"
            f"Top Google Results: \"\"\"\n{example_context}\n\"\"\""
        )
    }

    example_answer_message = {
        "role": "assistant",
        "content": example_answer
    }

    actual_game_message = {
        "role": "user",
        "content": (
            f"Name: {roblox_game}\n"
            f"Roblox Website Genre: {genre}\n"
            f"Description: \"\"\"\n{description}\n\"\"\"\n"
            f"Top Google Results: \"\"\"\n{context}\n\"\"\""
        )
    }

    messages = [
        {"role": "system", "content": instruction},
        example_message,
        example_answer_message,
        actual_game_message
    ]

    return messages


def get_model_response(prompt):
    try:
        response = model_completion(prompt)
    except Exception as e:
        raise RuntimeError(f"Failed to get a response from the model: {e}")

    if not response:
        raise ValueError("Received an empty response from the model.")

    return response


def get_iterative_response(prompt, max_iterations=4):
    iteration = 0
    messages = prompt

    while iteration < max_iterations:
        response = model_completion(messages)
        if "Description:" in response:
            description = response.split("Description:", 1)[1].strip()
            relevant_content = description
        else:
            relevant_content = response

        if len(relevant_content.replace("*", "")) <= 280 or iteration == (max_iterations - 1):
            return response
        system_msg = "Provide a brief answer (<280 chars) with the same format."
        user_msg = f"Shorten this: {response[:100]}... (truncated)"

        messages.append({"role": "system", "content": system_msg})
        messages.append({"role": "user", "content": user_msg})

        iteration += 1

    raise ValueError("Unable to get a satisfactory response after maximum iterations.")


def process_conversation_starters_response(conversation_starters_raw):
    try:
        conversation_starters_raw_split = conversation_starters_raw.strip().split('\n\n')
    except Exception as e:
        raise RuntimeError(f"Failed to process the raw response: {e}")

    response = []
    for conv_starter_raw in conversation_starters_raw_split:
        parts = conv_starter_raw.split(': ', 1)
        if len(parts) != 2:
            continue
        title, description = parts
        response.append({'title': title, 'description': description})

    return response


def conversation_starters(roblox_universe_id):
    instruction = read_text_file('instructions/conversation_starters_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/conversation_starters_example_context.txt')
    example_answer = read_text_file('examples/conversation_starters_example_answer.txt')

    try:
        prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre,
                                  example_description, example_context, example_answer)
        conversation_starters_raw = get_model_response(prompt)
        response = process_conversation_starters_response(conversation_starters_raw)
        return response
    except (RuntimeError, ValueError) as e:
        print(f"An error occurred: {e}")
        return None


def splitting_string(string):
    lines = string.split('\n')

    json_obj = {}

    for line in lines:
        line = line.strip()
        if ": " in line.strip():
            key, value = [x.strip() for x in line.split(': ', 1)]
            json_obj[key.lower()] = value

    return json_obj


def splitting_reviews(response, title):
    json_obj = splitting_string(response)

    result_json = {
        "title": title,
        "rate": json_obj.get("rate", ""),
        "description": json_obj.get("description", "")
    }

    return result_json


def monetization(roblox_universe_id):
    instruction = read_text_file('instructions/monetization_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/monetization_example_context.txt')
    example_answer = read_text_file('examples/monetization_example_answer.txt')

    prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                              example_context, example_answer)
    response = get_iterative_response(prompt)
    result_json = splitting_reviews(response, "Monetization")

    try:
        return result_json
    except ValueError:
        print(f"Unexpected response for 'monetization' of {roblox_universe_id}: {response}")
        return None


def violent_content(roblox_universe_id):
    instruction = read_text_file('instructions/violent_content_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/violent_content_example_context.txt')
    example_answer = read_text_file('examples/violent_content_example_answer.txt')

    prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                              example_context, example_answer)
    response = get_iterative_response(prompt)
    result_json = splitting_reviews(response, "Violent Content")

    try:
        return result_json
    except ValueError:
        print(f"Unexpected response for 'violence_content_rate' of {roblox_universe_id}: {response}")
        return None


def social_interaction(roblox_universe_id):
    instruction = read_text_file('instructions/social_interaction_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/social_interaction_example_context.txt')
    example_answer = read_text_file('examples/social_interaction_example_answer.txt')

    prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                              example_context, example_answer)
    response = get_iterative_response(prompt)
    result_json = splitting_reviews(response, "Social Interaction")

    try:
        return result_json
    except ValueError:
        print(f"Unexpected response for 'social_interaction_rate' of {roblox_universe_id}: {response}")
        return None


def parent_needs_to_know(roblox_universe_id):
    instruction = read_text_file('instructions/parent_needs_to_know_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/parents_need_to_know_example_context.txt')
    example_answer = read_text_file('examples/parents_need_to_know_example_answer.txt')

    prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                              example_context, example_answer)

    return {"description": get_iterative_response(prompt)}


def ai_genre(roblox_universe_id):
    instruction = read_text_file('instructions/ai_genre_instruction.txt')

    example_roblox_game = "Blox Fruits"
    example_genre = "Adventure"
    example_description = read_text_file('examples/example_game_description.txt')
    example_context = read_text_file('examples/genre_example_context.txt')
    example_answer = read_text_file('examples/genre_example_answer.txt')

    prompt = construct_prompt(instruction, roblox_universe_id, example_roblox_game, example_genre, example_description,
                              example_context, example_answer)
    response = get_iterative_response(prompt)
    json_obj = splitting_string(response)
    result_json = {
        "title": json_obj.get("genre", ""),
        "description": json_obj.get("description", "")
    }

    return result_json


def lambda_handler(event, context):
    try:
        metadata = {
            "script": {
                "timestamp": datetime.utcnow().isoformat() + "Z",
                "version": 'v1.1.1',
                "environment": os.environ.get('ENVIRONMENT', 'production')
            },
            "models": [
                get_model_metadata(get_embedding, "text-embedding-ada-002", "Not specified", "Not specified", "OpenAI"),
                get_model_metadata(model_completion, "gpt-3.5-turbo", "3.5",
                                   "https://api.openai.com/v1/engines/gpt-3.5-turbo/completions", "OpenAI")
            ]
        }
        message = event['Records'][0]['body']
        universe = json.loads(message)
        roblox_universe_id = universe.get('roblox_universe_id')
        roblox_universe_name = universe.get('name')
        print(f'Starting game {roblox_universe_name} {roblox_universe_id}')

        conversation_starters_list = conversation_starters(roblox_universe_id)
        monetization_json = monetization(roblox_universe_id)
        violent_content_json = violent_content(roblox_universe_id)
        social_interaction_json = social_interaction(roblox_universe_id)
        parent_needs_to_know_json = parent_needs_to_know(roblox_universe_id)
        ai_genre_json = ai_genre(roblox_universe_id)

        db = connect_to_postgres()
        save_to_postgres(db, roblox_universe_id, "conversation_starter", conversation_starters_list, metadata)
        save_to_postgres(db, roblox_universe_id, "review", monetization_json, metadata)
        save_to_postgres(db, roblox_universe_id, "review", violent_content_json, metadata)
        save_to_postgres(db, roblox_universe_id, "review", social_interaction_json, metadata)
        save_to_postgres(db, roblox_universe_id, "what_parents_need_to_know", parent_needs_to_know_json, metadata)
        save_to_postgres(db, roblox_universe_id, "genre", ai_genre_json, metadata)
        print(f'Finished game {roblox_universe_name} {roblox_universe_id}')
        if db:
            db.close()
    except Exception as e:
        print(f"An error occurred: {e}")
