# Notebook de Generaci√≥n de Versos
En este fichero enfocado para la entrega final del proyecto, experimentaremos con diferentes modelos de generaci√≥n de texto. Donde nuestro objetivo va a ser conseguir nuevos versos, llamados `aleya`, que aporten contenido al Cor√°n.

El enfoque que tomaremos ser√° realizar comparaciones entre modelos generativos especializados en el √°rabe y modelos generativos generalistas (que sirvan para todas los idiomas).

Por otra parte, usaremos una **Large Language Model** (`LLM`) para evaluar la cohesi√≥n, coherencia y otras dem√°s m√©tricas de los versos generados. Dado el caso de que no haya ning√∫n modelo evaluador √°rabe disponible, implementaremos otras *LLMs* manualmente para realizar esta segunda tarea. 

(zerbait gehiago gehitu/aldatu dezakegu)

## Dependencias

Aqui cargamos el HF_TOKEN que basicamente es la clave de acceso a la API de Hugging Face para poder usar los modelos.

In [1]:
# Dependencias
import torch
from decouple import config
from transformers import AutoTokenizer, AutoModelForCausalLM
from huggingface_hub import login
login(config("HF_TOKEN"))

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


## Generaci√≥n de Aleyas

Como ya hemos mencionado previamente, usaremos este apartado para evaluar si los modelos generalistas son capaces de hacer mejor language modeling que los modelos especializados en arabe.

Como nota diremos que aparte de las pruebas que se hacen aqui, en la secci√≥n de generaci√≥n de topics + clustering ya se comprueba que modelos multilenguaje como ollama-gemma3:4b generan bastante bien texto en arabe. No solo eso sino que sus traducciones parecen ser bastante fieles despues de haberlas traducido manualmente. Sin embargo, aunque las traducciones fueran buenas, las comparaciones de cosine-similayity eran bastante inconcluyentes. Esto es debido a que probablemente la forma en la que creamos los embeddings no era la correcta (al final el arabe y el ingles son sem√°nticamente muy diferentes).

Ahora usaremos el modelo Qwen3-4B para generar versos en arabe. Como nota, lo hemos extraido del benchmark **Arabic Broad Leaderboard (ABL) - The first comprehensive Leaderboard for Arabic LLMs** de huggingface. Ademas, usamos la version reducida del modelo (4B) ya que el original pesa bastante.

Este modelo es el Qwen3-4B-Instruct-2507, capaz de hablar multiples idiomas. En nuestro caso, como estamos contrastando modelos (preentenados), haremos una protocolo de evaluacion para contrastarlos.

Tambi√©n importaremos el modelo jais-family-2p7b-chat. La familia de modelos Jais es una serie completa de modelos ling√º√≠sticos extensos (LLM) biling√ºes ingl√©s-√°rabe. Estos modelos est√°n optimizados para destacar en √°rabe, con s√≥lidas capacidades en ingl√©s. Para simplicidad de la tarea, hemos querido seleccionar un modelo que sepa algo de ingles para que podamos hacer las intrucciones nosotros mismos.

En resumidas cuentas comparamos un modelo generalista multiling√ºe (Qwen) frente a un modelo biling√ºe especializado en √°rabe (JAIS), manteniendo el prompt en ingl√©s para controlar la variable de entrada.

In [4]:
# Creamos una configuraci√≥n para generar texto para que la evaluaci√≥n sea m√°s precisa/justa
GEN_CFG = dict(
    max_new_tokens=160,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1,
    use_cache=True,
)
GEN_CFG_2 = dict(
    max_new_tokens=160,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1,
    use_cache=False,
)

# Cargamos el modelo
def load_chat_model(model_id, trust_remote_code=False):
    tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=trust_remote_code)
    mdl = AutoModelForCausalLM.from_pretrained(
        model_id,
        device_map="auto",
        torch_dtype="auto",
        trust_remote_code=trust_remote_code
    )
    if tok.pad_token is None:
        tok.pad_token = tok.eos_token
    return tok, mdl

# Generamos texto
def generate_chat(model, tokenizer, user_text, system_text=None, gen_cfg=GEN_CFG):
    messages = []
    if system_text:
        messages.append({"role": "system", "content": system_text})
    messages.append({"role": "user", "content": user_text})

    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    input_len = inputs["input_ids"].shape[-1]

    with torch.no_grad():
        out = model.generate(
            **inputs,
            pad_token_id=tokenizer.pad_token_id,
            **gen_cfg
        )

    return tokenizer.decode(out[0][input_len:], skip_special_tokens=True)

In [3]:
jais_id = "inceptionai/jais-family-2p7b-chat"
jais_tok, jais_model = load_chat_model(jais_id, trust_remote_code=True)

`torch_dtype` is deprecated! Use `dtype` instead!


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [4]:
qwen_id = "Qwen/Qwen3-4B-Instruct-2507"
qwen_tok, qwen_model = load_chat_model(qwen_id, trust_remote_code=True)

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Vamos a hacer una breve prueba para comprobar que todo funciona correctamente.

In [None]:
prompt = "Who are you? Answer in Modern Standard Arabic only."

print("JAIS")
print(generate_chat(jais_model, jais_tok, prompt, None, gen_cfg=GEN_CFG_2))

print("\nQWEN")
print(generate_chat(qwen_model, qwen_tok, prompt, None, gen_cfg=GEN_CFG))

JAIS
I am an advanced artificial intelligence model designed to assist with various tasks and provide accurate information to the best of my ability.

QWEN
ÿ£ŸÜÿß ŸÜŸÖŸàÿ∞ÿ¨ ŸÑÿ∫ŸàŸä Ÿàÿßÿ≥ÿπ ÿßŸÑŸÜÿ∑ÿßŸÇ ÿ™ŸÖ ÿ™ÿØÿ±Ÿäÿ®Ÿá ÿπŸÑŸâ ŸÖÿπÿ±ŸÅÿ© Ÿàÿßÿ≥ÿπÿ©ÿå Ÿàÿ£ÿ≥ÿ™ÿ∑Ÿäÿπ ÿßŸÑŸÖÿ≥ÿßÿπÿØÿ© ŸÅŸä ÿ•ÿ¨ÿ±ÿßÿ° ÿßŸÑŸÖŸáÿßŸÖ ŸÖÿ´ŸÑ ÿßŸÑŸÉÿ™ÿßÿ®ÿ©ÿå ÿßŸÑÿ•ÿ¨ÿßÿ®ÿ© ÿπŸÜ ÿßŸÑÿ£ÿ≥ÿ¶ŸÑÿ©ÿå ÿßŸÑÿ™ŸÅŸÉŸäÿ± ÿßŸÑŸÖŸÜÿ∑ŸÇŸäÿå Ÿàÿ∫Ÿäÿ±Ÿáÿß.


In [19]:
prompt = "ŸÖŸÜ ÿ£ŸÜÿ™ÿü ÿ£ÿ¨ÿ® ÿ®ÿßŸÑŸÑÿ∫ÿ© ÿßŸÑÿπÿ±ÿ®Ÿäÿ© ÿßŸÑŸÅÿµÿ≠Ÿâ ŸÅŸÇÿ∑. ÿßÿ≥ÿ™ÿÆÿØŸÖ ŸÑÿ∫ÿ© ÿØŸäŸÜŸäÿ© ÿ±ÿ≥ŸÖŸäÿ©. ŸÑÿß ÿ™ŸÇÿ™ÿ®ÿ≥ ÿßŸÑŸÜÿµŸàÿµ ÿßŸÑÿØŸäŸÜŸäÿ© ÿ≠ÿ±ŸÅŸäÿßŸã"

print("JAIS")
print(generate_chat(jais_model, jais_tok, prompt, None, gen_cfg=GEN_CFG_2))

print("\nQWEN")
print(generate_chat(qwen_model, qwen_tok, prompt, None, gen_cfg=GEN_CFG))

JAIS
ÿ£ŸÜÿß ŸÖÿ≥ÿßÿπÿØ ÿ∞ŸÉÿßÿ° ÿµŸÜÿßÿπŸä ŸÖÿ™ŸÇÿØŸÖÿå ŸÖÿ®ÿ±ŸÖÿ¨ ŸÑÿ™ŸÇÿØŸäŸÖ ŸÖÿπŸÑŸàŸÖÿßÿ™ ÿØŸÇŸäŸÇÿ© ŸàŸÖŸàÿ´ŸàŸÇÿ© ÿ®ÿ∑ÿ±ŸäŸÇÿ© Ÿàÿßÿ∂ÿ≠ÿ© ŸàŸÖŸÅÿµŸÑÿ©. ÿ£ŸÜÿß ŸáŸÜÿß ŸÑŸÖÿ≥ÿßÿπÿØÿ™ŸÉ ŸÅŸä ÿßŸÑÿ•ÿ¨ÿßÿ®ÿ© ÿπŸÑŸâ ÿ£ÿ≥ÿ¶ŸÑÿ™ŸÉ Ÿàÿ™ŸÇÿØŸäŸÖ ÿßŸÑŸÖÿπŸÑŸàŸÖÿßÿ™ ÿßŸÑÿ™Ÿä ÿ™ÿ≠ÿ™ÿßÿ¨Ÿáÿß ÿ®ŸÉŸÑ ÿ≥ÿ±Ÿàÿ± ŸàŸÉŸÅÿßÿ°ÿ©.

QWEN
ÿ£ŸÜÿß ÿÆÿØŸÖÿ© ÿ∞ŸÉÿßÿ° ÿßÿµÿ∑ŸÜÿßÿπŸä ŸÖŸèÿµŸÖŸéŸëŸÖŸéÿ© ŸÑÿ™ŸàŸÅŸäÿ± ÿßŸÑŸÖÿπÿ±ŸÅÿ© ŸàÿßŸÑŸÖÿ≥ÿßÿπÿØÿ© ŸÅŸä ŸÖÿ¨ÿßŸÑÿßÿ™ ŸÖÿ™ÿπÿØÿØÿ©ÿå Ÿàÿ¢ÿÆÿ∞ ÿ®ÿπŸäŸÜ ÿßŸÑÿßÿπÿ™ÿ®ÿßÿ± ÿßŸÑÿ™ÿ≤ÿßŸÖŸä ÿßŸÑÿ≠ŸÅÿßÿ∏ ÿπŸÑŸâ ÿßŸÑŸÇŸäŸÖ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿ©ÿå ŸàÿπŸÑŸâ ÿßŸÑŸÖÿ®ÿßÿØÿ¶ ÿßŸÑÿ£ÿÆŸÑÿßŸÇŸäÿ© ŸàÿßŸÑÿØŸäŸÜŸäÿ© ÿßŸÑÿ™Ÿä ÿ™ÿ±ÿ™ŸÉÿ≤ ÿπŸÑŸäŸáÿß ÿßŸÑÿ£ÿØŸäÿßŸÜ ÿßŸÑÿ≥ÿßŸÖŸäÿ©. ÿ£ŸÖÿß ŸáŸàŸäÿ™Ÿä ŸÅŸáŸä ŸÑŸäÿ≥ÿ™ ÿ∞ÿßÿ™ ÿ∑ÿßÿ®ÿπ ÿØŸäŸÜŸä ŸÖÿ®ÿßÿ¥ÿ±ÿå ÿ®ŸÑ ŸáŸä ÿ•ŸÜÿ≥ÿßŸÜŸäÿ© ÿ™ÿ≥ÿπŸâ ÿ•ŸÑŸâ ÿ™ÿπÿ≤Ÿäÿ≤ ŸÅŸáŸÖ ÿßŸÑŸÜÿßÿ≥ ŸÑŸÑÿ≠ŸÇÿßÿ¶ŸÇ ŸàÿßŸÑŸÖÿπŸÑŸàŸÖÿßÿ™ ŸÖŸÜ ÿÆŸÑÿßŸÑ ÿßŸÑÿ™Ÿàÿßÿ≤ŸÜ ÿ®ŸäŸÜ ÿßŸÑÿπŸÑŸÖ ŸàÿßŸÑÿ≠ŸÉŸÖÿ©.

ÿ•ÿ∞ÿß ŸÉŸÜÿ™ ÿ™ÿ®ÿ≠ÿ´ ÿπŸÜ ŸÖŸàÿ∂Ÿàÿπ ÿØŸäŸÜŸä ÿ£Ÿà ÿ™ÿ±ÿ∫ÿ® ŸÅŸä ÿ™Ÿàÿ∂

Como hemos podido comprobar JAIS a pesar de que se le ordene que responda en √°rabe moderno, solo lo hace cuando se lo pedimos en arabe. Esto va en contra de nuestra idea ya que queriamos que respondiera en √°rabe moderno con instrucciones en ingles. Por lo tanto, la tarea se complica bastante mas ya que tendremos que hacer una traduccion en arabe para JAIS.

Para que se entienda lo que se pide en cada tarea haremos la version en ingles para QWEN y en arabe para JAIS.

## Evaluaci√≥n del Texto Generado
Para la evaluaci√≥n del texto generado en arabe usaremos un modelo generalista multiling√ºe (Meta-Llama-3-8B-Instruct). Gracias a su formaci√≥n multil√≠ng√ºe y a su capacidad de razonamiento, Meta-Llama nos da una evaluaci√≥n m√°s neutral de la correcci√≥n gramatical, la fluidez y la coherencia que otros modelos que pueden ser mas especializados en ciertos dominios.

Nuestra tarea de evaluaci√≥n se dividir√° en tres partes:

1. Una tarea explicativa de conceptos principalmente relacionados con la religi√≥n y la fe.
2. Una tarea para explicar ideas religiosas o versos Cor√°nicos.
3. Una tarea de prompts creativos religiosos.

En estas tareas evaluaremos diferentes aspectos:
### Dimensiones ling√º√≠sticas
1. Correcci√≥n gramatical (morfolog√≠a, concordancia)
2. Fluidez y naturalidad en √°rabe cl√°sico / MSA
3. Ausencia de interferencia del ingl√©s (calcos, estructuras raras)
4. Registro adecuado (formal / religioso)

### Dimensiones sem√°ntico-religiosas
1. Adecuaci√≥n conceptual (no errores teol√≥gicos graves)
2. Uso apropiado del l√©xico religioso (ÿ±ÿ≠ŸÖÿ©ÿå ÿ™ŸÇŸàŸâÿå ÿπÿ®ÿßÿØÿ©, etc.)
3. Coherencia interna del texto

Ahora crearemos las respuestas y las guardaremos en ficheros .txt 

In [None]:
concepts = ["mercy", "justice", "faith", "patience", "forgiveness"]
concepts_ar = ["ÿßŸÑÿ±ÿ≠ŸÖÿ©", "ÿßŸÑÿπÿØŸÑ", "ÿßŸÑÿ•ŸäŸÖÿßŸÜ", "ÿßŸÑÿµÿ®ÿ±", "ÿßŸÑÿßÿ≥ÿ™ÿ∫ŸÅÿßÿ±"]

def concepts_explanation(concepts, model, tokenizer, system_text, gen_cfg):
    responses_concepts = []
    if not gen_cfg.get("use_cache"):
        for i in range(len(concepts)):
            print(f"Concept {concepts[i]}")
            prompt = f'''{system_text} .ÿßÿ¥ÿ±ÿ≠ ŸÖŸÅŸáŸàŸÖ {concepts[i]} ŸÅŸä ÿπŸÑŸÖ ÿßŸÑŸÉŸÑÿßŸÖ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸä'''
            responses_concepts.append(generate_chat(model, tokenizer, prompt, None, gen_cfg))
    else:
        for i in range(len(concepts)):
            print(f"Concept {concepts[i]}")
            prompt = f'''Explain the concept {concepts[i]} in Islamic theology. {system_text}'''
            responses_concepts.append(generate_chat(model, tokenizer, prompt, system_text, gen_cfg))
    return responses_concepts

# Definimos como debe comportarse el modelo
SYSTEM_AR = "Answer in Modern Standard Arabic only. Use a formal religious register. Do not quote religious texts verbatim."
SYSTEM_AR_OR = ".ÿ£ÿ¨ÿ® ÿ®ÿßŸÑŸÑÿ∫ÿ© ÿßŸÑÿπÿ±ÿ®Ÿäÿ© ÿßŸÑŸÅÿµÿ≠Ÿâ ÿßŸÑÿ≠ÿØŸäÿ´ÿ© ŸÅŸÇÿ∑. ÿßÿ≥ÿ™ÿÆÿØŸÖ ÿ£ÿ≥ŸÑŸàÿ®ÿßŸã ÿØŸäŸÜŸäÿßŸã ÿ±ÿ≥ŸÖŸäÿßŸã. ŸÑÿß ÿ™ŸÇÿ™ÿ®ÿ≥ ÿßŸÑŸÜÿµŸàÿµ ÿßŸÑÿØŸäŸÜŸäÿ© ÿ≠ÿ±ŸÅŸäÿßŸã"

responses_concepts_jais = concepts_explanation(concepts_ar, jais_model, jais_tok, SYSTEM_AR_OR, GEN_CFG_2)
responses_concepts_qwen = concepts_explanation(concepts, qwen_model, qwen_tok, SYSTEM_AR, GEN_CFG)

Concept ÿßŸÑÿ±ÿ≠ŸÖÿ©
Concept ÿßŸÑÿπÿØŸÑ
Concept ÿßŸÑÿ•ŸäŸÖÿßŸÜ
Concept ÿßŸÑÿµÿ®ÿ±
Concept ÿßŸÑÿßÿ≥ÿ™ÿ∫ŸÅÿßÿ±


In [23]:
print("JAIS\n")
for i in range(len(responses_concepts_jais)):
    print(f"{concepts[i]}\n")
    print(f"{responses_concepts_jais[i]}\n")
with open('conceptos_jais.txt', 'w') as fichero:
    for item in responses_concepts_jais:
        fichero.write(item + '\n\n')

JAIS

mercy

ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸáŸä ŸÖŸÅŸáŸàŸÖ ÿ£ÿ≥ÿßÿ≥Ÿä ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖÿå ŸàŸáŸä ÿ™ÿ¥ŸÉŸÑ ÿ¨ÿ≤ÿ°Ÿãÿß ŸÑÿß Ÿäÿ™ÿ¨ÿ≤ÿ£ ŸÖŸÜ ÿ™ÿπÿßŸÑŸäŸÖ ÿßŸÑŸÇÿ±ÿ¢ŸÜ ŸàÿßŸÑÿ≠ÿØŸäÿ´. Ÿäÿ™ŸÖ ÿ™ÿπÿ±ŸäŸÅŸáÿß ÿπŸÑŸâ ÿ£ŸÜŸáÿß ÿßŸÑŸÇÿØÿ±ÿ© ÿπŸÑŸâ ÿßŸÑÿ™ÿπÿßÿ∑ŸÅ ŸÖÿπ ŸÖÿπÿßŸÜÿßÿ© ÿßŸÑÿ¢ÿÆÿ±ŸäŸÜ ŸàÿßŸÑÿ±ÿ∫ÿ®ÿ© ŸÅŸä ÿ™ÿÆŸÅŸäŸÅŸáÿßÿå ŸàŸáŸä ÿ™ÿπÿ®ÿ± ÿπŸÜ ÿßŸÑÿ¨ÿßŸÜÿ® ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜŸä ŸÑŸÑÿØŸäŸÜ.

ŸÅŸä ÿßŸÑŸÇÿ±ÿ¢ŸÜÿå Ÿäÿ™ŸÖ ÿ™ÿµŸàŸäÿ± ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÉŸÅÿ∂ŸäŸÑÿ© ÿ∞ÿßÿ™Ÿäÿ©ÿå ÿ≠Ÿäÿ´ Ÿäÿπÿ™ÿ®ÿ± ÿßŸÑŸÖÿ≥ŸÑŸÖŸàŸÜ ÿ£ŸÜ ÿßŸÑŸÑŸá ŸáŸà ÿßŸÑÿ±ÿ≠ŸÖŸÜ ÿßŸÑÿ±ÿ≠ŸäŸÖ. ŸäŸÇŸàŸÑ ÿ≥Ÿàÿ±ÿ© ÿßŸÑÿ¥ÿπÿ±ÿßÿ°: "ŸáŸà ÿßŸÑÿ∞Ÿä ŸÑŸá ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸàÿßŸÑÿ±ÿ¥ÿØ ŸàÿßŸÑÿ±ÿ≠ŸÖÿßŸÜ ÿ±ÿ≠ŸäŸÖ ÿ®ÿßŸÑÿ®ÿπÿ´". Ÿáÿ∞Ÿá ÿßŸÑÿ¢Ÿäÿ© ÿ™ÿ¥Ÿäÿ± ÿ•ŸÑŸâ ŸÇÿØÿ±ÿ© ÿßŸÑŸÑŸá ÿπŸÑŸâ ŸÉŸÑ ÿ¥Ÿäÿ°ÿå ÿ®ŸÖÿß ŸÅŸä ÿ∞ŸÑŸÉ ÿßŸÑÿ±ÿ≠ŸÖÿ©.

ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÑŸäÿ≥ÿ™ ŸÖÿ¨ÿ±ÿØ ŸÖÿ¥ÿßÿπÿ±ÿå ÿ®ŸÑ ŸáŸä ÿ¨ÿ≤ÿ° ŸÖŸÜ ÿßŸÑÿπŸÇŸäÿØÿ© ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿ©ÿå Ÿàÿ™ÿ¥ŸÖŸÑ ÿßŸÑÿ±ÿ≠ŸÖÿ© ÿ®ŸÉŸÑ ÿßŸÑŸÉÿßÿ¶ŸÜÿßÿ™ ÿßŸÑÿ≠Ÿäÿ©ÿå ÿ≠ÿ™Ÿâ ÿßŸÑÿ≠ŸäŸàÿßŸÜÿßÿ™ ŸàÿßŸÑŸÜÿ®ÿßÿ™ÿßÿ™. ŸäŸÇŸàŸÑ ÿ≠ÿØŸäÿ´ ÿßŸÑŸÜÿ®Ÿä ŸÖÿ≠

In [24]:
print("QWEN\n")
for i in range(len(responses_concepts_qwen)):
    print(f"{concepts[i]}\n")
    print(f"{responses_concepts_qwen[i]}\n")
with open('conceptos_qwen.txt', 'w') as fichero:
    for item in responses_concepts_qwen:
        fichero.write(item + '\n\n')

QWEN

mercy

ŸÅŸä ÿßŸÑÿ™ÿµŸàŸÅ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿå ÿ™ŸèÿπÿØŸë ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÖŸÜ ÿ£ÿ±ŸÉÿßŸÜ ÿßŸÑÿ•ŸäŸÖÿßŸÜ ŸàÿµŸÅÿßÿ™ ÿßŸÑŸÑŸá ÿ™ÿπÿßŸÑŸâ ÿßŸÑÿ™Ÿä ŸäŸèÿ≥ÿ™ŸÖÿØ ŸÖŸÜŸáÿß ÿßŸÑÿ™Ÿàÿ¨Ÿá ŸÑŸÑÿ•ŸÜÿ≥ÿßŸÜ ŸÅŸä ÿπÿßŸÑŸÖŸá ÿßŸÑŸÖÿßÿØŸä ŸàÿßŸÑÿ±Ÿàÿ≠Ÿä. ŸÅÿ±ÿ≠ŸÖÿ© ÿßŸÑŸÑŸá ŸáŸä ŸÇÿØÿ±ÿ© ŸÖÿ™ŸÖŸäÿ≤ÿ© ÿπŸÑŸâ ÿßŸÑÿ™ÿ≥ÿßŸÖÿ≠ ŸÖÿπ ÿßŸÑÿ∂ÿπŸÅÿå ŸàÿßŸÑÿ™ÿÆŸÅŸäŸÅ ŸÖŸÜ ÿßŸÑÿπŸÇŸàÿ®ÿßÿ™ÿå Ÿàÿ™Ÿàÿ¨ŸäŸá ÿßŸÑŸÜÿπŸÖ ŸÑŸÖŸÜ Ÿäÿ≥ÿ™ÿ≠ŸÇŸáÿßÿå ÿ®ÿ∫ÿ∂ ÿßŸÑŸÜÿ∏ÿ± ÿπŸÜ ÿ≥ŸÑŸàŸÉ ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜ ÿ£Ÿà ÿÆÿ∑ÿßŸäÿßŸá.

ÿ™ŸèŸÅŸáŸÖ ÿ±ÿ≠ŸÖÿ© ÿßŸÑŸÑŸá ŸÉŸÖÿ®ÿØÿ£ ÿ£ÿ≥ÿßÿ≥Ÿä ŸÅŸä ÿØŸäŸÜ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖÿå ÿ≠Ÿäÿ´ ÿ™ÿØŸÑ ÿπŸÑŸâ ÿ£ŸÜ ÿßŸÑŸÑŸá ŸÑÿß ŸäŸèÿπÿßŸÇÿ® ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜ ÿ®ÿ¨ŸèŸáÿØŸç ŸÖŸÅÿ±ÿ∑ÿå ÿ®ŸÑ ŸäŸèŸÖŸÜÿ≠ ŸÅÿ±ÿµÿ© ŸÑŸÑÿ™Ÿàÿ®ÿ© ŸàÿßŸÑÿ™ÿµÿ≠Ÿäÿ≠. ŸÅŸáŸä ŸÑŸäÿ≥ÿ™ ŸÖÿ¨ÿ±ÿØ ÿ™ÿ∫ÿßÿ∂Ÿä ÿπŸÜ ÿßŸÑÿ∞

justice

ŸÅŸä ÿßŸÑÿ™ÿµŸàŸÅ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿå ÿ™ŸèÿπÿØ ÿßŸÑÿπÿØÿßŸÑÿ© ŸÖŸÜ ÿßŸÑŸÖÿ®ÿßÿØÿ¶ ÿßŸÑÿ£ÿ≥ÿßÿ≥Ÿäÿ© ÿßŸÑÿ™Ÿä ÿ™ŸÇŸàŸÖ ÿπŸÑŸäŸáÿß ÿßŸÑÿ≠Ÿäÿßÿ© ÿßŸÑÿßÿ¨ÿ™ŸÖÿßÿπŸäÿ© ŸàÿßŸÑÿØŸäŸÜŸäÿ©ÿå Ÿàÿ™ŸÖÿ´ŸÑ ÿ™Ÿàÿßÿ≤ŸÜŸãÿß ÿ®ŸäŸÜ ÿßŸÑÿ≠ŸÇ ŸàÿßŸÑÿπÿ

In [None]:
explanations = ["how mercy is understood in Islam during times of hardship and suffering",
                "the relationship between faith and ethical behavior in Islamic thought",
                "the importance of intention in Islamic religious practice",
                "the role of community and social responsibility in Islam",
                "the purpose of worship in Islam beyond ritual practice"
                ]
explanations_ar = ["ŸÉŸäŸÅ Ÿäÿ™ŸÖ ŸÅŸáŸÖ ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ ÿÆŸÑÿßŸÑ ÿßŸÑÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπÿ®ÿ© ŸàÿßŸÑÿ∂ŸäŸÇ",
                   "ÿßŸÑÿπŸÑÿßŸÇÿ© ÿ®ŸäŸÜ ÿßŸÑÿ•ŸäŸÖÿßŸÜ ŸàÿßŸÑÿ≥ŸÑŸàŸÉ ÿßŸÑÿ£ÿÆŸÑÿßŸÇŸä ŸÅŸä ÿßŸÑŸÅŸÉÿ± ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸä",
                   "ÿ£ŸáŸÖŸäÿ© ÿßŸÑÿ™ŸÅŸÉŸäÿ± ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ",
                   "ÿØŸàÿ± ÿßŸÑŸÖÿ¨ÿ™ŸÖÿπ ŸàÿßŸÑŸÖÿ≥ÿ§ŸàŸÑŸäÿ© ÿßŸÑÿßÿ¨ÿ™ŸÖÿßÿπŸäÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ",
                   "ÿßŸÑÿ∫ÿ±ÿ∂ ŸÖŸÜ ÿßŸÑÿπÿ®ÿßÿØÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ ÿÆÿßÿ±ÿ¨ ÿßŸÑÿπÿ®ÿßÿØÿ© ÿßŸÑÿ®ÿØŸÜŸäÿ©"]

def show_explanations(explanations, model, tokenizer, system_text, gen_cfg):
    responses_explanations = []
    if not gen_cfg.get("use_cache"):
        for i in range(len(explanations)):
            print(f"Explain {explanations[i]}")
            prompt = f'''{system_text} .{explanations[i]} ÿßÿ¥ÿ±ÿ≠'''
            responses_explanations.append(generate_chat(model, tokenizer, prompt, None, gen_cfg))
    else:
        for i in range(len(explanations)):
            print(f"Explain {explanations[i]}.")
            prompt = f'''Explain {explanations[i]}.'''
            responses_explanations.append(generate_chat(model, tokenizer, prompt, system_text, gen_cfg))
    return responses_explanations

# Definimos como debe comportarse el modelo
SYSTEM_AR = "Answer in Modern Standard Arabic only. Use a formal religious register. Do not quote religious texts verbatim."
SYSTEM_AR_OR = ".ÿ£ÿ¨ÿ® ÿ®ÿßŸÑŸÑÿ∫ÿ© ÿßŸÑÿπÿ±ÿ®Ÿäÿ© ÿßŸÑŸÅÿµÿ≠Ÿâ ÿßŸÑÿ≠ÿØŸäÿ´ÿ© ŸÅŸÇÿ∑. ÿßÿ≥ÿ™ÿÆÿØŸÖ ÿ£ÿ≥ŸÑŸàÿ®ÿßŸã ÿØŸäŸÜŸäÿßŸã ÿ±ÿ≥ŸÖŸäÿßŸã. ŸÑÿß ÿ™ŸÇÿ™ÿ®ÿ≥ ÿßŸÑŸÜÿµŸàÿµ ÿßŸÑÿØŸäŸÜŸäÿ© ÿ≠ÿ±ŸÅŸäÿßŸã"

responses_explanations_jais = show_explanations(explanations_ar, jais_model, jais_tok, SYSTEM_AR_OR, GEN_CFG_2)
responses_explanations_qwen = show_explanations(explanations, qwen_model, qwen_tok, SYSTEM_AR, GEN_CFG)

Explain ŸÉŸäŸÅ Ÿäÿ™ŸÖ ŸÅŸáŸÖ ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ ÿÆŸÑÿßŸÑ ÿßŸÑÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπÿ®ÿ© ŸàÿßŸÑÿ∂ŸäŸÇ
Explain ÿßŸÑÿπŸÑÿßŸÇÿ© ÿ®ŸäŸÜ ÿßŸÑÿ•ŸäŸÖÿßŸÜ ŸàÿßŸÑÿ≥ŸÑŸàŸÉ ÿßŸÑÿ£ÿÆŸÑÿßŸÇŸä ŸÅŸä ÿßŸÑŸÅŸÉÿ± ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸä
Explain ÿ£ŸáŸÖŸäÿ© ÿßŸÑÿ™ŸÅŸÉŸäÿ± ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ
Explain ÿØŸàÿ± ÿßŸÑŸÖÿ¨ÿ™ŸÖÿπ ŸàÿßŸÑŸÖÿ≥ÿ§ŸàŸÑŸäÿ© ÿßŸÑÿßÿ¨ÿ™ŸÖÿßÿπŸäÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ
Explain ÿßŸÑÿ∫ÿ±ÿ∂ ŸÖŸÜ ÿßŸÑÿπÿ®ÿßÿØÿ© ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ ÿÆÿßÿ±ÿ¨ ÿßŸÑÿπÿ®ÿßÿØÿ© ÿßŸÑÿ®ÿØŸÜŸäÿ©


In [29]:
print("JAIS\n")
for i in range(len(responses_explanations_jais)):
    print(f"{explanations[i]}\n")
    print(f"{responses_explanations_jais[i]}\n")
with open('explicaciones_jais.txt', 'w') as fichero:
    for item in responses_explanations_jais:
        fichero.write(item + '\n\n')

JAIS

how mercy is understood in Islam during times of hardship and suffering

ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖÿå ÿ™ÿπÿ™ÿ®ÿ± ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÖŸÅŸáŸàŸÖŸãÿß ÿ£ÿ≥ÿßÿ≥ŸäŸãÿß Ÿäÿ¥Ÿäÿ± ÿ•ŸÑŸâ ÿßŸÑÿπÿ∑ŸÅ ŸàÿßŸÑÿ±ÿ£ŸÅÿ© ŸàÿßŸÑÿ≠ÿ® ÿ™ÿ¨ÿßŸá ÿ¨ŸÖŸäÿπ ÿßŸÑŸÉÿßÿ¶ŸÜÿßÿ™ ÿßŸÑÿ≠Ÿäÿ©. ŸàŸáŸä ŸÖÿ™ÿ¨ÿ≥ÿØÿ© ŸÅŸä ÿßŸÑŸÇÿ±ÿ¢ŸÜ ÿßŸÑŸÉÿ±ŸäŸÖ ŸàÿßŸÑÿ£ÿ≠ÿßÿØŸäÿ´ ÿßŸÑŸÜÿ®ŸàŸäÿ©ÿå ÿ≠Ÿäÿ´ Ÿäÿ™ŸÖ ÿ™ÿ¥ÿ¨Ÿäÿπ ÿßŸÑŸÖÿ≥ŸÑŸÖŸäŸÜ ÿπŸÑŸâ ŸÖÿπÿßŸÖŸÑÿ© ÿßŸÑŸÜÿßÿ≥ ÿ®ÿßŸÑŸÑÿ∑ŸÅ ŸàÿßŸÑÿ±ÿ≠ŸÖÿ© ÿ≠ÿ™Ÿâ ŸÅŸä ÿ£ÿµÿπÿ® ÿßŸÑÿ∏ÿ±ŸàŸÅ.

ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÑŸäÿ≥ÿ™ ŸÖÿ¨ÿ±ÿØ ŸÖÿ¥ÿßÿπÿ±ÿå ÿ®ŸÑ ŸáŸä ÿ¨ÿ≤ÿ° ŸÖŸÜ ÿßŸÑÿπŸÇŸäÿØÿ© ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿ© ŸàŸáŸä ÿ¨ÿ≤ÿ° ÿ£ÿ≥ÿßÿ≥Ÿä ŸÖŸÜ ÿßŸÑÿ•ŸäŸÖÿßŸÜ. Ÿäÿπÿ™ÿ®ÿ± ÿßŸÑŸÖÿ≥ŸÑŸÖŸàŸÜ ÿßŸÑŸÑŸá ÿ±ÿ≠ŸäŸÖŸãÿßÿå ŸàŸáŸà ÿ±ÿ≠ŸäŸÖ ÿ®ŸÉŸÑ ÿ¥Ÿäÿ°ÿå ÿ®ŸÖÿß ŸÅŸä ÿ∞ŸÑŸÉ ÿßŸÑÿ®ÿ¥ÿ± ŸàÿßŸÑŸÉÿßÿ¶ŸÜÿßÿ™ ÿßŸÑÿ£ÿÆÿ±Ÿâ. Ÿáÿ∞Ÿá ÿßŸÑŸÅŸÉÿ±ÿ© ÿ™ŸÜÿπŸÉÿ≥ ŸÅŸä ÿßŸÑÿπÿØŸäÿØ ŸÖŸÜ ÿßŸÑÿ£ÿ≠ÿßÿØŸäÿ´ ÿßŸÑŸÜÿ®ŸàŸäÿ© ÿßŸÑÿ™Ÿä ÿ™ÿ¥ÿ¨ÿπ ÿπŸÑŸâ ÿßŸÑÿ±ŸÅŸÇ ŸàÿßŸÑÿ™ÿ±ÿßÿ≠ŸÖ ŸÖÿπ ÿßŸÑÿ¢ÿÆÿ±ŸäŸÜ.

ÿÆŸÑÿßŸÑ ÿßŸÑÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπÿ®ÿ© ŸàÿßŸÑÿ∂ŸäŸÇÿå ŸäŸÖŸÉŸÜ ŸÑŸÑŸÖ

In [25]:
print("QWEN\n")
for i in range(len(responses_explanations_qwen)):
    print(f"{explanations[i]}\n")
    print(f"{responses_explanations_qwen[i]}\n")
with open('explicaciones_qwen.txt', 'w') as fichero:
    for item in responses_explanations_qwen:
        fichero.write(item + '\n\n')

QWEN

how mercy is understood in Islam during times of hardship and suffering

ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖÿå ŸäŸèŸÅŸáŸÖ ÿßŸÑÿ™ÿ≥ÿßŸÖÿ≠ ŸÉŸÖÿ®ÿØÿ£ ÿ£ÿ≥ÿßÿ≥Ÿä ŸàÿπŸéÿ®ŸíÿØŸç ÿπÿ∏ŸäŸÖ ŸÅŸä ÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπŸàÿ®ÿßÿ™ ŸàÿßŸÑŸÖÿπÿßŸÜÿßÿ©ÿå ÿ≠Ÿäÿ´ ŸäŸèÿπÿ™ÿ®ÿ± ŸÖŸÜ ÿ£ÿ≥ŸÖŸâ ÿµŸàÿ± ÿßŸÑÿ•ŸäŸÖÿßŸÜ ŸàÿßŸÑÿ™Ÿàÿßÿ∂ÿπ ÿ£ŸÖÿßŸÖ ÿßŸÑŸÑŸá. ŸÅÿ≠ŸäŸÜ ŸäŸÖÿ± ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜ ÿ®ÿ∏ÿ±ŸàŸÅ ÿ≥ŸÑÿ®Ÿäÿ© ŸÖÿ´ŸÑ ÿßŸÑŸÅŸÇÿ± ÿ£Ÿà ÿßŸÑŸÖÿ±ÿ∂ ÿ£Ÿà ÿßŸÑÿÆÿ≥ÿßÿ±ÿ©ÿå ŸÅÿ•ŸÜ ŸÖÿπÿ±ŸÅÿ© ÿßŸÑÿßŸÖÿ™ŸÜÿßŸÜ ŸÑŸÑŸá Ÿàÿßÿπÿ™ŸÇÿßÿØ ÿ£ŸÜ ŸÉŸÑ ŸÖÿß Ÿäÿ≠ÿØÿ´ ŸÑŸá ŸáŸà ÿ®ÿ≠ÿ≥ÿ® ÿ™Ÿàÿ¨ŸäŸáÿßÿ™Ÿá ÿßŸÑÿπÿßŸÇŸÑÿ© ŸàÿßŸÑÿÆÿßÿµÿ© ÿ®Ÿáÿå ŸäŸèÿπÿ≤ÿ≤ ŸÖŸÜ ŸÇÿØÿ±ÿ™Ÿá ÿπŸÑŸâ ÿßŸÑÿµÿ®ÿ± ŸàÿßŸÑÿ™ÿ≥ÿßŸÖÿ≠.

ŸäŸèÿ∞ŸÉÿ± ŸÅŸä ÿßŸÑÿ£ÿ≠ÿßÿØŸäÿ´ ÿßŸÑŸÜÿ®ŸàŸäÿ© ÿ£ŸÜ ÿßŸÑŸÑŸá ŸÑÿß ŸäŸèÿ§ÿ∞Ÿä ÿ£ÿ≠ÿØŸãÿß ÿ•ŸÑÿß ÿ®ÿ•ÿ∞ŸÜÿå Ÿàÿ£ŸÜŸá ŸäŸèÿ∏Ÿáÿ± ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÅŸä ÿ£

the relationship between faith and ethical behavior in Islamic thought

ŸÅŸä ÿßŸÑÿ™ŸÅŸÉŸäÿ± ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿå ÿ™ŸèÿπÿØŸë ÿßŸÑÿ•ŸäŸÖÿßŸÜ ŸàÿßŸÑÿ≥ŸÑŸàŸÉ ÿßŸÑÿ£ÿÆŸÑÿßŸÇŸä ŸÖÿ™ÿ±ÿ™ŸäŸÜ ŸÖÿ™Ÿàÿßÿ≤Ÿäÿ™ŸäŸÜ ŸàŸÖ

In [30]:
creativity = ["short religious reflection (6‚Äì8 sentences) about mercy and compassion",
              "moral paragraph about patience and trust in God during difficult times",
              "short sermon-style paragraph about forgiveness and reconciliation",
              "religious reflection on how faith guides everyday life",
              "short religious text about justice and personal responsibility"]
creativity_ar = ["ÿ™ÿ£ŸÖŸÑ ÿØŸäŸÜŸä ŸÇÿµŸäÿ± (Ÿ¶-Ÿ® ÿ¨ŸÖŸÑ) ÿπŸÜ ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸàÿßŸÑÿ¥ŸÅŸÇÿ©",
                "ŸÅŸÇÿ±ÿ© ÿ£ÿÆŸÑÿßŸÇŸäÿ© ÿπŸÜ ÿßŸÑÿµÿ®ÿ± ŸàÿßŸÑÿ™ŸàŸÉŸÑ ÿπŸÑŸâ ÿßŸÑŸÑŸá ŸÅŸä ÿßŸÑÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπÿ®ÿ©",
                "ŸÅŸÇÿ±ÿ© ŸÇÿµŸäÿ±ÿ© ÿπŸÑŸâ ÿ∫ÿ±ÿßÿ± ÿßŸÑŸÖŸàÿπÿ∏ÿ© ÿπŸÜ ÿßŸÑŸÖÿ∫ŸÅÿ±ÿ© ŸàÿßŸÑŸÖÿµÿßŸÑÿ≠ÿ©",
                "ÿ™ÿ£ŸÖŸÑ ÿØŸäŸÜŸä ÿ≠ŸàŸÑ ŸÉŸäŸÅ ŸäŸèÿ±ÿ¥ÿØ ÿßŸÑÿ•ŸäŸÖÿßŸÜ ÿßŸÑÿ≠Ÿäÿßÿ© ÿßŸÑŸäŸàŸÖŸäÿ©",
                "ŸÜÿµ ÿØŸäŸÜŸä ŸÇÿµŸäÿ± ÿπŸÜ ÿßŸÑÿπÿØÿßŸÑÿ© ŸàÿßŸÑŸÖÿ≥ÿ§ŸàŸÑŸäÿ© ÿßŸÑÿ¥ÿÆÿµŸäÿ©"]

def show_creativity(creativity, model, tokenizer, system_text, gen_cfg):
    responses_creativity = []
    if not gen_cfg.get("use_cache"):
        for i in range(len(creativity)):
            print(f"Generating {creativity[i]}...")
            prompt = f'''{system_text} .{creativity[i]} ÿßŸÉÿ™ÿ®'''
            responses_creativity.append(generate_chat(model, tokenizer, prompt, None, gen_cfg))
    else:
        for i in range(len(creativity)):
            print(f"Generating {creativity[i]}...")
            prompt = f'''Write a {creativity[i]}'''
            responses_creativity.append(generate_chat(model, tokenizer, prompt, system_text, gen_cfg))
    return responses_creativity

# Definimos como debe comportarse el modelo
SYSTEM_AR = "Answer in Modern Standard Arabic only. Use a formal religious register. Do not quote religious texts verbatim."
SYSTEM_AR_OR = ".ÿ£ÿ¨ÿ® ÿ®ÿßŸÑŸÑÿ∫ÿ© ÿßŸÑÿπÿ±ÿ®Ÿäÿ© ÿßŸÑŸÅÿµÿ≠Ÿâ ÿßŸÑÿ≠ÿØŸäÿ´ÿ© ŸÅŸÇÿ∑. ÿßÿ≥ÿ™ÿÆÿØŸÖ ÿ£ÿ≥ŸÑŸàÿ®ÿßŸã ÿØŸäŸÜŸäÿßŸã ÿ±ÿ≥ŸÖŸäÿßŸã. ŸÑÿß ÿ™ŸÇÿ™ÿ®ÿ≥ ÿßŸÑŸÜÿµŸàÿµ ÿßŸÑÿØŸäŸÜŸäÿ© ÿ≠ÿ±ŸÅŸäÿßŸã"

responses_creativity_jais = show_creativity(creativity_ar, jais_model, jais_tok, SYSTEM_AR_OR, GEN_CFG_2)
responses_creativity_qwen = show_creativity(creativity, qwen_model, qwen_tok, SYSTEM_AR, GEN_CFG)

Generating ÿ™ÿ£ŸÖŸÑ ÿØŸäŸÜŸä ŸÇÿµŸäÿ± (Ÿ¶-Ÿ® ÿ¨ŸÖŸÑ) ÿπŸÜ ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸàÿßŸÑÿ¥ŸÅŸÇÿ©...
Generating ŸÅŸÇÿ±ÿ© ÿ£ÿÆŸÑÿßŸÇŸäÿ© ÿπŸÜ ÿßŸÑÿµÿ®ÿ± ŸàÿßŸÑÿ™ŸàŸÉŸÑ ÿπŸÑŸâ ÿßŸÑŸÑŸá ŸÅŸä ÿßŸÑÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπÿ®ÿ©...
Generating ŸÅŸÇÿ±ÿ© ŸÇÿµŸäÿ±ÿ© ÿπŸÑŸâ ÿ∫ÿ±ÿßÿ± ÿßŸÑŸÖŸàÿπÿ∏ÿ© ÿπŸÜ ÿßŸÑŸÖÿ∫ŸÅÿ±ÿ© ŸàÿßŸÑŸÖÿµÿßŸÑÿ≠ÿ©...
Generating ÿ™ÿ£ŸÖŸÑ ÿØŸäŸÜŸä ÿ≠ŸàŸÑ ŸÉŸäŸÅ ŸäŸèÿ±ÿ¥ÿØ ÿßŸÑÿ•ŸäŸÖÿßŸÜ ÿßŸÑÿ≠Ÿäÿßÿ© ÿßŸÑŸäŸàŸÖŸäÿ©...
Generating ŸÜÿµ ÿØŸäŸÜŸä ŸÇÿµŸäÿ± ÿπŸÜ ÿßŸÑÿπÿØÿßŸÑÿ© ŸàÿßŸÑŸÖÿ≥ÿ§ŸàŸÑŸäÿ© ÿßŸÑÿ¥ÿÆÿµŸäÿ©...
Generating short religious reflection (6‚Äì8 sentences) about mercy and compassion...
Generating moral paragraph about patience and trust in God during difficult times...
Generating short sermon-style paragraph about forgiveness and reconciliation...
Generating religious reflection on how faith guides everyday life...
Generating short religious text about justice and personal responsibility...


In [31]:
print("JAIS\n")
for i in range(len(responses_creativity_jais)):
    print(f"{creativity[i]}\n")
    print(f"{responses_creativity_jais[i]}\n")
with open('creativity_jais.txt', 'w') as fichero:
    for item in responses_creativity_jais:
        fichero.write(item + '\n\n')

JAIS

short religious reflection (6‚Äì8 sentences) about mercy and compassion

ŸÅŸä ÿπÿßŸÑŸÖ ÿßŸÑÿ≠ÿ® ŸàÿßŸÑÿ±ÿ£ŸÅÿ©ÿå ŸÜÿ¨ÿØ ÿßŸÑÿ∞ÿßÿ™ ŸÅŸä ŸÇŸÑŸàÿ® ÿßŸÑÿ¢ÿÆÿ±ŸäŸÜÿå ŸÜÿ™ŸÑŸÖÿ≥ ÿßŸÑÿ∑ÿ±ŸäŸÇ ŸÜÿ≠Ÿà ÿßŸÑÿÆŸäÿ± ŸàÿßŸÑÿµŸÑÿßÿ≠. ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸáŸä ÿ™ŸÑŸÉ ÿßŸÑŸÜÿ®ÿ™ÿ© ÿßŸÑÿµÿ∫Ÿäÿ±ÿ© ÿßŸÑÿ™Ÿä ÿ™ŸÜŸÖŸà ŸÅŸä ÿßŸÑŸÇŸÑÿ®ÿå ÿ™ÿ∫ÿ∞ŸäŸáÿß ÿßŸÑÿπŸäŸÜ ÿ®ÿßŸÑÿ£ŸÖŸÑ ŸàÿßŸÑÿ™ŸÅÿßÿ§ŸÑÿå Ÿàÿ™ÿ≥ÿßÿπÿØŸáÿß ÿßŸÑŸäÿØ ÿ®ÿßŸÑÿπŸÖŸÑ ŸàÿßŸÑÿπŸÖŸÑ ÿßŸÑÿÆŸäÿ±. ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÑŸäÿ≥ÿ™ ŸÖÿ¨ÿ±ÿØ ŸÖÿ¥ÿßÿπÿ±ÿå ÿ®ŸÑ ŸáŸä ÿßŸÑŸÇŸàÿ© ÿßŸÑÿ™Ÿä ÿ™ÿØŸÅÿπŸÜÿß ŸÜÿ≠Ÿà ÿßŸÑÿÆŸäÿ± ŸàÿßŸÑÿπÿØŸÑ ŸàÿßŸÑÿ≥ŸÑÿßŸÖ. ÿßŸÑÿ±ÿ≠ŸÖÿ© ÿ™ÿ¨ÿπŸÑŸÜÿß ÿ£ŸÉÿ´ÿ± ÿ•ŸÜÿ≥ÿßŸÜŸäÿ©ÿå ÿ™ÿ¨ÿπŸÑŸÜÿß ŸÜŸÜÿ∏ÿ± ÿ•ŸÑŸâ ÿßŸÑÿπÿßŸÑŸÖ ÿ®ÿπŸäŸàŸÜ ŸÖÿ™ŸÅÿ™ÿ≠ÿ© ŸàŸÖÿ¥ÿ±ŸÇÿ©. ŸÅŸä ŸÉŸÑ ÿπŸÖŸÑ ÿÆŸäÿ±Ÿä ŸÜŸÇŸàŸÖ ÿ®Ÿáÿå ŸÜÿ≠ŸÜ ŸÜÿ™ÿ∞ŸÉÿ± ÿ£ŸÜ ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸáŸä ÿßŸÑŸÇŸàÿ© ÿßŸÑÿ™Ÿä ÿ™ÿ¨ŸÖÿπŸÜÿß ÿ¨ŸÖŸäÿπÿßŸã ŸÉÿ£ÿ®ŸÜÿßÿ° ŸÑŸÑÿ•ŸÜÿ≥ÿßŸÜŸäÿ© ÿßŸÑŸàÿßÿ≠ÿØÿ©.

moral paragraph about patience and trust in God during difficult times

ŸÅŸä ŸÖŸàÿßÿ¨Ÿáÿ© ÿßŸÑÿ¥ÿØÿßÿ¶ÿØÿå ŸÇÿØ ŸäŸÉŸàŸÜ 

In [32]:
print("QWEN\n")
for i in range(len(responses_creativity_qwen)):
    print(f"{creativity[i]}\n")
    print(f"{responses_creativity_qwen[i]}\n")
with open('creativity_qwen.txt', 'w') as fichero:
    for item in responses_creativity_qwen:
        fichero.write(item + '\n\n')

QWEN

short religious reflection (6‚Äì8 sentences) about mercy and compassion

ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸàÿßŸÑÿ™ÿπÿßÿ∑ŸÅ ŸáŸÖÿß ŸÖŸÜ ÿ£ÿ±ŸÉÿßŸÜ ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖ ÿßŸÑÿπÿ∏ŸäŸÖÿå ŸàŸäŸèÿπÿØÿßŸÜ ŸÖŸÅÿ™ÿßÿ≠Ÿãÿß ŸÑÿ≠Ÿäÿßÿ©Ÿç ŸÖÿ™Ÿàÿßÿ≤ŸÜÿ© ŸàŸÖÿ≠ÿ®Ÿëÿ©. ŸäŸèÿ∏Ÿáÿ± ÿßŸÑŸÇÿ±ÿ¢ŸÜ ÿßŸÑŸÉÿ±ŸäŸÖ ÿ£ŸÜ ÿßŸÑŸÑŸá ŸáŸà ÿßŸÑÿ±ÿ≠ŸäŸÖÿå ÿ•ÿ∞ ŸÑÿß ŸäŸèÿÆŸÑŸà ÿ¥Ÿäÿ° ŸÖŸÜ ÿ™ÿ£ÿ´Ÿäÿ±Ÿá ÿßŸÑÿ±ÿ≠ŸäŸëŸÖ ÿπŸÑŸâ ÿÆŸÑŸÇŸá. ŸÅŸÉÿ´Ÿäÿ±Ÿãÿß ŸÖÿß ŸäŸèÿ≥ÿ™ŸéÿπŸíŸÑŸÖ ŸÅŸä ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜ ŸÜÿ≥ÿÆÿ© ŸÖŸÜ ÿ±ÿ≠ŸÖÿ© ÿßŸÑŸÑŸá ÿπŸÜÿØŸÖÿß ŸäŸèÿπÿ±ÿ∂ ŸÑŸÑÿ•ÿ¨ŸáÿßÿØ ÿ£Ÿà ÿßŸÑŸÇŸÑŸÇ ÿ£Ÿà ÿßŸÑŸÖÿ≠ŸÜÿå ŸÅŸäŸèÿ¥ÿπÿ± ÿ®ÿßŸÑÿ±ÿ∫ÿ®ÿ© ŸÅŸä ŸÖÿ¥ÿßÿ±ŸÉÿ© ÿßŸÑÿ¢ÿÆÿ±ŸäŸÜ ÿ®ÿßŸÑŸÖÿ¥ÿßÿπÿ± ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜŸäÿ©. ŸÅÿßŸÑÿ™ÿπÿßÿ∑ŸÅ ŸÑŸäÿ≥ ŸÖÿ¨ÿ±ÿØ ÿ™ÿµÿ±ŸÅ ÿπÿßÿ®ÿ±ÿå ÿ®ŸÑ ŸáŸà ÿØÿπŸàÿ© ÿ•ŸÑŸâ ÿßŸÑÿ™ŸÅŸÉŸäÿ± ŸÅŸä ÿ≠ÿßŸÑÿ© ÿßŸÑÿ¢ÿÆÿ±ÿå ŸÉŸÖÿß ŸÉÿßŸÜ ÿßŸÑŸÜÿ®Ÿä ŸÖÿ≠ŸÖÿØ Ô∑∫

moral paragraph about patience and trust in God during difficult times

ŸÅŸä ÿ£ŸàŸÇÿßÿ™ ÿßŸÑÿµÿπŸàÿ®ÿßÿ™ ŸàÿßŸÑÿ¨ŸèŸàÿπÿå Ÿäÿ∏Ÿáÿ± ŸÖÿ®ÿØÿ£ ÿßŸÑÿµÿ®ÿ± ŸÉÿ£ÿ≠ÿØ ÿ£ŸÇŸàŸâ ÿßŸÑÿµŸÅÿßÿ™ ÿßŸÑÿ™Ÿä ÿ™Ÿèÿ®

In [2]:
def load_responses_txt(path):
    with open(path, "r", encoding="utf-8") as f:
        text = f.read().strip()
    return [item.strip() for item in text.split("\n\n") if item.strip()]

responses_explanations_qwen = load_responses_txt("evaluation_set/explicaciones_qwen.txt")
responses_explanations_jais = load_responses_txt("evaluation_set/explicaciones_jais.txt")

responses_concepts_qwen = load_responses_txt("evaluation_set/conceptos_qwen.txt")
responses_concepts_jais = load_responses_txt("evaluation_set/conceptos_jais.txt")

responses_creativity_qwen = load_responses_txt("evaluation_set/creativity_qwen.txt")
responses_creativity_jais = load_responses_txt("evaluation_set/creativity_jais.txt")

Aqui vamos a cargar el modelo evaluador como ya hemos explicado previamente.

In [6]:
# Cargamos el modelo
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
import torch

def load_chat_model_llama(model_id):
    bnb_cfg = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_use_double_quant=True,
        bnb_4bit_compute_dtype=torch.float16,
    )

    tok = AutoTokenizer.from_pretrained(model_id)

    mdl = AutoModelForCausalLM.from_pretrained(
        model_id,
        device_map="auto",
        quantization_config=bnb_cfg,
        torch_dtype="auto",
        max_memory={0: "11GiB", "cpu": "18GiB"},   # üëà limita para que no pete
        offload_folder="offload_llama",           # üëà si hace offload, que sea ordenado
    )

    if tok.pad_token is None:
        tok.pad_token = tok.eos_token
    return tok, mdl

# Generamos texto
def generate_chat_llama(model, tokenizer, user_text, system_text=None, gen_cfg=GEN_CFG):
    messages = []
    if system_text:
        messages.append({"role": "system", "content": system_text})
    messages.append({"role": "user", "content": user_text})

    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt")
    input_len = inputs["input_ids"].shape[-1]

    with torch.no_grad():
        out = model.generate(
            **inputs,
            pad_token_id=tokenizer.pad_token_id,
            use_cache=True,
            **gen_cfg
        )

    return tokenizer.decode(out[0][input_len:], skip_special_tokens=True)

In [None]:
llama_id = "meta-llama/Meta-Llama-3-8B-Instruct"
llama_tokenizer, llama_model = load_chat_model_llama(llama_id)

TypeError: load_chat_model_llama() got an unexpected keyword argument 'trust_remote_code'

Las instrucciones son sencillas, valorar del 1 al 5 las siguientes variables:
1. Correccion gramatical
2. Fluidez y naturalidad
3. Adecuacion al registro religioso
4. Coherencia conceptual

Siendo un 1 la peor nota y un 5 la mejor nota.

In [None]:
prompt = "Who are you?"

print("LLAMA")
# print(generate_chat(llama_model, llama_tokenizer, prompt, None, gen_cfg=GEN_CFG_2))
print(generate_chat_llama(llama_model, llama_tokenizer, prompt, None, gen_cfg=GEN_CFG))

LLAMA


KeyboardInterrupt: 

In [34]:
def evaluate_concepts(concepts, model, tokenizer, gen_cfg):
    evaluation_concepts = []
    for i in range(len(concepts)):
        print(f"Evaluating {concepts[i]}")
        prompt = f'''You are an expert in Arabic language and Islamic studies.
                    Evaluate the following Arabic text.

                    Criteria:
                    1. Grammatical correctness
                    2. Fluency and naturalness
                    3. Appropriateness of religious register
                    4. Conceptual correctness

                    Give a score from 1 to 5 for each criterion and a short justification.

                    Text:
                    {concepts[i]}'''
        
        evaluation_concepts.append(generate_chat(model, tokenizer, prompt, None, gen_cfg))
    return evaluation_concepts

evaluation_concepts_jais = evaluate_concepts(responses_concepts_jais, llama_model, llama_tokenizer, GEN_CFG)
evaluation_concepts_qwen = evaluate_concepts(responses_concepts_qwen, llama_model, llama_tokenizer, GEN_CFG)

Evaluating ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸáŸä ŸÖŸÅŸáŸàŸÖ ÿ£ÿ≥ÿßÿ≥Ÿä ŸÅŸä ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖÿå ŸàŸáŸä ÿ™ÿ¥ŸÉŸÑ ÿ¨ÿ≤ÿ°Ÿãÿß ŸÑÿß Ÿäÿ™ÿ¨ÿ≤ÿ£ ŸÖŸÜ ÿ™ÿπÿßŸÑŸäŸÖ ÿßŸÑŸÇÿ±ÿ¢ŸÜ ŸàÿßŸÑÿ≠ÿØŸäÿ´. Ÿäÿ™ŸÖ ÿ™ÿπÿ±ŸäŸÅŸáÿß ÿπŸÑŸâ ÿ£ŸÜŸáÿß ÿßŸÑŸÇÿØÿ±ÿ© ÿπŸÑŸâ ÿßŸÑÿ™ÿπÿßÿ∑ŸÅ ŸÖÿπ ŸÖÿπÿßŸÜÿßÿ© ÿßŸÑÿ¢ÿÆÿ±ŸäŸÜ ŸàÿßŸÑÿ±ÿ∫ÿ®ÿ© ŸÅŸä ÿ™ÿÆŸÅŸäŸÅŸáÿßÿå ŸàŸáŸä ÿ™ÿπÿ®ÿ± ÿπŸÜ ÿßŸÑÿ¨ÿßŸÜÿ® ÿßŸÑÿ•ŸÜÿ≥ÿßŸÜŸä ŸÑŸÑÿØŸäŸÜ.

ŸÅŸä ÿßŸÑŸÇÿ±ÿ¢ŸÜÿå Ÿäÿ™ŸÖ ÿ™ÿµŸàŸäÿ± ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÉŸÅÿ∂ŸäŸÑÿ© ÿ∞ÿßÿ™Ÿäÿ©ÿå ÿ≠Ÿäÿ´ Ÿäÿπÿ™ÿ®ÿ± ÿßŸÑŸÖÿ≥ŸÑŸÖŸàŸÜ ÿ£ŸÜ ÿßŸÑŸÑŸá ŸáŸà ÿßŸÑÿ±ÿ≠ŸÖŸÜ ÿßŸÑÿ±ÿ≠ŸäŸÖ. ŸäŸÇŸàŸÑ ÿ≥Ÿàÿ±ÿ© ÿßŸÑÿ¥ÿπÿ±ÿßÿ°: "ŸáŸà ÿßŸÑÿ∞Ÿä ŸÑŸá ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸàÿßŸÑÿ±ÿ¥ÿØ ŸàÿßŸÑÿ±ÿ≠ŸÖÿßŸÜ ÿ±ÿ≠ŸäŸÖ ÿ®ÿßŸÑÿ®ÿπÿ´". Ÿáÿ∞Ÿá ÿßŸÑÿ¢Ÿäÿ© ÿ™ÿ¥Ÿäÿ± ÿ•ŸÑŸâ ŸÇÿØÿ±ÿ© ÿßŸÑŸÑŸá ÿπŸÑŸâ ŸÉŸÑ ÿ¥Ÿäÿ°ÿå ÿ®ŸÖÿß ŸÅŸä ÿ∞ŸÑŸÉ ÿßŸÑÿ±ÿ≠ŸÖÿ©.

ÿßŸÑÿ±ÿ≠ŸÖÿ© ŸÑŸäÿ≥ÿ™ ŸÖÿ¨ÿ±ÿØ ŸÖÿ¥ÿßÿπÿ±ÿå ÿ®ŸÑ ŸáŸä ÿ¨ÿ≤ÿ° ŸÖŸÜ ÿßŸÑÿπŸÇŸäÿØÿ© ÿßŸÑÿ•ÿ≥ŸÑÿßŸÖŸäÿ©ÿå Ÿàÿ™ÿ¥ŸÖŸÑ ÿßŸÑÿ±ÿ≠ŸÖÿ© ÿ®ŸÉŸÑ ÿßŸÑŸÉÿßÿ¶ŸÜÿßÿ™ ÿßŸÑÿ≠Ÿäÿ©ÿå ÿ≠ÿ™Ÿâ ÿßŸÑÿ≠ŸäŸàÿßŸÜÿßÿ™ ŸàÿßŸÑŸÜÿ®ÿßÿ™ÿßÿ™. ŸäŸÇŸàŸÑ ÿ≠ÿØŸäÿ´ ÿßŸÑŸÜÿ®Ÿä ŸÖÿ≠ŸÖ

KeyboardInterrupt: 

In [None]:
print("LLAMA JAIS\n")
for i in range(len(evaluation_concepts_jais)):
    print(f"{concepts[i]}\n")
    print(f"{evaluation_concepts_jais[i]}\n")
with open('evaluation_concepts_jais.txt', 'w') as fichero:
    for item in evaluation_concepts_jais:
        fichero.write(item + '\n\n')

In [None]:
print("LLAMA QWEN\n")
for i in range(len(evaluation_concepts_qwen)):
    print(f"{concepts[i]}\n")
    print(f"{evaluation_concepts_qwen[i]}\n")
with open('evaluation_concepts_qwen.txt', 'w') as fichero:
    for item in evaluation_concepts_qwen:
        fichero.write(item + '\n\n')

In [None]:
def evaluate_explanations(explanations, model, tokenizer, gen_cfg):
    for i in range(len(explanations)):
        print(f"Evaluation for explanation {explanations[i]}")
        prompt = f'''You are an expert in Arabic language and Islamic studies.
                    Evaluate the following Arabic text.

                    Criteria:
                    1. Grammatical correctness
                    2. Fluency and naturalness
                    3. Appropriateness of religious register
                    4. Conceptual correctness

                    Give a score from 1 to 5 for each criterion and a short justification.

                    Text:
                    {explanations[i]}'''

evaluation_explanations_jais = evaluate_explanations(responses_explanations_jais, llama_model, llama_tokenizer, GEN_CFG)
evaluation_explanations_qwen = evaluate_explanations(responses_explanations_qwen, llama_model, llama_tokenizer, GEN_CFG)

In [None]:
print("LLAMA JAIS\n")
for i in range(len(evaluation_explanations_jais)):
    print(f"{explanations[i]}\n")
    print(f"{evaluation_explanations_jais[i]}\n")
with open('evaluation_explanations_jais.txt', 'w') as fichero:
    for item in evaluation_explanations_jais:
        fichero.write(item + '\n\n')

In [None]:
print("LLAMA QWEN\n")
for i in range(len(evaluation_concepts_qwen)):
    print(f"{explanations[i]}\n")
    print(f"{evaluation_concepts_qwen[i]}\n")
with open('evaluation_concepts_qwen.txt', 'w') as fichero:
    for item in evaluation_concepts_qwen:
        fichero.write(item + '\n\n')

In [None]:
def evaluate_creativity(creativity, model, tokenizer, gen_cfg):
    for i in range(len(creativity)):
        print(f"Evaluation for creativity {creativity[i]}")
        prompt = f'''You are an expert in Arabic language and Islamic studies.
                    Evaluate the following Arabic text.

                    Criteria:
                    1. Grammatical correctness
                    2. Fluency and naturalness
                    3. Appropriateness of religious register
                    4. Conceptual correctness

                    Give a score from 1 to 5 for each criterion and a short justification.

                    Text:
                    {creativity[i]}'''

evaluation_creativity_jais = evaluate_creativity(responses_creativity_jais, llama_model, llama_tokenizer, GEN_CFG)
evaluation_creativity_qwen = evaluate_creativity(responses_creativity_qwen, llama_model, llama_tokenizer, GEN_CFG)

In [None]:
print("LLAMA JAIS\n")
for i in range(len(evaluation_creativity_jais)):
    print(f"{creativity[i]}\n")
    print(f"{evaluation_creativity_jais[i]}\n")
with open('evaluation_creativity_jais.txt', 'w') as fichero:
    for item in evaluation_creativity_jais:
        fichero.write(item + '\n\n')

In [None]:
print("LLAMA QWEN\n")
for i in range(len(evaluation_creativity_qwen)):
    print(f"{creativity[i]}\n")
    print(f"{evaluation_creativity_qwen[i]}\n")
with open('evaluation_creativity_qwen.txt', 'w') as fichero:
    for item in evaluation_creativity_qwen:
        fichero.write(item + '\n\n')