<a href="https://colab.research.google.com/github/Naomie25/Hackaton-Fashion-Description-Generator/blob/FINALE-VERSION/Finale_Fashion_Description_Generator_Hackathon.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1.Define the Task & Pipeline Overview

Input (keyword or image) → Generation Model → Quality-Check Module → (Optional) Image Generator → Ethical Filter → Final Output

In [None]:
!pip install transformers torch sentencepiece
!pip install schedule
!pip install --upgrade datasets

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

In [None]:
# ============================
# Installation des bibliothèques (à exécuter une seule fois si besoin)
# ============================
!pip install transformers torch sentencepiece
!pip install schedule

# ============================
# Imports
# ============================
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer, BartForConditionalGeneration, BartTokenizer
from transformers import pipeline, set_seed
import difflib
import re
import random

In [None]:
import torch
from transformers import (
    pipeline,
    set_seed,
    BartTokenizer,
    BartForConditionalGeneration
)
import difflib
import re

# ============================
# 1. Configuration Générale
# ============================
device = torch.device("cpu")
print("✅ Device set to use", device)

# Générateur de texte
generator = pipeline('text-generation', model='distilgpt2', device=-1)
set_seed(42)

# Modèle de résumé qualité
bart_model_name = "facebook/bart-base"
bart_tokenizer = BartTokenizer.from_pretrained(bart_model_name)
bart_model = BartForConditionalGeneration.from_pretrained(bart_model_name).to(device)

# Liste de mots-clés mode pour le scoring
fashion_keywords = [
    "elegant", "stylish", "refined", "modern", "vintage", "casual",
    "minimalist", "chic", "versatile", "comfort", "premium", "crafted",
    "tailored", "cut", "fit", "fabric", "soft", "bold", "timeless"
]

# ============================
# 2. Génération de descriptions
# ============================
def generate_descriptions(keyword, num_variants=5):
    prompt = f"""*Item:* {keyword}\n*Description:*"""

    outputs = generator(
        prompt,
        max_new_tokens=120,
        num_return_sequences=num_variants,
        temperature=0.75,
        top_p=0.9,
        no_repeat_ngram_size=2,
        early_stopping=True
    )

    results = []
    for output in outputs:
        gen_text = output["generated_text"]
        # Récupération de la portion générée après "*Description:*"
        description_start = gen_text.find("*Description:*") + len("*Description:*")
        description_text = gen_text[description_start:].strip()
        score = score_description(description_text, prompt)
        results.append((description_text, score))

    results = clean_descriptions(results)
    return results

# ============================
# 3. Résumé qualité (BART)
# ============================
def summarize_text(text):
    inputs = bart_tokenizer(text, return_tensors="pt", truncation=True, max_length=512).to(device)
    summary_ids = bart_model.generate(inputs["input_ids"], num_beams=4, max_length=30, early_stopping=True)
    summary = bart_tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ============================
# 4. Filtrage Éthique
# ============================
def ethical_filter(text):
    blacklist = ["hate", "violence", "racism", "sexism", "terrorism"]
    text_lower = text.lower()
    return not any(bad_word in text_lower for bad_word in blacklist)

# ============================
# 5. Score et Nettoyage
# ============================
def has_repetitions(text, max_repeat=3):
    pattern = r'\b(\w+)( \1){' + str(max_repeat) + ',}\b'
    return re.search(pattern, text.lower()) is not None

def clean_descriptions(descriptions):
    filtered = []
    for desc, score in descriptions:
        if len(desc.split()) < 8:
            continue
        if has_repetitions(desc):
            continue
        filtered.append((desc, score))
    return filtered

def score_description(desc, prompt):
    words = desc.lower().split()
    keyword_bonus = sum(word in words for word in fashion_keywords)
    length_score = min(len(words), 50) / 50
    similarity = difflib.SequenceMatcher(None, desc.lower(), prompt.lower()).ratio()
    penalty = max(0, 1 - similarity)
    return length_score + 0.5 * keyword_bonus + penalty

# ============================
# 6. Pipeline principal
# ============================
def run_pipeline(keyword, num_variants=5):
    print(f"\n--- Génération pour: {keyword} ---")
    descriptions = generate_descriptions(keyword, num_variants)

    final_results = []
    for desc, score in descriptions:
        summary = summarize_text(desc)
        if not ethical_filter(desc):
            print("❌ Rejeté (filtre éthique):", desc)
            continue
        final_results.append((desc, summary, score))

    for i, (desc, summary, score) in enumerate(final_results, 1):
        print(f"\n✅ Description {i} [Score: {score:.2f}]:\n{desc}")
        print(f"📝 Résumé qualité:\n{summary}")

    generate_image_placeholder()
    return final_results

# ============================
# 7. Image (placeholder)
# ============================
def generate_image_placeholder():
    print("🖼️ Étape génération image (placeholder)")

# ============================
# 8. Documentation pipeline
# ============================
def document_pipeline():
    print("""
📌 Pipeline IA - Générateur de descriptions mode
Étapes :
1. Prompt → DistilGPT2 → Génération brute
2. Résumé avec BART → Vérifie la qualité
3. Filtrage éthique simple
4. Score = longueur + mots-clés + originalité
5. Image (placeholder)
Utilisation : run_pipeline("mot-clé")
""")

# ============================
# 9. Exemple d’utilisation
# ============================
if __name__ == "__main__":
    keyword = "denim jacket"
    run_pipeline(keyword, num_variants=5)
    document_pipeline()


✅ Device set to use cpu


Device set to use cpu
The following generation flags are not valid and may be ignored: ['early_stopping']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.



--- Génération pour: denim jacket ---

✅ Description 1 [Score: 2.88]:
The denim jackets are hand made from cotton and cotton. The jacket is made with cotton fabric, which is woven to fit snugly into the back of the jacket.
Item Description:Item :* leather jacketItemType:&& fabricItemName:+* fabric ItemType :& cottonItemDescription:-* clothingItemSize:%& woolItemColor:0:5%* garmentItemWidth:2:1%+%%

- All items are made in the same color. All item types are produced in a different color, and will be rolled together.

Item type
📝 Résumé qualité:
The denim jackets are hand made from cotton and cotton. The jacket is made with cotton fabric, which is woven to fit snugly into

✅ Description 2 [Score: 1.92]:
The first garment is a denim sweatshirt with a very lightweight material with the top section of the jacket. The garment features the color of a wool shirt, a pair of leather pants, and a large gold band.
It features a full-length black band with two gold bands. It features leather socks