<a href="https://colab.research.google.com/github/dede0702/ai-axur-estagio-desafio/blob/main/Axur_Desafio_T%C3%A9cnico_Est%C3%A1gio_IA_(Multimodal).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Avaliação Técnica - Script Multimodal

Este notebook realiza scraping de uma imagem, envia para inferência usando o modelo `microsoft-florence-2-large`, e submete a resposta via API.

In [None]:
# 📦 Instalar dependências (se necessário)
!pip install requests beautifulsoup4



In [None]:
# 🔧 Importações e configurações
import requests
from bs4 import BeautifulSoup
import base64
import json



In [None]:
# -------------------------------
# 1. FAZER SCRAPING DA IMAGEM
# -------------------------------
print("🔍 Fazendo scraping da imagem...")

response = requests.get(SCRAPING_URL)
soup = BeautifulSoup(response.content, "html.parser")
img_tag = soup.find("img")

if not img_tag or not img_tag.get("src"):
    raise Exception("❌ Imagem não encontrada na página!")

image_url = img_tag["src"]
# Check if the src is a data URL or a standard URL
if image_url.startswith("data:"):
    # If it's a data URL, use it directly
    data_url = image_url
    print("✅ Data URL da imagem encontrada.")
    # No need to download if it's a data URL, proceed to base64 conversion
    # Extract base64 string from data URL
    img_base64 = data_url.split(",")[1]
    img_bytes = base64.b64decode(img_base64)
else:
    # If it's a standard URL, construct the full URL
    if not image_url.startswith("http"):
        image_url = f"https://intern.aiaxuropenings.com{image_url}"
    print(f"✅ URL da imagem encontrada: {image_url}")

    # -------------------------------
    # 2. DOWNLOAD E CONVERSÃO EM BASE64
    # -------------------------------
    print("📥 Baixando a imagem...")
    img_response = requests.get(image_url)
    if img_response.status_code != 200:
        raise Exception("❌ Erro ao baixar a imagem!")

    img_bytes = img_response.content
    img_base64 = base64.b64encode(img_bytes).decode("utf-8")
    data_url = f"data:image/jpeg;base64,{img_base64}"

# The rest of your code remains the same
with open("imagem.jpg", "wb") as f:
    f.write(img_bytes)

# 🤖 3. Enviar imagem para inferência com prompt multimodal
payload = {
    "model": "microsoft-florence-2-large",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "<DETAILED_CAPTION>"},
                {"type": "image_url", "image_url": {"url": data_url}}  # Use data_url here
            ]
        }
    ],
    "temperature": 0.7
}

response = requests.post(INFERENCIA_API, headers=HEADERS, data=json.dumps(payload))

if response.status_code != 200:
    raise Exception(f"Erro ao chamar modelo: {response.status_code}\n{response.text}")

inference_result = response.json()
print(json.dumps(inference_result, indent=2))

# 📤 4. Submeter resposta recebida para avaliação
submit_response = requests.post(SUBMISSAO_API, headers=HEADERS, json=inference_result)

if submit_response.status_code == 200:
    print("✅ Resposta submetida com sucesso!")
else:
    print(f"❌ Erro na submissão: {submit_response.status_code}")
    print(submit_response.text)

🔍 Fazendo scraping da imagem...
✅ Data URL da imagem encontrada.
{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "This image depicts a vibrant fruit display in a supermarket. The arrangement is divided into three distinct sections, each showcasing a different color of apples:\n\n1. **Red Apples**: The left section features a large quantity of red apples, neatly organized on tiered shelves. These apples appear to be of a tart variety, possibly Granny Smith, characterized by their deep red color and crisp texture.\n\n2. **Green Apples**: Adjacent to the red apples, there are sections of green apples. These apples are less common, suggesting they may be Granny Smith or other varieties, again displaying a deep green color with a distinct flavor profile.\n\n3. **Mixed Apples**: At the center of the display is a mix of red and green apples. The mix creates an interesting pattern, with some red apples follow