# Auto Portrait Collector

Collecting portraits of celebrities manually is tedious. This notebook is an attempt to delegate this task to a [browser-use](https://github.com/browser-use/browser-use)ing agent.

## Dependencies

In [None]:
!wget -qO- https://astral.sh/uv/install.sh | sh

downloading uv 0.7.0 x86_64-unknown-linux-gnu
no checksums to verify
installing to /usr/local/bin
  uv
  uvx
everything's installed!


In [None]:
!uv venv

Using CPython 3.11.12 interpreter at: [36m/usr/bin/python3[39m
Creating virtual environment at: [36m.venv[39m
Activate with: [32msource .venv/bin/activate[39m


In [None]:
!source .venv/bin/activate

In [None]:
!uv pip install browser-use

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
[2mpsycopg2-binary[0m [32m------------------------------[2m[0m[0m 2.88 MiB/2.88 MiB
[2mbotocore  [0m [32m-----[2m-------------------------[0m[0m 1.99 MiB/12.90 MiB
[2mnvidia-cuda-cupti-cu12[0m [32m--------[2m----------------------[0m[0m 3.12 MiB/13.17 MiB
[2mnumpy     [0m [32m--[2m----------------------------[0m[0m 916.81 KiB/17.41 MiB
[2mnvidia-nvjitlink-cu12[0m [32m-----[2m-------------------------[0m[0m 3.14 MiB/20.09 MiB
[2mnvidia-cuda-nvrtc-cu12[0m [32m----[2m--------------------------[0m[0m 3.12 MiB/23.50 MiB
[2mfaiss-cpu [0m [32m----[2m--------------------------[0m[0m 3.17 MiB/29.85 MiB
[2mplaywright[0m [32m---[2m---------------------------[0m[0m 3.07 MiB/43.07 MiB
[2mnvidia-curand-cu12[0m [32m--[2m----------------------------[0m[0m 3.13 MiB/53.70 MiB
[2mnvidia-cusolver-cu12[0m [32m-[2m-----------------------------[0m[0m 3.02 MiB/122.01 MiB
[2mnvidia-cusp

In [None]:
!uv run playwright install --with-deps

Installing dependencies...
Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:3 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:6 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:7 https://r2u.stat.illinois.edu/ubuntu jammy InRelease
Hit:8 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:9 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Get:10 https://dl.google.com/linux/chrome/deb stable InRelease [1,825 B]
Hit:11 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Get:12 https://dl.google.com/linux/chrome/deb stable/main amd64 Packages [1,213 B]
Fetched 3,038 B in 2s (1,685 B/s)
Reading package lists... Done
W: Skipping acquire of configured file 'ma

## Agent-based image collection

In [None]:
from typing import List
from langchain_openai import ChatOpenAI
from browser_use import Agent, Browser, BrowserConfig, Controller
from browser_use.browser.context import BrowserContextConfig, BrowserContext
from pydantic import BaseModel

from google.colab import userdata
from dotenv import load_dotenv
import asyncio
import os

In [None]:
openai_api_key = userdata.get('OPENAI_API_KEY')
if openai_api_key:
    os.environ['OPENAI_API_KEY'] = openai_api_key
load_dotenv()

False

In [None]:
llm = ChatOpenAI(model="gpt-4o", temperature=0)

We test if the OpenAI client is operational.

In [None]:
messages = [
    (
        "system",
        "You are a helpful assistant that translates English to French. Translate the user sentence.",
    ),
    ("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 31, 'total_tokens': 37, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_90122d973c', 'id': 'chatcmpl-BRnt6cpcOsOpKc0NQ03WZNfbl79SB', 'finish_reason': 'stop', 'logprobs': None}, id='run-4a93d641-c327-4143-9188-ed4280da93b7-0', usage_metadata={'input_tokens': 31, 'output_tokens': 6, 'total_tokens': 37, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

We define the agent to collect target portraits.

In [None]:
browser = Browser(
    config = BrowserConfig(
        headless=True
    )
)
config = BrowserContextConfig(
    allowed_domains=['pinterest.com', 'duckduckgo.com', 'presearch.com'],
)
context = BrowserContext(browser=browser, config=config)

In [None]:
class Image(BaseModel):
	description: str
	url: str

class Images(BaseModel):
	images: List[Image]

controller = Controller(output_model=Images)

In [None]:
async def get_image_urls(target: str, location: str) -> Images:
  task = f'Go to {location} and find at least three good portrait images of {target}. ' \
        'Look for images of the person smiling. For good images get their html src attribute. ' \
        'Dont click on images or follow their links, just get the src attribute of the thumbnail.'

  agent = Agent(
    task=task,
    llm=llm,
    controller=controller,
    browser_context=context,
  )
  history = await agent.run()

  result = history.final_result()

  if result:
    parsed: Images = Images.model_validate_json(result)
    return parsed
  else:
    return None

In [None]:
async def collect_images(target: str, location: str):
  print(f'collecting images of {target} at {location}')
  images = await get_image_urls(target, location)
  for image in images.images:
    print(f'{image.url}\t{image.description}')

In [None]:
pinterest = 'https://www.pinterest.com/ideas/'
presearch = 'https://presearch.com/'

await collect_images('Angourie Rice', pinterest)
await collect_images('Emma Watson', pinterest)
await collect_images('Angourie Rice', presearch)
await collect_images('Emma Watson', presearch)

collecting images of Angourie Rice at https://www.pinterest.com/ideas/




https://i.pinimg.com/236x/c0/aa/fd/c0aafd560cbab191f96a34773f3be9b4.jpg	Image of Angourie Rice smiling
https://i.pinimg.com/236x/59/be/6e/59be6e049bd538b6840450aaf801b411.jpg	Image of Angourie Rice smiling
https://i.pinimg.com/236x/d9/f5/83/d9f583c00016ba2db2e3e654e1673c5f.jpg	Image of Angourie Rice smiling
collecting images of Angourie Rice at presearch.com
https://i.pinimg.com/236x/c0/aa/fd/c0aafd560cbab191f96a34773f3be9b4.jpg	Image of Angourie Rice smiling
https://i.pinimg.com/236x/d9/f5/83/d9f583c00016ba2db2e3e654e1673c5f.jpg	Image of Angourie Rice smiling
https://i.pinimg.com/236x/19/70/0a/19700acacd16b19b4e371e7530adbde6.jpg	Image of Angourie Rice smiling
collecting images of Emma Watson at https://www.pinterest.com/ideas/
https://i.pinimg.com/236x/7c/90/88/7c9088a2c45bc689524f114c564bea73.jpg	Emma Watson smiling portrait 1
https://i.pinimg.com/236x/95/6d/98/956d98bc198d46c2c158e7e5baa524f5.jpg	Emma Watson smiling portrait 2
https://i.pinimg.com/236x/cc/68/0d/cc680dbd067836f628b4

## Hard-coded image collection

In [None]:
from playwright.async_api import async_playwright
from datetime import datetime

from browser_use.browser.chrome import (
  CHROME_ARGS,
  CHROME_HEADLESS_ARGS,
  CHROME_DEFAULT_USER_AGENT
)

In [None]:
async def image_search(target: str, num_images: int = 10) -> Images:
    async with async_playwright() as p:
        screen_size = {'width': 1920, 'height': 1080}
        offset_x, offset_y = 0, 0
        chrome_args = {
            f'--remote-debugging-port=9222',
            *CHROME_ARGS,
            *CHROME_HEADLESS_ARGS,
            f'--window-position={offset_x},{offset_y}',
            f'--window-size={screen_size["width"]},{screen_size["height"]}',
        }

        browser = await p.chromium.launch(
            headless=True,
            args=list(chrome_args),
            channel='chrome',
            handle_sigint=False,
            handle_sigterm=False)

        context = await browser.new_context(user_agent=CHROME_DEFAULT_USER_AGENT)
        page = await context.new_page()

        await page.goto("https://presearch.com/")
        # await page.goto("https://duckduckgo.com/")
        await page.fill("input[name='q']", target)
        await page.press("input[name='q']", "Enter")
        await page.click("text=Images")

        await page.wait_for_load_state("networkidle")
        await page.wait_for_timeout(2000) # Wait for the client-side javascript to run

        img_tags = await page.query_selector_all("div.image-thumbnail img")
        # img_tags = await page.query_selector_all("div[data-testid='zci-images'] li img")
        # print(f"Found {len(img_tags)} images.")
        img_srcs = await asyncio.gather(*[img.get_attribute("src") for img in img_tags])

        await browser.close()
        print(f"Browser closed.")

        # Generate Image objects with descriptions
        images = Images(images=[
            Image(description=f"Image of {target}", url=url) for url in img_srcs[:num_images]
        ])

        return images

In [None]:
async def collect_images2(target: str):
  print(f'collecting images of {target}')
  images = await image_search(target, 10)
  for image in images.images:
    print(f'{image.url}\t{image.description}')

In [None]:
await collect_images2('Angourie Rice')
await collect_images2('Emma Watson')

collecting images of Angourie Rice
Found 60 images.
Browser closed.
https://tse3.mm.bing.net/th?id=OIP.f1umFGfmsIXRfGGsmLzCbwHaJ4&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse1.mm.bing.net/th?id=OIP.nesaW00tFv6kIWc9NuLMOAHaLH&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse2.mm.bing.net/th?id=OIP.5gZo8H4Fm4H9k1oaTJFRWgHaJQ&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse4.mm.bing.net/th?id=OIP.N_dzJNtIZkvbeEwXqDizcgHaLH&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse1.mm.bing.net/th?id=OIP.PiCC8UqCd99Wty0B7w9TPQHaKh&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse3.mm.bing.net/th?id=OIP.jnFskXWIEn6fytczvWhHsgHaGp&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse3.mm.bing.net/th?id=OIP.2H8WjNdvsJ5HvYz14ugfOgHaLH&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse1.mm.bing.net/th?id=OIP.EG33hhFfcArCB5EkY59dqgHaLL&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse3.mm.bing.net/th?id=OIP.aGs7DNd_NIFiGicF6Gy3QQHaL

## Collect

We need to authenticate to access google sheets.

In [25]:
from google.colab import auth
auth.authenticate_user()

from googleapiclient.discovery import build
from google.auth import default
import gspread
creds, _ = default()
gc = gspread.authorize(creds)

We list all google sheets.

In [26]:
spreadsheet_list = gc.list_spreadsheet_files()
for spreadsheet in spreadsheet_list:
  print(f"Spreadsheet Name: {spreadsheet['name']}, ID: {spreadsheet['id']}")

Spreadsheet Name: Celebrities, ID: 1xB-V6SzTp9mx9FTN7KVvx0QOtfn54U1G4AYumMFJ5V8


We select a sheet and verify it exists and contains targets.

In [27]:
spreadsheet_id = '1xB-V6SzTp9mx9FTN7KVvx0QOtfn54U1G4AYumMFJ5V8'  #@param {type: "string"}
worksheet_name = 'Sheet1' #@param {type: "string"}

try:
  sh = gc.open_by_key(spreadsheet_id)
  worksheet = sh.worksheet(worksheet_name)

  all_values = worksheet.get_all_values()

  for row in all_values[1:]:
      print(row)

except gspread.SpreadsheetNotFound:
  print(f"Spreadsheet with ID '{spreadsheet_id}' not found.")
except gspread.WorksheetNotFound:
  print(f"Worksheet '{worksheet_name}' not found in the spreadsheet.")
except Exception as e:
  print(f"An error occurred: {e}")


['Angourie Rice', '']
['Emma Watson', '']
['Jenna Ortega', '']
['Millie Bobby Brown', '']


We iterate over all rows and start the agent for each. Found images are inserted as columns.

In [29]:
for row_index, row in enumerate(all_values[1:]):
  target = row[0]
  if target:

    print(f'collecting for {target}')
    # images = await get_image_urls(target)
    images = await image_search(target, 5)

    if images is None:
      print(f'no images found')
    else:
      for image_index, image in enumerate(images.images):
        print(f'{image.url}\t{image.description}')
        worksheet.update_cell(row_index + 2, 2 + image_index, f'=IMAGE("{image.url}")')

collecting for Angourie Rice
Browser closed.
https://tse3.mm.bing.net/th?id=OIP.f1umFGfmsIXRfGGsmLzCbwHaJ4&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse1.mm.bing.net/th?id=OIP.nesaW00tFv6kIWc9NuLMOAHaLH&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse4.mm.bing.net/th?id=OIP.N_dzJNtIZkvbeEwXqDizcgHaLH&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse2.mm.bing.net/th?id=OIP.5gZo8H4Fm4H9k1oaTJFRWgHaJQ&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
https://tse1.mm.bing.net/th?id=OIP.PiCC8UqCd99Wty0B7w9TPQHaKh&pid=Api&P=0&w=300&h=300	Image of Angourie Rice
collecting for Emma Watson
Browser closed.
https://tse2.mm.bing.net/th?id=OIP.qe707A47i7hoOeDcIHz_sgHaLH&pid=Api&P=0&w=300&h=300	Image of Emma Watson
https://tse1.mm.bing.net/th?id=OIP.M2YJhozsScWhneKi9nmcJQHaHa&pid=Api&P=0&w=300&h=300	Image of Emma Watson
https://tse4.mm.bing.net/th?id=OIP.HP9zdd9h6pobkEXAHhcY-gHaJZ&pid=Api&P=0&w=300&h=300	Image of Emma Watson
https://tse4.mm.bing.net/th?id=OIP.B8beC_NwIvh

We set the row height to 200px and column width to 150px so that the images are displayed well.

In [31]:
service = build('sheets', 'v4', credentials=creds)
requests = [
    # Update row height
    {
        "updateDimensionProperties": {
            "range": {
                "sheetId": worksheet.id,
                "dimension": "ROWS",
                "startIndex": 1,
                "endIndex": len(all_values)
            },
            "properties": {
                "pixelSize": 200
            },
            "fields": "pixelSize"
        }
    },
    # Update column width
    {
        "updateDimensionProperties": {
            "range": {
                "sheetId": worksheet.id,
                "dimension": "COLUMNS",
                "startIndex": 1,
                "endIndex": 4
            },
            "properties": {
                "pixelSize": 150
            },
            "fields": "pixelSize"
        }
    }
]

body = {
    'requests': requests
}
response = service.spreadsheets().batchUpdate(
    spreadsheetId=spreadsheet_id,
    body=body
).execute()

print("Cell size updated successfully!")

Cell size updated successfully!
