# **Parsing the Show Floor Map**
I want a "Map Page", where users can see the location of a booth on the PAX map. In order to do that, I'll need to identify the bounding boxes of each booth within the image. 

# Setup
The cells below will help to set up the rest of the notebook. 

I'll start by configuring the kernel that's running this notebook:

In [1]:
# Change the cwd
%cd ..

# Enable the autoreload module
%load_ext autoreload
%autoreload 2

# Load the environment variables
from dotenv import load_dotenv
load_dotenv(override=True)

/Users/thubbard/Documents/personal/programming/pax-pal-2025/experiments


True

Next, I'm going to import the necessary modules:

In [2]:
# General imports
import json, re

# Third-party imports
from google.cloud import vision

# Section
Description

In [None]:
IMG = "experiments/data/pax-map.jpg"
PROJECT = "your‑gcp‑project"
client = vision.ImageAnnotatorClient()

with open(IMG, "rb") as f:
    response = client.text_detection(image=vision.Image(content=f.read()))

digits = {}
for anno in response.text_annotations[1:]:  # skip the full‑page blob
    text = anno.description.strip()
    if not re.fullmatch(r"\d+", text):
        continue

    # bounding_poly has 4 vertices (x,y). Flatten to [x1,y1,x2,y2].
    xs = [v.x for v in anno.bounding_poly.vertices]
    ys = [v.y for v in anno.bounding_poly.vertices]
    bbox = [min(xs), min(ys), max(xs), max(ys)]

    # Group multi‑chunk numbers (15 + 043 → 15043)
    y_mid = (bbox[1] + bbox[3]) // 2
    key = f"{y_mid:05d}"  # line key
    digits.setdefault(key, []).append((bbox, text))

# merge adjacent chunks on the same scan‑line
booths = {}
for line in digits.values():
    line.sort(key=lambda t: t[0][0])  # sort left→right
    cur_box, cur_txt = line[0]
    for box, txt in line[1:]:
        if box[0] - cur_box[2] < 15:  # ≤ 15 px gap?
            # extend right edge & concat text
            cur_box[2] = box[2]
            cur_txt += txt
        else:
            booths[cur_txt] = cur_box
            cur_box, cur_txt = box, txt
    booths[cur_txt] = cur_box

with open("booths.json", "w") as f:
    json.dump(booths, f, indent=2)
print(f"Saved {len(booths)} booths")