<a href="https://colab.research.google.com/github/elephant-xyz/notebook/blob/main/PhotoMedtaData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🐘 Welcome to Step 4 of Elephant Mining

Congratulations on reaching **Step 4**! By now, you’ve successfully **minted your County Data Group**. In this notebook, you'll use your **seed data** and **property images** to mint your **Photo Data Group**.

---

## 🧠 What You’ll Do in This Step

This notebook allows you to:

- Upload your property images  
- Mint a new **Photo Data Group**  
- Automatically generate a **fact sheet** based on the image metadata  

This step completes the visual layer of your dataset, setting you up for further data enrichment.

---

## ✅ Prerequisites

Before continuing, make sure you’ve completed the following two notebooks:

1. [📗 Notebook 1: Seed Minting](https://colab.research.google.com/drive/14tSNSP8Pe-mY4VwX9JhXgfyOvzmN3kC0?usp=chrome_ntp)  
2. [📘 Notebook 2: County Data Minting](https://colab.research.google.com/drive/1ZI_eScKFh2kDIZgwXljhOgBIgrenDhRi?usp=chrome_ntp)

After running both, you should have the following output files ready:

- `upload-results.json`  
- `submit.zip`

---

## 📸 In This Notebook

Once your image files are uploaded:

1. The images will be minted into the **Photo Data Group**  
2. A **fact sheet** will be generated for inspection  
3. You can continue with **image-based metadata extraction**  
4. This will lead to a complete and enriched data product  

---



## 📥 Step 1: Upload the `.env` File

This notebook requires a `.env` file that contains your API keys and credentials.  
It will be used to securely load the following environment variables:

| Variable Name           | Purpose                     |
|-------------------------|-----------------------------|
| `OPENAI_API_KEY`        | Access to OpenAI API        |
| `AWS_ACCESS_KEY_ID`     | AWS access key              |
| `AWS_SECRET_ACCESS_KEY` | AWS secret access key       |
| `S3_BUCKET_NAME`        | AWS secret access key       |
| `IMAGES_DIR`            | Images                      |
| `ELEPHANT_PRIVATE_KEY`  | Elephant wallet address     |
| `PINATA_JWT`            | PINATA token                |



- Click the **folder icon** 📂 in the left sidebar to open the file browser.
- Then click the **"Upload"** button and choose your `.env` file.

```env
# example of .env file
OPENAI_API_KEY=sk-XXXXXXXXXXXXXXXXXXXXXXXXXXXX
AWS_ACCESS_KEY_ID=XXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXX
S3_BUCKET_NAME=your-s3-bucket-name-here
IMAGES_DIR=images
ELEPHANT_PRIVATE_KEY=xxxxx
PINATA_JWT=xxxxx
```


## Step 2: Upload `upload_results.csv`

Upload the `upload_results.csv` file to the `/content/` directory.

> 📌 **Important**: You must generate this file by running **Step 2** of the following Colab notebook:  
> 👉 [Seed Data – Step 2](https://colab.research.google.com/drive/14tSNSP8Pe-mY4VwX9JhXgfyOvzmN3kC0?usp=sharing#scrollTo=OFKp4E49651Z)

After running Step 2 in that notebook, download the `upload_results.json` file and upload it to this notebook's `/content/` directory.





## Step 3: Upload `submit.zip`

Upload the submit.zip file to the /content/ directory of this notebook.

📌 Important: You must generate submit.zip of the following collab
👉 County Data Group [ – Step 3](https://colab.research.google.com/drive/1ZI_eScKFh2kDIZgwXljhOgBIgrenDhRi#scrollTo=HA0ppLFpUm1j)

After running Step 3 in that notebook, download the submit.zip file and upload it here to /content/ so it can be used in the next steps.

## Step 4: Setup folder structure for processing images

In [None]:
# 1. Install the package
!pip install --force-reinstall --no-cache-dir git+https://github.com/elephant-xyz/photo-meta-data-ai.git > /content/install_log.txt 2>&1


# 3. Set up folders
!colab-folder-setup


🚀 Starting Colab Folder Setup...
✓ Loaded upload results file: upload_results.csv
  - Rows: 2
  - Columns: ['propertyCid', 'dataGroupCid', 'dataCid', 'filePath', 'uploadedAt', 'htmlLink']
✓ Extracted 2 properties:
  - Property ID: 52434205310037080 -> CID: bafkreigzz5foh5ts76vvhxphzulptpnjwznog6lcnxw5wsvfqa7zlxeioa
  - Property ID: 30434108090030050 -> CID: bafkreifammzkemqq5xrw7kfyjtjp74gig4m3p2uv2i4qwwt4t6s2s6le4q
✓ Created images folder: ./images
✓ Created output folder: ./output
Found 2 unique properties
✓ Created image folder: 52434205310037080
✓ Created output folder: 52434205310037080

🔍 Following IPFS chain for property 52434205310037080:
  [1] Fetching propertyCid: bafkreigzz5foh5ts76vvhxphzulptpnjwznog6lcnxw5wsvfqa7zlxeioa
  [2] Found property_seed: bafkreiau7wmu7kyeec74dhzpxt3z6xbykure22t3dln4mtwaen2k3vvniy
  [3] Fetching property_seed CID: bafkreiau7wmu7kyeec74dhzpxt3z6xbykure22t3dln4mtwaen2k3vvniy
  [4] Found 'from' CID: bafkreid3fwfdgl2tywom4t6g7tbby3m3uxufduss62l2pnvjsmo

## Step 4: Upload images with Parcel ID Subfolders

Place all image files related to that parcel inside its corresponding folder under IMAGE_FOLDER_NAME=images






## Step 5: Setup AWS Environmet to process images

Place all image files related to that parcel inside its corresponding folder under IMAGE_FOLDER_NAME=images


In [None]:
!bucket-manager

## Step 6 Running AWS rekognition to categorize the pictures


In [None]:
!upload-to-s3
!photo-categorizer


📊 COMPREHENSIVE CATEGORIZATION SUMMARY

🏠 TOTAL PROPERTIES PROCESSED: 2
🖼️  TOTAL IMAGES: 98
✅ TOTAL CATEGORIZED: 98
📈 SUCCESS RATE: 100.0%

📁 OVERALL CATEGORY BREAKDOWN:
   exterior: 34 images
   kitchen: 21 images
   living_room: 19 images
   bedroom: 10 images
   other: 6 images
   closet: 3 images
   garage: 2 images
   bathroom: 1 images
   laundry: 1 images
   pool: 1 images

🏠 PROPERTY-BY-PROPERTY BREAKDOWN:
--------------------------------------------------------------------------------

📍 Property: 30434108090030050
   Address: Property 30434108090030050
   Total Images: 24
   Categorized: 24
   Success Rate: 100.0%
   Categories:
     • kitchen: 9 images
     • living_room: 5 images
     • exterior: 4 images
     • bedroom: 4 images
     • bathroom: 1 images
     • closet: 1 images

📍 Property: 52434205310037080
   Address: Property 52434205310037080
   Total Images: 74
   Categorized: 74
   Success Rate: 100.0%
   Categories:
     • exterior: 30 images
     • living_room: 1

##Step 7 Running AI to extract data from images

In [None]:
!ai-analyzer --local-folders --all-properties --batch-size 10 --max-workers 12
!quality-assessment

[TOKENS] Prompt: 8593, Completion: 1159
[COST] $0.060350 | Images in batch: 1
    [✔] Saved: structure_batch_01_photo_metadata.json
    [✔] Saved: lot_batch_01_photo_metadata.json
    [✔] Saved: utility_batch_01_photo_metadata.json
    [✔] Saved: layout_walk-in_closet_batch_01.json
    [✔] Saved: appliance_none_batch_01.json
    [✔] Updated: bafkreibzrfmqka5h7dnuz7jzilgx4ht5rqcrx3ocl23nger65frbb5hzma.json
[TOKENS] Prompt: 10888, Completion: 1191
[COST] $0.072305 | Images in batch: 4
    [✔] Saved: structure_batch_01_photo_metadata.json
    [✔] Saved: lot_batch_01_photo_metadata.json
    [✔] Saved: utility_batch_01_photo_metadata.json
    [✔] Saved: layout_bedroom_batch_01.json
    [✔] Saved: appliance_none_batch_01.json
    [→] Found existing main relationship file, merging...
    [✔] Updated: bafkreibzrfmqka5h7dnuz7jzilgx4ht5rqcrx3ocl23nger65frbb5hzma.json
[TOKENS] Prompt: 11653, Completion: 1183
[COST] $0.076010 | Images in batch: 5
    [✔] Saved: structure_batch_01_photo_metadata.js

## Step 8 Summerizing the data

In [None]:

!property-summarizer -

🚀 Starting summary for 2 properties: 30434108090030050, 52434205310037080

PROCESSING PROPERTY 1/2: 30434108090030050
📊 Analyzing property data from: output/30434108090030050
    [DEBUG] Looking for structure file: output/30434108090030050/structure.json
    [DEBUG] Structure file not found: output/30434108090030050/structure.json
    [DEBUG] Looking for lot file: output/30434108090030050/lot.json
    [DEBUG] Lot file not found: output/30434108090030050/lot.json

PROPERTY SUMMARY: 30434108090030050

📋 LAYOUTS (5 total)
----------------------------------------
Space Types:
  • laundry room
  • office
  • closet
  • living room
  • dining room
  • laundry room: 
  • office: 
  • closet: 
  • living room: 
  • dining room: 

🏠 STRUCTURE
----------------------------------------
  No structure data found

🌳 LOT
----------------------------------------
  No lot data found

⚡ UTILITIES
----------------------------------------
  No utility data found

🔌 APPLIANCES (3 total)
-------------------

##Step 9 Validate the results

In [None]:
!pip install --force-reinstall --no-cache-dir git+https://github.com/elephant-xyz/photo-meta-data-ai.git > /content/install_log.txt 2>&1

!fix-submit-local