# Image Dataset Creation: FoodVision-10

This script extracts 10 specific food classes from the **Food-101 dataset** to create the **FoodVision-10** dataset. It uses metadata files to organize images into a clean structure for machine learning workflows.

## Objectives:
- Select 10 food categories (`apple_pie`, `sushi`, `hamburger`, etc.).
- Split images into `train/` and `test/` folders using metadata (`train.txt`, `test.txt`).
- Prepare the dataset for efficient processing and modeling.

## Output Structure:

food_vision10/  
train/  
apple_pie/  
sushi/  
test/  
apple_pie/  
sushi/  

This dataset is optimized for training deep learning models with CNNs and Transfer Learning.

In [33]:
import os
import shutil

# Get the current working directory (where your script is located)
base_dir = os.getcwd()

# Paths
original_images_dir = os.path.join(base_dir, "food101_data", "images")
food_vision10_dir = os.path.join(base_dir, "food_vision_10")

meta_train_file = os.path.join(base_dir, "food101_data", "meta", "meta", "train.txt")
meta_test_file = os.path.join(base_dir, "food101_data", "meta", "meta", "test.txt")

# Selected classes
selected_classes = [
    "apple_pie", "baby_back_ribs", "baklava", "bibimbap", "caprese_salad",
    "cheesecake", "chocolate_cake", "french_fries", "hamburger", "sushi"
]

# Create folders for the foodvision10 dataset
os.makedirs(os.path.join(food_vision10_dir, "train"), exist_ok=True)
os.makedirs(os.path.join(food_vision10_dir, "test"), exist_ok=True)

In [34]:
# Helper Function that copies images
def copy_images(file_list, source_dir, destination_dir):
    """
    Copies images from a source directory to a destination directory 
    based on a file list and selected classes.

    Parameters:
    ----------
    file_list : str
        Path to a file containing image paths (e.g., "class_name/image_name").
    source_dir : str
        Path to the source directory containing class-based subdirectories.
    dest_dir : str
        Path to the destination directory where selected images will be copied.

    Notes:
    ------
    Only images belonging to classes listed in the global `selected_classes` 
    variable are copied. Directory structure is preserved.

    """
    with open(file_list, "r") as f:
        lines = f.readlines()  # Read all the lines from the file

    for line in lines:
        print(line)
        class_name, file_name = line.strip().split("/")
        # print(file_name)
        if class_name in selected_classes:
            source = os.path.join(source_dir, class_name, file_name + ".jpg")
            print(f"source: {source}")
            destination = os.path.join(destination_dir, class_name, file_name + ".jpg")
            print(f"destination: {destination}")
            os.makedirs(os.path.dirname(destination), exist_ok=True)
            shutil.copy2(source, destination)

In [35]:
# Copy training images
copy_images(meta_train_file, original_images_dir, os.path.join(food_vision10_dir, "train"))

# Copy testing images
copy_images(meta_test_file, original_images_dir, os.path.join(food_vision10_dir, "test"))
    
print("FoodVision-10 dataset created successfully!")

apple_pie/1005649

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/apple_pie/1005649.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/train/apple_pie/1005649.jpg
apple_pie/1014775

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/apple_pie/1014775.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/train/apple_pie/1014775.jpg
apple_pie/1026328

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/apple_pie/1026328.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/train/apple_pie/1026328.jpg
apple_pie/1028787

source: /Users/orshwartzman/Desktop/

IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)



sushi/3289316

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/sushi/3289316.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/test/sushi/3289316.jpg
sushi/3292104

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/sushi/3292104.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/test/sushi/3292104.jpg
sushi/3310019

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food101_data/images/sushi/3310019.jpg
destination: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipeline_and_transfer_learning/food_vision_10/test/sushi/3310019.jpg
sushi/3322020

source: /Users/orshwartzman/Desktop/My_portfolio/Project_two_image_data_pipelin