# **How to Use the Image Annotation Tool**

The following steps will guide you through using the GUI-based Image Annotation Tool for labeling images of digits.

### **Setup**
1) Make sure you have **Python 3.8 or higher** installed along with the necessary libraries:
   - Run the following commands to install the dependencies:
     ```bash
     !pip install pillow
     !pip install tkinter
     ```

2) To launch the tool, simply run the provided script.

### **Instructions for Labeling**
Once the tool is running, follow these steps to label your data (also explained in the GUI):

1) **Click "Load Images"**:  
   - A file explorer will pop up. Navigate to the folder containing your images and ensure that you **enter the folder** (not just selecting the folder itself).

2) **First Image Appears**:  
   - You will now see the first image on the screen, ready to be labeled.

3) **Draw Orientation Line**:  
   - Since the digits may have different orientations, you should first draw a line from the **center of the leftmost digit** to the **center of the rightmost digit**.
   - Click on the center of the first digit, drag the line, and release at the center of the last digit. A blue line should appear.

4) **Confirm Line**:  
   - A pop-up will ask if the line is correct. If it is, click **Yes**. If not, click **No** to redraw.

5) **Draw Rectangle Around Each Digit**:  
   - Now, for each digit, click and drag to draw a rectangle starting from the **top-left corner** of the digit (reading order). Remember:
     - The **top-left corner** should always be above and to the left of the **bottom-right corner**, regardless of the digit's orientation.

6) **Label Each Digit**:  
   - After drawing each rectangle, a pop-up will ask you to enter the digit's label (from 0 to 9). Type the number and press **Enter** or click **yes**.

7) **Check Label**:  
   - You should see the labeled information appear in the CSV content area. If you zoom in or out, you’ll see the number appear on the image.

8) **Repeat for All Digits**:  
   - Label all 11 digits in the image. When finished, click **Next Image** to move to the next one.

### **Output**
Once you're done, the tool will generate:
- A **CSV file** named `annotations.csv` containing all labeled data, formatted as follows:
   - **"Image Name"**, **"Position"**, **"Label"**, **"Top-Left X"**, **"Top-Left Y"**, **"Bottom-Right X"**, **"Bottom-Right Y"**  
     where:
     - *Image Name* corresponds to the original name of the image.
     - *Position* is the sequential position of the digit in the image.
     - *Label* is the digit value (0-9).
     - *Top-Left X, Y* are the coordinates of the top-left corner of the rectangle.
     - *Bottom-Right X, Y* are the coordinates of the bottom-right corner.
   
- A **"result" folder** containing all annotated images with the rectangles and labels drawn.

### **Navigation Controls**
- **Zoom In**: Use the **`+`** key.
- **Zoom Out**: Use the **`-`** key.
- **Move Image**: Use the **arrow keys** to pan in any direction.

### **Quick Summary**
1) **Load Images** → 2) **Draw Orientation Line** → 3) **Draw Rectangle for Each Digit** → 4) **Enter Digit Label** → 5) **Move to Next Image**

Follow this flow, and you'll quickly annotate all your images!


In [None]:
import os
import csv
import tkinter as tk
from tkinter import filedialog, simpledialog, messagebox
from PIL import Image, ImageTk, ImageDraw, ImageFont
import math

class ImageAnnotationTool:

    # ==================================== INITIALIZATION ====================================

    def __init__(self, root):
        self.root = root
        self.root.title("Image Annotation Tool")
        
        # Variables
        self.annotations = []
        self.image_index = 0
        self.image_files = []
        self.image_folder = ""
        self.current_image = None
        self.rect = None
        self.start_x = None
        self.start_y = None
        self.end_x = None
        self.end_y = None
        self.angle = 0
        self.line = None
        self.zoom_factor = 1.0  # For zooming
        self.original_image_size = (1624, 1234)  # Fixed aspect ratio
        self.selecting_orientation = True  # Switch between orientation and rectangle drawing
        self.step_message = "Load your images by clicking on Load Images \n Navigate to the folder containing the images for labeling and select the folder (ensure you're inside)."

        self.modified_image = None  # For storing the modified image with drawings
        self.position = 1  # Position in the current image

        # Create result folder if it doesn't exist
        if not os.path.exists("result"):
            os.makedirs("result")

        # Setup layout
        self.setup_layout()

    # ==================================== GUI LAYOUT ====================================

    def setup_layout(self):
        # Image name label
        self.image_name_label = tk.Label(self.root, text="Image: ", font=('Helvetica', 16))
        self.image_name_label.grid(row=0, column=0, columnspan=2, padx=10, pady=10, sticky="nsew")

        # Canvas for image display
        self.canvas = tk.Canvas(self.root, cursor="cross", width=self.original_image_size[0], height=self.original_image_size[1])
        self.canvas.grid(row=1, column=0, padx=10, pady=10, columnspan=2, sticky="nsew")
        self.canvas.bind("<ButtonPress-1>", self.on_button_press)
        self.canvas.bind("<B1-Motion>", self.on_move_press)
        self.canvas.bind("<ButtonRelease-1>", self.on_button_release)

        # Bind arrow keys for movement and +, - for zoom
        self.root.bind("<Up>", self.pan_up)
        self.root.bind("<Down>", self.pan_down)
        self.root.bind("<Left>", self.pan_left)
        self.root.bind("<Right>", self.pan_right)
        self.root.bind("<plus>", self.zoom_in)
        self.root.bind("<minus>", self.zoom_out)

        # Step label
        self.step_label = tk.Label(self.root, text=self.step_message, font=('Helvetica', 14))
        self.step_label.grid(row=2, column=0, columnspan=2, padx=10, pady=10)

        # CSV frame and display
        self.csv_frame = tk.Frame(self.root)
        self.csv_frame.grid(row=1, column=2, padx=20, sticky="nsew")
        self.csv_label = tk.Label(self.csv_frame, text="CSV Content (Real-Time):", font=('Helvetica', 16))
        self.csv_label.pack()
        self.csv_textbox = tk.Text(self.csv_frame, width=40, height=20, font=('Helvetica', 8))
        self.csv_textbox.pack(fill="both", expand=True)

        # Controls Frame
        self.controls_frame = tk.Frame(self.root)
        self.controls_frame.grid(row=3, column=0, columnspan=2, pady=10)

        # Buttons
        self.prev_button = tk.Button(self.controls_frame, text="Previous Image", command=self.prev_image, font=('Helvetica', 14))
        self.next_button = tk.Button(self.controls_frame, text="Next Image", command=self.next_image, font=('Helvetica', 14))
        self.prev_button.grid(row=0, column=0, padx=20, pady=10)
        self.next_button.grid(row=0, column=1, padx=20, pady=10)

        self.load_button = tk.Button(self.controls_frame, text="Load Images", command=self.load_images, font=('Helvetica', 14))
        self.load_button.grid(row=1, column=0, columnspan=2, padx=10, pady=20)

        self.quit_button = tk.Button(self.controls_frame, text="Quit", command=self.root.destroy, font=('Helvetica', 14))
        self.quit_button.grid(row=2, column=0, columnspan=2, pady=10)

        # Status message
        self.status_message = tk.Label(self.controls_frame, text="", font=('Helvetica', 16))
        self.status_message.grid(row=3, column=0, columnspan=2, padx=10, pady=10)
        
        # Instruction text for zoom and navigation
        self.instructions_label = tk.Label(self.root, text="Instructions:\n- Use the + key to zoom in\n- Use the - key to zoom out\n- Use arrow keys to move the image", font=('Helvetica', 12), justify="left")
        self.instructions_label.grid(row=2, column=2, padx=20, pady=10, sticky="nsew")

    # ==================================== IMAGE RELATED FUNCTIONS ====================================
    
    def load_images(self):
        # Load images from folder
        self.image_folder = filedialog.askdirectory()
        if not self.image_folder:
            self.status_message.config(text="No folder selected!", fg="red")
            return
        
        # Load all image files with supported extensions
        self.image_files = [f for f in os.listdir(self.image_folder) if f.endswith(('png', 'jpg', 'jpeg'))]
        
        if self.image_files:
            # Automatically start from the first image
            self.image_index = 0  # Reset to the first image
            self.current_image = self.image_files[self.image_index]
            self.display_image(self.current_image)  # Show the first image
            self.status_message.config(text=f"{len(self.image_files)} images loaded.", fg="green")
            self.step_message = "Draw a line from the center of the leftmost digit to the center of the rightmost digit to indicate orientation"
            self.step_label.config(text=self.step_message)
        else:
            self.status_message.config(text="No images found!", fg="red")


    def display_image(self, image_name):
        # Display the image at full size and prepare for modifications
        image_path = os.path.join(self.image_folder, image_name)
        img = Image.open(image_path)
        self.original_image = img.copy()  # Make a copy to preserve original
        self.modified_image = img  # Use this to draw directly on the image

        self.update_zoomed_image()

    def next_image(self):
        # Switch to the next image and reset position
        if self.image_index < len(self.image_files) - 1:
            self.image_index += 1
            self.zoom_factor = 1.0  # Reset zoom factor when switching images
            self.current_image = self.image_files[self.image_index]
            self.selecting_orientation = True  # Restart line selection process
            self.position = 1  # Reset position for new image
            self.step_message = "Draw a line from the center of the leftmost digit to the center of the rightmost digit to indicate orientation"
            self.step_label.config(text=self.step_message)
            self.display_image(self.current_image)

    def prev_image(self):
        if self.image_index > 0:
            self.image_index -= 1
            self.zoom_factor = 1.0  # Reset zoom factor when switching images
            self.current_image = self.image_files[self.image_index]
            self.selecting_orientation = True  # Restart line selection process
            self.position = 1  # Reset position for new image
            self.step_message = "Draw a line from the center of the leftmost digit to the center of the rightmost digit to indicate orientation"
            self.step_label.config(text=self.step_message)
            self.display_image(self.current_image)

    def update_zoomed_image(self):
        # Update the image display based on zoom level
        zoomed_width = int(self.original_image_size[0] * self.zoom_factor)
        zoomed_height = int(self.original_image_size[1] * self.zoom_factor)
        zoomed_image = self.modified_image.resize((zoomed_width, zoomed_height), Image.LANCZOS)
        self.tk_image = ImageTk.PhotoImage(zoomed_image)

        self.canvas.delete("all")
        self.canvas.create_image(0, 0, image=self.tk_image, anchor="nw")
        self.canvas.config(scrollregion=self.canvas.bbox(tk.ALL))

    def zoom_in(self, event):
        # Zoom in, increase the zoom factor
        self.zoom_factor *= 1.1
        self.update_zoomed_image()

    def zoom_out(self, event):
        # Prevent zoom factor from dropping below 1 (original size)
        if self.zoom_factor > 1.0:
            self.zoom_factor /= 1.1
        else:
            self.zoom_factor = 1.0  # Ensure it's at least the original size
        self.update_zoomed_image()

    def pan_up(self, event):
        # Move the canvas view up
        self.canvas.yview_scroll(-1, "units")

    def pan_down(self, event):
        # Move the canvas view down
        self.canvas.yview_scroll(1, "units")

    def pan_left(self, event):
        # Move the canvas view left
        self.canvas.xview_scroll(-1, "units")

    def pan_right(self, event):
        # Move the canvas view right
        self.canvas.xview_scroll(1, "units")

    # ==================================== LABELING RELATED FUNCTIONS ====================================   
    
    def on_button_press(self, event):
        # Convert canvas click position to original image coordinates
        if self.selecting_orientation:
            # Capture the starting point in original image coordinates
            self.start_x = self.canvas.canvasx(event.x) / self.zoom_factor
            self.start_y = self.canvas.canvasy(event.y) / self.zoom_factor
            self.line = self.canvas.create_line(
                event.x, event.y, event.x, event.y, fill="blue", width=2
            )  # Use canvas coordinates to draw on the zoomed canvas
        else:
            # Begin drawing the inclined rectangle
            self.start_x = self.canvas.canvasx(event.x) / self.zoom_factor
            self.start_y = self.canvas.canvasy(event.y) / self.zoom_factor
            self.rect = self.canvas.create_polygon(
                0, 0, 0, 0, 0, 0, 0, 0, outline="red", fill="", width=2
                )

    def on_move_press(self, event):
        # Get current mouse position in the original image coordinates
        cur_x = self.canvas.canvasx(event.x) / self.zoom_factor
        cur_y = self.canvas.canvasy(event.y) / self.zoom_factor

        if self.selecting_orientation:
            # Update the line on the canvas, using canvas coordinates
            self.canvas.coords(
                self.line,
                self.start_x * self.zoom_factor,
                self.start_y * self.zoom_factor,
                event.x,
                event.y
            )
        else:
            # Calculate the rectangle's coordinates in the original image space
            rect_coords = self.get_rect_coords_based_on_angle(self.start_x, self.start_y, cur_x, cur_y, self.angle)

            # Convert the original image coordinates back to zoomed canvas coordinates for display
            scaled_coords = [coord * self.zoom_factor for coord in rect_coords]

            # Update the rectangle drawing on the zoomed canvas
            self.canvas.coords(self.rect, *scaled_coords)


    def get_rect_coords_based_on_angle(self, x1, y1, x2, y2, angle):
        """
        Given the start point (x1, y1) and a second point (x2, y2),
        return the coordinates of the rectangle's four corners.
        The rectangle will be aligned with the given angle.
        """
        # Calculate the projected width and height based on the angle
        dx = x2 - x1
        dy = y2 - y1

        # Project the movement onto the angle direction (width along the line)
        width = dx * math.cos(angle) + dy * math.sin(angle)

        # Project the movement onto the perpendicular direction (height orthogonal to the line)
        height = abs(dx * math.sin(angle) - dy * math.cos(angle))

        # Use trigonometry to calculate the corners of the rectangle based on width and height
        cos_a = math.cos(angle)
        sin_a = math.sin(angle)

        # Top-left corner (starting point)
        x_tl = x1
        y_tl = y1

        # Top-right corner (move width along the angle)
        x_tr = x1 + width * cos_a
        y_tr = y1 + width * sin_a

        # Bottom-left corner (move height perpendicular to the angle)
        x_bl = x1 - height * sin_a
        y_bl = y1 + height * cos_a

        # Bottom-right corner (move both width along the angle and height perpendicular to it)
        x_br = x_tr - height * sin_a
        y_br = y_tr + height * cos_a

        return [x_tl, y_tl, x_tr, y_tr, x_br, y_br, x_bl, y_bl]

    def on_button_release(self, event):
        if self.selecting_orientation:
            # Capture the end point in original image coordinates
            self.end_x = self.canvas.canvasx(event.x) / self.zoom_factor
            self.end_y = self.canvas.canvasy(event.y) / self.zoom_factor

            # Calculate the angle in the original image coordinates
            self.angle = math.atan2(self.end_y - self.start_y, self.end_x - self.start_x)
            print(f"Orientation angle: {math.degrees(self.angle)} degrees")

            # Draw the line on the image (to save on the unzoomed image)
            draw = ImageDraw.Draw(self.modified_image)
            draw.line([(self.start_x, self.start_y), (self.end_x, self.end_y)], fill="blue", width=2)

            if messagebox.askyesno("Validate", "Do you want to keep this line?"):
                self.selecting_orientation = False
                self.status_message.config(text="")
                self.step_message = "Now draw a rectangle around the digit, ensuring they are ordered correctly (reading order), \n for each one of them start for the top left corner, regardless of the orientation."
                self.step_label.config(text=self.step_message)
            else:
                self.canvas.delete(self.line)
                self.line = None
        else:
            # Capture the end point in original image coordinates
            self.end_x = self.canvas.canvasx(event.x) / self.zoom_factor
            self.end_y = self.canvas.canvasy(event.y) / self.zoom_factor

            # Get the rectangle coordinates in the original image space
            rect_coords = self.get_rect_coords_based_on_angle(self.start_x, self.start_y, self.end_x, self.end_y, self.angle)

            # Draw the rectangle on the unzoomed image (original coordinates)
            draw = ImageDraw.Draw(self.modified_image)
            draw.polygon(rect_coords, outline="red", width=2)

            # Convert the coordinates to canvas coordinates for visual display
            scaled_coords = [coord * self.zoom_factor for coord in rect_coords]

            # Ask for label input
            label = simpledialog.askstring("Input", "Enter number for the selected area", parent=self.root)
            if label and label.strip().isdigit():
                text_x = (rect_coords[0] + rect_coords[2]) / 2
                text_y = min(rect_coords[1], rect_coords[3]) - 20
                font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf", 40)
                draw.text((text_x, text_y), label, fill="red", font=font)
                self.record_annotation(label, rect_coords)
                self.step_message = "Rectangle saved! Draw the next one or if you are done go to the next image."
                self.step_label.config(text=self.step_message)
            else:
                messagebox.showwarning("Invalid Input", "Please enter a valid number!")
                self.canvas.delete(self.rect)
                self.rect = None


    def record_annotation(self, label, rect_coords):
        # Keep only the top-left and bottom-right coordinates
        x_tl, y_tl = rect_coords[0], rect_coords[1]  # Top-left corner
        x_br, y_br = rect_coords[4], rect_coords[5]  # Bottom-right corner

        # Append the annotation to the list (including position)
        self.annotations.append((self.current_image, self.position, label, int(x_tl), int(y_tl), int(x_br), int(y_br)))

        # Increment position (reset if greater than 11)
        self.position += 1
        if self.position > 11:
            self.position = 1

        # Save annotations to CSV
        self.save_annotations_to_csv()

        # Save the modified image in the 'result' folder
        result_path = f"result/{self.current_image}"
        self.modified_image.save(result_path)
        print(f"Modified image saved as {result_path}")

    def save_annotations_to_csv(self):
        # Save annotations to a CSV file
        with open('annotations.csv', 'w', newline='') as file:
            writer = csv.writer(file)
            writer.writerow(["Image Name", "Position", "Label", "Top-Left X", "Top-Left Y", "Bottom-Right X", "Bottom-Right Y"])
            writer.writerows(self.annotations)
        self.update_csv_display()

    def update_csv_display(self):
        # Update the CSV display
        self.csv_textbox.delete(1.0, tk.END)
        for annotation in self.annotations:
            self.csv_textbox.insert(tk.END, f"{annotation}\n")

# Main setup
if __name__ == "__main__":
    root = tk.Tk()
    app = ImageAnnotationTool(root)
    root.mainloop()


Orientation angle: 32.8245136854354 degrees
Modified image saved as result/img_2431.png
Modified image saved as result/img_2431.png
Modified image saved as result/img_2431.png
Modified image saved as result/img_2431.png
