# Week 1: Kaggle Setup

Welcome to the AI for Good course! For the hands-on coding in this course, we will use Kaggle as a development environment. Kaggle is great because it can host notebooks, datasets, and models, and gives you 30 hours of GPU usage per week for free. The aim of this notebook is to get you set up in the Kaggle environment. In it, we'll read and write some data, run a function, and create a plot so you are good to go for the coming weeks. You are welcome to use your own development environment if you prefer,but you'll need to manage the packages and files and you won't receive support from us if you environment is causing bugs, so proceed at your own risk!

For this notebook, since we'll use an LLM, enable the GPU on the right hand panel, "GPU T4 x2" and then check it is working with the line of code below. If CPU is returned try again. Note, you only have 30 hours per week so make sure you use your time where needed and for the computation only. Learn more about what the GPU options are: https://www.linkedin.com/pulse/kaggle-accelerators-comparison-rukshar-alam-ki9bc/

Below is the chunk that loads on a fresh kaggle notebook, it includes some details on the resources available. You can explore more by selecting View (at top right of page) -> Show sidebar. In here (to right of page) are some clickable settings for the notebook, including compute resources used, location of persistance storage and more.

In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

## Test read, write 

First let's grab some images.. using following api https://pypi.org/project/bing-image-downloader/. For this section, ensure that you have internet enabled in the settings of the notebook on the right hand panel. You'll have to have a verified phone number on your Kaggle profile in order to do so.

In [None]:
pip install bing-image-downloader

In [None]:
import os
import shutil
from bing_image_downloader import downloader
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

In [None]:
downloader.download("child sad", limit=5, output_dir="images", adult_filter_off=False)

You can see on the side panel in "Output" that the image files are under "kaggle/working/images". Let's use `ls` to list the files here.

In [None]:
%ls /kaggle/working/images/'child sad'/

In [None]:
import os
from PIL import Image
import matplotlib.pyplot as plt

def load_and_display_images(image_dir):
    """
    Load images from a directory, store them in a list, 
    and display their thumbnails with names.

    Parameters:
    image_dir (str): Directory containing image files.
    
    Returns:
    images (list): List of PIL image objects.
    image_names (list): List of image filenames.
    """
    # 1. Load all images and store them in a list
    images = []
    image_names = []
    
    for filename in os.listdir(image_dir):
        if filename.lower().endswith(('jpg', 'jpeg')):
            image_path = os.path.join(image_dir, filename)
            images.append(Image.open(image_path))
            image_names.append(filename)

    # 2. Plot thumbnails of the images with their names
    fig, axes = plt.subplots(1, len(images), figsize=(15, 5))
    if len(images) == 1:
        axes = [axes]  # Ensure axes is iterable when there's only 1 image

    for ax, img, name in zip(axes, images, image_names):
        ax.imshow(img.resize((128, 128)))  # Resize for thumbnail
        ax.set_title(name, fontsize=8)
        ax.axis('off')

    plt.tight_layout()
    plt.show()

    return images, image_names


In [None]:
# Example usage
image_dir = '/kaggle/working/images/child sad/'
images, image_names = load_and_display_images(image_dir)

Now, let's grab some other, clearly different, images. Let's use the Bing Image Downloader again to get some images of happy children. Go ahead and write this line of code yourself. Limit the search to 5 images, output to the "images" directory, and leave the adult filter on.

In [None]:
# Use the Bing Image Downloader again to get some images of "child happy"
# Limit the search to 5 images, output to the "images" directory, and leave the adult filter on.

Let's put these in the same directory to mix things up a bit..

In [None]:
source_dir = '/kaggle/working/images/child happy/'
destination_dir = '/kaggle/working/images/child sad/'

os.makedirs(destination_dir, exist_ok=True)

for index, file in enumerate(os.listdir(source_dir)):
    src = os.path.join(source_dir, file)
    if os.path.isfile(src):
        shutil.move(src, os.path.join(destination_dir, f"image{index}{os.path.splitext(file)[1]}"))

%ls /kaggle/working/images/'child happy'/
!rm -r /kaggle/working/images/'child happy'


Now our directory has some mixed up happy and sad images mixed in..

In [None]:
%ls /kaggle/working/images/'child sad'/

images, image_names = load_and_display_images('/kaggle/working/images/child sad/')

## Prompting a Multimodal LLM with Images and Text

In [None]:
from transformers import LlavaNextProcessor, LlavaNextForConditionalGeneration
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

print(f"Using device: {device}")

Suppose we didn't obtain these images from the internet. Imagine we had millions of pictures to evaluate not just these. How might we assess what we have? One option is to try some off-the-shelf models to help extract text based meaning from the images. And then see if we can enumerate from there. We'll be using the Hugging Face APIs for this... https://huggingface.co/tasks/image-text-to-text

Below is a function that takes a random selection of images and some text input and outputs some text for each image. It uses a small model ~1.7GB (0.5B params) so is not massively accurate. At the same time the individual descriptions are interesting. Don't worry too much about the technical details right now (this is just a demo set up notebook, and in future sessions we'll pre-empt any work done in notebooks in session with conceptual and theoretical backgrounds, but if interested here is the background paper https://arxiv.org/pdf/2407.07895).

In [None]:
import os
import random
import torch
import matplotlib.pyplot as plt
from PIL import Image
from transformers import AutoProcessor, LlavaForConditionalGeneration

# Load the model and processor once to avoid reloading them for each image
model_id = "llava-hf/llava-interleave-qwen-0.5b-hf"
device = "cuda:0"  # Assuming you're using GPU

# Load model and processor
model = LlavaForConditionalGeneration.from_pretrained(
    model_id, 
    low_cpu_mem_usage=True
).to(device)

processor = AutoProcessor.from_pretrained(model_id)

# Function to process and generate response for a given image and prompt
def generate_response(image_path, prompt):
    # Load image from the path
    image = Image.open(image_path).convert("RGB")

    # Define a chat history and use `apply_chat_template` to get the correctly formatted prompt
    conversation = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {"type": "image"},
            ],
        },
    ]

    # Apply chat template to the conversation
    formatted_prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)

    # Process the inputs (text and image)
    inputs = processor(formatted_prompt, image, return_tensors="pt").to(device)

    # Generate the output from the model
    output = model.generate(**inputs, max_new_tokens=100)

    # Decode and return the response
    return processor.decode(output[0], skip_special_tokens=True), image

# Function to randomly select a specified number of images from a directory and process them with a given prompt
def process_multiple_random_images_from_directory(directory, prompt, num_images=5):
    # List all image files in the directory (assuming image files are .jpg, .jpeg, or .png)
    image_files = [f for f in os.listdir(directory) if f.lower().endswith(('jpg', 'jpeg', 'png'))]
    
    # If no image files are found, return
    if not image_files:
        print("No image files found in the directory.")
        return

    # If the requested number of images exceeds the available ones, adjust
    num_images = min(num_images, len(image_files))
    
    # Randomly select the specified number of images
    selected_image_files = random.sample(image_files, num_images)
    
    # Process each randomly selected image
    for selected_image_file in selected_image_files:
        image_path = os.path.join(directory, selected_image_file)
        print(f"Processing randomly selected image: {selected_image_file}")
        response, image = generate_response(image_path, prompt)
        
        # Display the image and the response
        display_image_with_response(image, response, selected_image_file)

# Function to display an image and its corresponding model response
def display_image_with_response(image, response, image_file):
    # Create a figure and axes
    plt.figure(figsize=(8, 8))
    
    # Display the image
    plt.imshow(image)
    plt.axis('off')  # Hide axes
    
    # Display the response as a title
    plt.title(f"Response for {image_file}:\n{response}", fontsize=12)
    
    # Show the image with response
    plt.show()


In [None]:
directory = "/kaggle/working/images/child sad/"
prompt = "Describe the contents of this image."

# Example call to process 1 random image from a given directory
process_multiple_random_images_from_directory(directory, prompt, num_images=1)

Below we check out the ability of the model to capture sentiment.

In [None]:
directory = "/kaggle/working/images/child sad/"
prompt = "Is this image a happy scene or not?"

process_multiple_random_images_from_directory(directory, prompt, num_images=1)


Redo the above with your own prompt. Be creative!

In [None]:
# query the model with your own prompt


One interesting thing about the model above is that it can used to query multiple images at once. This might be useful for assessing the contents aross the set of images we have...might be worth exploring this later. How do you think it would do processing multiple images at one?

## Clean Up

If you happen to want to clean up your working directory you can uncomment and run the following.

In [None]:
#!rm -r /kaggle/working

## Assignment

### Part 1: Kaggle Setup
This week, make sure you have done the following:
1. Created an account on Kaggle
2. Imported this notebook into Kaggle and run all of the cells.
3. Fill in the empty code blocks as prompted. 

Deliverable: upload a screenshot of your Kaggle profile to Canvas so that we can share private datasets with you in future weeks.

### Part 2: Python Assessments
To see where everyone is with their familiarity with Python and ML, we'll have you complete 3 assessments on DataCamp. You will need to create an account, just use the free tier. You are not being graded on how well you do, but we want to see where everyone is at so we can figure out how to support you best. Go to https://www.datacamp.com/signal#assessments and take the following 3 assessments, screenshot the result, and upload it to Canvas:
1. Python Programming
2. Data Manipulation with Python
3. Machine Learning Fundamentals in Python

### Bonus
Set up a personal website with a data science portfolio section to upload your work from each week to. This is a great way to showcase your work to future employers or advisors. You can use a free service like GitHub Pages, Netlify, WordPress, Wix, SquareSpace etc to create your site. Here's Isaiah's as an example: https://isaiahlg.com/portfolio/home.html.
