# Build an AI stylist with IBM Granite using watsonx.ai
**Authors:** Anna Gutowska, Ash Minhas

In this tutorial, you will be guided through how to build a generative AI-powered personal stylist. This tutorial leverages the [IBM Granite™ Vision 3.2](https://www.ibm.com/granite) [large language model (LLM)](https://www.ibm.com/think/topics/large-language-models) for processing image input and [Granite 3.2](https://www.ibm.com/granite) with the latest enhanced reasoning capabilities for formulating customizable outfit ideas.

## Introduction
How often do you find yourself thinking, “What should I wear today? I don’t even know where to start with picking items from my closet!” This dilemma is one that many of us share. By using cutting-edge [artificial intelligence (AI)](https://www.ibm.com/think/topics/artificial-intelligence) models, this no longer needs to be a daunting task. 

## AI styling: How it works
Our AI-driven solution is composed of the following stages:

1. The user uploads images of their current wardrobe or even items in their wishlist, one item at a time.

2. The user selects the following criteria:
    * Occasion: casual or formal.
    * Time of day: morning, afternoon or evening.
    * Season of the year: winter, spring, summer or fall.
    * Location (for example, a coffee shop).

3. Upon submission of the input, the [multimodal](https://www.ibm.com/think/topics/multimodal-ai) Granite Vision 3.2 model iterates over the list of images and returns the following output:
    * Description of the item.
    * Category: shirt, pants or shoes.
    * Occasion: casual or formal.

4. The Granite 3.2 model with enhanced reasoning then serves as a fashion stylist. The LLM uses the Vision model’s output to provide an outfit recommendation that is suitable for the user’s event.

5. The outfit suggestion, a data frame of items that the user uploaded and the images in the described personalized recommendation are all returned to the user.

For the optimal user experience, we recommend cloning the repository in step 3. This tutorial does not encompass every available feature.

# Prerequisites

You need an [IBM Cloud® account](https://cloud.ibm.com/registration) to create a [watsonx.ai™](https://www.ibm.com/products/watsonx-ai) project.

# Steps

In order to use the watsonx [application programming interface (API)](https://www.ibm.com/think/topics/api), you will need to complete the following steps.

## Step 1. Set up your environment

1. Log in to [watsonx.ai](https://dataplatform.cloud.ibm.com/registration/stepone?context=wx&apps=all) by using your IBM Cloud account.

2. Create a [watsonx.ai project](https://www.ibm.com/docs/en/watsonx/saas?topic=projects-creating-project).

	You can get your project ID from within your project. Click the **Manage** tab. Then, copy the project ID from the **Details** section of the **General** page. You need this ID for this tutorial.

## Step 2. Set up watsonx.ai Runtime service and API key

1. Create a [watsonx.ai Runtime](https://cloud.ibm.com/catalog/services/watsonxai-runtime) service instance (choose the Lite plan, which is a free instance).

2. Generate an [API Key](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-authentication.html). 

3. Associate the watsonx.ai Runtime service to the project that you created in [watsonx.ai](https://dataplatform.cloud.ibm.com/docs/content/wsj/getting-started/assoc-services.html?context=cpdaas). 

## Step 3. Clone the repository (optional)

For a more interactive experience when using this AI tool, clone the [GitHub repository](https://github.com/IBM/ibmdotcom-tutorials) and follow the setup instructions in the README.md file within the AI stylist project to launch the Streamlit application on your local machine. Otherwise, if you prefer to follow along step-by-step, create a Jupyter Notebook and continue with this tutorial.

## Step 4. Install and import relevant libraries and set up your credentials

We need a few libraries and modules for this tutorial. Make sure to import the following ones; if they're not installed, you can resolve this issue with a quick pip installation.

In [None]:
# Install required packages
!pip install -q image ibm-watsonx-ai

In [None]:
# Required imports
import getpass, os, base64
from ibm_watsonx_ai import Credentials
from ibm_watsonx_ai.foundation_models import ModelInference

To set our credentials, we need the `WATSONX_APIKEY` and `WATSONX_PROJECT_ID` you generated in step 1. We will also set the URL serving as the API endpoint.

In [None]:
WATSONX_APIKEY = getpass.getpass("Please enter your watsonx.ai Runtime API key (hit enter): ")

WATSONX_PROJECT_ID = getpass.getpass("Please enter your project ID (hit enter): ")

URL = "https://us-south.ml.cloud.ibm.com"

We can use the `Credentials` class to encapsulate our passed credentials.

In [None]:
credentials = Credentials(
    url=URL,
    api_key=WATSONX_APIKEY
)

## Step 5. Set up the API request for the Granite Vision model

The `augment_api_request_body` function takes the user query and image as parameters and augments the body of the API request. We will use this function in each iteration of inferencing the Vision model.


In [None]:
def augment_api_request_body(user_query, image):
    messages = [
        {
            "role": "user",
            "content": [{
                "type": "text",
                "text": user_query
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{image}"
                }
            }]
        }
    ]
    
    return messages

We can also instantiate the model interface using the `ModelInference` class. 

In [None]:
model = ModelInference(
        model_id="ibm/granite-vision-3-2-2b",
        credentials=credentials,
        project_id=WATSONX_PROJECT_ID,
        params={
            "max_tokens": 400,
            "temperature": 0       
        }
    )

## Step 6. Encode images

To encode our images in a way that is digestible for the LLM, we will encode them to bytes that we then decode to UTF-8 representation. In this case, our images are located in the local `images` directory. You can find sample images in the AI stylist directory in our [GitHub repository](https://github.com/IBM/ibmdotcom-tutorials).

In [None]:
directory = "images"  #directory name
images = []

for filename in os.listdir(directory): 
  if filename.endswith(".jpeg") or filename.endswith(".png"): 
    filepath = directory + '/' +filename
    with  open(filepath, "rb") as f:
      images.append(base64.b64encode(f.read()).decode('utf-8')) 
    print(filename)

## Step 7. Categorize input with the Vision model

Now that we have loaded and encoded our images, we can query the Vision model. Our prompt is specific to our desired output to limit the model's creativity as we seek valid JSON output. We will store the description, category and occasion of each image in a list called `closet`. 

In [None]:
user_query = """Provide a description, category, and occasion for the clothing item or shoes in this image.  

                Classify the category as shirt, pants, or shoes.
                Classify the occasion as casual or formal.
                
                Ensure the output is valid JSON. Do not create new categories or occasions. Only use the allowed classifications.
                
                Your response should be in this schema: 
                {
                    "description": "<description>",
                    "category": "<category>",
                    "occasion": "<occasion>"
                }
                """

closet = []

for i in range(len(images)):
    image = images[i]
    message = augment_api_request_body(user_query, image)
    answer = model.chat(messages=message)
    closet.append(answer)
    print(answer['choices'][0]['message']['content'])

## Step 8. Generate outfits with the reasoning model

Now that we have each clothing and shoe item categorized, it will be much easier for the reasoning model to generate an outfit for the selected occasion. Let's instantiate and query the reasoning model.

In [None]:
reasoning_model = ModelInference(
        model_id="ibm/granite-3-2-8b-instruct",
        credentials=credentials,
        project_id=WATSONX_PROJECT_ID
)

In [None]:
occasion = input("Enter the occasion")                 #casual or formal (e.g. "casual")
time_of_day = input("Enter the time of day")           #morning, afternoon or evening (e.g. "morning")
location = input("Enter the location")                 #any location (e.g. "park")
season = input("Enter the season")                     #spring, summer, fall or winter (e.g. "fall")

prompt = f"""Use the description, category, and occasion of the clothes in my closet to put together an outfit for a {occasion} {time_of_day} at the {location}.
                The event takes place in the {season} season. Make sure to return only one shirt, bottoms, and shoes.
                Use the description, category, and occasion provided. Do not classify the items yourself. 
                Include the file name of each image in your output along with the file extension. Here are the items in my closet: {closet}"""

messages = [
        {
            "role": "control",
            "content": "thinking"
        },
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": f"{prompt}"
                }
            ]
        }
    ]

print(reasoning_model.chat(messages=messages)['choices'][0]['message']['content'])

## Conclusion

In this tutorial, you built a system that uses AI to provide style advice to a user's specific event. Using photos or screenshots of the user's clothing, outfits are customized to meet the specified criteria. The Granite-Vision-3-2-2b model was critical for labeling and categorizing each item. Additionally, the Granite-3-2-8B-instruct model leveraged its reasoning capabilities to generate personalized outfit ideas.
Some next steps for building off this application can include:
- Customizing outfits to a user's personal style, body type, preferred color palette, and more.
- Broadening the criteria to include jackets and accessories.
    - For example, the system might propose a blazer for a user attending a formal conference in addition to the selected shirt, pants and shoes.
- Serving as a personal shopper by providing e-commerce product recommendations and pricing that align with the user's unique style and budget.
- Adding chatbot functionality to ask the LLM questions about each outfit.
- Providing a virtual try-on experience that uses a user selfie to simulate the final look.
