# Project Overview: Generating Panda-Themed Fashion Images with Hugging Face Models

This project is focused on using Hugging Face's model hub to generate images of a stylish, fashionable panda in various runway scenarios. It covers the following key functionalities:

1. **Prompt-Based Image Generation**: Using descriptive prompts to create fashion images of a panda, such as "A fashionable female panda walks elegantly on the runway."

2. **Model Download and Setup**: Guides on how to download and set up pre-trained models from Hugging Face, including handling dependencies and setting up authentication with Hugging Face tokens.

3. **GPU Utilization for Model Inference**: Attempts to optimize image generation by utilizing GPU resources, especially when working with larger models like Stable Diffusion, to speed up processing.

4. **Two Methods for Image Generation**:
   - **Method 1**: A simpler approach that sends prompts to the model API, retrieves image bytes, and saves generated images locally.
   - **Method 2**: An advanced approach using the `StableDiffusionPipeline` from Hugging Face with GPU support, allowing for potentially higher efficiency in image generation.

This project is ideal for users interested in prompt-based AI image generation, specifically using Hugging Face's tools and models to create customized images with specific themes.


In [1]:
!pip install python-dotenv



In [17]:
import requests
import io
from PIL import Image
from datetime import datetime
import torch
from diffusers import StableDiffusion3Pipeline
from diffusers import DiffusionPipeline

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

# Securely storet okens
This code demonstrates how to securely store and access sensitive tokens in Google Colab by saving them in a .env file on Google Drive. The process includes mounting Google Drive, creating and writing a .env file with the token, and loading it into the Colab environment. This method keeps sensitive data separate from the code, enhancing security and ease of access.

In [18]:
from dotenv import load_dotenv
import os
from google.colab import drive

In [19]:
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [20]:
!ls /content/drive/MyDrive

'Colab Notebooks'
'Copy of Opinion Mining.ipynb'
'dentist system design.drawio'
'dentist system design.drawio.png'
 FakeNewsClassification.ipynb
 github
'graduation photos'
 langchain.drawio
 Methodology.drawio
 Panpan_Zhang_21182197_Alfie_Abdul_Rahman_Presentat.mp4
'Pattern Recognition, Neural Networks and Deep Learning (7CCSMPNN).gdoc'
 Pipeline_Usage.ipynb
 PNN.drawio
 RAG.drawio
 secret
'set up environment.drawio'
 text_ana-1013.ipynb
 text_ana-1013_v2.ipynb
 Twitter_Sentiment_Analysis.ipynb
'user flow.drawio'


In [21]:
!ls /content/drive/MyDrive/secret/

In [13]:
# if it doesn't work
with open('/content/drive/MyDrive/secret/.env', 'w') as f:
    f.write("HF_TOKEN=XXX")

In [22]:
# replace your token

file_path = '/content/drive/MyDrive/secret/.env'
print(os.path.exists(file_path))

load_dotenv(file_path)
HF_TOKEN = os.getenv('HF_TOKEN')

True


# First method


In [24]:
API_URL = "https://api-inference.huggingface.co/models/stabilityai/stable-diffusion-3.5-large"
headers = {"Authorization": f"Bearer {HF_TOKEN}"}

In [25]:
def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.content

In [None]:
# prompts = [
#     "A glamorous female panda wearing a designer gown winks and blows a kiss to the audience as she struts down the runway.",
#     "A chic female panda glides down the runway in a fitted, elegant dress, spinning gracefully to show the back of her outfit.",
#     "A trendy female panda does a little dance on the runway in a classy outfit, adding charm to the fashion show.",
#     "A stylish female panda poses with one paw on her hip in a fashionable suit, giving a confident, fierce look on the runway.",
#     "A fashionable female panda wearing elegant accessories like pearls and a feathered hat walks gracefully down the runway.",
#     "A stylish female panda in a sleek, form-fitting dress skips down the runway, her joyful movements making the audience smile."
# ]


In [26]:
prompts = [
    "A fashionable female panda walks elegantly on the runway, dressed in a glamorous evening gown.",
    "A stylish female panda wearing a chic cocktail dress strikes a playful pose on the runway.",
    # "A sophisticated female panda in a designer dress twirls gracefully on the runway, showing off her outfit.",
    # "A female panda walks confidently down the runway in a trendy jumpsuit, giving a subtle wave to the audience.",
    # "A fashionable female panda, dressed in a vibrant, flowy dress, playfully jumps on the runway, captivating the crowd.",
    ]


In [27]:
# The first version of using the model
for prompt in prompts:
    image_bytes = query({
        "inputs": prompt,
    })
    image = Image.open(io.BytesIO(image_bytes))
    # image.show()
    timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
    filename = f"{timestamp}.png"
    image.save(f"{filename}")
    # image.save(filename)
    print(f"Image saved as {filename}")

Image saved as 2024-11-03 12:35:13.png
Image saved as 2024-11-03 12:35:45.png


# Second Method

use the GPU, but it still didn't work.

In [None]:
!pip install -U huggingface_hub



In [None]:
# login using the hugging face, so that we are able to be authorized to use the model.
# Input the HF_TOKEN
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) y
Token is valid (permission: fineGrained).
The token `cli` has been saved to /root/.cache/huggingface/stored_tokens
[1m[31mCannot authenticate through git-credential as no helper is defined on your machine.
You might have to re-authenti

In [None]:
# check if it's succefully login
from huggingface_hub import HfApi

api = HfApi()
user = api.whoami()
print("Logged in as:", user["name"])

Logged in as: peggy30


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [None]:
from diffusers import StableDiffusionPipeline

In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
model_repo_id = "stabilityai/stable-diffusion-3.5-large"

if torch.cuda.is_available():
    torch_dtype = torch.bfloat16
else:
    torch_dtype = torch.float32

pipe = DiffusionPipeline.from_pretrained(model_repo_id, torch_dtype=torch_dtype)
# pipe = StableDiffusionPipeline.from_pretrained(model_repo_id, torch_dtype=torch_dtype, safety_checker=None)
pipe = pipe.to(device)

# 加载模型并将其移到 GPU
# pipe = StableDiffusion3Pipeline.from_pretrained(
#     "stabilityai/stable-diffusion-3.5-large",
#     torch_dtype=torch.bfloat16,  # 使用 bfloat16 可以提高内存效率
#     use_auth_token=HF_TOKEN
# )
# pipe = pipe.to("cuda")  # 将模型加载到 GPU

Loading pipeline components...:   0%|          | 0/9 [00:00<?, ?it/s]

The config attributes {'qk_norm': 'rms_norm'} were passed to SD3Transformer2DModel, but are not expected and will be ignored. Please verify your config.json configuration file.


ValueError: Attention(
  (to_q): Linear(in_features=2432, out_features=2432, bias=True)
  (to_k): Linear(in_features=2432, out_features=2432, bias=True)
  (to_v): Linear(in_features=2432, out_features=2432, bias=True)
  (add_k_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (add_v_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (add_q_proj): Linear(in_features=2432, out_features=2432, bias=True)
  (to_out): ModuleList(
    (0): Linear(in_features=2432, out_features=2432, bias=True)
    (1): Dropout(p=0.0, inplace=False)
  )
  (to_add_out): Linear(in_features=2432, out_features=2432, bias=True)
) has no attribute norm_added_k.

In [None]:
# Generate images and save them
for prompt in prompts:
    # Generate image using GPU
    image_list = pipe(
        prompt,
        num_inference_steps=28,    # Adjust the number of inference steps to control image quality and generation speed
        guidance_scale=3.5,         # Guidance scale to control how closely the image matches the description
        num_images_per_prompt=3
    ).images

    # Generate filename based on the current timestamp
    timestamp = datetime.now().strftime("%Y-%m-%d %H-%M-%S")
    for i, image in enumerate(image_list):
      filename = f"{timestamp}_{i}.png"
      image.save(filename)
      print(f"Image saved as {filename}")
