It's a course project. The aim of this project is to create a deep learning model that can predict the attributes of a sketch of humans (mainly based on FS2K). Basically speaking, it's a classification task (image in and attribute out). The model is based on CLIP. Its architecture is shown below.
Following several steps to run the project:
- Make sure you have installed Python(>=3.6) and PyTorch(>=1.10). It's highly recommended to use conda virtual environment, run
conda create -n sketch2attributes python=3.9
andconda activate sketch2attributes
. - Clone the project from GitHub
- Install CLIP following the installation guide
- Run
git clone https://github.com/DengPingFan/FS2K.git
and follow their instructions to download and split the dataset. - Run
conda install --file requirements.txt
to install all the required packages. - Run
jupyter
to start the Jupyter Notebook, then click themodel.ipynb
file to run experiments.
All procedures including data loading, model defining, training, and evaluation are done in the Jupyter Notebook, with friendly comments.
Great thanks to FS2K and CLIP for their valuable contributions.