CLIP: Learning Transferable Visual Models From Natural Language Supervision [paper] Implementation Code
- Install modules following command.
pip install -r requirements.txt
-
Prepare Dataset. Download the [Flickr 8k Dataset] and place the downloaded data in
./data
directory. -
Train model.
python -m src.train