- Git clone this repository
pip install -r requirements.txt
This is just a pipeline involving the use of both ALPACA and BLIP-2, without any prior finetuning. You can refer to the details in ALPACA_LORA's repo here and the BLIP-2 training details on their GitHub page here. For the pipeline, I have used the BLIP-2 model found on HuggingSpace here
- cd to the cloned repo
- Run
python3 generate.py
#TODO:
- Try to reduce VRAM Usage: It hits around 14GB of VRAM on the 7B Weights when combined with BLIP2
- Add ability for users to customise their prompts to BLIP-2 in Gradio. This can help finetune the context given from BLIP2 to ALPACA, improving accuracy of generated outputs
Once again, I would like to credit the Salesforce team for creating BLIP2, as well as tloen, the original creator of alpaca-lora. I would also like to thank Meta, the original creator of LLAMA.