🌋 LLaVA: Large Language and Vision Assistant

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.

Windows Install

Clone this repository and navigate to LLaVA folder

git clone https://github.com/natlamir/LLaVA-Windows.git llava
cd llava

Create environment and install dependencies

conda create -n llava python=3.10
conda activate llava
pip install -r requirements.txt

Install PyTorch from Pytorch Website

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

Downgrade pydantic to fix the model queue infinite load issue

pip install pydantic==1.10.9

Install bitsandbytes for windows to be able to run quantized model

pip install git+https://github.com/Keith-Hon/bitsandbytes-windows.git

Manually Download Model (Optional)

The models are located in the Model Zoo. For this example, I will use the "liuhaotian/llava-v1.5-7b" model.

Create a folder called "models" within the "llava" install folder
cd into the "models" folder from a prompt
Download 7b model from hugging face into the models folder

git lfs install
git clone https://huggingface.co/liuhaotian/llava-v1.5-7b

Usage

Launch 3 anaconda prompts. For each one: Activate llava environment, and cd into the llava install folder.
In 1st anaconda prompt, launch the controller

python -m llava.serve.controller --host 0.0.0.0 --port 10000

In the 2nd anaconda prompt, launch the model worker using 8-bit quantized model

Using the model card (this will also download the model first)

python -m llava.serve.model_worker --host "0.0.0.0" --controller-address "http://localhost:10000" --port 40000 --worker-address "http://localhost:40000" --model-path "liuhaotian/llava-v1.5-7b" --load-8bit

Or: this is using the manually installed model that should be in the models folder from the previous optional step

python -m llava.serve.model_worker --host "0.0.0.0" --controller-address "http://localhost:10000" --port 40000 --worker-address "http://localhost:40000" --model-path "models/llava-v1.5-7b" --load-8bit

In the 3rd anaconda prompt, launch the Gradio Web UI

python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload

Open a browser and navigate to http://127.0.0.1:7860

Citation

If you find LLaVA useful for your research and applications, please cite using this BibTeX:

@misc{liu2023improvedllava,
      title={Improved Baselines with Visual Instruction Tuning}, 
      author={Liu, Haotian and Li, Chunyuan and Li, Yuheng and Lee, Yong Jae},
      publisher={arXiv:2310.03744},
      year={2023},
}

@misc{liu2023llava,
      title={Visual Instruction Tuning}, 
      author={Liu, Haotian and Li, Chunyuan and Wu, Qingyang and Lee, Yong Jae},
      publisher={arXiv:2304.08485},
      year={2023},
}

Acknowledgement

Vicuna: the codebase we built upon, and our base model Vicuna-13B that has the amazing language capabilities!

Related Projects

For future project ideas, please check out:

SEEM: Segment Everything Everywhere All at Once
Grounded-Segment-Anything to detect, segment, and generate anything by marrying Grounding DINO and Segment-Anything.

Name		Name	Last commit message	Last commit date
Latest commit History 300 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
docs		docs
images		images
llava		llava
playground/data		playground/data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌋 LLaVA: Large Language and Vision Assistant

Windows Install

Manually Download Model (Optional)

Usage

Citation

Acknowledgement

Related Projects

About

Releases

Packages

Languages

License

natlamir/LLaVA-Windows

Folders and files

Latest commit

History

Repository files navigation

🌋 LLaVA: Large Language and Vision Assistant

Windows Install

Manually Download Model (Optional)

Usage

Citation

Acknowledgement

Related Projects

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages