This is an implementation of the TheBloke/Llama-2-7b-Chat-GPTQ as a Cog model. Cog packages machine learning models as standard containers.
First, download the pre-trained weights:
cog run script/download-weights
Then, you can run predictions:
cog predict -i prompt="Tell me about AI"