# Exploring Llama 3.2-Vision (locally) with Ollama
Code authored by: Shaw Talebi

[Blog link](https://towardsdatascience.com/multimodal-models-llms-that-can-see-and-hear-5c6737c981d3/) |
[Video link](https://www.youtube.com/watch?v=Ot2c5MKN_-w)

modified by: Soonwook Hwang

In [1]:
#!pip install ollama

In [2]:
import ollama

### pull model

In [3]:
ollama.pull('llama3.2-vision')

ProgressResponse(status='success', completed=None, total=None, digest=None)

#### Basic Usage

In [4]:
response = ollama.chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'What is in this image?',
        'images': ['images/selfie.jpg']
    }]
)

print(response['message']['content'])

The image shows a woman sitting at a table, taking a selfie with her phone.


#### Image captioning - streaming

In [5]:
stream = ollama.chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'Can you write a caption for this image?',
        'images': ['images/selfie.jpg']
    }],
    stream=True,
)

for chunk in stream:
    print(chunk['message']['content'], end='', flush=True)

The image shows a woman sitting at a table, taking a selfie with her phone. The purpose of the image is to capture a moment of the woman enjoying her time at the table.

* A woman is sitting at a table:
	+ The woman is seated on the left side of the table.
	+ She is facing the camera and holding her phone up to take a selfie.
	+ Her body is turned slightly to the right, with her left arm resting on the table.
* The woman is wearing a red jacket:
	+ The jacket is a deep red color and appears to be made of a thick, warm material.
	+ It is open, revealing a black shirt underneath.
	+ The jacket has a relaxed fit and is not buttoned up.
* There is a cup on the table:
	+ The cup is placed on the right side of the table, near the edge.
	+ It is a white cup with a black lid, and it appears to be empty.
	+ The cup is sitting on a small plate or saucer, which is also white.

Overall, the image suggests that the woman is enjoying a quiet moment to herself, perhaps sipping a coffee or tea from th

#### Explaining memes

In [6]:
stream = ollama.chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'Can you explain this meme to me?',
        'images': ['images/ai-meme.jpeg']
    }],
    stream=True,
)

for chunk in stream:
    print(chunk['message']['content'], end='', flush=True)

The meme features a still from the animated TV show SpongeBob SquarePants, depicting Patrick Star, a character known for his laziness and lack of intelligence, attempting to build a table. The image is captioned "Trying to build with AI today..." and features various AI logos superimposed over the image, including those for Gemini, OpenAI, and Meta AI. The meme humorously highlights the challenges of using AI for creative tasks, implying that even Patrick Star, a character not known for his intelligence, is struggling to build a table using AI.

#### OCR

In [7]:
stream = ollama.chat(
    model='llama3.2-vision',
    messages=[{
        'role': 'user',
        'content': 'Can you transcribe the text from this screenshot in a markdown format?',
        'images': ['images/5-ai-projects.jpeg']
    }],
    stream=True,
)

for chunk in stream:
    print(chunk['message']['content'], end='', flush=True)

5 AI Projects You Can Build This Weekend (with Python)

### 1) Resume Optimization (Beginner)

* Idea: Build a tool that adapts your resume for a specific job description

### 2) YouTube Lecture Summarizer (Beginner)

* Idea: Build a tool that takes a YouTube video link and summarizes it

### 3) Automatically Organizing PDFs (Intermediate)

* Idea: Build a tool that analyzes the contents of each PDF and organizes them into folders based on topics

### 4) Multimodal Search (Intermediate)

* Idea: Use multimodal embeddings to represent user queries, text knowledge, and images in a single space

### 5) Desktop QA (Advanced)

* Idea: Connect a multimodal knowledge base to a multimodal model like Llama-3.2-11B-Vision

This list provides a starting point for building AI projects with Python, covering topics from resume optimization to advanced desktop QA.