<a href="https://colab.research.google.com/github/arulbenjaminchandru/Python-and-Gen-AI/blob/main/Image_Generation_recognition%2C_Running_large_language_models_(LLMs)_locally.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Generative AI: Image Generation and Image Recognition**



### **Image Generation**

**Image Generation** in Generative AI refers to the process of creating new images based on certain inputs, such as text descriptions or other images. Imagine telling a computer, "Draw me a picture of a sunset over the ocean," and it creates a completely new image that matches your description.

- **How It Works**: In simple terms, the AI takes your input (like a text description) and tries to create a visual representation that matches what you've asked for. The AI learns from lots of examples, like millions of pictures of sunsets, so it can understand what makes a sunset look like a sunset.

- **Applications**:
  - **Art Creation**: Artists can use AI to generate new artwork based on their ideas.
  - **Design**: Designers might use AI to create new patterns or design concepts quickly.
  - **Content Creation**: AI can generate images for websites, marketing, or social media.

For example, if you wanted to create a logo for your new business, you could describe what you want (e.g., "A modern, minimalist logo with a mountain and a river") and the AI would generate a few different versions for you to choose from.



### **Image Recognition**

**Image Recognition** is about teaching computers to understand and identify what they see in images. Imagine showing a computer a picture of a dog, and it correctly identifies it as a dog.

- **How It Works**: The AI is trained by showing it thousands or even millions of images, each labeled with what it is (like dogs, cats, cars, etc.). Over time, the AI learns to recognize patterns and features that are specific to each type of object, so it can correctly identify them in new images.

- **Applications**:
  - **Security**: Facial recognition systems use this technology to identify people.
  - **Healthcare**: AI can analyze medical images (like X-rays) to help doctors diagnose conditions.
  - **Self-Driving Cars**: Cars use image recognition to understand their surroundings, like recognizing other cars, pedestrians, and traffic signs.

For instance, a smartphone camera might use image recognition to detect faces when you're taking a picture, ensuring that the focus is on the people in the photo.

Here are some examples of AI tools for **image generation** and **image recognition** that are user-friendly and don't require deep technical knowledge:



### Image Generation Tools

1. **DALL-E 3**
   - **What It Does**: Generates high-quality images from text descriptions. For example, you could type "A futuristic city at sunset" and DALL-E 3 will create an image based on that description.
   - **Use Case**: Artists, designers, and content creators use it to generate visual content quickly.

   https://openai.com/index/dall-e-3/

2. **MidJourney**
   - **What It Does**: Another popular text-to-image generation tool, known for creating stunning and artistic images based on user prompts.
   - **Use Case**: Creative professionals use it to generate unique artwork, illustrations, or concept designs.

  https://www.midjourney.com/home -- This runs on discord

3. **Stable Diffusion**
    - Alternate to Midjourney

   https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium


### Image Recognition Tools

1. **Google Vision AI**
   - **What It Does**: Detects and classifies objects, faces, text, and even landmarks in images.
   - **Use Case**: Used in various industries, from automating photo tagging in social media to analyzing images in large datasets.

   https://cloud.google.com/vision/docs/drag-and-drop

2. **Amazon Rekognition**
   - **What It Does**: Recognizes objects, people, text, scenes, and activities in images and videos. It can also analyze images for facial recognition.
   - **Use Case**: Often used in security, customer analysis, and content moderation.

   https://docs.aws.amazon.com/rekognition/latest/dg/what-is.html

##**Running LLMs Locally**

###**Ollama**



#### **Overview**
Ollama is a framework for running and managing large language models (LLMs) locally across different platforms. It supports various models and offers customization options.



#### **Supported Platforms**
- **macOS:** [Download](https://ollama.com/download/Ollama-darwin.zip)
- **Windows (Preview):** [Download](https://ollama.com/download/OllamaSetup.exe)
- **Linux:** Use the command:
  ```
  curl -fsSL https://ollama.com/install.sh | sh
  ```
- **Docker:** Available via Docker Hub: `ollama/ollama`.



#### **Libraries**
- Python: [ollama-python](https://github.com/ollama/ollama-python)
- JavaScript: [ollama-js](https://github.com/ollama/ollama-js)



####**Getting Started**
To run a model:
```
ollama run llama3.1
```



####**Model Library**
Ollama supports various models like Llama 3.1, Phi 3, Gemma 2, Mistral, and more. Example command:
```
ollama run llama3.1
```
> **Note:** Ensure you have enough RAM: 8 GB for 7B models, 16 GB for 13B models, and 32 GB for 33B models.



####**Command-Line Interface (CLI) Basics**
- **Create a model:**
  ```
  ollama create mymodel -f ./Modelfile
  ```
- **Pull a model:**
  ```
  ollama pull llama3.1
  ```
- **Remove a model:**
  ```
  ollama rm llama3.1
  ```
- **Copy a model:**
  ```
  ollama cp llama3.1 my-model
  ```
- **Multimodal models:**
  ```
  ollama run llama "What's in this image? /path/to/image.png"
  ```



####**REST API**
Ollama provides a REST API for generating responses and chatting with models:
- **Generate response:**
  ```
  curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "prompt":"Why is the sky blue?"}'
  ```
- **Chat:**
  ```
  curl http://localhost:11434/api/chat -d '{"model": "llama3.1", "messages": [{"role": "user", "content": "why is the sky blue?"}]}'
  ```

### **LM Studio** - https://lmstudio.ai/docs/welcome