# OpenAI Multimodal

Let me introduce you to something amazing called GPT. Think of GPT as a very smart computer program that can understand both text and images. It can read and write just like a human, and it can also look at pictures and understand what they are. Imagine having a tool that can help you with writing, reading, and even looking at photos to tell you what they show. It's like having a really smart assistant who can do many things at once!

讓我向您介紹一個非常了不起的東西，叫做 GPT。想像一下，GPT 就像是一個非常聰明的電腦程式，它能理解文字和圖片。它可以像人一樣閱讀和寫作，還能看圖片並理解它們的內容。想像一下，有一個工具可以幫助您寫作、閱讀，甚至看照片並告訴您照片中的內容。這就像擁有一個非常聰明的助理，可以同時做很多事情！

In [None]:
from IPython.display import display, HTML

# Define the HTML to display images side by side
html = """
<div style="display: flex; justify-content: space-around;">
    <div>
        <img src="nDATs9kmQk7sNrx5ELrhZ.png" height="900" width="600" />
    </div>
    <div>
        <img src="yghzBMOFHZRKGvRuw6AM6.png" height="900" width="600" />
    </div>
</div>
"""

# Display the HTML
display(HTML(html))

In [None]:
import os

os.chdir("../../")

In [None]:
from langchain.chat_models import ChatOpenAI

from src.initialization import credential_init

credential_init()

model = ChatOpenAI(openai_api_key=os.environ['OPENAI_API_KEY'],
                   model_name="gpt-4o-2024-05-13", temperature=0)

GPT does not see an image, but something strange called base64 foramt string

In [None]:
import io
import base64

from PIL import Image
from langchain_core.messages.human import HumanMessage
from langchain.prompts import ChatPromptTemplate


def image_to_base64(image_path):
    
    # Convert Image to Base64 String
    
    # Open the Image:
    with Image.open(image_path) as image:
        
        # Save the Image to a Buffer
        buffered = io.BytesIO()
        image.save(buffered, format="JPEG")
        
        # Encode the Image to Base64
        image_str = base64.b64encode(buffered.getvalue())
    
    return image_str.decode('utf-8')



### 1. Convert Image Path to Base64 String

- The image path is constructed and passed to image_to_base64 to get the Base64 string of the image.

In [None]:
from src.io.path_definition import get_project_dir


image_str = image_to_base64(os.path.join(get_project_dir(), 'tutorial/Week-5/nDATs9kmQk7sNrx5ELrhZ.png'))

### 2. Create a Human Message

- A HumanMessage object is created containing two parts:
    - A text message asking "What is in this image?"
    - An image URL containing the Base64 encoded image.

In [None]:
human_message = HumanMessage(content=[{'type': 'text', 
                                       'text': 'What is in this image?'},
                                      {'type': 'image_url',
                                       'image_url': {
                                           'url': f"data:image/jpeg;base64,{image_str}"}
                                      }])

# Create a Prompt Template
prompt = ChatPromptTemplate.from_messages([human_message])

# Generate the Chain
chain = prompt|model

In [None]:
chain.invoke(input={})

In [None]:
from IPython.display import Image as Image_IPYTHON

Image_IPYTHON(url="https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg")

In [None]:
human_message = HumanMessage(content=[{'type': 'text', 
                                       'text': 'What is in this image?'},
                                      {'type': 'image_url',
                                       'image_url': {
                                           'url': "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"}
                                      }])

prompt = ChatPromptTemplate.from_messages([human_message])

In [None]:
chain = prompt|model

chain.invoke(input={})

In [None]:
prompt = ChatPromptTemplate.from_messages([human_message])

## 回家作業1: 用LCEL建立一個影像分析函數，輸入為檔案名稱，輸出為content

## Multiple Images

In [None]:
human_message = HumanMessage(content=[{'type': 'text', 
                                       'text': 'What are in these images? Is there any difference between them?'},
                                      {'type': 'image_url',
                                       'image_url': {
                                           'url': "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"}
                                      },
                                      {'type': 'image_url',
                                       'image_url': {
                                           'url': "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"}
                                      }])

prompt = ChatPromptTemplate.from_messages([human_message])
model.invoke(prompt.format())

In [None]:
model.invoke(prompt.format())

# Text Splitting

https://www.youtube.com/watch?v=8OJC21T2SL4

- Character Split
- Recursive Character Split
- Document Specific Splitting
- Semantic Splitting
- Agentic Splitting

1. Context Limit: Limit on the amount of words/tokens you can pass to the language model
2. Signal to Noise: Remove information that isn't helpful to your task

## Character Splitting

Character splitting is the most basic form of splitting up your text. It is the process of simply dividing your text into N-character sized chunks regardless of their content or form

This method isn's recommended for any applications - but it's a great starting point for us to understand the basics.

- Pros: Easy & Simple
- Cons: Very rigid and doesn't take into account the structure of your text

Concepts to know:

- Chunk Size - The number of characters you would like in your chunks. 50, 100, 100000, etc.
- Chunk Overlap - The amount you would like your sequential chunks to overlap. This is to try to avoid cutting a single piece of context into multiple pieces. This will create duplicate data across chunks.


字元分割是將文本分割成最基本形式的方式。它是將文本簡單地分割成N個字元大小的區塊，而不考慮其內容或形式。

這種方法不推薦用於任何應用，但它是我們了解基礎知識的絕佳起點。

優點：簡單且容易
缺點：非常僵硬，不考慮文本結構
需要了解的概念：

區塊大小：您希望每個區塊包含的字元數量。例如，50，100，100000等。
區塊重疊：您希望順序區塊之間重疊的字元數量。這是為了避免將單個上下文切割成多個部分。這將在區塊之間創建重複數據。

In [None]:
text = "This is the text I would like to chunk up. It is the example text for this exercise"

In [None]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=35, chunk_overlap=0, separator='', strip_whitespace=False)
text_splitter.create_documents([text])

In [None]:
text_splitter = CharacterTextSplitter(chunk_size=35, chunk_overlap=4, separator='', strip_whitespace=False)
text_splitter.create_documents([text])

In [None]:
from IPython.display import IFrame

IFrame(src='https://chunkviz.up.railway.app/', width=800, height=800)

- Separators are the character(s) sequences you would like to split on. Say you wanted to chunk your data at `ch`, you can specify it.

In [None]:
text_splitter = CharacterTextSplitter(chunk_size=4, chunk_overlap=0, separator='ch')
text_splitter.create_documents([text])

## Recursive character splitting

This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is ["\n\n", "\n", " ", ""]. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.

這種文本分割器是針對一般文本推薦的。它是由一個字元列表參數化的，按照順序嘗試在這些字元上進行分割，直到區塊足夠小。預設的列表是 ["\n\n", "\n", " ", ""]. 這樣做的效果是盡可能將所有段落（然後是句子，再然後是單詞）保持在一起，因為這些通常看起來是語義上最相關的文本片段。

### CNN (Cable News Network) 數據集

In [None]:
import pandas as pd

df_news = pd.read_csv("tutorial/Week-5/CNN_Articels_clean.csv")

In [None]:
df_news.head(5)

In [None]:
text = df_news.iloc[0]['Article text']

In [None]:
len(text)

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=65, chunk_overlap=0)

In [None]:
documents = text_splitter.create_documents([text])

In [None]:
documents[0]

In [None]:
documents[1]

In [None]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=0)

In [None]:
documents = text_splitter.create_documents([text])

In [None]:
documents[0]

# **** 預計第一個小時結束 ****


## Document Specific Splitting

### Markdown splitter

This code snippet demonstrates how to use LangChain's MarkdownTextSplitter to split a Markdown text document into smaller chunks. The MarkdownTextSplitter class is designed to handle Markdown-specific structure, making it easier to process and retrieve information from Markdown documents.

### 1. Import LangChain Components

- Ensure that the necessary components from LangChain are imported. This might include MarkdownTextSplitter.
- 確保導入 LangChain 的必要組件。這可能包括 MarkdownTextSplitter。

In [None]:
from langchain.text_splitter import MarkdownTextSplitter

### 2. Initialize the Text Splitter

- The MarkdownTextSplitter is initialized with a chunk_size of 40 and chunk_overlap of 0. This means each chunk will contain up to 40 characters, and there will be no overlap between chunks.
- MarkdownTextSplitter 被初始化為 chunk_size 為 40，chunk_overlap 為 0。這意味著每個塊將包含最多 40 個字符，並且塊之間不會重疊。

In [None]:
text_splitter = MarkdownTextSplitter(chunk_size=40, chunk_overlap=0)

In [None]:
markdown_text = """
# Fun in Califormia

## Driving

Try driving on the 1 down to San Diego

### Food

Make sure to eat a burrito while you're there

## Hiking

Go to Yosemite
"""

### 3. Create Documents from Markdown Text

- The create_documents method of MarkdownTextSplitter is used to split the Markdown text into smaller chunks based on the specified chunk size.
- 使用 MarkdownTextSplitter 的 create_documents 方法根據指定的塊大小將 Markdown 文本拆分成較小的部分。

In [None]:
text_splitter.create_documents([markdown_text])

### Python splitter

In [None]:
from langchain.text_splitter import PythonCodeTextSplitter

python_text = """
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

p1 = Person("John", 36)

for i in range(10):
    print(i)
"""

python_splitter = PythonCodeTextSplitter(chunk_size=100, chunk_overlap=0)
python_splitter.create_documents([python_text])

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter, Language


python_splitter = RecursiveCharacterTextSplitter.from_language(
    language=Language.PYTHON, chunk_size=100, chunk_overlap=0
)
python_docs = python_splitter.create_documents([python_text])
python_docs

### split code: https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/code_splitter/