# Contents Generation Demo

## シナリオ案
- ゲーム用広告生成
  - ターゲット：ブログ記事（発売情報、ゲームの魅力・特徴、シリーズものは他との違い、動画・画像データを差し込んだ具体的な説明）
  - コンテンツ：タイトル、メタデータ、記事本文
  - インプット：
	- 製品情報：公式ページから抽出
	- 動画：Youtubeから取得
	  - https://www.youtube.com/watch?v=PDHvMiX_vwk
	  - https://www.youtube.com/watch?v=eoIWH1p9LfM
	- その他情報：
	  - Webページ：公式ページから画像データなど抽出
	- ユーザプロンプト：内容の指示・カスタマイズ
  - アウトプット形式
	- マークダウン
- ブランドガイドライン（ブランドのイメージ、方針）に沿ったゲーム用広告生成
  - 過去の製品説明、画像、動画
    - 利用目的は、近いデータセットをひっぱってきて、
    	- 過去の成功体験をトレースする
    	- 製品ラインのイメージを遵守する（例：ドラクエシリーズの世界観）
    - これは事前にインデックス化しておく
    - 既存のWebページなどあれば、情報として使えるためインプットさせる
 
## Architecture Overview
- 動画データの前処理：
  - 動画データから情報を抽出するために、フレーム画像に分割してGPT-4oでキャプション生成。
  - キャプションはJsonメタデータで一括管理。コンテンツ生成時に目的の画像データを検索するのに利用する。
- PDF（画像とテキスト）の前処理：
  - PDFデータから情報を抽出するために、Document Intelligence を利用する。画像の抽出と階層化形式（マークダウン形式）でのアウトプットが可能。
  - 画像データはGPT-4oでキャプションを生成して、Jsonメタデータで一括管理。コンテンツ生成時に目的の画像データを検索するのに利用する。
- コンテンツ生成
  - Jsonメタデータとテキストデータをコンテキストとしてすべてプロンプトに含める。
  - 出力形式は特に指定せず、ブログ用の記事で興味をひく記事で作成するようにだけ指示。
- ブランドイメージに沿ったコンテンツ生成
  - 目的のコンテンツに合致させるために、アウトプットに関する情報を追加する。
  - ここでは、過去のブログ記事（Webページを想定）のイメージに近いものを作成させるために、情報を抽出。
    - 情報抽出はBeautifulSoupでクローリング

## Roadmap
- ブランドガイドライン、過去の広告データなどを検索インデックス化して、RAGのアーキテクチャを追加


## 

## Prerequisites

In [None]:
! pip install -r ../requirements.txt

In [None]:
import os
import os
from dotenv import load_dotenv
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import ContentFormat
from preprocessing.preprocessing_with_image import analyze_layout

from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient, SearchIndexerClient
from azure.search.documents.indexes.models import (
    AIServicesVisionParameters,
    AIServicesVisionVectorizer,
    AIStudioModelCatalogName,
    AzureMachineLearningVectorizer,
    AzureOpenAIVectorizer,
    AzureOpenAIModelName,
    AzureOpenAIParameters,
    BlobIndexerDataToExtract,
    BlobIndexerParsingMode,
    CognitiveServicesAccountKey,
    DefaultCognitiveServicesAccount,
    ExhaustiveKnnAlgorithmConfiguration,
    ExhaustiveKnnParameters,
    FieldMapping,
    HnswAlgorithmConfiguration,
    HnswParameters,
    IndexerExecutionStatus,
    IndexingParameters,
    IndexingParametersConfiguration,
    InputFieldMappingEntry,
    OutputFieldMappingEntry,
    ScalarQuantizationCompressionConfiguration,
    ScalarQuantizationParameters,
    SearchField,
    SearchFieldDataType,
    SearchIndex,
    SearchIndexer,
    SearchIndexerDataContainer,
    SearchIndexerDataIdentity,
    SearchIndexerDataSourceConnection,
    SearchIndexerSkillset,
    SemanticConfiguration,
    SemanticField,
    SemanticPrioritizedFields,
    SemanticSearch,
    SimpleField,
    VectorSearch,
    VectorSearchAlgorithmKind,
    VectorSearchAlgorithmMetric,
    VectorSearchProfile,
    VisionVectorizeSkill
)
from azure.search.documents.models import (
    HybridCountAndFacetMode,
    HybridSearch,
    SearchScoreThreshold,
    VectorizableTextQuery,
    VectorizableImageBinaryQuery,
    VectorizableImageUrlQuery,
    VectorSimilarityThreshold,
)
from azure.storage.blob import BlobServiceClient
from dotenv import load_dotenv
from IPython.display import Image, display, HTML
from openai import AzureOpenAI

In [None]:
# Load environment variables
load_dotenv()

# Configuration
AZURE_AI_VISION_API_KEY = os.getenv("AZURE_AI_VISION_API_KEY") # Option: For Image Embedding: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/concept-image-retrieval
AZURE_AI_VISION_ENDPOINT = os.getenv("AZURE_AI_VISION_ENDPOINT") # Option: For Image Embedding: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/concept-image-retrieval
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
BLOB_CONNECTION_STRING = os.getenv("BLOB_CONNECTION_STRING")
BLOB_CONTAINER_NAME = os.getenv("BLOB_CONTAINER_NAME")
AZURE_SEARCH_ADMIN_KEY = os.getenv("AZURE_SEARCH_ADMIN_KEY") # Option: For Retrievar on RAG
AZURE_SEARCH_ENDPOINT = os.getenv("AZURE_SEARCH_ENDPOINT") # Option: For Retrievar on RAG
AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT = os.getenv("AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT")
AZURE_DOCUMENT_INTELLIGENCE_KEY = os.getenv("AZURE_DOCUMENT_INTELLIGENCE_KEY")
aoai_deployment_name = 'gpt-4o' # your model deployment name for GPT-4o
aoai_api_version = '2024-02-01' # this might change in the future

## Extracting frames from video

#### Option: To use blob

In [None]:
# This cell is optional if you want to use blob storage for image input of gpt-4o. 
from azure.storage.blob import BlobServiceClient, BlobClient, generate_blob_sas, BlobSasPermissions
import datetime

# Setting for Azure Blob Storage
blob_service_client = BlobServiceClient.from_connection_string(BLOB_CONNECTION_STRING)

def upload_to_blob(file_path, blob_name):
    blob_client = blob_service_client.get_blob_client(container=BLOB_CONTAINER_NAME, blob=blob_name)
    with open(file_path, "rb") as data:
        blob_client.upload_blob(data, overwrite=True)
    
    start_time = datetime.datetime.now(datetime.timezone.utc)
    expiry_time = start_time + datetime.timedelta(days=7)
    
    sas_token = generate_blob_sas(
        account_name=blob_client.account_name,
        container_name=blob_client.container_name,
        blob_name=blob_client.blob_name,
        account_key=blob_service_client.credential.account_key,
        permission=BlobSasPermissions(read=True),
        expiry=expiry_time,
        start=start_time
    )
    blob_url = blob_client.url
    return blob_url, sas_token

In [None]:
#Note: The openai-python library support for Azure OpenAI is in preview.
#Note: This code sample requires OpenAI Python library version 1.0.0 or higher.
import os
from openai import AzureOpenAI
import json


def create_caption_by_gpt(image_url):
	# print("Image URL:", image_url)
	print("Generating caption using GPT-4o model...")
	client = AzureOpenAI(
		azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
		api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
		api_version="2024-02-01"
	)

	system_message = """
	You are an excellent caption creator. You must create a detailed caption for the image below with a maximum of 100 characters.
	The created caption will be used to describe the image in the search index.
	"""

	message_text = [
		{"role":"system","content":system_message},
		{"role":"user","content": [
			{
				"type": "text",
				"content": f"Please provide a caption for the image."
			},
			{ 
				"type": "image_url",
				"image_url": {
					"url": image_url
				}
			}
		]}
	]

	completion = client.chat.completions.create(
		model="gpt-4o", # model = "deployment_name"
		messages = message_text,
		temperature=0,
		max_tokens=200,
		top_p=0.95,
		frequency_penalty=0,
		presence_penalty=0,
		stop=None
		)
	print(completion.choices[0].message.content)
	return completion.choices[0].message.content


In [None]:
import base64
from mimetypes import guess_type

# Function to encode a local image into data URL 
def local_image_to_data_url(image_path):
    # Guess the MIME type of the image based on the file extension
    mime_type, _ = guess_type(image_path)
    if mime_type is None:
        mime_type = 'application/octet-stream'  # Default MIME type if none is found

    # Read and encode the image file
    with open(image_path, "rb") as image_file:
        base64_encoded_data = base64.b64encode(image_file.read()).decode('utf-8')

    # Construct the data URL
    return f"data:{mime_type};base64,{base64_encoded_data}"

In [None]:
import cv2
import os
import base64
import json

def extract_frames(video_path, output_dir, interval=5):
    # Get the video file name
    video_filename = os.path.basename(video_path)
    
    # Create the output directory
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    
    # Load the video file
    cap = cv2.VideoCapture(video_path)
    fps = cap.get(cv2.CAP_PROP_FPS)
    frame_count = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
    duration = frame_count / fps
    
    # List to save metadata
    metadata = []

    # Extract frames
    frame_number = 0
    while True:
        ret, frame = cap.read()
        if not ret:
            break

        current_time = frame_number / fps
        if current_time % interval < 1.0 / fps:
            # Save the frame as an image
            frame_filename = f"{video_filename}_frame_{frame_number}.jpg"
            frame_filepath = os.path.join(output_dir, frame_filename)
            cv2.imwrite(frame_filepath, frame)
            
            data_url = local_image_to_data_url(frame_filepath)
            
            # Generate a caption for the frame image
            caption = create_caption_by_gpt(data_url)
            
            # Save the metadata
            metadata.append({
                'video_filename': video_filename,
                'frame_filename': frame_filename,
                'frame_filepath': frame_filepath,
                'frame_number': frame_number,
                'timestamp': current_time,
                'caption': caption
            })
        
        frame_number += 1

    # Save metadata as a JSON
    metadata_json_path = os.path.join(output_dir, f"{video_filename}_metadata.json")
    with open(metadata_json_path, 'w', encoding='utf-8') as f:
        json.dump(metadata, f, ensure_ascii=False, indent=4)
    
    # Release resources
    cap.release()
    print(f"Frames and metadata have been saved in {output_dir}")


In [None]:
# Process the video file
video_path = '../data/01_scenario/video/dq3_intro.mp4'
output_dir = '../data/01_scenario/video_frames'
extract_frames(video_path, output_dir, interval=2)

## Extracting images and document from pdf

In [None]:
import os

input_file = "../data/01_scenario/pdf/dq3.pdf"
output_dir_images = "../data/01_scenario/images"
output_dir_documents = "../data/01_scenario/text"
os.makedirs(output_dir_images, exist_ok=True)
os.makedirs(output_dir_documents, exist_ok=True)

In [None]:
md_content = analyze_layout(
    input_file,
   	output_dir_images,
    AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT,
    AZURE_DOCUMENT_INTELLIGENCE_KEY,
    AZURE_OPENAI_ENDPOINT,
    AZURE_OPENAI_API_KEY,
    aoai_deployment_name,
    aoai_api_version)

In [None]:
with open(f"{output_dir_documents}/{os.path.splitext(os.path.basename(input_file))[0]}.md", 'w', encoding='utf-8') as f:
	f.write(md_content)

In [None]:
import glob
import os

images_dir = "../data/01_scenario/images"
file_paths = glob.glob(os.path.join(f"{images_dir}/*"), recursive=True)

In [None]:
import glob
import os

images_dir = "../data/01_scenario/images"
file_paths = glob.glob(os.path.join(f"{images_dir}/*"), recursive=True)

metadata = []

for file in file_paths:
	data_url = local_image_to_data_url(file)
	caption = create_caption_by_gpt(data_url)

	# Save the metadata
	metadata.append({
		'image_filepath': file,
		'caption': caption
	})
 
metadata_json_path = os.path.join(images_dir, f"images_metadata.json")
with open(metadata_json_path, 'w', encoding='utf-8') as f:
	json.dump(metadata, f, ensure_ascii=False, indent=4)

## Generating contents by GPT

In [None]:
import os

# Define the root directory
root_dir = "../data/01_scenario"

# Initialize the GPT prompt
gpt_prompt = ""

# Iterate over the subdirectories in the root directory
for subdir, dirs, files in os.walk(root_dir):
    # Iterate over the files in each subdirectory
    for file in files:
        # Get the file extension
        file_ext = os.path.splitext(file)[1]
        
        # Check if the file extension is json or md
        if file_ext == ".json" or file_ext == ".md":
            # Read the contents of the file
            file_path = os.path.join(subdir, file)
            with open(file_path, "r", encoding="utf-8") as f:
                file_contents = f.read()
            
            # Add the structured file information to the GPT prompt
            gpt_prompt += f"File Path: {file_path}\n"
            gpt_prompt += f"File Name: {file}\n"
            gpt_prompt += f"File Extension: {file_ext}\n"
            gpt_prompt += "File Contents:\n"
            gpt_prompt += file_contents + "\n\n"

# Print the GPT prompt
print(gpt_prompt)

In [None]:
#Note: The openai-python library support for Azure OpenAI is in preview.
#Note: This code sample requires OpenAI Python library version 1.0.0 or higher.
import os
from openai import AzureOpenAI
import json


client = AzureOpenAI(
	azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
	api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
	api_version="2024-02-01"
)

system_message = f"""
You are an excellent marketing content creator.
From the context you are given, create marketing content that best fits your objectives.
You must answer in Japanese.

# Context
{gpt_prompt}

"""

message_text = [
	{"role":"system","content":system_message},
	{"role":"user","content": [
		{
			"type": "text",
			"content": f"ゲームのブログ記事を作ってほしいです。ユーザが購入しプレイしたいと思える魅力的なブログ記事を必ず日本語で作成してください。記事には、発売情報、ゲームの魅力・特徴、シリーズものは他との違い、動画・画像データを差し込んだ具体的な説明を含めてください。"
		}
	]}
]

completion = client.chat.completions.create(
	model="gpt-4o", # model = "deployment_name"
	messages = message_text,
	temperature=0,
	# max_tokens=200,
	top_p=0.95,
	frequency_penalty=0,
	presence_penalty=0,
	stop=None
	)
print(completion.choices[0].message.content)

# Generate contents with brand image

In [None]:
import requests
from bs4 import BeautifulSoup

def analyze_website(url):
    # Fetch the page
    response = requests.get(url)
    html_content = response.content

    # Parse the HTML using BeautifulSoup
    soup = BeautifulSoup(html_content, 'html5lib')

    # Get the page title
    title = soup.title.string if soup.title else "No title found"

    # Retrieve metadata
    meta_description = soup.find('meta', attrs={'name': 'description'})
    meta_keywords = soup.find('meta', attrs={'name': 'keywords'})
    meta_description_content = meta_description['content'] if meta_description else "No description meta tag found"
    meta_keywords_content = meta_keywords['content'] if meta_keywords else "No keywords meta tag found"

    # Retrieve images
    images = [img['src'] for img in soup.find_all('img') if 'src' in img.attrs]

    # Retrieve links
    # links = [a['href'] for a in soup.find_all('a') if 'href' in a.attrs]

    # Retrieve videos
    videos = [video['src'] for video in soup.find_all('video') if 'src' in video.attrs]

    # Retrieve header information
    headers = {}
    for level in range(1, 7):
        headers[f'h{level}'] = [header.get_text(strip=True) for header in soup.find_all(f'h{level}')]

    # Get the page text
    page_text = soup.get_text()

    # Format results in Markdown
    markdown = f"""
	# Website Analysis Report

	## URL
	{url}

	## Title
	{title}

	## Meta Description
	{meta_description_content}

	## Meta Keywords
	{meta_keywords_content}

	## Images
	{' '.join(images)}

	## Videos
	{' '.join(videos)}

	## Headers
	"""
    for level in range(1, 7):
        markdown += f"\n### h{level} Headers\n"
        markdown += '\n'.join(headers[f'h{level}']) + '\n'

    markdown += f"\n## Page Text (First 1000 characters)\n{page_text[:1000]}"

    # Save HTML content to a file
    with open("website_content.html", "wb") as file:
        file.write(html_content)

    return markdown


In [None]:
# Example usage
url = "https://www.famitsu.com/news/202401/16330797.html"
report = analyze_website(url)

# Save the Markdown report to a file
with open("website_report.md", "w", encoding="utf-8") as file:
    file.write(report)

print(report)

In [None]:
def load_text_file(filepath):
    # Load HTML content from a file
    with open(filepath, "r", encoding="utf-8") as file:
        content = file.read()
    return content

In [None]:
#Note: The openai-python library support for Azure OpenAI is in preview.
#Note: This code sample requires OpenAI Python library version 1.0.0 or higher.
import os
from openai import AzureOpenAI
import json


client = AzureOpenAI(
	azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
	api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
	api_version="2024-02-01"
)

system_message = f"""
You are an excellent marketing content creator.
From the context you are given, create marketing content that best fits your objectives.
Also, the reference article given is a very good article, so please refer to it in terms of structure, tone, and image citations.
You must create natural and engaging content that will attract users to the website.
You must answer in Japanese.

# Context
{gpt_prompt}

# Reference Web page Content
{load_text_file("website_report.md")}

"""

message_text = [
	{"role":"system","content":system_message},
	{"role":"user","content": [
		{
			"type": "text",
			"content": f"ゲームのブログ記事を作ってほしいです。ユーザが購入しプレイしたいと思える魅力的なブログ記事を必ず日本語で作成してください。記事には、発売情報、ゲームの魅力・特徴、シリーズものは他との違い、動画・画像データを差し込んだ具体的な説明を含めてください。ただし、ブログ記事に役に立たない動画・画像データ（Webページの単なるアイコンなど）は含めないでください。"
		}
	]}
]

completion = client.chat.completions.create(
	model="gpt-4o", # model = "deployment_name"
	messages = message_text,
	temperature=0,
	# max_tokens=200,
	top_p=0.95,
	frequency_penalty=0,
	presence_penalty=0,
	stop=None
	)
print(completion.choices[0].message.content)

### Improve user prompt

In [None]:
user_prompt = """
I need a blog post about a game that will make readers want to purchase and play it. The blog post must be written in Japanese and should include the following elements:

1. Release information: Provide details about the game's release date, platforms, and any special editions available.
2. Game appeal and features: Highlight the unique aspects and attractions of the game. Explain what makes it stand out and why it's worth playing.
3. Differences from previous titles (if it’s part of a series): Compare it with other titles in the series, focusing on improvements, new features, and any changes that make this game unique.
4. Detailed explanations with multimedia: Include specific descriptions with embedded videos and images to illustrate the gameplay, graphics, and other notable features. Avoid including irrelevant multimedia content like website icons.
"""

In [None]:
#Note: The openai-python library support for Azure OpenAI is in preview.
#Note: This code sample requires OpenAI Python library version 1.0.0 or higher.
import os
from openai import AzureOpenAI
import json


client = AzureOpenAI(
	azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
	api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
	api_version="2024-02-01"
)

system_message = f"""
You are an excellent marketing content creator.
From the context you are given, create marketing content that best fits your objectives.
Also, the reference article given is a very good article, so please refer to it in terms of structure, tone, and image citations.
You must create natural and engaging content that will attract users to the website.
You must answer in Japanese.

# Context
{gpt_prompt}

# Reference Web page Content
{load_text_file("website_report.md")}

"""

message_text = [
	{"role":"system","content":system_message},
	{"role":"user","content": [
		{
			"type": "text",
			"content": user_prompt
		}
	]}
]

completion = client.chat.completions.create(
	model="gpt-4o", # model = "deployment_name"
	messages = message_text,
	temperature=0,
	# max_tokens=200,
	top_p=0.95,
	frequency_penalty=0,
	presence_penalty=0,
	stop=None
	)
print(completion.choices[0].message.content)

### Additional Description for images in blog post

In [None]:
#Note: The openai-python library support for Azure OpenAI is in preview.
#Note: This code sample requires OpenAI Python library version 1.0.0 or higher.
import os
from openai import AzureOpenAI
import json


def create_detail_descripton_by_gpt(image_url):
	# print("Image URL:", image_url)
	print("Generating caption using GPT-4o model...")
	client = AzureOpenAI(
		azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
		api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
		api_version="2024-02-01"
	)

	system_message = """
	You are an excellent blog post creator. You must create a detailed explanation for blog post.
	The description must be attractive to the reader, not merely a visual description. This description is to be incorporated directly into the blog, so please keep the text natural.
	If you determine that the image description does not have a positive impact on the blog, return an empty string.
	You must answer in Japanese.
	"""

	message_text = [
		{"role":"system","content":system_message},
		{"role":"user","content": [
			{
				"type": "text",
				"content": f"Please provide a caption for the image."
			},
			{ 
				"type": "image_url",
				"image_url": {
					"url": image_url
				}
			}
		]}
	]

	completion = client.chat.completions.create(
		model="gpt-4o", # model = "deployment_name"
		messages = message_text,
		temperature=0,
		# max_tokens=200,
		top_p=0.95,
		frequency_penalty=0,
		presence_penalty=0,
		stop=None
		)
	print(completion.choices[0].message.content)
	return completion.choices[0].message.content

In [None]:
import re

original_md = completion.choices[0].message.content

# Regular expression to extract image and caption
figure_pattern = re.compile(r'(<figure>.*?<img src="(.*?)".*?alt="(.*?)".*?<figcaption>(.*?)</figcaption>.*?</figure>)', re.DOTALL)

# Markdown text after replacement
new_md = original_md

# Process each image and caption
for match in figure_pattern.findall(original_md):
    full_figure, img_src, alt_text, figcaption = match
    print(img_src)
    
    detailed_description = create_detail_descripton_by_gpt(local_image_to_data_url(img_src))
    
    # New content for the replaced figure block
    new_figure_block = f"""
<figure>
<img src="{img_src}" alt="{alt_text}">
<figcaption>{figcaption}</figcaption>
<p>{detailed_description}</p>
</figure>
"""
    # Replacing the original figure block with the new content
    new_md = new_md.replace(full_figure, new_figure_block)

print(new_md)