## Step 1: Connect to a language model

In [2]:
import os
import openai
import langchain
from langchain.chat_models import ChatOpenAI
from dotenv import load_dotenv, find_dotenv

_ = load_dotenv(find_dotenv())
openai.api_key = os.environ['OPENAI_API_KEY']

llm = "gpt-3.5-turbo"
chat = ChatOpenAI(temperature=0.0, model=llm)

## Step 2: Define the problem statement

IN:user defines a problem or solution / tool

In [4]:
problem_statement = """I want to analyze our Berne advertising performance in Germany in April 2023. I want to learn about the situation of important metrics and about immediate actions to take."""

## Step 3: Provide source data to feed information to the model so that it becomes context aware

### 3.1 URL

In [5]:
# Step 1: URL Loader Functions

# UnstructuredURLLoader
from langchain.document_loaders import UnstructuredURLLoader
def load_url_unstructured(url):
    unstructured_url_loader = UnstructuredURLLoader(urls=[url])
    return unstructured_url_loader.load()

# WebBaseLoader
from langchain.document_loaders import WebBaseLoader
def load_url_webbase(url):
    webbase_url_loader = WebBaseLoader(url)
    return webbase_url_loader.load()

# Step 2: URL Path
url = "https://scaleinsights.com/learn/how-to-analyse-amazon-ppc-data"

# Step 3: Test URL loader functions

# Test UnstructuredURLLoader
url_result1 = load_url_unstructured(url)
print("Loaded Content:\n", url_result1)

# Test WebBaseLoader
url_result2 = load_url_webbase(url)
print("Loaded Content:\n", url_result2)

Loaded Content:
 [Document(page_content="Blog\n\nAbout\n\nFeatures\n\nRoadmap\n\nPricing\n\nContact\n\nSign Up For Free\n\nHow To Analyse Amazon PPC Data: 8 Must-Read Secrets\n\nStart 30 Days Free Trial\n\nNo Credit-Card Required\n\nDiscuss Amazon PPC strategies with other experts\n\nJoin our FB group\n\nFind article\n\nRecent Articles\n\nWhat Does Amazon's Choice Mean And How To Get A Badge?\n\nHow To Sell On Amazon FBA For Beginners: 12 Powerful Tactics\n\nAmazon Digital Display Advertising For Sellers\n\nTop tags\n\nAmazon PPC\n\nComparison Guides\n\nBest Amazon Tools\n\nDiscuss Amazon PPC strategies with other experts\n\nJoin our FB group\n\nScale Insights\n\nSmart PPC Solution for Smart Amazon Sellers\n\nStreamline your Amazon PPC workflows and scale profits with automation\n\nSign Up For Free\n\nHow To Analyse Amazon PPC Data: 8 Must-Read Secrets\n\nScale Insights Team\n\nShare:\n\n\n\n\n\n\n\namazon ppc data analysis\n\nanalysing amazon ppc data\n\nhow to analyse amazon ppc data

### 3.2. PDF

In [6]:
# Step 1: PDF Loader Function
from langchain.document_loaders import PyPDFLoader
def load_pdf(pdf_path):
    pdf_loader = PyPDFLoader(file_path=pdf_path)
    return pdf_loader.load()

# Step 2: File Path
pdf = "amazon_ppc_guide.pdf"

# Step 3: Test loader function
pdf_result = load_pdf(pdf)
print("Loaded Content:\n", pdf_result)

Loaded Content:
 [Document(page_content="Discuss\nAmazon PPC\nstrategies\nwith other experts\nJoin our FB group\nDiscuss\nAmazon PPC\nstrategies\nwith other experts\nJoin our FB group\nScale Insights\nSmart PPC\nSolution forSearch Here...Find article\n\uf002\nWhat Does Amazon's\nChoice Mean And How\nTo Get A Badge?\nHow To Sell On Amazon\nFBA For Beginners: 12\nPowerful Tactics\nAmazon Digital Display\nAdvertising For SellersRecent Articles\nAmazon PPC\nComparison Guides\nBest Amazon ToolsTop tagsHow To Analyse Amazon PPC Data: 8 Must-Read Secrets\nShare:\uf39e\uf099\uf0e1\n amazon ppc data analysis  analysing amazon ppc data  how to analyse amazon ppc data\nAnalysing Amazon PPC data is vital for e\x00ective campaign management and data-driven decision-making. Key\nmetrics such as click-through rates, conversion rates, and cost-per-click provide valuable insights into the\naspects of your campaign that may require adjustments.\nTherefore, strategic analysis of Amazon PPC data can optim

### 3.3. HTML

In [7]:
# Step 1: HTML Loader Functions

# HTML Loader Function
from langchain.document_loaders import UnstructuredHTMLLoader
def load_html(file_path):
    html_loader = UnstructuredHTMLLoader(file_path=file_path)
    return html_loader.load()

# HTML with BeautifulSoup4 Loader Function
from langchain.document_loaders import BSHTMLLoader
def load_bshtml(file_path):
    bshtml_loader = BSHTMLLoader(file_path=file_path)
    return bshtml_loader.load()

# Step 2: File Path
html = "amazon_ppc_guide.html"

# Step 3: Test loader functions
html_result1 = load_html(html)
print("Loaded Content:\n", html_result1)

html_result2 = load_bshtml(html)
print("Loaded Content:\n", html_result2)

  rows = body.findall("tr") if body else []


Loaded Content:
 [Document(page_content='<!DOCTYPE html> <html  lang =" en "> <head>      <meta  charset =" utf-8 " />      <meta  name =" viewport "  content =" width=device-width, initial-scale=1.0 " />      <meta  name =" description "  content =" Learn to analyse your Amazon PPC data and make informed decisions to boost your PPC campaigns. This ultimate guide offers tips and insights for your campaigns. " />      <meta  name =" google-site-verification "  content =" CZknm9WnsZXpgecY8QAXSCAThTK7lvPzKp6ARxo6Ipg " />      <title> How To Analyse Amazon PPC Data: 8 Must-Read Secrets </title>      <link  rel =" canonical "  href =" https://scaleinsights.com/learn/how-to-analyse-amazon-ppc-data " />      <script  src =" https://kit.fontawesome.com/5e226041e5.js "  crossorigin =" anonymous "> </script>      <link  rel =" stylesheet "  href =" /css/learn.css?v=O1d47y045RZVCddNQqDXTBd6htSyCkb_AYGQFHKDYLA " />   \t      <!-- Google Tag Manager -->      <script>         (function (w, d, s, l, 

### 3.4. TEXT

In [8]:
from langchain.document_loaders.text import TextLoader
def load_text(file_path):
    text_loader = UnstructuredURLLoader(file_path=file_path)
    return text_loader.load()

## Step 4: Process the Loaded Documents
- Use NLP techniques to extract relevant information from the loaded documents.
- This could involve summarizing the content, extracting key phrases, or identifying specific information related to advertising analysis.

In [9]:
# Step 1: Initializing Splitters
from langchain.text_splitter import CharacterTextSplitter, RecursiveCharacterTextSplitter, TokenTextSplitter, HTMLHeaderTextSplitter
separators = ["\n\n", "\n", "\. ", " ", ""]
headers_to_split_on = headers_to_split_on = [("h1", "Header 1"), ("h2", "Header 2"), ("h3", "Header 3"), ("h4", "Header 4")]

c_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
r_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
t_splitter = TokenTextSplitter(chunk_size=10, chunk_overlap=0)
h_splitter = HTMLHeaderTextSplitter(headers_to_split_on=headers_to_split_on)

In [10]:
# Step 2: Split
html_header_splits = h_splitter.split_text_from_url(url) # 

# Be aware that whis "splits" documents are created from url, local variable that we created in the previous step is not used, for local file use html_splitter.split_text_from_file(<path_to_file>
splits = r_splitter.split_documents(html_header_splits)
splits[15:20]

[Document(page_content="Keyword Data: This data provides information about the performance of individual keywords in your PPC campaigns. It includes metrics such as clicks, impressions, conversion rates, and cost per click (CPC). Keyword data is crucial in identifying top-performing keywords that are driving conversions and optimising bids accordingly.  \nCampaign Data: Campaign data provides an overview of the performance of your PPC campaigns as a whole. It includes metrics such as total spending, total sales, and ACoS. Analysing campaign data helps you understand the overall impact of your campaigns on your business and identify areas for improvement.  \nSearch Term Data: This data provides insight into customers' search terms to find and click on your ads. It includes metrics such as clicks, impressions, and conversions by search term. Analysing search term data helps you identify new keywords to add to your campaigns and negative keywords to exclude, as well as improve your target

## Step 5: Generate Prompt Template
- Concatenate the summaries or key information extracted from the documents to form a comprehensive background context.
- Then, use the LLM to generate a prompt template.

#### LLM 1:
- INPUT: problem_statement by user (str)
- OUTPUT: reshaped and detailed version of problem statement for the next LLM (dict)

In [11]:
business_background = """Please keep in mind the following business background:
- The context involves an e-commerce brand specializing in home and living products sold on Amazon.
- The brand manages a large product catalogue and operates across multiple marketplaces, including DE, FR, IT, and ES.
- The primary challenge is in the analysis of data for informed decision-making in sales, inventory, and advertising domains."""

In [14]:
llm1_output_format = """
JSON format with the following keys:
"finetuned_problem_statement": "Your rephrased problem_statement that is about the same length as the original statement.",  
"domain": "One of the following: [Sales, Inventory, Advertising, Product]",
"collection_name": "if applicable, one from the following: [Berne tables, Scots Pine tables, Home office, Oviedo, Huesca, Baumkante coffee tables, Malaga, Baumkante consoles, Asymmetric mirrors, Terra mirrors, Mira mirrors, Mia, Roa, Mammo, Luna, Salamanca, Palencia, Bilbao, Murcia, Palamos]", 
"location": "if applicable, one of the following: [DE, FR, IT, ES]", 
"timeframe": "If applicable give details about timeframe mentioned",
"""

In [16]:
llm1_prompt = f"""
Your task is to refine a business owner's problem statement, provided below enclosed within triple backticks.\
Use the problem statement and Business Background Information, also enclosed within triple backticks, to understand the context.\
Your goal is to enhance the grammar, sentence structure, and clarity for better suitability for an AI model. Include specific details like KPIs or relevant metrics if applicable.\        

Original Problem Statement: 
```{problem_statement}```

Business Background Information (for context, not direct use):
```{business_background}```

Please follow this format for your output, focusing on clarity and detail:
```{llm1_output_format}```

Your reformulated statement will be integrated with additional documents to create a detailed prompt for another AI model. Ensure that your output is precise and thorough.
"""

In [17]:
def get_completion(prompt, model=llm, client=openai):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

problem_statement_structured = get_completion(llm1_prompt)

In [18]:
print(problem_statement_structured)

{
"finetuned_problem_statement": "I need to analyze the advertising performance of our Berne product line in Germany during April 2023. I want to gain insights into key metrics and identify immediate actions to improve performance.",
"domain": "Advertising",
"collection_name": "Berne tables",
"location": "DE",
"timeframe": "April 2023"
}


In [19]:
import json 

def read_string_to_list(input_string):
    if input_string is None:
        return None

    try:
        input_string = input_string.replace("'", "\"")  # Replace single quotes with double quotes for valid JSON
        data = json.loads(input_string)
        return data
    except json.JSONDecodeError:
        print("Error: Invalid JSON string")
        return None 

In [20]:
problem_statement_list = read_string_to_list(problem_statement_structured)
print(problem_statement_list)

{'finetuned_problem_statement': 'I need to analyze the advertising performance of our Berne product line in Germany during April 2023. I want to gain insights into key metrics and identify immediate actions to improve performance.', 'domain': 'Advertising', 'collection_name': 'Berne tables', 'location': 'DE', 'timeframe': 'April 2023'}


#### LLM 2:
- INPUT: Loaded documents  (url, pdf, html: processed or no processed)

- OUTPUT: List of instructions of with specialized information

In [32]:
amazon_analysis_template = """
To streamline the analysis of an Amazon PPC advertising report at a collection level, here are the step-by-step instructions:

Data Preparation:

Add a column titled 'Collection Name' to the advertising report.
Group campaign names under their respective collection names to organize the data at the collection level.
Sales Analysis:

For each collection, sum up the total sales from the '7 Day Total Sales' column.
Spending Analysis:

Calculate the total spend for each collection by adding up the amounts in the 'Spend' column.
ACoS (Advertising Cost of Sales) Calculation:

For each collection, calculate ACoS using the formula: (Total Spend / Total Sales) * 100. This gives the ACoS percentage.
ROAS (Return on Advertising Spend) Calculation:

Compute ROAS for each collection with the formula: Total Sales / Total Spend.
Impressions, Clicks, and CTR (Click-Through Rate) Analysis:

Total the 'Impressions' and 'Clicks' for each collection.
Calculate CTR using the formula: (Total Clicks / Total Impressions) * 100.
Conversion Rate Calculation:

Conversion rate is calculated by dividing the number of conversions (from the '7 Day Total Orders' column) by the total number of clicks, then multiplying by 100.
Review and Compare:

Compare these metrics across different collections to identify which are performing best and which need improvement.
Identify Trends:

Look for trends or patterns in the data over time, such as increasing or decreasing ACoS or ROAS, to inform future advertising strategies.
Actionable Insights:

Based on your analysis, determine which collections are most profitable, which campaigns need adjustment, and where budget allocation could be optimized.
"""

In [35]:
# Making sure if document content is available

document_content = url_result1[0].page_content if url_result1 else "No document content available"

llm2_prompt = f"""
You are a part of a consulting team that helps an e-commerce breand to manage their Amazon PPC advertising.\
Your client brand's background information is provided in delimited triple brackets below,\
Business background: <<<{business_background}>>>
Your duty in this consulting team is 3 steps:\
Step 1: Understand the business background, as a small and non-technical team they can not manage a large product catalogue and operate across multiple marketplaces.\
They would like to track the performace in a very streamlined way using amazon ppc advertising reports that include campaign names, spend, sales, and ACoS, ROAS, CTR, and conversion rate.\
Step 2: Read provided document content that provides very sophisticated information about how to manage analyze amazon advertising data and manage campaigns.\
You will study these documents and take notes about how can you implement a streamlined analysis of these spreadsheets. These documents are presented delimited by triple brackets below:\
Document content: <<<{document_content}>>>\
Step 3: You will create a report about your findings that will be used as step by step instructions to the brand.\
These will be in a format that will be like similiar to the provided format delimited in triple brackets, Provided format: <<<{amazon_analysis_template}>>>\
You will keep the psuedo-code like step-by-step instrcutions format, but you will revise the accuracy of information with what you have learned from the documents you have studied.\
Give numbers for each step.\
"""

test_output = get_completion(llm2_prompt)

In [36]:
print(test_output)

To streamline the analysis of an Amazon PPC advertising report at a collection level, follow these step-by-step instructions:

1. Data Preparation:
   a. Add a column titled 'Collection Name' to the advertising report.
   b. Group campaign names under their respective collection names to organize the data at the collection level.

2. Sales Analysis:
   a. For each collection, sum up the total sales from the '7 Day Total Sales' column.

3. Spending Analysis:
   a. Calculate the total spend for each collection by adding up the amounts in the 'Spend' column.

4. ACoS (Advertising Cost of Sales) Calculation:
   a. For each collection, calculate ACoS using the formula: (Total Spend / Total Sales) * 100. This gives the ACoS percentage.

5. ROAS (Return on Advertising Spend) Calculation:
   a. Compute ROAS for each collection with the formula: Total Sales / Total Spend.

6. Impressions, Clicks, and CTR (Click-Through Rate) Analysis:
   a. Total the 'Impressions' and 'Clicks' for each collecti

In [27]:
prompt_format = """
Objective: [Your objective here]
Data Overview: [Brief description of the data]
Specific Requests/Questions: [List your specific queries or focus areas]
Instructions for Analysis/Response: [Step-by-step instructions or methodology preferences]
Expected Outcome: [Describe the desired output or result]
Additional Requirements/Constraints: [Any special considerations]
Call to Action/Closing Note: [Final remark or request for action]
"""

INPUT VARIABLES ARE READY, FOR REAL THIS TIME

In [28]:
llm2_prompt = f"""
You are tasked with generating a well-structured prompt for analyzing Amazon PPC advertising data,\
which will be provided to a Language Model (LLM) along with real Amazon advertising reports.\
The objective is to create a streamlined data analysis process for an e-commerce brand specializing in home and living products sold on Amazon.\
This brand faces challenges due to its extensive product catalog (over 200 SKUs), numerous collections (around 40), and diverse product categories (around 10).\
The brand operates across multiple marketplaces, including DE, IT, FR, UK, and ES.\

Please create a prompt that outlines a comprehensive data analysis process.\
This process should focus on category-level insights and key performance indicators (KPIs) such as Click-Through Rate (CTR), Advertising Cost of Sales (ACOS),Return on Ad Spend (ROAS),\
Cost Per Click (CPC), and changes in impressions. Additionally, the process should be designed for automation, possibly using SQL agents, to enhance efficiency and accuracy.

1. To assist you in generating this prompt, you have access to the following documents that highlights the specific problems and areas of focus when managing Amazon PPC advertising campaigns:
Document content: <<<{document_content}>>>
You will use information from these documents to structure the instructions for ads management to the next LLM.\

2. You will remind the next LLM that the prompt you create will be provided to the next LLM, which will use it with real Amazon advertising reports.\
These reports will be provided in spreadsheet or tabular format and will show monthly performances with columns such as spend, budget, ACOS, ROAS, CPC, CTR, etc.\

Your prompt should incorporate technical knowledge from <<<{document_content}>> to explain how to better structure the instructions for ads management to the next LLM.\
At the same time, it should draw attention to the fact that the prompt you create will be provided to the next LLM, which will use it with real Amazon advertising reports.\
Therefore, your instructions should guide the next LLM on how to effectively use the real data in the analysis process.\

Your generated prompt will serve as a guide for the next LLM in conducting data analysis with real Amazon advertising reports.
Please use the format that is delimited by triple brackets below:
<<<{prompt_format}>>>
"""

In [29]:
def create_ultimate_prompt(prompt, model=llm, client=openai):
    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=0.2
    )
    return response.choices[0].message.content

ultimate_prompt = create_ultimate_prompt(llm2_prompt)

BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4097 tokens. However, your messages resulted in 6776 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}

In [34]:
print(ultimate_prompt)

The document is a blog post that provides information and tips on how to analyze Amazon PPC (Pay-Per-Click) data. It emphasizes the importance of data-driven decision-making and highlights the benefits of analyzing PPC data, such as improved ROI, better targeting, increased sales, cost savings, and a competitive advantage. The document explains the different types of Amazon PPC data, including keyword data, campaign data, and search term data, and discusses key metrics to monitor, such as impressions, clicks, click-through rate (CTR), cost-per-click (CPC), and conversion rate. It also mentions various reports and tools that can be used for analyzing Amazon PPC data, such as Amazon Advertising console, Google Analytics, and third-party analytics tools. The document provides steps for analyzing Amazon PPC data, tips for effective data analysis, and ways to use data analysis to optimize campaign performance. It concludes by mentioning Scale Insights, an AI-driven analytics tool for Amazon

## Step 6: Load and Process Real Business Data
- Now, you'll need to load the actual Amazon advertising data, likely in a spreadsheet format. You can use pandas for this purpose.

In [None]:
import pandas as pd

# Load advertising data
ad_data = pd.read_csv('path/to/advertising/data.csv')

## Step 7: Analyze the Data
- With the prompt template and real data, you can now analyze the data.
- This might involve writing custom Python functions or using existing libraries.

In [None]:
# Example: Simple analysis function
def analyze_data(data, prompt_template):
    # Custom analysis logic here
    return analysis_results

results = analyze_data(ad_data, prompt_template)

## Step 8: Display Results

In [None]:
print("Analysis Results:")
print(results)