**Style Analysis**

First, we load the csv of amazon product listings on their online shop.

In [1]:
# load the data
import pandas as pd
df = pd.read_csv('amazon_co-ecommerce_sample.csv')

Then, we check the properties of the data frame.

In [2]:
df.head()

Unnamed: 0,index,uniq_id,product_name,manufacturer,price,number_available_in_stock,number_of_reviews,number_of_answered_questions,average_review_rating,amazon_category_and_sub_category,customers_who_bought_this_item_also_bought,description,product_information,product_description,items_customers_buy_after_viewing_this_item,customer_questions_and_answers,customer_reviews,sellers
0,0,eac7efa5dbd3d667f26eb3d3ab504464,Hornby 2014 Catalogue,Hornby,£3.42,5 new,15,1.0,4.9 out of 5 stars,Hobbies > Model Trains & Railway Sets > Rail V...,http://www.amazon.co.uk/Hornby-R8150-Catalogue...,Product Description Hornby 2014 Catalogue Box ...,Technical Details Item Weight640 g Product Dim...,Product Description Hornby 2014 Catalogue Box ...,http://www.amazon.co.uk/Hornby-R8150-Catalogue...,Does this catalogue detail all the previous Ho...,Worth Buying For The Pictures Alone (As Ever) ...,"{""seller""=>[{""Seller_name_1""=>""Amazon.co.uk"", ..."
1,1,b17540ef7e86e461d37f3ae58b7b72ac,FunkyBuys® Large Christmas Holiday Express Fes...,FunkyBuys,£16.99,,2,1.0,4.5 out of 5 stars,Hobbies > Model Trains & Railway Sets > Rail V...,http://www.amazon.co.uk/Christmas-Holiday-Expr...,Size Name:Large FunkyBuys® Large Christmas Hol...,Technical Details Manufacturer recommended age...,Size Name:Large FunkyBuys® Large Christmas Hol...,http://www.amazon.co.uk/Christmas-Holiday-Expr...,can you turn off sounds // hi no you cant turn...,Four Stars // 4.0 // 18 Dec. 2015 // By\n \...,"{""seller""=>{""Seller_name_1""=>""UHD WHOLESALE"", ..."
2,2,348f344247b0c1a935b1223072ef9d8a,CLASSIC TOY TRAIN SET TRACK CARRIAGES LIGHT EN...,ccf,£9.99,2 new,17,2.0,3.9 out of 5 stars,Hobbies > Model Trains & Railway Sets > Rail V...,http://www.amazon.co.uk/Classic-Train-Lights-B...,BIG CLASSIC TOY TRAIN SET TRACK CARRIAGE LIGHT...,Technical Details Manufacturer recommended age...,BIG CLASSIC TOY TRAIN SET TRACK CARRIAGE LIGHT...,http://www.amazon.co.uk/Train-With-Tracks-Batt...,What is the gauge of the track // Hi Paul.Trut...,**Highly Recommended!** // 5.0 // 26 May 2015 ...,"{""seller""=>[{""Seller_name_1""=>""DEAL-BOX"", ""Sel..."
3,3,e12b92dbb8eaee78b22965d2a9bbbd9f,HORNBY Coach R4410A BR Hawksworth Corridor 3rd,Hornby,£39.99,,1,2.0,5.0 out of 5 stars,Hobbies > Model Trains & Railway Sets > Rail V...,,Hornby 00 Gauge BR Hawksworth 3rd Class W 2107...,Technical Details Item Weight259 g Product Dim...,Hornby 00 Gauge BR Hawksworth 3rd Class W 2107...,,,I love it // 5.0 // 22 July 2013 // By\n \n...,
4,4,e33a9adeed5f36840ccc227db4682a36,Hornby 00 Gauge 0-4-0 Gildenlow Salt Co. Steam...,Hornby,£32.19,,3,2.0,4.7 out of 5 stars,Hobbies > Model Trains & Railway Sets > Rail V...,http://www.amazon.co.uk/Hornby-R6367-RailRoad-...,Product Description Hornby RailRoad 0-4-0 Gild...,Technical Details Item Weight159 g Product Dim...,Product Description Hornby RailRoad 0-4-0 Gild...,http://www.amazon.co.uk/Hornby-R2672-RailRoad-...,,Birthday present // 5.0 // 14 April 2014 // By...,


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 18 columns):
 #   Column                                       Non-Null Count  Dtype  
---  ------                                       --------------  -----  
 0   index                                        10000 non-null  int64  
 1   uniq_id                                      10000 non-null  object 
 2   product_name                                 10000 non-null  object 
 3   manufacturer                                 9993 non-null   object 
 4   price                                        8565 non-null   object 
 5   number_available_in_stock                    7500 non-null   object 
 6   number_of_reviews                            9982 non-null   object 
 7   number_of_answered_questions                 9235 non-null   float64
 8   average_review_rating                        9982 non-null   object 
 9   amazon_category_and_sub_category             9310 non-null   object 
 10 

At a glance, the most important columns from the data frame are the following :

1. *product_name* – title of the product
2. *product_information* – technical details & specifications
3. *product_description* – official Amazon description

Now we want to use the product description as a reference for the model.

To do that, we need to do a light style analysis of the data. Specifically we have to check how the descriptions were constructed (eg. sentence structure, catchy words used) so we can adapt it to the model.

Note : this can be improved by doing natural language processing, but for this project we utilized regex.


Then, we proceed with the light & manual style analysis.

In [5]:
# import requiremenets
import re

# create samples list
sample_descriptions = df['product_description'].dropna().sample(1000, random_state=1).tolist()
sentence_data = []

# use regex to create actual samples
for desc in sample_descriptions:
    sentences = re.split(r'(?<=[.!?]) +', desc.strip())
    sentence_data.extend(sentences)

sentence_lengths = [len(s.split()) for s in sentence_data]
sentence_starters = [s.split()[0] for s in sentence_data if len(s.split()) > 0]
average_sentence_length = sum(sentence_lengths) / len(sentence_lengths) if sentence_lengths else 0

print(f"Average sentence length: {average_sentence_length:.2f} words")
pd.Series(sentence_starters).value_counts().head(10)

Average sentence length: 16.60 words


The        363
Product    337
Box        238
This       195
It          71
A           66
Each        65
With        54
All         47
For         46
Name: count, dtype: int64

Above we have a sample of descriptions that we want as a guide for the model. We arrived at the following findings :

- Average sentence length at ~17 words
- Words typically start with words such as 'The', 'Product', 'This'
- Tone generally leans towards professional, slightly promotional, and often imperative
- Mix of complete sentences and bullet-point-style fragments
- Tends to highlight functionality ("stows away", "indoor dipole aerial")

Now we will use this analysis as a guide for our **prompt engineering**.