# KPop Merchandise Price Analysis

## We want to:

We want to identify which sites offer the best deals for merchandise. This is done on several levels: which sites offer the best deals, which sites are better for particular artists, which sites offer exclusive merchandise, and so on. 

### Importing Libraries & Loading Data

In [2]:
import pandas as pd
import numpy as np
from datetime import datetime

In [3]:
jyp = pd.read_csv("jyp_shop.csv")
kpopalbums = pd.read_csv("kpopalbums_shop.csv")
musicplaza = pd.read_csv("musicplaza_shop.csv")
sm = pd.read_csv("smglobalshop_all.csv")
kpopstoreinusa = pd.read_csv("kpopstoreinusa.csv", index_col=0)
mwave = pd.read_csv("mwave.csv")

##### Cleaning Data to Make Central Dataframe

In [4]:
# jypshop needs to add a vendor and prefix to URL; kpopstoreinusa needs to change availability to is_sold_out
jyp['vendor'] = ["jypshop"]*len(jyp)
jyp = jyp.rename(columns={'sold_out': 'is_sold_out'})
jyp['url'] = 'https://en.thejypshop.com' + jyp['url'].astype(str)


kpopstoreinusa = kpopstoreinusa.rename(columns={'availability': 'is_sold_out'})
kpopstoreinusa['is_sold_out'] = ~kpopstoreinusa['is_sold_out']

In [5]:
data = pd.concat([jyp, kpopalbums, musicplaza, sm, kpopstoreinusa, mwave])

In [6]:
data.to_csv('data.csv',index=False)

### Analysis Explanation

While there are plenty of kpop groups to choose from, it makes sense to compare "competing" groups for differences in the pricing of their products on a group level and on a store to store level. This is because while they may have distinct styles, presentation, and fanbases, these groups are competitors due to their size, fame, and relevance to their parent entertainment company. Thus, I will examine how their recent comebacks are priced on different platforms and see if some groups charge more for their merchandise, in addition to examining if there are significant price differences for the same product across the different e-commerce platforms as well. For my analysis, I will compare Twice and Red Velvet, who are from JYP and SM Entertainment respectively.

In [7]:
# Loading master data file
data = pd.read_csv("data.csv")

#### Identifying Official Store Prices by Select Artists

In [8]:
twice_data = data[data['item'].str.contains("twice", case = False)]
reve_data = data[data['item'].str.contains("red velvet|reve", case = False)]
bts_data = data[data['item'].str.contains("bts", case = False)]

In [9]:
# Looking at the most recent comebacks: Formula of Love: O+T=<3 & Between 1&2 for Twice, The ReVe Festival 2022 & Queendom for Red Velvet. 
twice_fol = twice_data[twice_data['item'].str.contains("formula", case = False)].reset_index(drop=True)
twice_between = twice_data[twice_data['item'].str.contains("between", case = False)].reset_index(drop=True)
reve_queendom = reve_data[reve_data['item'].str.contains("queendom", case = False)].reset_index(drop=True)
reve_feel_my_rhythm = reve_data[reve_data['item'].str.contains("feel|my|rhythm", case = False)].reset_index(drop=True)

In [10]:
pd.set_option("display.max_colwidth", None)

##### Feel My Rhythm vs Between 1&2

In [11]:
twice_b_subset = twice_between[["item","discount_price","price","vendor","is_sold_out"]]
print(twice_b_subset)

                                                                                  item  \
0                                                    TWICE 11th MINI ALBUM BETWEEN 1&2   
1                                 [VIDEO CALL EVENT] TWICE 11th MINI ALBUM BETWEEN 1&2   
2                                   [FAN SIGN EVENT] TWICE 11th MINI ALBUM BETWEEN 1&2   
3                                      TWICE BETWEEN 1&2 BADGE - TWICE 7TH ANNIVERSARY   
4                               TWICE - [BETWEEN 1&2] 11th Mini Album COMPLETE Version   
5                                  TWICE - [BETWEEN 1&2] 11th Mini Album 4 Version SET   
6                             TWICE - [BETWEEN 1&2] 11th Mini Album PATHFINDER Version   
7                                TWICE - [BETWEEN 1&2] 11th Mini Album ARCHIVE Version   
8                           TWICE - [BETWEEN 1&2] 11th Mini Album CRYPTOGRAPHY Version   
9                                 TWICE - [BETWEEN 1&2] 11th Mini Album RANDOM Version   
10      트와

In [12]:
reve_fmr_subset = reve_feel_my_rhythm[["item","discount_price","price","vendor","is_sold_out"]]
print(reve_fmr_subset)

                                                                                                       item  \
0                      RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe 2 Version SET   
1                    RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe CALMATO Version   
2                RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe CAPRICCIOSO Version   
3                     RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe RANDOM Version   
4                           RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ORGEL Version   
5                RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe CAPRICCIOSO Version   
6                     RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe RANDOM Version   
7                    RED VELVET - [The ReVe Festival 2022 : Feel My Rhythm] Mimi Album ReVe CALMATO Version   
8

Looking at the prices, we find that Between 1&2 sells for $18.55 pre tax on the official JYP site, but is $2.04 cheaper on kpopalbums.com. The other sites in our scraped data do not sell the albums individually, but instead bundle them as a set - however, this set costs significantly more on kpopstoreinusa by about $40. Factoring in the posters, we find that the complete set, with posters, on kpopstoreinusa is still about $20 to $24 more expensive. As all of these sites do not include shipping, it would appear Kpopalbums is the better site to go to. 

A similar case is found with Red Velvet's Feel my Rhythm. The official store prices are several dollars higher than third party sites, but the difference is partially offset by the existence of exclusive collectibles that are nto found on other merchant sites. Kpopalbums.com, again, appears to be the overall cheapest site for albums and merchandise. 

##### Queendom vs Formula of Love

In [13]:
twice_fol_subset = twice_fol[["item","discount_price","price","vendor","is_sold_out"]]
print(twice_fol_subset)

                                                                                          item  \
0                                                      TWICE 3rd Album Formula of Love: O+T=<3   
1                              TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album RESULT FILE Version   
2                                 TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album BREAK IT Version   
3                             TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album FULL OF LOVE Version   
4                                   TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album RANDOM Version   
5                                    TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album 4 Version SET   
6                                TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album EXPLOSION Version   
7                         TWICE - [FORMULA OF LOVE: O+T=<3] 3rd Album STUDY ABOUT LOVE Version   
8                                          트와이스 | TWICE 3RD ALBUM [ FORMULA OF LOVE : O+T=<3 ]   
9                   

In [14]:
reve_queendom_subset = reve_queendom[["item","discount_price","price","vendor","is_sold_out"]]
print(reve_queendom_subset)

                                                                                                      item  \
0                                                     RED VELVET - [QUEENDOM] 6th Mini Album GIRLS Version   
1                                       RED VELVET - [QUEENDOM] 6th Mini Album QUEENS Version RANDOM Cover   
2                                                     RED VELVET - [QUEENDOM] 6th Mini Album GIRLS Version   
3                                              레드벨벳 | RED VELVET 6TH MINI ALBUM [ QUEENDOM ] GIRLS VERSION   
4                                             레드벨벳 | RED VELVET 6TH MINI ALBUM [ QUEENDOM ] QUEENS VERSION   
5                                           레드벨벳 | RED VELVET [ INTERVIEW VOL. 7: QUEENDOM ] POSTCARD BOOK   
6                           레드벨벳 | RED VELVET | 6TH MINI ALBUM [ QUEENDOM ] | (GIRLS - B VER.) POSTER ONLY   
7                           레드벨벳 | RED VELVET | 6TH MINI ALBUM [ QUEENDOM ] | (GIRLS - A VER.) POSTER ONLY   
8         

The prices for Formula of Love are much less varied between the different sites - while kpopalbums.com appears to still sell its items at a slightly better price it is not clear which site is immediately the best one due to the variability in shipping cost. As for Queendom, it appears that kpopalbums.com has the cheaper album and merchandise prices, but there are exclusive items that are only available through the official SM Entertainment site. 

From the four albums we have looked at, it is appears that kpopalbums.com is the go-to site for buying albums, while the official shop sites are useful solely for exlusive merchandise. That being said, there appears to be merchandise that can only be found on certain e-commerce sites too, so if someone wants to purchase merchandise that is no longer carried in the official site then kpopstoreinusa and musicplaza appear to be the best options for them. 

It also appears that there is not too significant of a difference in album and merchandise pricing, but it does appear that Red Velvet prices generally skew higher. However, this is not a very strong claim, as the groups also sell different exclusive merchandise. 

##### Platform Product Availability

Given that we have identified kpopalbums.com as our general "go-to" site for most kpop merchandise, the next step is to gauge product availability. 

In [19]:
print("missing items total (kpopalbums): " + str(len(kpopalbums[kpopalbums['is_sold_out'] == True])))
print("total number of rows (kpopalbums): " + str(len(kpopalbums)))

missing items total: 98
total number of rows: 4976


In [20]:
print("missing items total (musicplaza): " + str(len(musicplaza[musicplaza['is_sold_out'] == True])))
print("total number of rows (musicplaza): " + str(len(musicplaza)))

missing items total (musicplaza): 2401
total number of rows (musicplaza): 7011


In [21]:
print("missing items total (kpopstoreinusa): " + str(len(kpopstoreinusa[kpopstoreinusa['is_sold_out'] == True])))
print("total number of rows (kpopstoreinusa): " + str(len(kpopstoreinusa)))

missing items total (kpopstoreinusa): 3341
total number of rows (kpopstoreinusa): 10222


With roughly 2% of items being sold out, kpopalbums.com appears to be doing very well, especially in comparison to other third party sites such as musicplaza and kpopstoreinusa. While there is a wider range of merchandise available at kpopstoreinusa, it may be a better strategy to first check kpopalbums before moving on to kpopstoreinusa when comparing prices for popular, more likely to be in-demand items like an album or EP from a popular artist. 