# Data cleaning and merging dataframes

## Load datasets

In [1]:
import pandas as pd

# products.csv
url = 'https://drive.google.com/file/d/1UfsHI80cpQqGfsH2g4T4Tsw8cWayOfzC/view?usp=sharing' 
path = 'https://drive.google.com/uc?export=download&id='+url.split('/')[-2]
products = pd.read_csv(path)

## Challenge: Cleaning products

Quick review of its major problems: 

In [2]:
print(products.info(), "\n")
print("Missing values:", products.isna().sum(), "\n")
print("Duplicate rows:", products.duplicated().sum())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19326 entries, 0 to 19325
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   sku          19326 non-null  object
 1   name         19326 non-null  object
 2   desc         19319 non-null  object
 3   price        19280 non-null  object
 4   promo_price  19326 non-null  object
 5   in_stock     19326 non-null  int64 
 6   type         19276 non-null  object
dtypes: int64(1), object(6)
memory usage: 1.0+ MB
None 

Missing values: sku             0
name            0
desc            7
price          46
promo_price     0
in_stock        0
type           50
dtype: int64 

Duplicate rows: 8746


Looking at this overview, we can see that there are different things that have to be changed: 

* Data types: 
    * `price` should be a float
    * `promo price` should be a float
* Duplicated rows. They have to be removed. 
    * To accomplish this step you can use the method `pd.DataFrame.drop_duplicates()`. Be sure you drop all the duplicates based on the column **sku**, as it is the one you will use to merge with orderlines. 
* Missing values: 
    * Description maybe can be inferred by the name
    * `price`. Is there a way we can extract the information from another table?
    * `type`. Do we need this column for our analysis?
    
This task can be accomoplished by using all the methods you already know.

In [996]:
products.shape

(19326, 7)

## Start of the challenge.

#### Duplicates

In [3]:
my_products = products.drop_duplicates(subset=['sku'])
my_products

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,499.899,1,8696
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,589.996,0,13855401
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,569.898,0,1387
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,229.997,0,1230
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364
...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,269.903,1,12282
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,649.903,0,1392


In [4]:
products.shape

(19326, 7)

#### Fix data types

In [5]:
my_products = my_products.dropna()
my_products1 = my_products.assign(promo_count = my_products['promo_price'].str.split('.').str.len(),\
                                  price_count = my_products['price'].str.split('.').str.len())
my_products1

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,499.899,1,8696,2,2
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,589.996,0,13855401,2,1
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,569.898,0,1387,2,1
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,229.997,0,1230,2,1
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364,2,2
...,...,...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,269.903,1,12282,2,2
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,649.903,0,1392,2,2


### Cleaning the first set (Promo_count = 3)

In [6]:
my_prdt3 = my_products1[my_products1['promo_count'] == 3]
my_prdt3

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,1.119.976,0,1325,3,1
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,10.449.923,0,1296,3,1
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.199.957,0,12635403,3,1
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.079.961,0,12635403,3,1
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,1.649.896,0,1216,3,2
...,...,...,...,...,...,...,...,...,...
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,1.199.897,0,"1,44E+11",3,2
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,1.599.898,0,"1,44E+11",3,2
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,1.399.897,0,"1,44E+11",3,2
19319,KNO0032,"Knomo MacBook Pro Beauchamp Backpack 14 ""Black",Backpack thin nylon mesh internal compartment ...,179,1.699.905,1,1392,3,1


In [7]:
my_prdt3[my_prdt3.price_count == 3]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,1.639.792,1.629.894,1,1364,3,3
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,2.199.791,2.199.901,0,11935397,3,3
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,5.609.698,5.549.895,0,11935397,3,3
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,2.099.895,2.099.895,0,"1,44E+11",3,3
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,1.329.911,1.329.911,0,"5,49E+11",3,3
...,...,...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,1.499.892,1.499.892,0,1334,3,3
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,2.999.905,2.999.905,1,1334,3,3


### Create further categories based on price_count.

#### price_count == 3

In [8]:
set33 = my_prdt3[my_prdt3.price_count == 3]
set33

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,1.639.792,1.629.894,1,1364,3,3
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,2.199.791,2.199.901,0,11935397,3,3
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,5.609.698,5.549.895,0,11935397,3,3
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,2.099.895,2.099.895,0,"1,44E+11",3,3
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,1.329.911,1.329.911,0,"5,49E+11",3,3
...,...,...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,1.499.892,1.499.892,0,1334,3,3
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,2.999.905,2.999.905,1,1334,3,3


In [9]:
set33 = set33.assign(result1 = set33['price'].str.replace('\.','', regex=True),\
                     result2 = set33['promo_price'].str.replace('\.','', regex=True))
set33

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result1,result2
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,1.639.792,1.629.894,1,1364,3,3,1639792,1629894
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,2.199.791,2.199.901,0,11935397,3,3,2199791,2199901
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,5.609.698,5.549.895,0,11935397,3,3,5609698,5549895
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,2.099.895,2.099.895,0,"1,44E+11",3,3,2099895,2099895
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,1.329.911,1.329.911,0,"5,49E+11",3,3,1329911,1329911
...,...,...,...,...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,1.499.892,1.499.892,0,1334,3,3,1499892,1499892
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3,84900013,8490001
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3,84900013,8490001
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,2.999.905,2.999.905,1,1334,3,3,2999905,2999905


In [10]:
# cor_price = corrected price
# cor_promo_price = corrected promo_price

set33['cor_price'] = set33.apply(lambda row: row['result1'][:3] + '.' + row['result1'][3:], axis=1)
set33['cor_promo_price'] = set33.apply(lambda row: row['result2'][:3] + '.' + row['result2'][3:], axis=1)
set33                                

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result1,result2,cor_price,cor_promo_price
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,1.639.792,1.629.894,1,1364,3,3,1639792,1629894,163.9792,162.9894
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,2.199.791,2.199.901,0,11935397,3,3,2199791,2199901,219.9791,219.9901
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,5.609.698,5.549.895,0,11935397,3,3,5609698,5549895,560.9698,554.9895
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,2.099.895,2.099.895,0,"1,44E+11",3,3,2099895,2099895,209.9895,209.9895
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,1.329.911,1.329.911,0,"5,49E+11",3,3,1329911,1329911,132.9911,132.9911
...,...,...,...,...,...,...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,1.499.892,1.499.892,0,1334,3,3,1499892,1499892,149.9892,149.9892
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3,84900013,8490001,849.00013,849.0001
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,84.900.013,8.490.001,0,11905404,3,3,84900013,8490001,849.00013,849.0001
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,2.999.905,2.999.905,1,1334,3,3,2999905,2999905,299.9905,299.9905


In [11]:
set33.dtypes

sku                object
name               object
desc               object
price              object
promo_price        object
in_stock            int64
type               object
promo_count         int64
price_count         int64
result1            object
result2            object
cor_price          object
cor_promo_price    object
dtype: object

In [12]:
set33['price'] = pd.to_numeric(set33.cor_price)
set33['promo_price'] = pd.to_numeric(set33.cor_promo_price)
set33

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result1,result2,cor_price,cor_promo_price
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,163.97920,162.9894,1,1364,3,3,1639792,1629894,163.9792,162.9894
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.97910,219.9901,0,11935397,3,3,2199791,2199901,219.9791,219.9901
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,560.96980,554.9895,0,11935397,3,3,5609698,5549895,560.9698,554.9895
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,209.98950,209.9895,0,"1,44E+11",3,3,2099895,2099895,209.9895,209.9895
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,132.99110,132.9911,0,"5,49E+11",3,3,1329911,1329911,132.9911,132.9911
...,...,...,...,...,...,...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,149.98920,149.9892,0,1334,3,3,1499892,1499892,149.9892,149.9892
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,849.00013,849.0001,0,11905404,3,3,84900013,8490001,849.00013,849.0001
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,849.00013,849.0001,0,11905404,3,3,84900013,8490001,849.00013,849.0001
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,299.99050,299.9905,1,1334,3,3,2999905,2999905,299.9905,299.9905


In [13]:
set33 = set33.round({'promo_price': 2, 'price': 2})

In [14]:
set33 = set33.drop(['result1', 'result2', 'cor_price', 'cor_promo_price', 'promo_count', 'price_count'], axis=1)
set33

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,163.98,162.99,1,1364
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,219.99,0,11935397
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,560.97,554.99,0,11935397
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,209.99,209.99,0,"1,44E+11"
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,132.99,132.99,0,"5,49E+11"
...,...,...,...,...,...,...,...
19184,UBI0009,Ubiquiti Amplifi Mesh Access Point,Point Smart Wi-Fi high-density access with mes...,149.99,149.99,0,1334
19248,DJI0026,DJI Mavic Air Drone cuadricóptero Arctic White,Drone cuadricóptero laptop with integrated cam...,849.00,849.00,0,11905404
19249,DJI0025,DJI Mavic Air Drone Black Onyx cuadricóptero,Drone cuadricóptero laptop with integrated cam...,849.00,849.00,0,11905404
19251,LIN0014,Linksys Wi-Fi Velop system AC4400 2 units,Wi-Fi high-density intelligent Mesh technology,299.99,299.99,1,1334


In [15]:
set33.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

#### price_count == 2

In [16]:
my_prdt3[my_prdt3.price_count == 2]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,1.649.896,0,1216,3,2
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,15.389.905,0,1282,3,2
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,15.699.895,0,1282,3,2
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,10.999.892,0,12175397,3,2
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,1.159.906,0,11905404,3,2
...,...,...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,2.189.894,1,1334,3,2
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,1.199.897,0,"1,44E+11",3,2
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,1.599.898,0,"1,44E+11",3,2
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,1.399.897,0,"1,44E+11",3,2


In [17]:
set32 = my_prdt3[my_prdt3.price_count == 2]
set32

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,1.649.896,0,1216,3,2
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,15.389.905,0,1282,3,2
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,15.699.895,0,1282,3,2
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,10.999.892,0,12175397,3,2
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,1.159.906,0,11905404,3,2
...,...,...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,2.189.894,1,1334,3,2
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,1.199.897,0,"1,44E+11",3,2
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,1.599.898,0,"1,44E+11",3,2
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,1.399.897,0,"1,44E+11",3,2


In [18]:
set32 = set32.assign(result = set32['promo_price'].str.replace('\.','', regex=True), \
                     price_len = set32['price'].str.split('.').str.get(0).str.len())
set32

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,1.649.896,0,1216,3,2,1649896,3
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,15.389.905,0,1282,3,2,15389905,4
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,15.699.895,0,1282,3,2,15699895,4
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,10.999.892,0,12175397,3,2,10999892,4
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,1.159.906,0,11905404,3,2,1159906,3
...,...,...,...,...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,2.189.894,1,1334,3,2,2189894,3
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,1.199.897,0,"1,44E+11",3,2,1199897,3
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,1.599.898,0,"1,44E+11",3,2,1599898,3
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,1.399.897,0,"1,44E+11",3,2,1399897,3


In [19]:
set32['cor_promo_price'] = set32.apply(lambda row: row['result'][:(row['price_len'])] + '.' + row['result'][(row['price_len']):], axis=1)
set32

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,1.649.896,0,1216,3,2,1649896,3,164.9896
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,15.389.905,0,1282,3,2,15389905,4,1538.9905
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,15.699.895,0,1282,3,2,15699895,4,1569.9895
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,10.999.892,0,12175397,3,2,10999892,4,1099.9892
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,1.159.906,0,11905404,3,2,1159906,3,115.9906
...,...,...,...,...,...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,2.189.894,1,1334,3,2,2189894,3,218.9894
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,1.199.897,0,"1,44E+11",3,2,1199897,3,119.9897
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,1.599.898,0,"1,44E+11",3,2,1599898,3,159.9898
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,1.399.897,0,"1,44E+11",3,2,1399897,3,139.9897


In [20]:
set32.dtypes

sku                object
name               object
desc               object
price              object
promo_price        object
in_stock            int64
type               object
promo_count         int64
price_count         int64
result             object
price_len           int64
cor_promo_price    object
dtype: object

In [21]:
set32['promo_price'] = pd.to_numeric(set32.cor_promo_price)
set32['price'] = pd.to_numeric(set32.price)
set32

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,164.9896,0,1216,3,2,1649896,3,164.9896
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,1538.9905,0,1282,3,2,15389905,4,1538.9905
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,1569.9895,0,1282,3,2,15699895,4,1569.9895
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,1099.9892,0,12175397,3,2,10999892,4,1099.9892
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,115.9906,0,11905404,3,2,1159906,3,115.9906
...,...,...,...,...,...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,218.9894,1,1334,3,2,2189894,3,218.9894
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,119.9897,0,"1,44E+11",3,2,1199897,3,119.9897
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,159.9898,0,"1,44E+11",3,2,1599898,3,159.9898
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,139.9897,0,"1,44E+11",3,2,1399897,3,139.9897


In [22]:
set32 = set32.round({'promo_price': 2, 'price': 2})

In [23]:
set32 = set32.drop(['result', 'cor_promo_price', 'price_count', 'promo_count', 'price_len'], axis=1)
set32

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
97,MAK0014,Maclocks safety housing Kiosk iPad 2 3 and 4 b...,Holder and housing iPad 2 3 and 4 aluminum and...,164.99,164.99,0,1216
163,PAC0174,Apple MacBook Pro 133 '' 25GHz | 8GB RAM | Fus...,Apple MacBook Pro Fusion Drive 8GB internal an...,1613.99,1538.99,0,1282
172,PAC0178,Apple MacBook Pro 133 '' 25GHz | 16GB RAM | Fu...,Apple MacBook Pro Fusion Drive 16GB 2 internal...,1733.99,1569.99,0,1282
282,SNN0016,Sonnet XMAC mini Server,Turn your Mac mini Server Rack format,1208.79,1099.99,0,12175397
283,FIT0009,Fitbit Aria scale smart white,smart scale with WiFi connection.,119.99,115.99,0,11905404
...,...,...,...,...,...,...,...
19284,DLK0144,D-Link Wi-fi system COVR powerline mesh AC1200,Two network extenders Electric PLC Lonea Wi-Fi...,220.99,218.99,1,1334
19300,REP0412,Rear Camera Repair iPhone 7 Plus,It is including parts and labor for iPhone 7 Plus,119.99,119.99,0,"1,44E+11"
19317,REP0403,iPad LCD screen repair,Repair service including parts and labor for iPad,159.99,159.99,0,"1,44E+11"
19318,REP0402,iPad touch screen repair,Repair service including parts and labor for iPad,139.99,139.99,0,"1,44E+11"


In [24]:
set32.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

#### price_count == 1

In [25]:
my_prdt3[my_prdt3.price_count == 1]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,1.119.976,0,1325,3,1
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,10.449.923,0,1296,3,1
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.199.957,0,12635403,3,1
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.079.961,0,12635403,3,1
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,11.455.917,0,1282,3,1
...,...,...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,3.059.945,1,12655397,3,1
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,3.865.841,0,12655397,3,1
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,4.265.843,0,12655397,3,1
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,3.189.996,0,24885185,3,1


In [26]:
set31 = my_prdt3[my_prdt3.price_count == 1]
set31

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,1.119.976,0,1325,3,1
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,10.449.923,0,1296,3,1
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.199.957,0,12635403,3,1
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.079.961,0,12635403,3,1
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,11.455.917,0,1282,3,1
...,...,...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,3.059.945,1,12655397,3,1
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,3.865.841,0,12655397,3,1
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,4.265.843,0,12655397,3,1
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,3.189.996,0,24885185,3,1


In [27]:
set31 = set31.assign(result = set31['promo_price'].str.replace('\.','', regex=True), \
                     price_len = set31['price'].str.split('.').str.get(0).str.len())
set31

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,1.119.976,0,1325,3,1,1119976,3
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,10.449.923,0,1296,3,1,10449923,4
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.199.957,0,12635403,3,1,1199957,3
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.079.961,0,12635403,3,1,1079961,3
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,11.455.917,0,1282,3,1,11455917,4
...,...,...,...,...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,3.059.945,1,12655397,3,1,3059945,3
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,3.865.841,0,12655397,3,1,3865841,3
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,4.265.843,0,12655397,3,1,4265843,3
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,3.189.996,0,24885185,3,1,3189996,3


In [28]:
set31['cor_promo_price'] = set31.apply(lambda row: row['result'][:(row['price_len'])] + '.' + row['result'][(row['price_len']):], axis=1)
set31

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,1.119.976,0,1325,3,1,1119976,3,111.9976
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,10.449.923,0,1296,3,1,10449923,4,1044.9923
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.199.957,0,12635403,3,1,1199957,3,119.9957
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,1.079.961,0,12635403,3,1,1079961,3,107.9961
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,11.455.917,0,1282,3,1,11455917,4,1145.5917
...,...,...,...,...,...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,3.059.945,1,12655397,3,1,3059945,3,305.9945
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,3.865.841,0,12655397,3,1,3865841,3,386.5841
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,4.265.843,0,12655397,3,1,4265843,3,426.5843
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,3.189.996,0,24885185,3,1,3189996,3,318.9996


In [29]:
set31.dtypes

sku                object
name               object
desc               object
price              object
promo_price        object
in_stock            int64
type               object
promo_count         int64
price_count         int64
result             object
price_len           int64
cor_promo_price    object
dtype: object

In [30]:
set31['promo_price'] = pd.to_numeric(set31.cor_promo_price)
set31['price'] = pd.to_numeric(set31.price)
set31

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,111.9976,0,1325,3,1,1119976,3,111.9976
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,1044.9923,0,1296,3,1,10449923,4,1044.9923
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,119.9957,0,12635403,3,1,1199957,3,119.9957
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,107.9961,0,12635403,3,1,1079961,3,107.9961
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,1145.5917,0,1282,3,1,11455917,4,1145.5917
...,...,...,...,...,...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,305.9945,1,12655397,3,1,3059945,3,305.9945
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,386.5841,0,12655397,3,1,3865841,3,386.5841
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,426.5843,0,12655397,3,1,4265843,3,426.5843
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,318.9996,0,24885185,3,1,3189996,3,318.9996


In [31]:
set31 = set31.round({'promo_price': 2, 'price': 2})

In [32]:
set31 = set31.drop(['result', 'cor_promo_price', 'price_count', 'promo_count', 'price_len'], axis=1)
set31

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
50,APP0367,Apple Mini DisplayPort to DVI Adapter Mac dual...,Adapter Mini Display Port to DVI dual channel ...,119,112.00,0,1325
51,APP0344,"Apple Thunderbolt Display 27 ""Monitor Mac",Monitor Display 27-inch Apple Thunderbolt (MC9...,1149,1044.99,0,1296
66,MAK0008,Maclocks theft case iPad 2 3 and 4 black with ...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,120.00,0,12635403
67,MAK0007,Maclocks theft case iPad 2 3 and 4 transparent...,Case antitheft iPad 2 3 and 4 polycarbonate ro...,120,108.00,0,12635403
100,APP0390,"Apple MacBook Pro 133 ""Core i5 25GHz | 4GB RAM...",MacBook Pro laptop 133 inches (MD101Y / A).,1199,1145.59,0,1282
...,...,...,...,...,...,...,...
19266,WDT0416,"WD Hard Drive 8TB Gold 35 ""Servers",Hard Western Digital 8TB 35 inches SATA 6 Gb /...,419,305.99,1,12655397
19267,WDT0415,"WD Hard Drive 10TB Gold 35 ""Servers",Hard Western Digital 10TB 35 inches SATA 6 Gb ...,519,386.58,0,12655397
19268,WDT0414,"WD Hard Drive 12TB Gold 35 ""Servers",Hard Western Digital 12TB 35 inches SATA 6 Gb ...,689,426.58,0,12655397
19290,AP20474,Like new - Apple Watch GPS 38mm Case Series 3 ...,Reconditioned Apple Watch 38mm series 3 with G...,369,319.00,0,24885185


In [33]:
set31.dtypes

sku             object
name            object
desc            object
price            int64
promo_price    float64
in_stock         int64
type            object
dtype: object

### Cleaning the second set (Promo = 2).

In [36]:
my_prdt2 = my_products1[my_products1['promo_count'] == 2]
my_prdt2

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,499.899,1,8696,2,2
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,589.996,0,13855401,2,1
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,569.898,0,1387,2,1
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,229.997,0,1230,2,1
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364,2,2
...,...,...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,269.903,1,12282,2,2
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,649.903,0,1392,2,2


### Create further categories based on price_count.

#### price_count == 3

In [37]:
my_prdt2[my_prdt2.price_count == 3]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
898,REP0156,iPhone 5 GSM antenna repair,Repair service including parts and labor for i...,69.989.909,699.899,0,"1,44E+11",2,3
941,REP0185,Home button repair iPad mini,Repair service including parts and labor for i...,69.989.909,699.899,0,"1,44E+11",2,3
1955,PAC0784,Samsung SSD 850 expansion kit EVO 500GB + 16GB...,Expansion kit SSD 500GB + 16GB 1600Mhz RAM + D...,4.519.592,353.585,1,1433,2,3
2010,PAC0746,Kit PRO 512GB Samsung SSD expansion + 16GB 160...,SSD upgrade kit 512GB + 16GB 1600Mhz RAM for M...,4.689.592,418.585,1,1433,2,3
2279,TRA0011,Transcend JetDrive Lite 330 64GB Macbook Pro R...,MLC memory card 64GB for Macbooks Retina 13 in...,4.528.062,452.806,0,57445397,2,3
...,...,...,...,...,...,...,...,...,...
19312,REP0424,Input repair Headphones iPad,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3
19313,REP0421,iPad charging connector repair,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3
19314,REP0416,iPad front camera repair,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3
19315,REP0413,repair rear camera iPad,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3


In [38]:
set23 = my_prdt2[my_prdt2.price_count == 3]

set23 = set23.assign(result1 = set23['price'].str.replace('\.','', regex=True),\
                     result2 = set23['promo_price'].str.replace('\.','', regex=True))

set23['cor_price'] = set23.apply(lambda row: row['result1'][:2] + '.' + row['result1'][2:], axis=1)
set23['cor_promo_price'] = set23.apply(lambda row: row['result2'][:2] + '.' + row['result2'][2:], axis=1) 

set23

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result1,result2,cor_price,cor_promo_price
898,REP0156,iPhone 5 GSM antenna repair,Repair service including parts and labor for i...,69.989.909,699.899,0,"1,44E+11",2,3,69989909,699899,69.989909,69.9899
941,REP0185,Home button repair iPad mini,Repair service including parts and labor for i...,69.989.909,699.899,0,"1,44E+11",2,3,69989909,699899,69.989909,69.9899
1955,PAC0784,Samsung SSD 850 expansion kit EVO 500GB + 16GB...,Expansion kit SSD 500GB + 16GB 1600Mhz RAM + D...,4.519.592,353.585,1,1433,2,3,4519592,353585,45.19592,35.3585
2010,PAC0746,Kit PRO 512GB Samsung SSD expansion + 16GB 160...,SSD upgrade kit 512GB + 16GB 1600Mhz RAM for M...,4.689.592,418.585,1,1433,2,3,4689592,418585,46.89592,41.8585
2279,TRA0011,Transcend JetDrive Lite 330 64GB Macbook Pro R...,MLC memory card 64GB for Macbooks Retina 13 in...,4.528.062,452.806,0,57445397,2,3,4528062,452806,45.28062,45.2806
...,...,...,...,...,...,...,...,...,...,...,...,...,...
19312,REP0424,Input repair Headphones iPad,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3,6999003,6999,69.99003,69.99
19313,REP0421,iPad charging connector repair,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3,6999003,6999,69.99003,69.99
19314,REP0416,iPad front camera repair,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3,6999003,6999,69.99003,69.99
19315,REP0413,repair rear camera iPad,Repair service including parts and labor for iPad,6.999.003,69.99,0,"1,44E+11",2,3,6999003,6999,69.99003,69.99


In [39]:
set23['price'] = pd.to_numeric(set23.cor_price)
set23['promo_price'] = pd.to_numeric(set23.cor_promo_price)

set23 = set23.round({'promo_price': 2, 'price': 2})

set23 = set23.drop(['result1', 'result2', 'cor_price', 'cor_promo_price', 'promo_count', 'price_count'], axis=1)
set23

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
898,REP0156,iPhone 5 GSM antenna repair,Repair service including parts and labor for i...,69.99,69.99,0,"1,44E+11"
941,REP0185,Home button repair iPad mini,Repair service including parts and labor for i...,69.99,69.99,0,"1,44E+11"
1955,PAC0784,Samsung SSD 850 expansion kit EVO 500GB + 16GB...,Expansion kit SSD 500GB + 16GB 1600Mhz RAM + D...,45.20,35.36,1,1433
2010,PAC0746,Kit PRO 512GB Samsung SSD expansion + 16GB 160...,SSD upgrade kit 512GB + 16GB 1600Mhz RAM for M...,46.90,41.86,1,1433
2279,TRA0011,Transcend JetDrive Lite 330 64GB Macbook Pro R...,MLC memory card 64GB for Macbooks Retina 13 in...,45.28,45.28,0,57445397
...,...,...,...,...,...,...,...
19312,REP0424,Input repair Headphones iPad,Repair service including parts and labor for iPad,69.99,69.99,0,"1,44E+11"
19313,REP0421,iPad charging connector repair,Repair service including parts and labor for iPad,69.99,69.99,0,"1,44E+11"
19314,REP0416,iPad front camera repair,Repair service including parts and labor for iPad,69.99,69.99,0,"1,44E+11"
19315,REP0413,repair rear camera iPad,Repair service including parts and labor for iPad,69.99,69.99,0,"1,44E+11"


In [40]:
set23.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

#### price_count == 2.

In [41]:
my_prdt2[my_prdt2.price_count == 2]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,499.899,1,8696,2,2
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364,2,2
6,KIN0008,Mac Memory Kingston 1GB 667MHz DDR2 SO-DIMM,1GB RAM Mac mini and iMac (2006/07) MacBook Pr...,18.99,146.471,0,1364,2,2
7,KIN0009,Mac Memory Kingston 2GB 800MHz DDR2 SO-DIMM,2GB RAM iMac with Intel Core 2 Duo (Penryn).,36.99,274.694,0,1364,2,2
11,SEN0021,Sennheiser CX 300-II Precision headphones iPho...,Headphones iPhone iPad iPad 2 iPad 3 and iPod.,49.99,449.878,0,5384,2,2
...,...,...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,269.903,1,12282,2,2
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,649.903,0,1392,2,2


In [42]:
set22 = my_prdt2[my_prdt2.price_count == 2]

set22 = set22.assign(result = set22['promo_price'].str.replace('\.','', regex=True), \
                     price_len = set22['price'].str.split('.').str.get(0).str.len())

set22['cor_promo_price'] = set22.apply(lambda row: row['result'][:(row['price_len'])] + '.' + row['result'][(row['price_len']):], axis=1)

set22

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,499.899,1,8696,2,2,499899,2,49.9899
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364,2,2,3199,2,31.99
6,KIN0008,Mac Memory Kingston 1GB 667MHz DDR2 SO-DIMM,1GB RAM Mac mini and iMac (2006/07) MacBook Pr...,18.99,146.471,0,1364,2,2,146471,2,14.6471
7,KIN0009,Mac Memory Kingston 2GB 800MHz DDR2 SO-DIMM,2GB RAM iMac with Intel Core 2 Duo (Penryn).,36.99,274.694,0,1364,2,2,274694,2,27.4694
11,SEN0021,Sennheiser CX 300-II Precision headphones iPho...,Headphones iPhone iPad iPad 2 iPad 3 and iPod.,49.99,449.878,0,5384,2,2,449878,2,44.9878
...,...,...,...,...,...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,269.903,1,12282,2,2,269903,2,26.9903
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2,649903,2,64.9903
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,649.903,1,1392,2,2,649903,2,64.9903
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,649.903,0,1392,2,2,649903,2,64.9903


In [43]:
set22.dtypes

sku                object
name               object
desc               object
price              object
promo_price        object
in_stock            int64
type               object
promo_count         int64
price_count         int64
result             object
price_len           int64
cor_promo_price    object
dtype: object

In [44]:
set22['promo_price'] = pd.to_numeric(set22.cor_promo_price)
set22['price'] = pd.to_numeric(set22.price)

set22 = set22.round({'promo_price': 2, 'price': 2})

set22 = set22.drop(['result', 'cor_promo_price', 'price_count', 'promo_count', 'price_len'], axis=1)

set22

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
0,RAI0007,Silver Rain Design mStand Support,Aluminum support compatible with all MacBook,59.99,49.99,1,8696
4,KIN0007,Mac Memory Kingston 2GB 667MHz DDR2 SO-DIMM,2GB RAM Mac mini and iMac (2006/07) MacBook Pr...,34.99,31.99,1,1364
6,KIN0008,Mac Memory Kingston 1GB 667MHz DDR2 SO-DIMM,1GB RAM Mac mini and iMac (2006/07) MacBook Pr...,18.99,14.65,0,1364
7,KIN0009,Mac Memory Kingston 2GB 800MHz DDR2 SO-DIMM,2GB RAM iMac with Intel Core 2 Duo (Penryn).,36.99,27.47,0,1364
11,SEN0021,Sennheiser CX 300-II Precision headphones iPho...,Headphones iPhone iPad iPad 2 iPad 3 and iPod.,49.99,44.99,0,5384
...,...,...,...,...,...,...,...
19321,BEL0376,Belkin Travel Support Apple Watch Black,compact and portable stand vertically or horiz...,29.99,26.99,1,12282
19322,THU0060,"Enroute Thule 14L Backpack MacBook 13 ""Black",Backpack with capacity of 14 liter compartment...,69.95,64.99,1,1392
19323,THU0061,"Enroute Thule 14L Backpack MacBook 13 ""Blue",Backpack with capacity of 14 liter compartment...,69.95,64.99,1,1392
19324,THU0062,"Enroute Thule 14L Backpack MacBook 13 ""Red",Backpack with capacity of 14 liter compartment...,69.95,64.99,0,1392


In [45]:
set22.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

#### price_count == 1.

In [47]:
my_prdt2[my_prdt2.price_count == 1]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,589.996,0,13855401,2,1
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,569.898,0,1387,2,1
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,229.997,0,1230,2,1
5,APP0073,Apple Composite AV Cable iPhone and iPod white,IPhone and iPod AV Cable Dock to Composite Video.,45,420.003,0,1230,2,1
8,KIN0001-2,Mac memory Kingston 4GB (2x2GB) 667MHz DDR2 SO...,RAM 4GB (2x2GB) Mac mini and iMac (2006/07) Ma...,74,669.904,0,1364,2,1
...,...,...,...,...,...,...,...,...,...
19283,AP20468,Like new - Apple iPhone Black Lightning Dock,Support base and refitted with dock connector ...,59,440.004,0,13615399,2,1
19285,AP20470,Like new - Apple Thunderbolt to Gigabit Ethern...,Refurbished Mac adapter Thunderbolt to Gigabit...,35,279.994,0,1325,2,1
19288,AP20649,Like new - Apple Leather Case iPhone Case 8/7 ...,Reconditioned sleeve leather and microfiber Ap...,55,420.003,0,11865403,2,1
19295,AP20471,Apple Thunderbolt to FireWire 800 adapter,Reconditioned connection adapter Thunderbolt t...,35,279.994,0,1325,2,1


In [48]:
set21 = my_prdt2[my_prdt2.price_count == 1]

set21 = set21.assign(result = set21['promo_price'].str.replace('\.','', regex=True), \
                     price_len = set21['price'].str.split('.').str.get(0).str.len())

set21['cor_promo_price'] = set21.apply(lambda row: row['result'][:(row['price_len'])] + '.' + row['result'][(row['price_len']):], axis=1)

set21

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result,price_len,cor_promo_price
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,589.996,0,13855401,2,1,589996,2,58.9996
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,569.898,0,1387,2,1,569898,2,56.9898
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,229.997,0,1230,2,1,229997,2,22.9997
5,APP0073,Apple Composite AV Cable iPhone and iPod white,IPhone and iPod AV Cable Dock to Composite Video.,45,420.003,0,1230,2,1,420003,2,42.0003
8,KIN0001-2,Mac memory Kingston 4GB (2x2GB) 667MHz DDR2 SO...,RAM 4GB (2x2GB) Mac mini and iMac (2006/07) Ma...,74,669.904,0,1364,2,1,669904,2,66.9904
...,...,...,...,...,...,...,...,...,...,...,...,...
19283,AP20468,Like new - Apple iPhone Black Lightning Dock,Support base and refitted with dock connector ...,59,440.004,0,13615399,2,1,440004,2,44.0004
19285,AP20470,Like new - Apple Thunderbolt to Gigabit Ethern...,Refurbished Mac adapter Thunderbolt to Gigabit...,35,279.994,0,1325,2,1,279994,2,27.9994
19288,AP20649,Like new - Apple Leather Case iPhone Case 8/7 ...,Reconditioned sleeve leather and microfiber Ap...,55,420.003,0,11865403,2,1,420003,2,42.0003
19295,AP20471,Apple Thunderbolt to FireWire 800 adapter,Reconditioned connection adapter Thunderbolt t...,35,279.994,0,1325,2,1,279994,2,27.9994


In [49]:
set21['promo_price'] = pd.to_numeric(set21.cor_promo_price)
set21['price'] = pd.to_numeric(set21.price)

set21 = set21.round({'promo_price': 2, 'price': 2})

set21 = set21.drop(['result', 'cor_promo_price', 'price_count', 'promo_count', 'price_len'], axis=1)

set21

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
1,APP0023,Apple Mac Keyboard Keypad Spanish,USB ultrathin keyboard Apple Mac Spanish.,59,59.00,0,13855401
2,APP0025,Mighty Mouse Apple Mouse for Mac,mouse Apple USB cable.,59,56.99,0,1387
3,APP0072,Apple Dock to USB Cable iPhone and iPod white,IPhone dock and USB Cable Apple iPod.,25,23.00,0,1230
5,APP0073,Apple Composite AV Cable iPhone and iPod white,IPhone and iPod AV Cable Dock to Composite Video.,45,42.00,0,1230
8,KIN0001-2,Mac memory Kingston 4GB (2x2GB) 667MHz DDR2 SO...,RAM 4GB (2x2GB) Mac mini and iMac (2006/07) Ma...,74,66.99,0,1364
...,...,...,...,...,...,...,...
19283,AP20468,Like new - Apple iPhone Black Lightning Dock,Support base and refitted with dock connector ...,59,44.00,0,13615399
19285,AP20470,Like new - Apple Thunderbolt to Gigabit Ethern...,Refurbished Mac adapter Thunderbolt to Gigabit...,35,28.00,0,1325
19288,AP20649,Like new - Apple Leather Case iPhone Case 8/7 ...,Reconditioned sleeve leather and microfiber Ap...,55,42.00,0,11865403
19295,AP20471,Apple Thunderbolt to FireWire 800 adapter,Reconditioned connection adapter Thunderbolt t...,35,28.00,0,1325


In [50]:
set21.dtypes

sku             object
name            object
desc            object
price            int64
promo_price    float64
in_stock         int64
type            object
dtype: object

## Cleaning the third set (Promo = 1).

In [53]:
my_prdt1 = my_products1[my_products1['promo_count'] == 1]
my_prdt1

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
145,PAC0185,"Apple MacBook Pro 133 ""i5 25GHz | RAM 16GB | 2...",Apple MacBook Pro 133 inches (MD101Y / A) with...,1639,1469,0,1282,1,1
226,PAC1181,"Apple MacBook Pro 133 ""i5 25GHz | 16GB RAM | 1...",Apple MacBook Pro 133 inches (MD101Y / A) with...,2159,1769,0,1282,1,1
986,SNN0032,Sonnet Echo Express III-R 2U Chassis Rack Thun...,PCIe expansion chassis for Mac via Thunderbolt.,1208.79,1089,0,12995397,1,2
1413,HAR0015,Harman Kardon Esquire Mini Portable Speaker Brown,Bluetooth wireless speaker with battery for iP...,149.99,121,0,5398,1,2
1420,APP0916,Apple Smart Cover iPad Air 2 Case Black,Smart Leather Case for iPad Air 2.,89,74,0,12635403,1,1
...,...,...,...,...,...,...,...,...,...
18674,SYN0157-A,Open - Synology RT2600AC Wifi Router AC2600,Refurbished Wifi Wireless Router AC2600 17GHz ...,229.9,222,0,1334,1,2
18686,IFX0041-A,Open - iFixit P6 Battery Pentalobe screwdriver...,Refurbished devices screwdriver for MacBook Pr...,8.95,6,0,14305406,1,2
18697,APP0432-A,Open - Apple Lightning connector cable to USB ...,Lightning USB cable 1 meter to charge and sync...,25,18,0,1230,1,1
18883,PAC2286,"Second hand - Apple LED Cinema Display 24 """,Monitor Refurbished Apple Cinema Display 24 inch,899,499,0,1282,1,1


### Create further categories based on price_count

#### price_count == 3.

In [54]:
my_prdt1[my_prdt1.price_count == 3]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
11486,APP0819-A,(Open) Apple iPhone 6 64GB Space Gray,New iPhone 6 64GB Free (MG4F2QL / A).,7.490.021,679,0,1298,1,3
12301,APP1603,Space Gray Apple Watch Sport 42mm Black Belt,Apple Watch Sport Aluminum 42mm Space Gray wit...,41.900.001,419,0,24895185,1,3
17485,APP2493,Apple TV 32GB 4K,Apple multimedia player with 4K resolution and...,1.990.002,194,1,113464259,1,3


In [55]:
set13 = my_prdt1[my_prdt1.price_count == 3]

set13 = set13.assign(result1 = set13['price'].str.replace('\.','', regex=True),\
                     result2 = set13['promo_price'].str.replace('\.','', regex=True))

set13['cor_price'] = set13.apply(lambda row: row['result1'][:3] + '.' + row['result1'][3:], axis=1)
set13['cor_promo_price'] = set13.apply(lambda row: row['result2'][:3] + '.' + row['result2'][3:], axis=1) 

set13

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count,result1,result2,cor_price,cor_promo_price
11486,APP0819-A,(Open) Apple iPhone 6 64GB Space Gray,New iPhone 6 64GB Free (MG4F2QL / A).,7.490.021,679,0,1298,1,3,7490021,679,749.0021,679.0
12301,APP1603,Space Gray Apple Watch Sport 42mm Black Belt,Apple Watch Sport Aluminum 42mm Space Gray wit...,41.900.001,419,0,24895185,1,3,41900001,419,419.00001,419.0
17485,APP2493,Apple TV 32GB 4K,Apple multimedia player with 4K resolution and...,1.990.002,194,1,113464259,1,3,1990002,194,199.0002,194.0


In [56]:
set13['price'] = pd.to_numeric(set13.cor_price)
set13['promo_price'] = pd.to_numeric(set13.cor_promo_price)

set13 = set13.round({'promo_price': 2, 'price': 2})

set13 = set13.drop(['result1', 'result2', 'cor_price', 'cor_promo_price', 'promo_count', 'price_count'], axis=1)
set13

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
11486,APP0819-A,(Open) Apple iPhone 6 64GB Space Gray,New iPhone 6 64GB Free (MG4F2QL / A).,749.0,679.0,0,1298
12301,APP1603,Space Gray Apple Watch Sport 42mm Black Belt,Apple Watch Sport Aluminum 42mm Space Gray wit...,419.0,419.0,0,24895185
17485,APP2493,Apple TV 32GB 4K,Apple multimedia player with 4K resolution and...,199.0,194.0,1,113464259


In [57]:
set13.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

#### price_count == 2.

In [58]:
my_prdt1[my_prdt1.price_count == 2]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
986,SNN0032,Sonnet Echo Express III-R 2U Chassis Rack Thun...,PCIe expansion chassis for Mac via Thunderbolt.,1208.79,1089,0,12995397,1,2
1413,HAR0015,Harman Kardon Esquire Mini Portable Speaker Brown,Bluetooth wireless speaker with battery for iP...,149.99,121,0,5398,1,2
2352,TN10018-A,(Open) Ten One Pogo Connect Bluetooth 4.0 Poin...,Bluetooth Stylus iPhone iPad and iPod.,79.95,19,0,1298,1,2
2678,QNA0104,QNAP TVS-463 | 4GB RAM Mac and PC Server Nas,4-bay NAS Server for Mac and PC.,797.39,789,0,12175397,1,2
2822,SNN0020-A,(Open) Sonnet Presto 10GbE Ethernet 1 port PCI...,1 PCIe 10GbE Ethernet card for Mac and PC port.,475.99,399,0,1298,1,2
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,47,0,11935397,1,2
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,9,1,11935397,1,2
13076,NTE0099-A,Open - NewerTech NuPower battery 63 W Ti Power...,63W battery compatible with 15-inch PowerBook ...,71.99,49,0,10142,1,2
13475,SAN0154,SanDisk Ultra Fit Flash Drive 16GB USB 3.0,Pendrive 16GB ultra-compact USB 3.0 transfers ...,9.99,9,0,11935397,1,2
13487,SAN0158,SanDisk Cruzer Dial Flash Drive 16GB USB 2.0,Ultra compact flash drive with built-dial for ...,7.99,9,0,12655397,1,2


In [63]:
set12 = my_prdt1[my_prdt1.price_count == 2]
set12

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
986,SNN0032,Sonnet Echo Express III-R 2U Chassis Rack Thun...,PCIe expansion chassis for Mac via Thunderbolt.,1208.79,1089,0,12995397,1,2
1413,HAR0015,Harman Kardon Esquire Mini Portable Speaker Brown,Bluetooth wireless speaker with battery for iP...,149.99,121,0,5398,1,2
2352,TN10018-A,(Open) Ten One Pogo Connect Bluetooth 4.0 Poin...,Bluetooth Stylus iPhone iPad and iPod.,79.95,19,0,1298,1,2
2678,QNA0104,QNAP TVS-463 | 4GB RAM Mac and PC Server Nas,4-bay NAS Server for Mac and PC.,797.39,789,0,12175397,1,2
2822,SNN0020-A,(Open) Sonnet Presto 10GbE Ethernet 1 port PCI...,1 PCIe 10GbE Ethernet card for Mac and PC port.,475.99,399,0,1298,1,2
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,47,0,11935397,1,2
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,9,1,11935397,1,2
13076,NTE0099-A,Open - NewerTech NuPower battery 63 W Ti Power...,63W battery compatible with 15-inch PowerBook ...,71.99,49,0,10142,1,2
13475,SAN0154,SanDisk Ultra Fit Flash Drive 16GB USB 3.0,Pendrive 16GB ultra-compact USB 3.0 transfers ...,9.99,9,0,11935397,1,2
13487,SAN0158,SanDisk Cruzer Dial Flash Drive 16GB USB 2.0,Ultra compact flash drive with built-dial for ...,7.99,9,0,12655397,1,2


In [64]:
set12.dtypes

sku            object
name           object
desc           object
price          object
promo_price    object
in_stock        int64
type           object
promo_count     int64
price_count     int64
dtype: object

In [65]:
set12['price'] = pd.to_numeric(set12.price)
set12['promo_price'] = pd.to_numeric(set12.promo_price)

set12 = set12.drop(['promo_count', 'price_count'], axis=1)
set12

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  set12['price'] = pd.to_numeric(set12.price)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  set12['promo_price'] = pd.to_numeric(set12.promo_price)


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
986,SNN0032,Sonnet Echo Express III-R 2U Chassis Rack Thun...,PCIe expansion chassis for Mac via Thunderbolt.,1208.79,1089,0,12995397
1413,HAR0015,Harman Kardon Esquire Mini Portable Speaker Brown,Bluetooth wireless speaker with battery for iP...,149.99,121,0,5398
2352,TN10018-A,(Open) Ten One Pogo Connect Bluetooth 4.0 Poin...,Bluetooth Stylus iPhone iPad and iPod.,79.95,19,0,1298
2678,QNA0104,QNAP TVS-463 | 4GB RAM Mac and PC Server Nas,4-bay NAS Server for Mac and PC.,797.39,789,0,12175397
2822,SNN0020-A,(Open) Sonnet Presto 10GbE Ethernet 1 port PCI...,1 PCIe 10GbE Ethernet card for Mac and PC port.,475.99,399,0,1298
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,47,0,11935397
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,9,1,11935397
13076,NTE0099-A,Open - NewerTech NuPower battery 63 W Ti Power...,63W battery compatible with 15-inch PowerBook ...,71.99,49,0,10142
13475,SAN0154,SanDisk Ultra Fit Flash Drive 16GB USB 3.0,Pendrive 16GB ultra-compact USB 3.0 transfers ...,9.99,9,0,11935397
13487,SAN0158,SanDisk Cruzer Dial Flash Drive 16GB USB 2.0,Ultra compact flash drive with built-dial for ...,7.99,9,0,12655397


In [66]:
set12.dtypes

sku             object
name            object
desc            object
price          float64
promo_price      int64
in_stock         int64
type            object
dtype: object

#### price_count == 1.

In [67]:
my_prdt1[my_prdt1.price_count == 1]

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
145,PAC0185,"Apple MacBook Pro 133 ""i5 25GHz | RAM 16GB | 2...",Apple MacBook Pro 133 inches (MD101Y / A) with...,1639,1469,0,1282,1,1
226,PAC1181,"Apple MacBook Pro 133 ""i5 25GHz | 16GB RAM | 1...",Apple MacBook Pro 133 inches (MD101Y / A) with...,2159,1769,0,1282,1,1
1420,APP0916,Apple Smart Cover iPad Air 2 Case Black,Smart Leather Case for iPad Air 2.,89,74,0,12635403,1,1
2483,DRB0012-A,(Open) Dr Bott Digital Video Link DVI to Mini ...,DVI to Mini DisplayPort Converter for 27-inch ...,118,49,0,1298,1,1
2561,SNS0021,Sonos Play 3 Speaker Black,Wireless Speaker for iPhone iPad and iPod.,299,279,0,5398,1,1
...,...,...,...,...,...,...,...,...,...
18591,APP1770-A,Open - Smart Battery Apple Battery Case iPhone...,Battery Case for iPhone 6s and 6,119,89,0,"5,49E+11",1,1
18597,APP2482,Apple iPhone 8 256GB Gold,256GB Apple iPhone 8 in Gold Free,979,959,1,113291716,1,1
18697,APP0432-A,Open - Apple Lightning connector cable to USB ...,Lightning USB cable 1 meter to charge and sync...,25,18,0,1230,1,1
18883,PAC2286,"Second hand - Apple LED Cinema Display 24 """,Monitor Refurbished Apple Cinema Display 24 inch,899,499,0,1282,1,1


In [68]:
set11 = my_prdt1[my_prdt1.price_count == 1]
set11

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,promo_count,price_count
145,PAC0185,"Apple MacBook Pro 133 ""i5 25GHz | RAM 16GB | 2...",Apple MacBook Pro 133 inches (MD101Y / A) with...,1639,1469,0,1282,1,1
226,PAC1181,"Apple MacBook Pro 133 ""i5 25GHz | 16GB RAM | 1...",Apple MacBook Pro 133 inches (MD101Y / A) with...,2159,1769,0,1282,1,1
1420,APP0916,Apple Smart Cover iPad Air 2 Case Black,Smart Leather Case for iPad Air 2.,89,74,0,12635403,1,1
2483,DRB0012-A,(Open) Dr Bott Digital Video Link DVI to Mini ...,DVI to Mini DisplayPort Converter for 27-inch ...,118,49,0,1298,1,1
2561,SNS0021,Sonos Play 3 Speaker Black,Wireless Speaker for iPhone iPad and iPod.,299,279,0,5398,1,1
...,...,...,...,...,...,...,...,...,...
18591,APP1770-A,Open - Smart Battery Apple Battery Case iPhone...,Battery Case for iPhone 6s and 6,119,89,0,"5,49E+11",1,1
18597,APP2482,Apple iPhone 8 256GB Gold,256GB Apple iPhone 8 in Gold Free,979,959,1,113291716,1,1
18697,APP0432-A,Open - Apple Lightning connector cable to USB ...,Lightning USB cable 1 meter to charge and sync...,25,18,0,1230,1,1
18883,PAC2286,"Second hand - Apple LED Cinema Display 24 """,Monitor Refurbished Apple Cinema Display 24 inch,899,499,0,1282,1,1


In [69]:
set11.dtypes

sku            object
name           object
desc           object
price          object
promo_price    object
in_stock        int64
type           object
promo_count     int64
price_count     int64
dtype: object

In [70]:
set11['price'] = pd.to_numeric(set11.price)
set11['promo_price'] = pd.to_numeric(set11.promo_price)

set11 = set11.drop(['promo_count', 'price_count'], axis=1)
set11

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  set11['price'] = pd.to_numeric(set11.price)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  set11['promo_price'] = pd.to_numeric(set11.promo_price)


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
145,PAC0185,"Apple MacBook Pro 133 ""i5 25GHz | RAM 16GB | 2...",Apple MacBook Pro 133 inches (MD101Y / A) with...,1639,1469,0,1282
226,PAC1181,"Apple MacBook Pro 133 ""i5 25GHz | 16GB RAM | 1...",Apple MacBook Pro 133 inches (MD101Y / A) with...,2159,1769,0,1282
1420,APP0916,Apple Smart Cover iPad Air 2 Case Black,Smart Leather Case for iPad Air 2.,89,74,0,12635403
2483,DRB0012-A,(Open) Dr Bott Digital Video Link DVI to Mini ...,DVI to Mini DisplayPort Converter for 27-inch ...,118,49,0,1298
2561,SNS0021,Sonos Play 3 Speaker Black,Wireless Speaker for iPhone iPad and iPod.,299,279,0,5398
...,...,...,...,...,...,...,...
18591,APP1770-A,Open - Smart Battery Apple Battery Case iPhone...,Battery Case for iPhone 6s and 6,119,89,0,"5,49E+11"
18597,APP2482,Apple iPhone 8 256GB Gold,256GB Apple iPhone 8 in Gold Free,979,959,1,113291716
18697,APP0432-A,Open - Apple Lightning connector cable to USB ...,Lightning USB cable 1 meter to charge and sync...,25,18,0,1230
18883,PAC2286,"Second hand - Apple LED Cinema Display 24 """,Monitor Refurbished Apple Cinema Display 24 inch,899,499,0,1282


In [71]:
set11.dtypes

sku            object
name           object
desc           object
price           int64
promo_price     int64
in_stock        int64
type           object
dtype: object

## Concatenate all cleaned sets.

In [72]:
clean_products = pd.concat([set33, set32, set31, set23, set22, set21, set13, set12, set11], axis=0)
clean_products.sample(20)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
13433,TUC0282,Tucano ELEKTRO FLEX 8/7 Ultraslim iPhone Case ...,thin flexible sleeve with metallic shades for ...,16.9,12.99,0,11865403
1058,MOP0058,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 32GB external storage...,159.99,159.99,0,"5,49E+11"
11696,OWC0118-A,Open - OWC ThunderBay 4 Thunderbolt 2,OWC external storage box 4 bay professional Th...,580.99,413.35,0,1298
17207,AP20265,Like new - Apple iPhone 7 Plus 128GB Silver,Apple iPhone 7 Plus 128GB Free Silver Recondit...,889.0,759.0,0,85651716
14962,SAN0171,Sandisk Extreme Pro microSDHC Memory Card 32GB...,Micro memory card reading speed 95MB / s and s...,34.99,25.28,0,57445397
11844,STI0011,Stil Ange Gardien Mind iPhone Case 6 / 6s Black,Stilmind leather case for iPhone 6 / 6S,29.99,18.49,0,11865403
1859,OWC0141,OWC Aura Pro Express 6G - 120GB SSD MacBook Ai...,120GB SSD hard drive for MacBook Air 2012.,108.99,111.58,0,12215397
2566,SNS0015,Connect Sonos Zone Player,Music player and receiver to manage music from...,399.0,399.0,1,5398
18134,MOL0008,Moleskine Smart Writing in September smartpen,Pack digitizer pen and Moleskine notebook Blue...,229.0,202.99,1,1229
1444,BEA0020,Beats by Dr. Dre bike mount speaker Pill,Pill bike mount speakers.,49.95,32.99,0,5398


In [73]:
clean_products.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10477 entries, 665 to 19131
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   sku          10477 non-null  object 
 1   name         10477 non-null  object 
 2   desc         10477 non-null  object 
 3   price        10477 non-null  float64
 4   promo_price  10477 non-null  float64
 5   in_stock     10477 non-null  int64  
 6   type         10477 non-null  object 
dtypes: float64(2), int64(1), object(4)
memory usage: 654.8+ KB


In [74]:
clean_products['price'].mean()

650.002191467023

### Calculate the average discount

In [75]:
clean_products['price'].mean()

650.002191467023

In [76]:
clean_products['promo_price'].mean()

892.7402815691516

In [77]:
clean_products[clean_products['promo_price'] > clean_products['price']].sample(10)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
2622,OWC0128,OWC Mercury Extreme Pro SSD 120GB 6GB,SSD 120GB SATA hard drive for Mac and PC III.,114.99,975.84,0,12215397
15574,TPL0017-A,Open - TP-Link M7350 Mobile Hotspot 4G Dual Band,4G point portable mobile access SD card reader...,119.79,971.69,0,1298
13015,KIN0162,Kingston SSD Disk 240GB UV400,SSD HDD 240GB SATA 3.0 6Gb / s for Mac and PC,100.77,859.94,0,12215397
2420,FCM0012,FCM Mac Memory 2GB DDR3 1066MHz SO-DIMM,2GB RAM Mac mini iMac MacBook and MacBook Pro ...,14.99,17.99,0,1364
11947,SAN0163,Sandisk Extreme SDHC UHS-I 32GB v30 90MB / s 4...,SDHC UHS Class 3 v30 speeds 90MB / s-40MB / s.,20.99,21.28,0,57445397
1165,OTR0075,Startech Ethernet Cable Cat6 10m Black,Ethernet cable 10 meters Unit Class 6 Mac and PC.,11.99,99.9,1,1325
2503,ADO0083,Adobe InDesign CC - design-,Adobe InDesign CC for Mac and PC.,435.45,459.99,0,1416
11108,SEA0038-A,"Open - Seagate Barracuda 3TB 35 ""SATA 7200rpm ...",internal hard drive Mac and PC 3TB (ST3000DM001).,112.0,796.41,0,1298
2498,ADO0075,Adobe Premiere Pro CC -Edition video-,Adobe Premiere Pro CC software for Mac and PC.,435.45,459.99,0,1416
17440,MUV0189,Muvit iPhone Case X Case Crystal Clear,durable and lightweight X iPhone Case,10.95,89.89,0,11865403


### Extra-cleaning of clean_products.

In [78]:
extra_cleaning = clean_products[clean_products['promo_price'] > clean_products['price']]
extra_cleaning

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,219.99,0,11935397
11188,PAC1655,QNAP TS-253A | 16GB RAM | 6TB (2x3TB) WD Red,NAS with 16GB of RAM + memory 6TB (2x3TB) Netw...,786.08,791.18,1,12175397
11258,PAC1675,QNAP TS-453A | 16GB | 12TB (4x3TB) WD Red,QNAP NAS TS-453A with 16GB of RAM memory + 12T...,118.93,119.64,1,12175397
11498,OWC0013-A,(Open) OWC Mercury Aura Pro SSD 240GB MacBook ...,240GB SSD for MacBook Air 2008/2009.,207.99,209.87,0,1298
11578,PAC1286,QNAP TS-451 Pack | 4GB RAM | Seagate Desktop 16TB,QNAP TS-451 Pack + 4GB memory RAM + 16TB (4x4T...,107.10,982.99,0,12175397
...,...,...,...,...,...,...,...
18989,PAC2507,Keyboard Replacement numerical Wireless iMac,Keyboard replacement service at the time of pu...,149.00,499.90,1,13855401
19074,PAC2509,substitution Magic Mouse 2 Trackpad 2,Replacement Service Mouse Trackpad at the time...,149.00,999.90,1,1387
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,47.00,0,11935397
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,9.00,1,11935397


In [79]:
extra_cleaning.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
dtype: object

In [80]:
extra_cleaning['promo_price'] = extra_cleaning['promo_price'] / 10
extra_cleaning

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning['promo_price'] = extra_cleaning['promo_price'] / 10


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,21.999,0,11935397
11188,PAC1655,QNAP TS-253A | 16GB RAM | 6TB (2x3TB) WD Red,NAS with 16GB of RAM + memory 6TB (2x3TB) Netw...,786.08,79.118,1,12175397
11258,PAC1675,QNAP TS-453A | 16GB | 12TB (4x3TB) WD Red,QNAP NAS TS-453A with 16GB of RAM memory + 12T...,118.93,11.964,1,12175397
11498,OWC0013-A,(Open) OWC Mercury Aura Pro SSD 240GB MacBook ...,240GB SSD for MacBook Air 2008/2009.,207.99,20.987,0,1298
11578,PAC1286,QNAP TS-451 Pack | 4GB RAM | Seagate Desktop 16TB,QNAP TS-451 Pack + 4GB memory RAM + 16TB (4x4T...,107.10,98.299,0,12175397
...,...,...,...,...,...,...,...
18989,PAC2507,Keyboard Replacement numerical Wireless iMac,Keyboard replacement service at the time of pu...,149.00,49.990,1,13855401
19074,PAC2509,substitution Magic Mouse 2 Trackpad 2,Replacement Service Mouse Trackpad at the time...,149.00,99.990,1,1387
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,4.700,0,11935397
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,0.900,1,11935397


In [81]:
extra_cleaning['indicator'] = extra_cleaning['price'] / extra_cleaning['promo_price']
extra_cleaning

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning['indicator'] = extra_cleaning['price'] / extra_cleaning['promo_price']


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,21.999,0,11935397,9.999545
11188,PAC1655,QNAP TS-253A | 16GB RAM | 6TB (2x3TB) WD Red,NAS with 16GB of RAM + memory 6TB (2x3TB) Netw...,786.08,79.118,1,12175397,9.935539
11258,PAC1675,QNAP TS-453A | 16GB | 12TB (4x3TB) WD Red,QNAP NAS TS-453A with 16GB of RAM memory + 12T...,118.93,11.964,1,12175397,9.940655
11498,OWC0013-A,(Open) OWC Mercury Aura Pro SSD 240GB MacBook ...,240GB SSD for MacBook Air 2008/2009.,207.99,20.987,0,1298,9.910421
11578,PAC1286,QNAP TS-451 Pack | 4GB RAM | Seagate Desktop 16TB,QNAP TS-451 Pack + 4GB memory RAM + 16TB (4x4T...,107.10,98.299,0,12175397,1.089533
...,...,...,...,...,...,...,...,...
18989,PAC2507,Keyboard Replacement numerical Wireless iMac,Keyboard replacement service at the time of pu...,149.00,49.990,1,13855401,2.980596
19074,PAC2509,substitution Magic Mouse 2 Trackpad 2,Replacement Service Mouse Trackpad at the time...,149.00,99.990,1,1387,1.490149
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,4.700,0,11935397,9.572340
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,0.900,1,11935397,9.988889


In [82]:
extra_cleaning1 = extra_cleaning[(extra_cleaning['indicator'] >= 1) & (extra_cleaning['indicator'] <= 3)]

extra_cleaning1.sample(20)                          # to be used in dataframe

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
12969,PLA0023,Plantronics Voyager Edge Bluetooth Wireless He...,Smart handset with internal sensor for iPhone ...,108.33,92.99,0,5384,1.164964
18771,WDT0382-A,Open - WD My Passport 2TB External Hard Drive ...,External Hard Drive USB 3.0 2TB refurbished wi...,124.99,87.577,0,11935397,1.427201
1814,KEN0200,Kensington KeyFolio Plus X2 Thin Case with bac...,Cover with Spanish backlit keyboard for iPad A...,109.99,69.99,0,12575403,1.57151
18769,WDT0350-A,Open - WD My Passport 2TB External Hard Drive ...,External Hard Drive USB 3.0 2TB refurbished wi...,124.99,83.998,0,11935397,1.488012
15823,WDT0169-A,"Open - Hard Disk 3TB WD Blue 35 ""Mac and PC",Western Digital Internal Hard Disk 3TB 35 inch...,129.0,97.663,0,1298,1.320869
18103,PAC2468,DS218play Synology NAS Server | 24TB (2x12TB) ...,2-bay NAS server can accommodate 4K Ultra HD f...,1240.97,987.178,0,12175397,1.257088
17253,AP20030,"Apple iMac 215 ""Core i5 Quad-Core 16GHz | 8GB ...",IMac reconditioned 215 inch quad-core i5 16GHz...,1279.0,965.593,0,"2,16E+11",1.324575
15510,APP1459-A,Open - Magic Spanish Keyboard Mac Keyboard (OEM),Spanish Keyboard Mac and Apple iPad Ultrathin ...,119.0,94.995,0,5401,1.252698
12809,IFX0109,iFixit Piece Cable and Audio Control button on...,Part button on audio and control iPhone 5,15.95,8.99,0,21485407,1.774194
13015,KIN0162,Kingston SSD Disk 240GB UV400,SSD HDD 240GB SATA 3.0 6Gb / s for Mac and PC,100.77,85.994,0,12215397,1.171826


In [83]:
extra_cleaning2 = extra_cleaning[(extra_cleaning['indicator'] < 1) | (extra_cleaning['indicator'] > 3)]
extra_cleaning2.sample(20)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
11817,APP1493,"Apple iPad Pro 9.7 ""256GB Wi-Fi Gray Space",9.7-inch Apple iPad Pro 256GB Wi-Fi Gray Space.,899.0,90.281,0,12141714,9.957798
15956,PAC2067,"Second hand - Apple iMac 24 ""Core 2 Duo 28 GHz...",Computer Refurbished iMac 24 inch Core 2 Duo 2...,1799.0,485.595,0,1282,3.704733
15162,JAW0034-A,Open - Jawbone UP3 Activity Monitor Black,Bluetooth activity monitor recorded sleep data...,179.99,39.892,0,1298,4.511932
2330,LIF0058,Waterproof iPad Case Lifeproof nüüd 234 Black,Water resistant protective cover for iPad 2/3/4.,119.99,19.989,0,12635403,6.002802
12539,PUR0145,Pure UltraSlim Case 03 + Protector iPhone 6 / ...,03mm skinny sleeve with included screen protec...,9.95,0.999,0,11865403,9.95996
560,KIN0078,Kingston V300 SSD Disk 120GB,SSD 120GB SATA Hard Drive Mac and PC III.,60.0,6.858,0,12215397,8.748906
11834,APP1511,"Apple iPad Pro 12.9 ""Wi-Fi 256GB Gold",New iPad Pro 256GB Wi-Fi.,1119.0,112.281,0,1714,9.966067
1913,KAN0010-A,(Open) Kanex Multi-Sync Bluetooth keyboard Mac...,Bluetooth keypad for Mac iPad and iPhone.,59.95,5.999,0,1298,9.993332
1777,JMO0066,Just AluFrame Mobile Phone Case Skin 6 Black,Aluminum and leather iPhone 6.,39.99,9.99,0,11865403,4.003003
11824,APP1500,"Apple iPad Pro 9.7 ""Wi-Fi + Cellular 32GB Rose...",9.7-inch Apple iPad Pro Wi-Fi + Cellular 32GB ...,829.0,83.281,0,12141714,9.954251


In [84]:
extra_cleaning21 = extra_cleaning2[extra_cleaning2['price'] < 100]
extra_cleaning21.sample(20)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
13260,PUR0153,"Puro TPU UltraSlim 03 ""iPhone Case + Screen Pr...",Ultrathin Flexible TPU cover material included...,12.95,1.299,0,11865403,9.969207
2726,TRA0034,Transcend External DVD Recorder White,superfine external recorder and portable DVD.,29.9,3.215,0,1424,9.300156
12804,IFX0138,Screwdriver P2 Pentalobe iFixit iPhone 4 4s and 5,Screwdriver for iPhone 44s & 5,7.95,0.799,1,12645406,9.949937
1287,MOS0153,Moshi Venturo shoulder bag / backpack Macbook ...,Backpack thin Macbook Pro 13 inches / 15 inches.,82.64,9.999,0,1392,8.264826
1134,PIE0028,internal battery for iPhone 5,Replacement AC Adapter for Apple iPhone 5.,14.95,1.499,1,21485407,9.973316
1786,KUA0017,Support Kukaclip car + Funda iPhone 6 / 6S Red,Magnetic car holder with 360 degrees rotating ...,24.99,6.99,0,11865403,3.575107
16955,GRT0460-A,Open - Griffin Survivor Case Tough Journey iPh...,Cast and impact resistant padded case for iPho...,29.99,4.49,0,1298,6.679287
14660,BOO0060-A,Open - Booq Boa Skin XS iPad Case Purple / Gray,soft and durable nylon and neoprene iPad sleeve.,14.95,2.99,0,12635403,5.0
11668,BEL0178-A,(Open) Classic Belkin Folio Case iPhone 6 / 6S...,protective case with microfiber interior for i...,29.9,8.92,0,1298,3.352018
11281,PHI0062,Philips Hue Dimmer light dimmer switch White,Remote control to control lamps and bulbs Hue,24.95,2.499,1,11905404,9.983994


In [85]:
extra_cleaning211 = extra_cleaning21[abs(extra_cleaning21['price'] - (extra_cleaning21['promo_price']*10)) <= 15]
extra_cleaning211

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
15,MOS0021,Clearguard Moshi MacBook Pro and Air,Keyboard Protector MacBook Pro 13-inch Retina ...,24.95,2.499,0,13835403,9.983994
39,JMO0026,Just Mobile Lazy Couch Support Mac and iPad,Mac and iPad small lift stand.,19.95,1.999,0,8696,9.979990
327,OWC0037,"OWC Mercury Elite Pro Mini aluminum box 25 ""FW...",outer case 25 inch SATA eSATA / FW800 / FW400 ...,69.99,7.490,0,12995397,9.344459
411,KIN0074,Kingston DataTraveler SE9 8GB USB 2.0 key,8GB USB 2.0 key minimalist design.,4.99,0.578,0,57445397,8.633218
468,OWC0056,Mac OWC Memory 4GB (2x2GB) 667MHz DDR2 FB-DIMM,RAM 4GB (2x2GB) for Mac Pro.,52.99,5.699,0,1364,9.298122
...,...,...,...,...,...,...,...,...
18962,CAV0009,Cavus Foot Support Sonos Play 1 Black,Floor stand for Speaker Sonos Play 1,59.00,7.139,1,5398,8.264463
18963,CAV0010,Cavus Foot Support Sonos Play 1 White,Floor stand for Speaker Sonos Play 1,59.00,7.139,0,5398,8.264463
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,4.700,0,11935397,9.572340
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,0.900,1,11935397,9.988889


In [86]:
extra_cleaning211['promo_price'] = extra_cleaning211['promo_price']*10

extra_cleaning211                                    # to be used in main product dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning211['promo_price'] = extra_cleaning211['promo_price']*10


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
15,MOS0021,Clearguard Moshi MacBook Pro and Air,Keyboard Protector MacBook Pro 13-inch Retina ...,24.95,24.99,0,13835403,9.983994
39,JMO0026,Just Mobile Lazy Couch Support Mac and iPad,Mac and iPad small lift stand.,19.95,19.99,0,8696,9.979990
327,OWC0037,"OWC Mercury Elite Pro Mini aluminum box 25 ""FW...",outer case 25 inch SATA eSATA / FW800 / FW400 ...,69.99,74.90,0,12995397,9.344459
411,KIN0074,Kingston DataTraveler SE9 8GB USB 2.0 key,8GB USB 2.0 key minimalist design.,4.99,5.78,0,57445397,8.633218
468,OWC0056,Mac OWC Memory 4GB (2x2GB) 667MHz DDR2 FB-DIMM,RAM 4GB (2x2GB) for Mac Pro.,52.99,56.99,0,1364,9.298122
...,...,...,...,...,...,...,...,...
18962,CAV0009,Cavus Foot Support Sonos Play 1 Black,Floor stand for Speaker Sonos Play 1,59.00,71.39,1,5398,8.264463
18963,CAV0010,Cavus Foot Support Sonos Play 1 White,Floor stand for Speaker Sonos Play 1,59.00,71.39,0,5398,8.264463
10966,SAN0092,SanDisk Ultra USB 3.0 128GB pendrive,Pendrive USB 3.0 Flash Drive 128G for Mac and PC.,44.99,47.00,0,11935397,9.572340
11038,SAN0110,SanDisk Ultra Flair Flash Drive 16GB USB 3.0,USB 3.0 flash drive 16GB USB Flash Drive Mac a...,8.99,9.00,1,11935397,9.988889


In [87]:
extra_cleaning212 = extra_cleaning21[~(abs(extra_cleaning21['price'] - (extra_cleaning21['promo_price']*10)) <= 15)]
extra_cleaning212

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
293,ROL0010,Rollei Chest Mount iPhone 4 / 4S,Case iPhone 4 / 4S for sport.,49.99,9.99,0,11865403,5.004004
334,NTE0023,NewerTech eSATA Cable Adapter Mac Pro,Add 2 eSATA ports with the adapter cable for M...,30.99,7.39,1,12755395,4.193505
607,BOO0066,Booq Boa Skin XS iPad Case Purple / Gray,soft and durable nylon and neoprene iPad sleeve.,14.95,3.991,0,12635403,3.745928
636,BEL0139,Belkin Lego iPhone Case SE / 5s / 5 Blue / Purple,Lego rigid shell for iPhone SE / 5s / 5.,29.99,9.995,0,11865403,3.0005
817,OWC0087,OWC Bluetooth module shielding shielding kit B...,Shielding Kit Bluetooth Module for Mac mini 2012.,55.99,9.99,1,12755395,5.604605
882,PUR0107,Puro Just Cavalli Swan iPhone Case Passion / 5...,IPhone anti-shock housing for iPhone SE / 5s / 5.,19.99,3.991,0,11865403,5.00877
1035,KEN0175,Kensington SafeGrip iPad Air Rugged Case with ...,Shockproof housing with handle for iPad.,43.99,9.995,0,12635403,4.401201
1036,OPU0010,Exo Opulus alcantara iPhone Case SE / 5s / 5 B...,Cover for iPhone SE / 5s / 5 polycarbonate and...,39.99,8.99,0,11865403,4.448276
1040,OPU0009,Exo Opulus alcantara Case iPhone 5 / 5S black,Case for iPhone 5 / 5S polycarbonate and micro...,39.99,9.99,0,11865403,4.003003
1082,NON0007,Yoko Ono Non Violence Case iPhone 5 / 5S White,Yoko Ono hard case design for iPhone 5 / 5S.,14.99,4.99,0,11865403,3.004008


In [88]:
extra_cleaning212.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
indicator      float64
dtype: object

In [89]:
extra_cleaning212.isna().sum()

sku            0
name           0
desc           0
price          0
promo_price    0
in_stock       0
type           0
indicator      0
dtype: int64

In [90]:
extra_cleaning212['price'] = pd.to_numeric(extra_cleaning212['price'])

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning212['price'] = pd.to_numeric(extra_cleaning212['price'])


In [91]:
extra_cleaning212['price'] = extra_cleaning212['price']/10
extra_cleaning212 #to be used in main product dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning212['price'] = extra_cleaning212['price']/10


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
293,ROL0010,Rollei Chest Mount iPhone 4 / 4S,Case iPhone 4 / 4S for sport.,4.999,9.99,0,11865403,5.004004
334,NTE0023,NewerTech eSATA Cable Adapter Mac Pro,Add 2 eSATA ports with the adapter cable for M...,3.099,7.39,1,12755395,4.193505
607,BOO0066,Booq Boa Skin XS iPad Case Purple / Gray,soft and durable nylon and neoprene iPad sleeve.,1.495,3.991,0,12635403,3.745928
636,BEL0139,Belkin Lego iPhone Case SE / 5s / 5 Blue / Purple,Lego rigid shell for iPhone SE / 5s / 5.,2.999,9.995,0,11865403,3.0005
817,OWC0087,OWC Bluetooth module shielding shielding kit B...,Shielding Kit Bluetooth Module for Mac mini 2012.,5.599,9.99,1,12755395,5.604605
882,PUR0107,Puro Just Cavalli Swan iPhone Case Passion / 5...,IPhone anti-shock housing for iPhone SE / 5s / 5.,1.999,3.991,0,11865403,5.00877
1035,KEN0175,Kensington SafeGrip iPad Air Rugged Case with ...,Shockproof housing with handle for iPad.,4.399,9.995,0,12635403,4.401201
1036,OPU0010,Exo Opulus alcantara iPhone Case SE / 5s / 5 B...,Cover for iPhone SE / 5s / 5 polycarbonate and...,3.999,8.99,0,11865403,4.448276
1040,OPU0009,Exo Opulus alcantara Case iPhone 5 / 5S black,Case for iPhone 5 / 5S polycarbonate and micro...,3.999,9.99,0,11865403,4.003003
1082,NON0007,Yoko Ono Non Violence Case iPhone 5 / 5S White,Yoko Ono hard case design for iPhone 5 / 5S.,1.499,4.99,0,11865403,3.004008


In [92]:
extra_cleaning22 = extra_cleaning2[extra_cleaning2['price'] >= 100]
extra_cleaning22.sample(20)

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
16005,PAC2066,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo | 4GB RAM ...,1199.0,379.0,0,"5,43E+15",3.163588
18846,AP20067,"Very Good - Apple iMac 20 ""Core 2 Duo 226GHz |...",Refurbished iMac 20 inch Core 2 Duo 226GHz | 2...,1199.0,375.595,0,"5,43E+15",3.192268
17405,OWC0248,OWC SSD expansion Aura 2TB 6G MacBook Pro 2012...,2TB SSD expansion MacBook Pro 13-inch and 15-i...,966.99,97.358,0,12215397,9.932312
12868,PAC2025,Synology DS216 + II | 2GB RAM,NAS with 4K transcoding and direct copy button...,373.97,38.618,0,12175397,9.683826
2882,APP1093,Apple iPod Touch 64GB Rosa,New 6th generation iPod Touch 64GB with 8 mega...,292.81,34.281,0,11821715,8.541466
2503,ADO0083,Adobe InDesign CC - design-,Adobe InDesign CC for Mac and PC.,435.45,45.999,0,1416,9.46651
929,LAC0103,LaCie CloudBox 2TB Hard Drive,2TB hard drive with cloud storage.,124.99,13.179,0,11935397,9.484028
644,WDT0185,"Red 750GB WD 25 ""Mac PC hard drive and NAS",Western Digital hard drive designed for NAS 75...,639.9,70.584,0,12655397,9.065794
15140,EIZ0024,"Eizo FlexScan EV2736W Monitor 27 ""QHD DP pivot...",27-inch monitor with slim frame and adjustable...,629.0,64.999,0,1296,9.677072
17667,PAC2195,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo | 2GB RAM ...,1199.0,255.595,0,"5,43E+15",4.691015


In [93]:
extra_cleaning221 = extra_cleaning22[abs(extra_cleaning22['price'] - (extra_cleaning22['promo_price']*10)) <= 180]
extra_cleaning221

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,21.999,0,11935397,9.999545
11188,PAC1655,QNAP TS-253A | 16GB RAM | 6TB (2x3TB) WD Red,NAS with 16GB of RAM + memory 6TB (2x3TB) Netw...,786.08,79.118,1,12175397,9.935539
11258,PAC1675,QNAP TS-453A | 16GB | 12TB (4x3TB) WD Red,QNAP NAS TS-453A with 16GB of RAM memory + 12T...,118.93,11.964,1,12175397,9.940655
11498,OWC0013-A,(Open) OWC Mercury Aura Pro SSD 240GB MacBook ...,240GB SSD for MacBook Air 2008/2009.,207.99,20.987,0,1298,9.910421
12854,SAN0139,"SanDisk SSD 480GB Plus 25 ""SATA 6Gb / s",Hard SSD 480GB 25 inches,149.99,15.658,0,12215397,9.579129
...,...,...,...,...,...,...,...,...
13329,APP1758,Apple iPad Mini 4 Wi-Fi 32GB Silver,Apple iPad Mini 4 32GB Wi-Fi.,429.00,43.281,0,24861714,9.911971
13330,APP1759,Apple iPad Mini 4 Wi-Fi 32GB Gold,Apple iPad Mini 4 32GB Wi-Fi.,429.00,43.281,0,24861714,9.911971
15140,EIZ0024,"Eizo FlexScan EV2736W Monitor 27 ""QHD DP pivot...",27-inch monitor with slim frame and adjustable...,629.00,64.999,0,1296,9.677072
15248,APP1659-A,Open - 32GB Apple iPhone 6s Rose Gold,New iPhone 6S 16GB Free,529.00,60.033,0,1298,8.811820


In [94]:
extra_cleaning221.dtypes

sku             object
name            object
desc            object
price          float64
promo_price    float64
in_stock         int64
type            object
indicator      float64
dtype: object

In [95]:
extra_cleaning221.isna().sum()

sku            0
name           0
desc           0
price          0
promo_price    0
in_stock       0
type           0
indicator      0
dtype: int64

In [96]:
extra_cleaning221['promo_price'] = extra_cleaning221['promo_price']*10

extra_cleaning221                                                         # to be used in dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning221['promo_price'] = extra_cleaning221['promo_price']*10


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
827,PAC0339,NewerTech miniStack 4TB Hard Drive Mac,External Box Hard Drive Mac + 4TB.,219.98,219.99,0,11935397,9.999545
11188,PAC1655,QNAP TS-253A | 16GB RAM | 6TB (2x3TB) WD Red,NAS with 16GB of RAM + memory 6TB (2x3TB) Netw...,786.08,791.18,1,12175397,9.935539
11258,PAC1675,QNAP TS-453A | 16GB | 12TB (4x3TB) WD Red,QNAP NAS TS-453A with 16GB of RAM memory + 12T...,118.93,119.64,1,12175397,9.940655
11498,OWC0013-A,(Open) OWC Mercury Aura Pro SSD 240GB MacBook ...,240GB SSD for MacBook Air 2008/2009.,207.99,209.87,0,1298,9.910421
12854,SAN0139,"SanDisk SSD 480GB Plus 25 ""SATA 6Gb / s",Hard SSD 480GB 25 inches,149.99,156.58,0,12215397,9.579129
...,...,...,...,...,...,...,...,...
13329,APP1758,Apple iPad Mini 4 Wi-Fi 32GB Silver,Apple iPad Mini 4 32GB Wi-Fi.,429.00,432.81,0,24861714,9.911971
13330,APP1759,Apple iPad Mini 4 Wi-Fi 32GB Gold,Apple iPad Mini 4 32GB Wi-Fi.,429.00,432.81,0,24861714,9.911971
15140,EIZ0024,"Eizo FlexScan EV2736W Monitor 27 ""QHD DP pivot...",27-inch monitor with slim frame and adjustable...,629.00,649.99,0,1296,9.677072
15248,APP1659-A,Open - 32GB Apple iPhone 6s Rose Gold,New iPhone 6S 16GB Free,529.00,600.33,0,1298,8.811820


In [100]:
extra_cleaning222 = extra_cleaning22[abs(extra_cleaning22['price'] - (extra_cleaning22['promo_price']*10)) > 180]
extra_cleaning222

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
14716,QNA0187,QNAP TVS-1271U-RP NAS server with redundant po...,Expansion unit rack format with 12 bays and 32 GB,3386.79,356.799,0,12175397,9.492151
14717,QNA0188,QNAP TVS-1271U-RP NAS server with redundant po...,Expansion unit rack bays 12 and format memory ...,2781.79,296.299,0,12175397,9.388456
14718,QNA0189,QNAP TVS-1271U-RP NAS with redundant source,Expansion unit rack bays 12 format and 8GB of ...,2346.19,253.899,0,12175397,9.240643
14996,PAC1921,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",IMac used 20 inch Core 2 Duo 24GHz | 3GB RAM |...,1199.0,325.584,0,1282,3.682613
14997,PAC1920,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",IMac used 20 inch Core 2 Duo 24GHz | 2GB RAM |...,1199.0,336.584,0,"5,43E+15",3.562261
15933,PAC2058,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",Refurbished iMac 20 inch Core 2 Duo 24GHz | 2G...,1199.0,326.584,0,"5,43E+15",3.671337
15935,PAC2057,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",Refurbished iMac 20 inch Core 2 Duo 24GHz | 4G...,1499.0,355.594,0,"5,43E+15",4.215482
15937,PAC2060,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo 226GHz | 4...,1199.0,395.595,0,51882158,3.030878
15956,PAC2067,"Second hand - Apple iMac 24 ""Core 2 Duo 28 GHz...",Computer Refurbished iMac 24 inch Core 2 Duo 2...,1799.0,485.595,0,1282,3.704733
16005,PAC2066,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo | 4GB RAM ...,1199.0,379.0,0,"5,43E+15",3.163588


In [101]:
extra_cleaning222['price'] = extra_cleaning222['price']/10

extra_cleaning222                                           # to be used in dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  extra_cleaning222['price'] = extra_cleaning222['price']/10


Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
14716,QNA0187,QNAP TVS-1271U-RP NAS server with redundant po...,Expansion unit rack format with 12 bays and 32 GB,338.679,356.799,0,12175397,9.492151
14717,QNA0188,QNAP TVS-1271U-RP NAS server with redundant po...,Expansion unit rack bays 12 and format memory ...,278.179,296.299,0,12175397,9.388456
14718,QNA0189,QNAP TVS-1271U-RP NAS with redundant source,Expansion unit rack bays 12 format and 8GB of ...,234.619,253.899,0,12175397,9.240643
14996,PAC1921,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",IMac used 20 inch Core 2 Duo 24GHz | 3GB RAM |...,119.9,325.584,0,1282,3.682613
14997,PAC1920,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",IMac used 20 inch Core 2 Duo 24GHz | 2GB RAM |...,119.9,336.584,0,"5,43E+15",3.562261
15933,PAC2058,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",Refurbished iMac 20 inch Core 2 Duo 24GHz | 2G...,119.9,326.584,0,"5,43E+15",3.671337
15935,PAC2057,"Second hand - Apple iMac 20 ""Core 2 Duo 24GHz ...",Refurbished iMac 20 inch Core 2 Duo 24GHz | 4G...,149.9,355.594,0,"5,43E+15",4.215482
15937,PAC2060,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo 226GHz | 4...,119.9,395.595,0,51882158,3.030878
15956,PAC2067,"Second hand - Apple iMac 24 ""Core 2 Duo 28 GHz...",Computer Refurbished iMac 24 inch Core 2 Duo 2...,179.9,485.595,0,1282,3.704733
16005,PAC2066,"Second hand - Apple iMac 20 ""Core 2 Duo 226GHz...",Refurbished iMac 20 inch Core 2 Duo | 4GB RAM ...,119.9,379.0,0,"5,43E+15",3.163588


In [102]:
true_clean = clean_products[clean_products['promo_price'] <= clean_products['price']]

true_clean                                                                            # to be used in the dataframe

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,163.98,162.99,1,1364
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,560.97,554.99,0,11935397
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,209.99,209.99,0,"1,44E+11"
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,132.99,132.99,0,"5,49E+11"
1058,MOP0058,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 32GB external storage...,159.99,159.99,0,"5,49E+11"
...,...,...,...,...,...,...,...
18591,APP1770-A,Open - Smart Battery Apple Battery Case iPhone...,Battery Case for iPhone 6s and 6,119.00,89.00,0,"5,49E+11"
18597,APP2482,Apple iPhone 8 256GB Gold,256GB Apple iPhone 8 in Gold Free,979.00,959.00,1,113291716
18697,APP0432-A,Open - Apple Lightning connector cable to USB ...,Lightning USB cable 1 meter to charge and sync...,25.00,18.00,0,1230
18883,PAC2286,"Second hand - Apple LED Cinema Display 24 """,Monitor Refurbished Apple Cinema Display 24 inch,899.00,499.00,0,1282


In [103]:
true_clean_products = pd.concat([true_clean, extra_cleaning1, extra_cleaning211, extra_cleaning212, extra_cleaning221, extra_cleaning222], axis=0)
true_clean_products

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,163.980,162.990,1,1364,
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,560.970,554.990,0,11935397,
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,209.990,209.990,0,"1,44E+11",
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,132.990,132.990,0,"5,49E+11",
1058,MOP0058,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 32GB external storage...,159.990,159.990,0,"5,49E+11",
...,...,...,...,...,...,...,...,...
1909,KEN0201,Kensington KeyFolio X3 Thin Keyboard and Case ...,Spanish keyboard holster and battery for iPad ...,10.999,29.900,0,12635403,3.678595
13767,PEB0001-A,Open - Pebble Smartwatch Original Black,Bluetooth watch intelligent LED backlit compat...,12.999,34.981,0,1298,3.716017
15162,JAW0034-A,Open - Jawbone UP3 Activity Monitor Black,Bluetooth activity monitor recorded sleep data...,17.999,39.892,0,1298,4.511932
2345,ADN0016-A,Open - Adonit Jot Touch with PixelPoint pointe...,Special precision pointer to draw fine tip,11.900,38.083,0,1298,3.124754


In [105]:
true_clean_products[true_clean_products.promo_price > true_clean_products.price]       # promo_price greater than price.

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type,indicator
15,MOS0021,Clearguard Moshi MacBook Pro and Air,Keyboard Protector MacBook Pro 13-inch Retina ...,24.950,24.990,0,13835403,9.983994
39,JMO0026,Just Mobile Lazy Couch Support Mac and iPad,Mac and iPad small lift stand.,19.950,19.990,0,8696,9.979990
327,OWC0037,"OWC Mercury Elite Pro Mini aluminum box 25 ""FW...",outer case 25 inch SATA eSATA / FW800 / FW400 ...,69.990,74.900,0,12995397,9.344459
411,KIN0074,Kingston DataTraveler SE9 8GB USB 2.0 key,8GB USB 2.0 key minimalist design.,4.990,5.780,0,57445397,8.633218
468,OWC0056,Mac OWC Memory 4GB (2x2GB) 667MHz DDR2 FB-DIMM,RAM 4GB (2x2GB) for Mac Pro.,52.990,56.990,0,1364,9.298122
...,...,...,...,...,...,...,...,...
1909,KEN0201,Kensington KeyFolio X3 Thin Keyboard and Case ...,Spanish keyboard holster and battery for iPad ...,10.999,29.900,0,12635403,3.678595
13767,PEB0001-A,Open - Pebble Smartwatch Original Black,Bluetooth watch intelligent LED backlit compat...,12.999,34.981,0,1298,3.716017
15162,JAW0034-A,Open - Jawbone UP3 Activity Monitor Black,Bluetooth activity monitor recorded sleep data...,17.999,39.892,0,1298,4.511932
2345,ADN0016-A,Open - Adonit Jot Touch with PixelPoint pointe...,Special precision pointer to draw fine tip,11.900,38.083,0,1298,3.124754


#### Save & Download the clean products dataset

In [106]:
true_clean_products.drop('indicator', axis=1, inplace=True)
true_clean_products

Unnamed: 0,sku,name,desc,price,promo_price,in_stock,type
665,CRU0015-2,Crucial memory Mac 16GB (2x8GB) SO-DIMM DDR3 1...,RAM 16GB (2x8GB) 135V MacBook Pro iMac (2012/2...,163.980,162.990,1,1364
885,PAC0376,OWC Mercury Elite Pro Dual Thunderbolt + 8TB,RAID outer box 35 inch SATA connection Thunder...,560.970,554.990,0,11935397
943,REP0188,Full Screen Repair iPad Mini 2,Repair service including parts and labor for i...,209.990,209.990,0,"1,44E+11"
1057,MOP0057,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 16GB external storage...,132.990,132.990,0,"5,49E+11"
1058,MOP0058,Mophie Space Pack Battery Case (1700mAh) and S...,Housing with battery and 32GB external storage...,159.990,159.990,0,"5,49E+11"
...,...,...,...,...,...,...,...
1909,KEN0201,Kensington KeyFolio X3 Thin Keyboard and Case ...,Spanish keyboard holster and battery for iPad ...,10.999,29.900,0,12635403
13767,PEB0001-A,Open - Pebble Smartwatch Original Black,Bluetooth watch intelligent LED backlit compat...,12.999,34.981,0,1298
15162,JAW0034-A,Open - Jawbone UP3 Activity Monitor Black,Bluetooth activity monitor recorded sleep data...,17.999,39.892,0,1298
2345,ADN0016-A,Open - Adonit Jot Touch with PixelPoint pointe...,Special precision pointer to draw fine tip,11.900,38.083,0,1298


In [107]:
true_clean_products.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10477 entries, 665 to 13671
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   sku          10477 non-null  object 
 1   name         10477 non-null  object 
 2   desc         10477 non-null  object 
 3   price        10477 non-null  float64
 4   promo_price  10477 non-null  float64
 5   in_stock     10477 non-null  int64  
 6   type         10477 non-null  object 
dtypes: float64(2), int64(1), object(4)
memory usage: 654.8+ KB


In [108]:
true_clean_products.isna().sum()

sku            0
name           0
desc           0
price          0
promo_price    0
in_stock       0
type           0
dtype: int64

In [109]:
true_clean_products.to_csv('products_clean.csv', index=False)