## SKU Creation

1. Isolate the product within the options dataframe with all their options.
2. Fill nulls using ffill method
3. Compile list of options using the optionid
4. Generate all combinations
5. Create new dataframe of options
6. Generate SKUs based on those combinations. 

## Business Problem: 
As per my conversaƟon with Allen Shimon, StrobesnMore needs a merge of their products and product 
opƟons so that every possible combinaƟon of opƟons has a unique skew. The challenge that their 
website currently faces is that their inventory lookup requires visiƟng mulƟple links and clicking into 
mulƟple dropdowns, making their current list ineffecƟve at looking up product quickly.

Example: 
Product 123 has 2 options with multiple types. 
Option 1(product feature): Magnetic VS Permanent 
Option 2(color): Blue VS Green 

CURRENTLY: Product 123 only has 1 skew (123) 
After the complete skew list is created, the following skews will be provided: 
123MBLUE 
123PBLUE 
123MGREEN 
123PGREEN 


## Data Understanding 
Two csv files were provided to demonstrate the need for this service. There is a csv with a full list of 
products and another with a list of product options. The goal will be to provide a query merging these 
two csv files so that each product has a skew associated with every combination of options associated 
with it. 
This process will require “digging into” the data and coding a query to pull the list of products and merge 
all their combination of options. To ensure quality, an understanding of the data options and their 
products will be required. 
In the product options list, a few observations have been made and need to be confirmed… 
1. It is assumed that each product can only have one unique color. 
2. The optionid is the identifier for the option categories for each product. It is assumed that each 
product can only have one unique option (for example, cannot be both magnetic and permanent) per 
optionid. 
4. The catalogid, NOT the productid in the csv appears to the proper unique identifier for each product. 
5. Client should list if there is a preffered format for SKU creation(color first, capitals, etc)





Challenges

Making this iterative process work on all products(There are 1200. Unless I want to spend 60 hours, that would take a while.)



# EDA

In [25]:
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option("display.max_colwidth", 100)

In [26]:
df_prod = pd.read_csv('products.csv', encoding='unicode_escape')
len(df_prod)

1277

### Checking the products

This is a simple check of the product list, mainly to see if there are any duplicates.

In [27]:
df_prod['name'].duplicated().value_counts()

False    1254
True       23
Name: name, dtype: int64

In [28]:
df_prod[df_prod["id"].duplicated()].head(10)

Unnamed: 0,catalogid,id,name,image1,thumbnail,price,categories,description,keywords,stock
11,3486,50,Whelen 500 Series TIR6 Super-LED,assets/images/whelen-500-tir6-lighthead-blue.jpg,assets/images/thumbnails/whelen-500-tir6-black-flange-5flangeb_thumbnail.jpg,139.99,LED Lights/Body & Grille Mount,"These lightheads feature a six Super-LED panel, three Super-LEDs per color, in a single lighthea...",,9930.0
12,3487,68,Whelen 500 Series Lens for LED and Halogen with Optics,assets/images/whelen-replacement-500-lens-halogen-optic.jpg,assets/images/thumbnails/whelen-replacement-500-lens-halogen-optic_thumbnail.jpg,39.99,Replacement Bulbs & Lenses/Replacement Lenses,"These are Whelen 500 Series replacement lenses. Please choose optic, which has the groves in the...",,-79.0
14,3490,68,Whelen 600 Series Lens,assets/images/whelen-replacement-600-lens-halogen-optic.jpg,assets/images/thumbnails/whelen-replacement-600-lens-halogen-optic_thumbnail.jpg,76.99,Replacement Bulbs & Lenses/Replacement Lenses,These Whelen 600 Series lenses fit the below lightheads.,,-97.0
17,3493,68,Whelen 700 Series Optic Lens,assets/images/whelen-replacement-700-lens-halogen-optic.jpg,assets/images/thumbnails/whelen-replacement-700-lens-halogen-optic_thumbnail.jpg,77.99,Replacement Bulbs & Lenses/Replacement Lenses,These Whelen 700 Series lenses fit the 700 Series LED lightheads.,,-104.0
20,3496,68,Whelen 900 Series Optic Lens,assets/images/Whelen_900Series_OpticLens_Red.jpg,assets/images/thumbnails/Whelen_900Series_OpticLens_Red_thumbnail.jpg,77.99,Replacement Bulbs & Lenses/Replacement Lenses,These Whelen replacement lens fit the 900 Series Super-Led lightheads.,,-187.0
81,3754,40,Whelen 400 Series Single Level Super-LED,assets/images/whelen-400-series-singe-level-led-amber.jpg,assets/images/thumbnails/whelen-400-series-singe-level-led-amber_thumbnail.jpg,110.99,LED Lights/Body & Grille Mount,Whelen's Super Linear 400 Series lightheads feature 5 built in scan-lcok flash patterns and are ...,,33977.0
340,5401,M9,Whelen M9 Conversion Flange from M9 to 900 Series,assets/images/M9FC900.jpg,assets/images/thumbnails/m9fc900_thumbnail.jpg,41.99,LED Lights/Mounting Brackets & Accessories,,,-8.0
366,5431,CCSRN,Whelen CenCom Carbide Amplifier Control Module,assets/images/carbide-grp.jpg,assets/images/thumbnails/carbide-grp_thumbnail.jpg,1197.99,Sirens & Speakers/Sirens & Amplifiers,,,-35.0
380,5462,AV,Whelen Avenger II DUO Single Combination Linear/TIR Super-LED Dash Light,assets/images/AVENGERIIDUO.jpg,assets/images/thumbnails/avengeriiduo_thumbnail.jpg,182.99,LED Lights/Dash & Window Mount,,,-13.0
381,5463,AV,Whelen Avenger II TRIO Single Combination Linear/TIR Super-LED Dash Light,assets/images/AVENGERIITRIO.jpg,assets/images/thumbnails/avengeriitrio_thumbnail.jpg,194.99,LED Lights/Dash & Window Mount,,,-6.0


In [29]:
df_prod['name'].duplicated().sort_values(ascending=False).head(10)

638    True
491    True
511    True
582    True
607    True
774    True
917    True
918    True
920    True
921    True
Name: name, dtype: bool

In [30]:
df_prod['catalogid'].duplicated().value_counts()

False    1277
Name: catalogid, dtype: int64

In [31]:
df_prod.isnull().sum()

catalogid         0
id                1
name              1
image1          115
thumbnail       117
price             0
categories      111
description     807
keywords       1085
stock             0
dtype: int64

In [32]:
df_prod.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1277 entries, 0 to 1276
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   catalogid    1277 non-null   int64  
 1   id           1276 non-null   object 
 2   name         1276 non-null   object 
 3   image1       1162 non-null   object 
 4   thumbnail    1160 non-null   object 
 5   price        1277 non-null   float64
 6   categories   1166 non-null   object 
 7   description  470 non-null    object 
 8   keywords     192 non-null    object 
 9   stock        1277 non-null   float64
dtypes: float64(2), int64(1), object(7)
memory usage: 99.9+ KB


In [33]:
df_prod['catalogid'] = df_prod['catalogid'].astype(str)

In [34]:
df_prod[df_prod['catalogid'] == '3487']['name']

12    Whelen 500 Series Lens for LED and Halogen with Optics
Name: name, dtype: object

In [35]:
df_options = pd.read_csv('options.csv',encoding='unicode_escape')
len(df_options)

4397

In [36]:
df_options.columns

Index(['optionid', 'productid', 'catalogid', 'category_id', 'featurecaption',
       'featuretype', 'featurerequired', 'sorting', 'url', 'info', 'featureid',
       'featurename', 'featureprice', 'sorting.1', 'partnumber', 'imagepath',
       'selected', 'hidden', 'optcatalogid', 'qty', 'thumbpath', 'colorcode'],
      dtype='object')

In [37]:
df_option_list = df_options.copy()

1) identify all unique products(from products dataframe)
2) identify all product options(reshape dataframe for merging with the products)
3) melt dataframe so there is a product row for every combination of options
Example:

Product 123 has 2 options with multiple types.

Option 1: Magnetic VS Permanent
Option 2: Blue VS Green

Product 123 only has 1 skew (123)

I need 4 Skews created

 - 123MBLUE - magnetic and blue
 - 123PBLUE - permanent and blue
 - 123MGREEN - magnetic and green
 - 123PGREEN - permanent and green

In [38]:
df_option_list.columns

Index(['optionid', 'productid', 'catalogid', 'category_id', 'featurecaption',
       'featuretype', 'featurerequired', 'sorting', 'url', 'info', 'featureid',
       'featurename', 'featureprice', 'sorting.1', 'partnumber', 'imagepath',
       'selected', 'hidden', 'optcatalogid', 'qty', 'thumbpath', 'colorcode'],
      dtype='object')

In [39]:
df_option_list.isnull().sum()

optionid           3544
productid          3544
catalogid          3544
category_id        3544
featurecaption     3544
featuretype        3544
featurerequired    3544
sorting            3544
url                4396
info               4395
featureid             6
featurename           6
featureprice          6
sorting.1             6
partnumber          554
imagepath          4397
selected              6
hidden                6
optcatalogid          6
qty                   6
thumbpath          4397
colorcode          2775
dtype: int64

In [40]:
df_option_list.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4397 entries, 0 to 4396
Data columns (total 22 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   optionid         853 non-null    float64
 1   productid        853 non-null    object 
 2   catalogid        853 non-null    float64
 3   category_id      853 non-null    float64
 4   featurecaption   853 non-null    object 
 5   featuretype      853 non-null    object 
 6   featurerequired  853 non-null    float64
 7   sorting          853 non-null    float64
 8   url              1 non-null      float64
 9   info             2 non-null      object 
 10  featureid        4391 non-null   float64
 11  featurename      4391 non-null   object 
 12  featureprice     4391 non-null   float64
 13  sorting.1        4391 non-null   float64
 14  partnumber       3843 non-null   object 
 15  imagepath        0 non-null      float64
 16  selected         4391 non-null   float64
 17  hidden        

In [41]:
df_option_list.columns

Index(['optionid', 'productid', 'catalogid', 'category_id', 'featurecaption',
       'featuretype', 'featurerequired', 'sorting', 'url', 'info', 'featureid',
       'featurename', 'featureprice', 'sorting.1', 'partnumber', 'imagepath',
       'selected', 'hidden', 'optcatalogid', 'qty', 'thumbpath', 'colorcode'],
      dtype='object')

In [42]:
df_option_list = df_option_list[['optionid', 'productid', 'catalogid', 'category_id','featurecaption',
       'featuretype', 'featurerequired', 'sorting', 'url', 'info', 'featureid',
       'featurename','partnumber']]

In [43]:
df_option_list.head()

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber
0,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11418.0,Blue,BP
1,,,,,,,,,,,11419.0,Clear,CP
2,,,,,,,,,,,11420.0,Red,RP
3,,,,,,,,,,,11421.0,Amber,AP
4,4114.0,11.1002,3474.0,0.0,Choose your Momentary Option,Dropdown,1.0,1.0,,,18984.0,Positive Momentary,.PM


In [44]:
df_option_list_filled = df_option_list.fillna(method='ffill')

In [45]:
df_option_list_filled.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4397 entries, 0 to 4396
Data columns (total 13 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   optionid         4397 non-null   float64
 1   productid        4397 non-null   object 
 2   catalogid        4397 non-null   float64
 3   category_id      4397 non-null   float64
 4   featurecaption   4397 non-null   object 
 5   featuretype      4397 non-null   object 
 6   featurerequired  4397 non-null   float64
 7   sorting          4397 non-null   float64
 8   url              416 non-null    float64
 9   info             911 non-null    object 
 10  featureid        4397 non-null   float64
 11  featurename      4397 non-null   object 
 12  partnumber       4397 non-null   object 
dtypes: float64(7), object(6)
memory usage: 446.7+ KB


# Sku format

The SKU will be comprised of the options listed with the partnumbers associated with the options.

Example:

For Product 68, LED(-1183), Non-Optic(725), Amber(-1SB) the sku will read 68-1183725-1SB

In [46]:
df_option_list_filled[df_option_list_filled['productid'] == '68']

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber
27,2462.0,68,3483.0,0.0,Choose your Bulb,Dropdown,1.0,2.0,,,11466.0,LED,-1183
28,2462.0,68,3483.0,0.0,Choose your Bulb,Dropdown,1.0,2.0,,,11467.0,Strobe,-3183
29,2462.0,68,3483.0,0.0,Choose your Bulb,Dropdown,1.0,2.0,,,11468.0,Halogen,-1183
30,2463.0,68,3483.0,0.0,Choose your Lens Color,Dropdown,1.0,3.0,,,11469.0,Amber,-1SB
31,2463.0,68,3483.0,0.0,Choose your Lens Color,Dropdown,1.0,3.0,,,11470.0,Blue,-2SB
32,2463.0,68,3483.0,0.0,Choose your Lens Color,Dropdown,1.0,3.0,,,11471.0,Clear,-3SB
33,2463.0,68,3483.0,0.0,Choose your Lens Color,Dropdown,1.0,3.0,,,11472.0,Red,-5SB
34,2464.0,68,3483.0,0.0,Choose your Model,Dropdown,1.0,2.0,,,11473.0,Non-Optic,725
35,2464.0,68,3483.0,0.0,Choose your Model,Dropdown,1.0,2.0,,,11474.0,Optic,726
50,2467.0,68,3487.0,0.0,Choose your Bulb,Dropdown,1.0,0.0,,,11487.0,Halogen,-196


In [48]:
df_prod[df_prod['id'] == '68'][['id','name','catalogid']]

Unnamed: 0,id,name,catalogid
9,68,Whelen 400 Series Lens,3483
12,68,Whelen 500 Series Lens for LED and Halogen with Optics,3487
14,68,Whelen 600 Series Lens,3490
17,68,Whelen 700 Series Optic Lens,3493
20,68,Whelen 900 Series Optic Lens,3496
704,68,Whelen ION Clear Replacement Lens,6128
819,68,Whelen 700 Series Non-Optic Lens,6387


In [54]:
df_prod[df_prod['id'] == '2'][['id','name','catalogid']]

Unnamed: 0,id,name,catalogid
6,2,Whelen Warning Par36 Fairing Light,3479


In [61]:
df_option_list_filled['partnumber'].value_counts()

#NAME?                  888
R                       145
A                       139
B                       130
Equipment Info          119
C                        98
-R                       72
BB                       60
L                        58
M                        46
G                        44
J                        44
D                        40
W                        38
F                        34
RB                       33
No SAK                   30
2                        29
E                        29
V                        27
P                        24
3COLORKIT                24
K                        24
-C                       21
AA                       21
RC                       20
8SP1                     20
RR                       20
BC                       19
MBAJ94                   16
MKEZ94                   16
AC                       15
RA                       14
CUSTOM                   13
T5                       13
UNI                 

## Sample skew creation [RB6T and 11.1002]

In [49]:
df_option_list_filled.iloc[:17]

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber
0,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11418.0,Blue,BP
1,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11419.0,Clear,CP
2,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11420.0,Red,RP
3,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11421.0,Amber,AP
4,4114.0,11.1002,3474.0,0.0,Choose your Momentary Option,Dropdown,1.0,1.0,,,18984.0,Positive Momentary,.PM
5,4114.0,11.1002,3474.0,0.0,Choose your Momentary Option,Dropdown,1.0,1.0,,,18985.0,Negative/Ground Momentary,.GSM
6,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11444.0,Blue,#NAME?
7,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11445.0,Red/Blue,#NAME?
8,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11446.0,Blue/White,#NAME?
9,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11447.0,Red/White,-RC


### Product RB6T

In [53]:
df_RB6T = df_option_list_filled.iloc[:4]




### Create Skew Column

In [54]:
df_RB6T

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber
0,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11418.0,Blue,BP
1,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11419.0,Clear,CP
2,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11420.0,Red,RP
3,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11421.0,Amber,AP


In [56]:
df_RB6T['sku'] = df_RB6T['productid'] + '-' + df_RB6T['partnumber']
df_RB6T

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_RB6T['sku'] = df_RB6T['productid'] + '-' + df_RB6T['partnumber']


Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber,sku
0,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11418.0,Blue,BP,RB6T-BP
1,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11419.0,Clear,CP,RB6T-CP
2,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11420.0,Red,RP,RB6T-RP
3,2451.0,RB6T,3457.0,0.0,Choose your Lens Color,Dropdown,1.0,0.0,,,11421.0,Amber,AP,RB6T-AP


In [59]:
df_11 = df_option_list_filled.iloc[4:17]


In [60]:
df_11

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featurerequired,sorting,url,info,featureid,featurename,partnumber
4,4114.0,11.1002,3474.0,0.0,Choose your Momentary Option,Dropdown,1.0,1.0,,,18984.0,Positive Momentary,.PM
5,4114.0,11.1002,3474.0,0.0,Choose your Momentary Option,Dropdown,1.0,1.0,,,18985.0,Negative/Ground Momentary,.GSM
6,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11444.0,Blue,#NAME?
7,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11445.0,Red/Blue,#NAME?
8,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11446.0,Blue/White,#NAME?
9,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11447.0,Red/White,-RC
10,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11448.0,Amber/White,#NAME?
11,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11449.0,Green,#NAME?
12,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11450.0,Red,#NAME?
13,2458.0,11.12,3475.0,0.0,Choose your LED Colors,Dropdown,1.0,1.0,,,11451.0,Amber,#NAME?


In [62]:
df_11_moment = df_11.iloc[:2]
df_11_moment_options = list(df_11_moment['featurename'])

In [63]:
df_11_color = df_11.iloc[2:11]

df_11_color_options = list(df_11_color['featurename'])

In [64]:
df_11_magperm = df_11.iloc[11:13]
df_11_magperm_options = list(df_11_magperm['featurename'])

In [65]:
print(df_11_magperm_options)
print(df_11_color_options)
print(df_11_moment_options)

['Magnetic', 'Permanent']
['Blue', 'Red/Blue', 'Blue/White', 'Red/White', 'Amber/White', 'Green', 'Red', 'Amber', 'Red/Amber']
['Positive Momentary', 'Negative/Ground Momentary']


In [66]:
from itertools import product

# Your three lists
options1 = ['Magnetic', 'Permanent']
options2 = ['Blue', 'Red/Blue', 'Blue/White', 'Red/White', 'Amber/White', 'Green', 'Red', 'Amber', 'Red/Amber']
options3 = ['Positive Momentary', 'Negative/Ground Momentary']

# Get every permutation
permutations = list(product(options1, options2, options3))

combos = []
# Print the permutations
for perm in permutations:
    combos.append(perm)

print(len(combos))
combos

36


[('Magnetic', 'Blue', 'Positive Momentary'),
 ('Magnetic', 'Blue', 'Negative/Ground Momentary'),
 ('Magnetic', 'Red/Blue', 'Positive Momentary'),
 ('Magnetic', 'Red/Blue', 'Negative/Ground Momentary'),
 ('Magnetic', 'Blue/White', 'Positive Momentary'),
 ('Magnetic', 'Blue/White', 'Negative/Ground Momentary'),
 ('Magnetic', 'Red/White', 'Positive Momentary'),
 ('Magnetic', 'Red/White', 'Negative/Ground Momentary'),
 ('Magnetic', 'Amber/White', 'Positive Momentary'),
 ('Magnetic', 'Amber/White', 'Negative/Ground Momentary'),
 ('Magnetic', 'Green', 'Positive Momentary'),
 ('Magnetic', 'Green', 'Negative/Ground Momentary'),
 ('Magnetic', 'Red', 'Positive Momentary'),
 ('Magnetic', 'Red', 'Negative/Ground Momentary'),
 ('Magnetic', 'Amber', 'Positive Momentary'),
 ('Magnetic', 'Amber', 'Negative/Ground Momentary'),
 ('Magnetic', 'Red/Amber', 'Positive Momentary'),
 ('Magnetic', 'Red/Amber', 'Negative/Ground Momentary'),
 ('Permanent', 'Blue', 'Positive Momentary'),
 ('Permanent', 'Blue', 'N

## Functions 

In [67]:
skews = pd.DataFrame(combos,columns = ['Magnetic/Permanent','Color','Positive or Neg/Ground'])



#Created labels for color
skews['Color'] = skews['Color'].apply(lambda x: 'B' if x == 'Blue'\
                                                else 'RB' if x == 'Red/Blue'\
                                                else 'BW' if x =='Blue/White'\
                                                else 'AW' if x == 'Amber/White'\
                                                else 'RW' if x == 'Red/White'\
                                                else  'G' if x =='Green'\
                                                else  'R' if x == 'Red'\
                                                else 'A' if x =='Amber'\
                                                else'RA' if x == 'Red/Amber' else x)
#Created labels for Magnetic vs Permanent
skews['Magnetic/Permanent'] = skews['Magnetic/Permanent'].apply(lambda x: 'M' if x == 'Magnetic'\
                                                                else 'P' if x == 'Permanent' else x)

#Created labels for positive vs negatice/ground momentary
skews['Positive or Neg/Ground'] = skews['Positive or Neg/Ground'].apply(lambda x: 'P' if x == 'Positive Momentary' \
                                                                        else 'NG' if x == 'Negative/Ground Momentary'\
                                                                       else x)
skews['prodid'] = '3474'
skews = skews[['prodid','Magnetic/Permanent', 'Color', 'Positive or Neg/Ground']]
skews['sku'] = skews.sum(axis=1)
skews['description'] = combos
skews

Unnamed: 0,prodid,Magnetic/Permanent,Color,Positive or Neg/Ground,sku,description
0,3474,M,B,P,3474MBP,"(Magnetic, Blue, Positive Momentary)"
1,3474,M,B,NG,3474MBNG,"(Magnetic, Blue, Negative/Ground Momentary)"
2,3474,M,RB,P,3474MRBP,"(Magnetic, Red/Blue, Positive Momentary)"
3,3474,M,RB,NG,3474MRBNG,"(Magnetic, Red/Blue, Negative/Ground Momentary)"
4,3474,M,BW,P,3474MBWP,"(Magnetic, Blue/White, Positive Momentary)"
5,3474,M,BW,NG,3474MBWNG,"(Magnetic, Blue/White, Negative/Ground Momentary)"
6,3474,M,RW,P,3474MRWP,"(Magnetic, Red/White, Positive Momentary)"
7,3474,M,RW,NG,3474MRWNG,"(Magnetic, Red/White, Negative/Ground Momentary)"
8,3474,M,AW,P,3474MAWP,"(Magnetic, Amber/White, Positive Momentary)"
9,3474,M,AW,NG,3474MAWNG,"(Magnetic, Amber/White, Negative/Ground Momentary)"


### More samples - products 2 and 68

Here we see that the products productid is not enough, but will be fine for the eventual sampling. The catalogid will be the unique product identifier that we will use when we finally merge the csv files. 

In [68]:
df_sample = df_option_list.iloc[17:36]
df_sample = df_sample.fillna(method = 'ffill')

In [69]:
df_prod[df_prod['id'] == '2'][['id','name']]

Unnamed: 0,id,name
6,2,Whelen Warning Par36 Fairing Light


In [70]:
df_prod[df_prod['id'] == '68'][['id','name','catalogid']]

Unnamed: 0,id,name,catalogid
9,68,Whelen 400 Series Lens,3483
12,68,Whelen 500 Series Lens for LED and Halogen with Optics,3487
14,68,Whelen 600 Series Lens,3490
17,68,Whelen 700 Series Optic Lens,3493
20,68,Whelen 900 Series Optic Lens,3496
704,68,Whelen ION Clear Replacement Lens,6128
819,68,Whelen 700 Series Non-Optic Lens,6387


In [71]:
df_sample

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featureid,featurename,partnumber
17,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11453.0,Blue with Clear Lens,B00ZCR
18,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11454.0,White with Clear Lens,C00ZCR
19,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11455.0,Red with Clear Lens,R00ZCR
20,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11456.0,Amber with Clear Lens,A00ZCR
21,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18738.0,Red/Blue with Clear Lens EXTENDED LENS ONLY,J00ZCR
22,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18739.0,Amber with Amber Lens,A00ZAR
23,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18740.0,Blue with Blue Lens,B00ZBR
24,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18741.0,Red with Red Lens,R00ZRR
25,4858.0,2,3479.0,0.0,Choose your Lens Style,Dropdown,24256.0,Extended Lens,E
26,4858.0,2,3479.0,0.0,Choose your Lens Style,Dropdown,24257.0,Flat Lens,F


In [72]:
df_prod.head()

Unnamed: 0,catalogid,id,name,image1,thumbnail,price,categories,description,keywords,stock
0,3457,RB6T,Whelen Dual Reflector Rota-Beam Beacon,assets/images/whelen-RB6-dual-rotabeam-beacon-amber.jpg,assets/images/thumbnails/whelen-rb6-dual-rotabeam-beacon-amber_thumbnail.jpg,227.99,360° Beacons/360° Beacons,"The RB6T is a versatile, mid-sized beacon, perfect for utility, maintenance, fire and rescue app...",,99995.0
1,3465,11.1005SF,Sho-Me Universal Strobe-Style LED Flasher,assets/images/able-2-sho-me-strobe-style-flasher-111005SF.jpg,assets/images/thumbnails/able-2-sho-me-strobe-style-flasher-111005sf_thumbnail.jpg,26.99,Flasher Modules,This Able 2/Sho-Me Universal Strobe-Style LED Flasher features flash patterns in your choice of ...,"Able 2, Universal, LED, Flasher",99971.0
2,3466,11.1032,Sho-Me Micro Switch with Built-In LED Flasher,assets/images/able-2-sho-me-switch-111032.jpg,assets/images/thumbnails/able-2-sho-me-switch-111032_thumbnail.jpg,52.99,Switches & Controllers,This touch pad switch features a built in LED flasher and is the smallest switch of its kind on ...,"Able 2, sho me, Micro Switch, Built in, LED, Flasher",99999.0
3,3474,11.1002,Sho-Me Universal On/Off/Flash Switch,assets/images/able2-sho-me-on-off-mometary-switch-11-1002.jpg,assets/images/thumbnails/able2-sho-me-on-off-mometary-switch-11-1002_thumbnail.jpg,28.99,Switches & Controllers,This Sho-Me universal on/off switch features a momentary switch which will work with most any po...,"Able 2, show me, shome, On, Off, Momentary, Switch",99992.0
4,3475,11.120,Sho-Me Low-Profile LED Mini Lightbar,assets/images/able2-sho-me-on-low-profile-mini-lightbar-11-1200-red-blue.jpg,assets/images/thumbnails/able2-sho-me-on-low-profile-mini-lightbar-11-1200-red-blue_thumbnail.jpg,349.99,Lightbars/Mini Lightbars,The Able 2 Low-Profile LED Mini Lightbar offers 360 degree light output with a continuous loop o...,"Able 2, Low-Profile, LED, Mini, Lightbar",99998.0


In [73]:
df_prod[df_prod['id'] == '2'][['id','name']]

Unnamed: 0,id,name
6,2,Whelen Warning Par36 Fairing Light


In [74]:
df_sample[df_sample['optionid'] == 2459.0]

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featureid,featurename,partnumber
17,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11453.0,Blue with Clear Lens,B00ZCR
18,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11454.0,White with Clear Lens,C00ZCR
19,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11455.0,Red with Clear Lens,R00ZCR
20,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,11456.0,Amber with Clear Lens,A00ZCR
21,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18738.0,Red/Blue with Clear Lens EXTENDED LENS ONLY,J00ZCR
22,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18739.0,Amber with Amber Lens,A00ZAR
23,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18740.0,Blue with Blue Lens,B00ZBR
24,2459.0,2,3479.0,0.0,Choose your LED and Lens Color,Dropdown,18741.0,Red with Red Lens,R00ZRR


In [37]:
df_sample[df_sample['optionid'] == 4858.0]

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featureid,featurename,partnumber
25,4858.0,2,3479.0,0.0,Choose your Lens Style,Dropdown,24256.0,Extended Lens,E
26,4858.0,2,3479.0,0.0,Choose your Lens Style,Dropdown,24257.0,Flat Lens,F


In [75]:
from itertools import product

# Your three lists
options1 = list(df_sample[df_sample['optionid'] == 2459.0]['featurename'])
options2 = list(df_sample[df_sample['optionid'] == 4858.0]['featurename'])
# Get every permutation
permutations = list(product(options1, options2))

combos = []
# Print the permutations
for perm in permutations:
    combos.append(perm)

print(len(combos))
combos

16


[('Blue with Clear Lens', 'Extended Lens'),
 ('Blue with Clear Lens', 'Flat Lens'),
 ('White with Clear Lens', 'Extended Lens'),
 ('White with Clear Lens', 'Flat Lens'),
 ('Red with Clear Lens', 'Extended Lens'),
 ('Red with Clear Lens', 'Flat Lens'),
 ('Amber with Clear Lens', 'Extended Lens'),
 ('Amber with Clear Lens', 'Flat Lens'),
 ('Red/Blue with Clear Lens EXTENDED LENS ONLY', 'Extended Lens'),
 ('Red/Blue with Clear Lens EXTENDED LENS ONLY', 'Flat Lens'),
 ('Amber with Amber Lens', 'Extended Lens'),
 ('Amber with Amber Lens', 'Flat Lens'),
 ('Blue with Blue Lens', 'Extended Lens'),
 ('Blue with Blue Lens', 'Flat Lens'),
 ('Red with Red Lens', 'Extended Lens'),
 ('Red with Red Lens', 'Flat Lens')]

## Functions

In [76]:
def replace_words_with_first_letter(text):
    words = text.split()
    initials = [word[0].upper() for word in words]
    return ''.join(initials)

# skew_sample['color_lens'] = skew_sample['color_lens'].apply(replace_words_with_first_letter)

In [77]:
skew_sample = pd.DataFrame(combos,columns = ['color_lens', 'Ext_or_Fl'])

skew_sample['color_lens'] = skew_sample['color_lens'].apply(replace_words_with_first_letter)
skew_sample['Ext_or_Fl'] = skew_sample['Ext_or_Fl'].apply(replace_words_with_first_letter)
skew_sample['prodid'] = '3479'
skew_sample = skew_sample[['prodid','color_lens','Ext_or_Fl']]
skew_sample['sku'] = skew_sample.sum(axis=1)
skew_sample['description'] = combos
skew_sample

Unnamed: 0,prodid,color_lens,Ext_or_Fl,sku,description
0,3479,BWCL,EL,3479BWCLEL,"(Blue with Clear Lens, Extended Lens)"
1,3479,BWCL,FL,3479BWCLFL,"(Blue with Clear Lens, Flat Lens)"
2,3479,WWCL,EL,3479WWCLEL,"(White with Clear Lens, Extended Lens)"
3,3479,WWCL,FL,3479WWCLFL,"(White with Clear Lens, Flat Lens)"
4,3479,RWCL,EL,3479RWCLEL,"(Red with Clear Lens, Extended Lens)"
5,3479,RWCL,FL,3479RWCLFL,"(Red with Clear Lens, Flat Lens)"
6,3479,AWCL,EL,3479AWCLEL,"(Amber with Clear Lens, Extended Lens)"
7,3479,AWCL,FL,3479AWCLFL,"(Amber with Clear Lens, Flat Lens)"
8,3479,RWCLELO,EL,3479RWCLELOEL,"(Red/Blue with Clear Lens EXTENDED LENS ONLY, Extended Lens)"
9,3479,RWCLELO,FL,3479RWCLELOFL,"(Red/Blue with Clear Lens EXTENDED LENS ONLY, Flat Lens)"


In [78]:
list(skew_sample['sku'])

['3479BWCLEL',
 '3479BWCLFL',
 '3479WWCLEL',
 '3479WWCLFL',
 '3479RWCLEL',
 '3479RWCLFL',
 '3479AWCLEL',
 '3479AWCLFL',
 '3479RWCLELOEL',
 '3479RWCLELOFL',
 '3479AWALEL',
 '3479AWALFL',
 '3479BWBLEL',
 '3479BWBLFL',
 '3479RWRLEL',
 '3479RWRLFL']

In [79]:
df_sample[df_sample['optionid'] == 2462.0]
df_sample[df_sample['optionid'] == 2463.0]
df_sample[df_sample['optionid'] == 2464.0]

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featureid,featurename,partnumber
34,2464.0,68,3483.0,0.0,Choose your Model,Dropdown,11473.0,Non-Optic,725
35,2464.0,68,3483.0,0.0,Choose your Model,Dropdown,11474.0,Optic,726


In [43]:
from itertools import product

# Your three lists
options1 = list(df_sample[df_sample['optionid'] == 2462.0]['featurename'])
options2 = list(df_sample[df_sample['optionid'] == 2463.0]['featurename'])
options3 = list(df_sample[df_sample['optionid'] == 2464.0]['featurename'])
# Get every permutation
permutations = list(product(options1, options2,options3))

combos = []
# Print the permutations
for perm in permutations:
    combos.append(perm)

print(len(combos))
combos

24


[('LED', 'Amber', 'Non-Optic'),
 ('LED', 'Amber', 'Optic'),
 ('LED', 'Blue', 'Non-Optic'),
 ('LED', 'Blue', 'Optic'),
 ('LED', 'Clear', 'Non-Optic'),
 ('LED', 'Clear', 'Optic'),
 ('LED', 'Red', 'Non-Optic'),
 ('LED', 'Red', 'Optic'),
 ('Strobe', 'Amber', 'Non-Optic'),
 ('Strobe', 'Amber', 'Optic'),
 ('Strobe', 'Blue', 'Non-Optic'),
 ('Strobe', 'Blue', 'Optic'),
 ('Strobe', 'Clear', 'Non-Optic'),
 ('Strobe', 'Clear', 'Optic'),
 ('Strobe', 'Red', 'Non-Optic'),
 ('Strobe', 'Red', 'Optic'),
 ('Halogen', 'Amber', 'Non-Optic'),
 ('Halogen', 'Amber', 'Optic'),
 ('Halogen', 'Blue', 'Non-Optic'),
 ('Halogen', 'Blue', 'Optic'),
 ('Halogen', 'Clear', 'Non-Optic'),
 ('Halogen', 'Clear', 'Optic'),
 ('Halogen', 'Red', 'Non-Optic'),
 ('Halogen', 'Red', 'Optic')]

In [82]:
skew_sample_2 = pd.DataFrame(combos,columns = ['light', 'color','optic'])

skew_sample_2['optic'] = skew_sample_2['optic'].str.replace('-', ' ')
skew_sample_2 = skew_sample_2.applymap(replace_words_with_first_letter)
skew_sample_2['prodid'] = '3483'
skew_sample_2 = skew_sample_2[['prodid','light', 'color','optic']]
skew_sample_2['sku'] = skew_sample_2.sum(axis=1)
skew_sample_2['description'] = combos
skew_sample_2

ValueError: 3 columns passed, passed data had 2 columns

In [83]:
df_3487 = df_option_list.iloc[50:61]
df_3487

Unnamed: 0,optionid,productid,catalogid,category_id,featurecaption,featuretype,featureid,featurename,partnumber
50,2467.0,68.0,3487.0,0.0,Choose your Bulb,Dropdown,11487.0,Halogen,-196
51,,,,,,,11488.0,LED,-196
52,,,,,,,11489.0,Strobe,STROBE-
53,2468.0,68.0,3487.0,0.0,Choose your Lens Color,Dropdown,11490.0,Red,-50A
54,,,,,,,11491.0,Amber,-10A
55,,,,,,,11492.0,Blue,-20A
56,,,,,,,11493.0,Clear,-30A
57,2469.0,68.0,3487.0,0.0,Choose your Model,Dropdown,11494.0,Optic - Halogen,-1963237
58,,,,,,,11495.0,20? Optic - LED 5MM w/o Flange,-1963583
59,,,,,,,29439.0,20? Optic - TIR6 Super LED,-1963583


In [46]:
df_3487 = df_option_list.iloc[50:61]
df_3487 = df_3487.fillna(method = 'ffill')
df_2467 = df_3487[df_3487['optionid'] == 2467.0]
df_2468 = df_3487[df_3487['optionid'] == 2468.0]
df_2469 = df_3487[df_3487['optionid'] == 2469.0]


from itertools import product

# Your three lists
options1 = list(df_3487[df_3487['optionid'] == 2467.0]['featurename'])
options2 = list(df_3487[df_3487['optionid'] == 2468.0]['featurename'])
options3 = list(df_3487[df_3487['optionid'] == 2469.0]['featurename'])
# Get every permutation
permutations = list(product(options1, options2,options3))

combos = []
# Print the permutations
for perm in permutations:
    combos.append(perm)

print(len(combos))
combos


skew_sample_3487 = pd.DataFrame(combos,columns = ['bulb', 'color_lens','model'])

skew_sample_3487['model'] = skew_sample_3487['model'].str.replace('-', ' ').str.replace('?', '').str.replace('w/', 'w')
skew_sample_3487 = skew_sample_3487.applymap(replace_words_with_first_letter)
skew_sample_3487['prodid'] = '3487'
skew_sample_3487 = skew_sample_3487[['prodid','bulb', 'color_lens','model']]
skew_sample_3487['sku'] = skew_sample_3487.sum(axis=1)
skew_sample_3487['description'] = combos
skew_sample_3487

48


  skew_sample_3487['model'] = skew_sample_3487['model'].str.replace('-', ' ').str.replace('?', '').str.replace('w/', 'w')


Unnamed: 0,prodid,bulb,color_lens,model,sku,description
0,3487,H,R,OH,3487HROH,"(Halogen, Red, Optic - Halogen)"
1,3487,H,R,2OL5WF,3487HR2OL5WF,"(Halogen, Red, 20? Optic - LED 5MM w/o Flange)"
2,3487,H,R,2OTSL,3487HR2OTSL,"(Halogen, Red, 20? Optic - TIR6 Super LED)"
3,3487,H,R,2OL5WF,3487HR2OL5WF,"(Halogen, Red, 20? Optic - LED 5MM w/ Flange)"
4,3487,H,A,OH,3487HAOH,"(Halogen, Amber, Optic - Halogen)"
5,3487,H,A,2OL5WF,3487HA2OL5WF,"(Halogen, Amber, 20? Optic - LED 5MM w/o Flange)"
6,3487,H,A,2OTSL,3487HA2OTSL,"(Halogen, Amber, 20? Optic - TIR6 Super LED)"
7,3487,H,A,2OL5WF,3487HA2OL5WF,"(Halogen, Amber, 20? Optic - LED 5MM w/ Flange)"
8,3487,H,B,OH,3487HBOH,"(Halogen, Blue, Optic - Halogen)"
9,3487,H,B,2OL5WF,3487HB2OL5WF,"(Halogen, Blue, 20? Optic - LED 5MM w/o Flange)"


In [56]:
skew_6 = skew_sample_2[['prodid','sku','description']]
skew_2 = skew_sample[['prodid','sku','description']]
skew_11 = skews[['prodid','sku','description']]
skew_3487 = skew_sample_3487[['prodid','sku','description']]
skew_rb = df_RB6T[['sku','description']]
skew_rb['prodid'] = '3457'
skew_sample_list = pd.concat([skew_6,skew_2,skew_11,skew_rb, skew_3487])
skew_sample_list.columns = ['prodid','sku','option_description']
def join_tuple_with_underscore(t):
    return '_'.join(map(str, t))

skew_sample_list['option_description'] = skew_sample_list['option_description'].apply(join_tuple_with_underscore)
skew_sample_list

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  skew_rb['prodid'] = '3457'


Unnamed: 0,prodid,sku,option_description
0,3483,3483LANO,LED_Amber_Non-Optic
1,3483,3483LAO,LED_Amber_Optic
2,3483,3483LBNO,LED_Blue_Non-Optic
3,3483,3483LBO,LED_Blue_Optic
4,3483,3483LCNO,LED_Clear_Non-Optic
5,3483,3483LCO,LED_Clear_Optic
6,3483,3483LRNO,LED_Red_Non-Optic
7,3483,3483LRO,LED_Red_Optic
8,3483,3483SANO,Strobe_Amber_Non-Optic
9,3483,3483SAO,Strobe_Amber_Optic


In [51]:
df_prod['catalogid'] = df_prod['catalogid'].astype(str)

In [59]:
product_option_list_sample = skew_sample_list.merge(df_prod, how='inner', left_on='prodid',right_on='catalogid')

In [60]:
product_option_list_sample.to_csv('product_option_list_sample.csv')