# Suction Cup Selector Project

This is the notebook for the suction cup selector project.

## Download the project

If you have git installed, you should be able to pull the project down from the repository using the command:

```bash
git clone <url>
```

Once you cloned the project, you can switch the branch using:

```bash
git checkout <branchname>
git checkout -b <yourownbranchname>
```

To confirm your change and add it to the git history, do:

```bash
git add .  #this is to stage all chagnes
git status #this is to confirm the change
git commit -m "commit message" #this is to finalize and commit the change
```

If you want to push your changes to repository, do:

```bash
git push -u origin <branchname>
```

To stage your changes for review, do pull request on Github


## Environment setup
For this project I choose to use Python 3.9 with Pandas/Jupyter library.

In [2]:
import sys
print(sys.version)

3.9.7 (default, Sep 16 2021, 08:50:36) 
[Clang 10.0.0 ]


To isolate the environment, here is how you create a virtual one for the project. Go to your project folder, open terminal/cmd and run the following command:

```bash
python3 -m venv myenv
```

To activate the environment, run the following command:
```bash
# for windows
myenv\Scripts\activate

# for Mac/Linux
source myenv/bin/activate
```

## Data Processing
The first step of the project is to process the flat files. There are multiple ways for doing data processing. Here we choose to use pandas library read_csv() method to load the csv files into pandas **dataframe** objects.

In [3]:
import pandas as pd

suctionCups = pd.read_csv('SuctionCups.csv')
graspTypes = pd.read_csv('GraspTypes.csv')
items = pd.read_csv('items.csv')
itemConfigs = pd.read_csv('itemConfigs.csv')

#### To see the type of the object, you can do type()


In [4]:
type(items)

pandas.core.frame.DataFrame

#### To inspect the data we just loaded, Here are some common methods:
* df.head(): Returns the first few rows of the DataFrame.
* df.tail(): Returns the last few rows of the DataFrame.
* df.shape: Returns the dimensions (rows, columns) of the DataFrame.
* df.info(): Provides information about the DataFrame, including column data types and missing values.
* df.describe(): Generates descriptive statistics of numerical columns, such as count, mean, min, max, etc.

#### items object overview

In [5]:
items.head()

Unnamed: 0,item_id,sku_no,unit_length,unit_width,unit_height,weight,item_description
0,12810,24287592.0,8.4,3.7,2.3,2.1,SILK PURE ALMOND UNSWT VAN
1,19327,1266017.0,5.5,5.5,8.4,1.47,NEO-GEL 48PC TUB BLU
2,24874,24529912.0,10.0,8.05,2.6,0.72,FULL SIZE HOT GLUE GUN
3,15205,565284.0,6.85,2.7,2.65,1.285,TAPE DISPENSER
4,13444,2610177.0,9.2,8.1,6.7,1.3,DESKTOP DRAWER SYSTEM SMALL


In [6]:
items.shape

(7622, 7)

In [7]:
items.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7622 entries, 0 to 7621
Data columns (total 7 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   item_id           7622 non-null   int64  
 1   sku_no            7114 non-null   float64
 2   unit_length       7622 non-null   float64
 3   unit_width        7622 non-null   float64
 4   unit_height       7622 non-null   float64
 5   weight            7622 non-null   float64
 6   item_description  7622 non-null   object 
dtypes: float64(5), int64(1), object(1)
memory usage: 417.0+ KB


In [8]:
items.describe()

Unnamed: 0,item_id,sku_no,unit_length,unit_width,unit_height,weight
count,7622.0,7114.0,7622.0,7622.0,7622.0,7622.0
mean,9634.173183,6313149.0,6.342267,4.147844,1.943269,0.552853
std,7048.288945,9914077.0,1.854861,1.580922,1.478202,0.701571
min,744.0,12203.0,0.1,0.2,0.0,0.0012
25%,4099.5,500045.2,5.1,3.0,0.9,0.125
50%,7782.5,831648.5,6.1,3.8,1.4,0.3
75%,13471.5,2735132.0,7.6,5.0,2.6,0.7
max,26945.0,24563190.0,12.8,11.1,10.1,9.2


#### graspTypes object overview

In [9]:
graspTypes.head()

Unnamed: 0,id,name,description
0,0,suction_only,suction only
1,1,default,suction + fingers
2,2,stabilized,stabilized grasp


In [10]:
graspTypes.shape

(3, 3)

In [11]:
graspTypes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   id           3 non-null      int64 
 1   name         3 non-null      object
 2   description  3 non-null      object
dtypes: int64(1), object(2)
memory usage: 200.0+ bytes


#### SuctionCups object overview

In [12]:
suctionCups.head(n=10)

Unnamed: 0,id,description,name,minDim,maxDim,maxWeight
0,0,any,any,0.0,0,0.0
1,1,small-25mm,swappable_vs_25_nr,0.25,5,0.8
2,2,medium,swappable_b3_bgi34,2.0,1000,1.9
3,3,large,swappable_vsa_63_nr,3.0,1000,6.6
4,4,bag,swappable_bgx_48,1.9,1000,2.42
5,5,small-18mm,swappable_vs_18_nr,0.18,5,0.8


In [13]:
suctionCups.shape

(6, 6)

In [14]:
suctionCups.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 6 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   id           6 non-null      int64  
 1   description  6 non-null      object 
 2   name         6 non-null      object 
 3   minDim       6 non-null      float64
 4   maxDim       6 non-null      int64  
 5   maxWeight    6 non-null      float64
dtypes: float64(2), int64(2), object(2)
memory usage: 416.0+ bytes


#### itemConfig object overview

In [15]:
itemConfigs.head(n=100)

Unnamed: 0,item_id,suction_cup_id,name,arm_config,name.1
0,823,5,swappable_vs_18_nr,1,default
1,763,0,any,1,default
2,7116,3,swappable_vsa_63_nr,1,default
3,766,0,any,1,default
4,767,0,any,0,suction_only
...,...,...,...,...,...
95,13424,3,swappable_vsa_63_nr,1,default
96,3763,4,swappable_bgx_48,2,stabilized
97,3926,4,swappable_bgx_48,1,default
98,4878,4,swappable_bgx_48,2,stabilized


In [16]:
itemConfigs.shape

(10819, 5)

In [17]:
itemConfigs.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10819 entries, 0 to 10818
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   item_id         10819 non-null  int64 
 1   suction_cup_id  10819 non-null  int64 
 2   name            10819 non-null  object
 3   arm_config      10819 non-null  int64 
 4   name.1          10819 non-null  object
dtypes: int64(3), object(2)
memory usage: 422.7+ KB


In [18]:
itemConfigs['name'].value_counts()

name
swappable_bgx_48       3775
swappable_vs_25_nr     3696
swappable_vsa_63_nr    1910
any                     812
swappable_vs_18_nr      626
Name: count, dtype: int64

In [19]:
itemConfigs.groupby('item_id').filter(lambda x: len(x)>1).shape

(0, 5)

## Data Cleaning

After a preliminary data inspection, it is evident that the dataset contains incorrect data types and null values. Additionally, it is frequently observed that text fields contains noisy punctuations, such as quote and semicolons, which should be eliminated. Performing data cleaning is crucial at this stage to eradicate such records and ensure the cleanliness of your data.

### Item object
* Many item rows doesn't have SKU#
* Item ID should be a string since we are not going to do numeric manipulation on it.
* SKU # should be a string field without the trailing .0
* item_description is a text field. We probably want to take a deeper look
* len/wid/hgt doesn't quite fit our purpose. Making dim1, dim2 and dim3 in a asc/desc order makes more sense 

In [20]:
# drop null
items.dropna(subset=['sku_no'], inplace=True)

In [21]:
items.shape

(7114, 7)

In [22]:
# reformat sku_no field
items['sku_no'] = items['sku_no'].astype(str).str.rstrip('.0')

In [23]:
items.sample(5)

Unnamed: 0,item_id,sku_no,unit_length,unit_width,unit_height,weight,item_description
3790,6683,800301,5.9,5.3,1.3,0.2,PAD COLD COMPRESS 4X5
756,24900,24526138,7.0,4.4,2.4,0.585,P2 910/912 ONLIN ENRL KIT
3808,4955,889545,6.95,5.4,0.3,0.13,NO HEAT LUGGAGE TAG W/LOOP CLR
3520,4516,2618975,6.0,2.5,0.34,0.1,SPLS UNV SLIM STYLUS BL
1996,12518,388687,7.7,5.0,3.6,2.2,RUBBERBAND #84-1LB


In [24]:
# check punctuation
import string

print(string.punctuation)
mask = items['item_description'].str.contains(f"[{string.punctuation}]")

!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~


In [25]:
items[mask].sample(10)

Unnamed: 0,item_id,sku_no,unit_length,unit_width,unit_height,weight,item_description
2232,26337,24540519,6.9,3.7,1.7,0.6,130W GAN USB-C 4-PORT CHARGER
5056,24414,388754,5.95,4.0,0.45,0.035,32GB PRO USB 3.0
932,4946,2498461,8.9,8.0,0.3,0.3,STAPLES BLUE MOUSE PAD 2/PACK
4242,26008,2396606,5.5,3.7,0.01,0.01,64GB STORENGO 2 USB-C BLU
4400,1019,504704,2.7,2.5,0.7,0.1,TAPE HIGHLAND INV REF .5X1296
5291,5356,24411834,7.5,5.38,0.18,0.11,NORTON AV+ AR WEB
5728,6884,72565,3.1,2.7,0.3,0.02,POST-IT 1/2 FLAG NEON 4PK
1857,15990,635867,6.7,3.2,2.8,0.3,TEA APPLE CINNAMON 28CT.B
1338,3643,440642,4.0,4.0,1.3,0.3,LBL C/CODE ALPHA-K ROLL
1835,9147,778936,6.1,3.0,1.2,0.6,POSTIT NTS 3X3 POP-UP JAIPUR


In [26]:
# remove punctuation
unwantedChar = '\'"&'
for c in unwantedChar:
    items['item_description'] = items['item_description'].str.replace(c, '')

In [27]:
# reformat len/width/hgt to dim1/2/3
items['dim1'] = items[['unit_length', 'unit_width', 'unit_height']].apply(max, axis=1)
items['dim2'] = items[['unit_length', 'unit_width', 'unit_height']].apply(lambda x: sorted(x)[1], axis=1)
items['dim3'] = items[['unit_length', 'unit_width', 'unit_height']].apply(min, axis=1)

## Suction Cup Selection Logic
In this section, our main focus will be on developing the selection logic. The selection logic consists of a series of conditional statements with expandable rules. To ensure flexibility for future rule additions, we can leverage object-oriented programming (OOP) concepts. By adopting an OOP approach, we can easily incorporate new rules into the existing framework.

### Item Objects
The base of OOP is object. Pandas dataframe provides conveninent utilities for data manipulation, but it is not designed for OOP. For implementing selection logic, I would like to convert item to a object which is easier to access later on.

In [28]:
items

Unnamed: 0,item_id,sku_no,unit_length,unit_width,unit_height,weight,item_description,dim1,dim2,dim3
0,12810,24287592,8.40,3.70,2.30,2.100,SILK PURE ALMOND UNSWT VAN,8.40,3.70,2.30
1,19327,1266017,5.50,5.50,8.40,1.470,NEO-GEL 48PC TUB BLU,8.40,5.50,5.50
2,24874,24529912,10.00,8.05,2.60,0.720,FULL SIZE HOT GLUE GUN,10.00,8.05,2.60
3,15205,565284,6.85,2.70,2.65,1.285,TAPE DISPENSER,6.85,2.70,2.65
4,13444,2610177,9.20,8.10,6.70,1.300,DESKTOP DRAWER SYSTEM SMALL,9.20,8.10,6.70
...,...,...,...,...,...,...,...,...,...,...
7617,11997,49616,5.30,3.70,1.50,0.600,LAMINATING POUCH BADGE SIZE,5.30,3.70,1.50
7618,9779,664524,6.30,3.50,1.80,0.700,SDFC MULTIPLICATION 0-12,6.30,3.50,1.80
7619,17560,478187,7.80,5.50,2.90,1.400,NUTRA GRAIN RASPBERRY-BX,7.80,5.50,2.90
7620,10738,735767,5.00,4.90,1.40,0.300,MAGIC TAPE 1/2X2592 3IN 2PK,5.00,4.90,1.40


In [29]:
# here I imported item Object from the SuctionCupRules.py
from SuctionCupRules import Item

In [30]:
# Initiate Map/dict to contain items
# the key should be the item ID, the value is the Item object: {1: item1, 2:item2, ...}
itemMap = {}

# Make a for loop to go through the dataframe and put each row into a object
for idx, row in items.iterrows():
    itemMap[row['item_id']] = Item(row.item_id, row.sku_no, row.dim1, row.dim2, row.dim3, row.weight, row.item_description)

In [31]:
print(itemMap[12810])

item_id:12810;
sku_no:24287592;
dim1:8.4;
dim2:3.7;
dim3:2.3;
weight:2.1;
item_description:SILK PURE ALMOND UNSWT VAN         ;
suctionCupConfig:set();



### Rule Object
Instead of hardcoding the selection rules, a more flexible approach would be to leverage object-oriented programming (OOP) concepts. This involves encapsulating the rules into separate Rule objects and applying them dynamically during the selection process. Let's compare the two styles:

**Naive if flow**: 
```Python
# rule one
if foo>bar and ....:
    do something here
# rule two 
if foo<bar and ....:
    do other things here
# rule three, four, ...
...
```

In this approach, the selection logic is directly implemented within the code, making it less adaptable to changes in rules or the need for additional rules. Modifying the selection criteria requires manual changes to the code, which can be error-prone and time-consuming.

----

**OOP**:
```Python
rules = [rule1, rule2, rule3, rule4, ...]
for rule in rules:
    rule.apply(item)
...
```

By using OOP concepts, we can encapsulate the selection rules into separate Rule objects. Each Rule object represents a specific selection criterion and can be easily modified or extended without affecting the overall structure of the code. The rules can be organized into a cohesive hierarchy, allowing for better organization and maintainability.

During the selection process, the Rule objects can be dynamically applied based on the desired criteria. This flexibility enables easy addition, modification, or removal of rules, providing a more scalable and adaptable solution.


### Check the rule in SuctionCupRules.py for implementation

Below is the process of how to run the suction cup selection

In [64]:
# Import the rules from the file
from SuctionCupRules import *

# create the rule list
# Removed the rules for medium suction cup as we don't use it currently.
rules = [MiniCupRule, MiniCupPreferred1SORule, MiniCupPreferred2, 
         SmallCupRule, SmallCupPreferredCase3SO, SmallCupPreferredCase2, SmallCupPreferredCase1,
         BagCupRule, BagCupPreferred2, BagCupPreferred1,
         LargeCupRule, BigCupPreferredCase1, BigCupPreferredCase2SO, BigCupPreferredCase3]

In [69]:
# run through the items and apply MiniCupPreferred1SO rule:
itemSample = []
# loop over items and rules
for itemId, item in itemMap.items():
    for rule in rules:
        if rule().isEligible(item):
            item.addSuctionCupConfig(rule().getConfig())
    # if nothing fits, use BaseRule as default
    if len(item.getSuctionCupConfig()) == 0:
        item.addSuctionCupConfig(BaseRule().getConfig())

In [70]:
# show top n examples
import random
n = 3
for i in range(n):
    print(random.choice(list(itemMap.values())))

item_id:10662;
sku_no:24428137;
dim1:5.9;
dim2:3.1;
dim3:1.2;
weight:0.25;
item_description:PM PROFILE GEL 0.7MM 12CD BLK      ;
suctionCupConfig:{(<SuctionCup.any: 0>, <GraspType.default: 1>)};

item_id:4091;
sku_no:24468673;
dim1:5.1;
dim2:5.0;
dim3:3.7;
weight:0.985;
item_description:5-PORT SWITCH                      ;
suctionCupConfig:{(<SuctionCup.swappable_vsa_63_nr: 3>, <GraspType.default: 1>), (<SuctionCup.swappable_bgx_48: 4>, <GraspType.default: 1>)};

item_id:971;
sku_no:398378;
dim1:2.55;
dim2:2.3;
dim3:1.2;
weight:0.075;
item_description:2-CLR PRE-INKED STAMP   PAID       ;
suctionCupConfig:{(<SuctionCup.swappable_vs_25_nr: 1>, <GraspType.default: 1>), (<SuctionCup.swappable_vs_18_nr: 5>, <GraspType.default: 1>)};

