# Suction Cup Selector Project

This is the notebook for the suction cup selector project.

## Download the project
If you have git installed, you should be able to pull the project down from the repository using the command:

```bash
git clone <url>
```

Once you cloned the project, you can switch the branch using:

```bash
git checkout <branchname>
```

To confirm your change and add it to the git history, do:

```bash
git add .  #this is to stage all chagnes
git status #this is to confirm the change
git commit -m "commit message" #this is to finalize and commit the change
```

If you want to push your changes to repository, do:

```bash
git push -u origin <branchname>
```

To stage your changes for review, do pull request on Github


## Environment setup
For this project I choose to use Python 3.9 with Pandas/Jupyter library.

In [None]:
import sys
print(sys.version)

To isolate the environment, here is how you create a virtual one for the project. Go to your project folder, open terminal/cmd and run the following command:

```bash
python3 -m venv myenv
```

To activate the environment, run the following command:
```bash
# for windows
myenv\Scripts\activate

# for Mac/Linux
source myenv/bin/activate
```

## Data Processing
The first step of the project is to process the flat files. There are multiple ways for doing data processing. Here we choose to use pandas library read_csv() method to load the csv files into pandas **dataframe** objects.

In [None]:
import pandas as pd

suctionCups = pd.read_csv('SuctionCups.csv')
graspTypes = pd.read_csv('GraspTypes.csv')
items = pd.read_csv('items.csv')
itemConfigs = pd.read_csv('itemConfigs.csv')

#### To see the type of the object, you can do type()


In [None]:
type(items)

#### To inspect the data we just loaded, Here are some common methods:
* df.head(): Returns the first few rows of the DataFrame.
* df.tail(): Returns the last few rows of the DataFrame.
* df.shape: Returns the dimensions (rows, columns) of the DataFrame.
* df.info(): Provides information about the DataFrame, including column data types and missing values.
* df.describe(): Generates descriptive statistics of numerical columns, such as count, mean, min, max, etc.

#### items object overview

In [None]:
items.head()

In [None]:
items.shape

In [None]:
items.info()

In [None]:
items.describe()

#### graspTypes object overview

In [None]:
graspTypes.head()

In [None]:
graspTypes.shape

In [None]:
graspTypes.info()

#### SuctionCups object overview

In [None]:
suctionCups.head(n=10)

In [None]:
suctionCups.shape

In [None]:
suctionCups.info()

#### itemConfig object overview

In [None]:
itemConfigs.head(n=100)

In [None]:
itemConfigs.shape

In [None]:
itemConfigs.info()

In [None]:
itemConfigs['name'].value_counts()

In [None]:
itemConfigs.groupby('item_id').filter(lambda x: len(x)>1).shape

## Data Cleaning

After a preliminary data inspection, it is evident that the dataset contains incorrect data types and null values. Additionally, it is frequently observed that text fields contains noisy punctuations, such as quote and semicolons, which should be eliminated. Performing data cleaning is crucial at this stage to eradicate such records and ensure the cleanliness of your data.

### Item object
* Many item rows doesn't have SKU#
* Item ID should be a string since we are not going to do numeric manipulation on it.
* SKU # should be a string field without the trailing .0
* item_description is a text field. We probably want to take a deeper look
* len/wid/hgt doesn't quite fit our purpose. Making dim1, dim2 and dim3 in a asc/desc order makes more sense 

In [None]:
# drop null
items.dropna(subset=['sku_no'], inplace=True)

In [None]:
items.shape

In [None]:
# reformat item and sku_no field
items['sku_no'] = items['sku_no'].astype(str).str.rstrip('.0')
items['item_id'] = items['item_id'].astype(str)

In [None]:
items.sample(5)

In [None]:
# check punctuation
import string

print(string.punctuation)
mask = items['item_description'].str.contains(f"[{string.punctuation}]")

In [None]:
items[mask].sample(10)

In [None]:
# remove punctuation
unwantedChar = '\'"&'
for c in unwantedChar:
    items['item_description'] = items['item_description'].str.replace(c, '')

In [None]:
items[mask].sample()

In [None]:
# reformat len/width/hgt to dim1/2/3
items['dim1'] = items[['unit_length', 'unit_width', 'unit_height']].apply(max, axis=1)
items['dim2'] = items[['unit_length', 'unit_width', 'unit_height']].apply(lambda x: sorted(x)[1], axis=1)
items['dim3'] = items[['unit_length', 'unit_width', 'unit_height']].apply(min, axis=1)

## Suction Cup Selection Logic
In this section, our main focus will be on developing the selection logic. The selection logic consists of a series of conditional statements with expandable rules. To ensure flexibility for future rule additions, we can leverage object-oriented programming (OOP) concepts. By adopting an OOP approach, we can easily incorporate new rules into the existing framework.

### Item Objects
The base of OOP is object. Pandas dataframe provides conveninent utilities for data manipulation, but it is not designed for OOP. For implementing selection logic, I would like to convert item to a object which is easier to access later on.

In [None]:
items

In [None]:
# here I created a item object to host the name and description
class Item:
    def __init__(self, **kwargs):
        for key, value in kwargs.items():
            setattr(self, key, value)
        self.mySelection = []
    
    def __str__(self):
        pstr = ''
        for k, v in vars(self).items():
            pstr += f"{k}:{v};\n"
        return pstr

In [None]:
# Initiate Map/dict to contain items
# the key should be the item ID, the value is the Item object: {1: item1, 2:item2, ...}
itemMap = {}

# Make a for loop to go through the dataframe and put each row into a object
for idx, row in items.iterrows():
    itemMap[row['item_id']] = Item(**row.to_dict())

In [None]:
itemMap['12810']

In [None]:
print(itemMap['12810'])

In [None]:
itemMap['12810'].minDim

### Rule Object
Instead of hardcoding the selection rules, a more flexible approach would be to leverage object-oriented programming (OOP) concepts. This involves encapsulating the rules into separate Rule objects and applying them dynamically during the selection process. Let's compare the two styles:

**Naive if flow**: 
```Python
# rule one
if foo>bar and ....:
    do something here
# rule two 
if foo<bar and ....:
    do other things here
# rule three, four, ...
...
```

In this approach, the selection logic is directly implemented within the code, making it less adaptable to changes in rules or the need for additional rules. Modifying the selection criteria requires manual changes to the code, which can be error-prone and time-consuming.

----

**OOP**:
```Python
rules = [rule1, rule2, rule3, rule4, ...]
for rule in rules:
    rule.apply(item)
...
```

By using OOP concepts, we can encapsulate the selection rules into separate Rule objects. Each Rule object represents a specific selection criterion and can be easily modified or extended without affecting the overall structure of the code. The rules can be organized into a cohesive hierarchy, allowing for better organization and maintainability.

During the selection process, the Rule objects can be dynamically applied based on the desired criteria. This flexibility enables easy addition, modification, or removal of rules, providing a more scalable and adaptable solution.


In [487]:
# here is an example of how to implement Rule Object
# Base Rule object. Every Rule object should provide the two method template
class BaseRule:
    def isEligible(self):
        '''returns a true/false(boolean) value for the selection logic to use.'''
        return False
    
    def getGraspSuctionTuple(self):
        '''returns a tuple (suctionCupID, graspTypeID) for the selection logic to use.'''
        return (0, 1)

    
# each suction cup should have a base rule to enforce the min/max dim and weight condition
class miniCupRule(BaseRule):
    def isEligible(self, item: Item):
        return item.dim3 > 0.18 and item.dim1 < 5 and item.weight < 0.8
    
    def getGraspSuctionTuple(self):
        '''returns a tuple (suctionCupID, graspTypeID) for the selection logic to use.'''
        return (5, 1)


# advanced rule inheriting from cup base rule
class MiniCupPreferred1SORule(miniCupRule):
    '''Rule MiniCupPreferred1SO Implementation'''
    def __init__(self):
        super().__init__()
        self.name = 'MiniCupPreferred1SO'
    
    def isEligible(self, item: Item):
        return super().isEligible(item) and item.dim3 <1.1 and item.dim1 < 5.8 and item.weight < 0.088
    
    def getGraspSuctionTuple(self):
        '''returns a tuple (suctionCupID, graspTypeID) for the selection logic to use.'''
        return (0, 1)


In [None]:
# run through the items and apply MiniCupPreferred1SO rule:
rules = [MiniCupPreferred1SORule(),]
itemSample = []
# loop over items and rules
for itemId, item in itemMap.items():
    for rule in rules:
        if rule.isEligible(item):
            print(f"item {itemId} is eligible for rule {rule.name}")
            item.mySelection.append(rule.getGraspSuctionTuple())
            itemSample.append(item)

In [None]:
itemMap['5048'].mySelection