# An exploratory analysis of the Ergonomic Study on Chopsticks
Link to the original study: https://www.ncbi.nlm.nih.gov/pubmed/15676839

In [None]:
from collections import Counter

import numpy as np
import pandas as pd

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

data = pd.read_csv('../input/chopstick-effectiveness.csv')

# Quick Preview of the Data
- There are 186 entries, and none of them are null.
- There were 31 participants in the study, but we don't have any additional information about the individuals themselves. However, according to the abstract of the original study, we know that these 31 participants were male junior college students who were considered to be adults.
- The other part of this study included 21 primary school students, who were considered to be children. This data was not provided on Kaggle.
- The study tested 6 different lengths of chopsticks: 180, 210, 240, 270, 300, and 330 mm.
- Each of the 31 participants used all 6 lengths of chopsticks, resulting in the 186 data points.

In [None]:
data.info()

In [None]:
data.describe()

In [None]:
data['Individual'].unique()

In [None]:
data['Chopstick.Length'].unique()

# Visualizing the Data
I started out by plotting the points on a scatter plot to see the overall distribution of Food Pinching Efficiency for each length of chopsticks. Just using the scatter plot, it is difficult to conclude anything, but there is a visible dip in performance between the 240 mm chopsticks and 270 mm chopsticks.

I plotted a similar graph using six bins, one for each length. That graph estimated the central tendency of Food Pinching Efficiency for each length and made it really easy to tell that the 240 mm chopsticks appears to be the best fit for this group of participants.

Before finalizing my conclusion, I decided to determine the best and worst lengths for each participant and to plot the results on bar graphs. These two bar graphs made it very clear that the best length is 240 mm and that the worst length is 330 mm.

In [None]:
# visualize everything in a single scatterplot with a trend line
ax = sns.regplot(x='Chopstick.Length', y='Food.Pinching.Efficiency', data=data)
ax.set(xlabel='Chopstick Length (mm)', 
       ylabel='Food Pinching Efficiency',
       title='Food Pinching Efficiency vs. Chopstick Length')
plt.show()

In [None]:
# visualize everything while binning the data to keep the 
ax = sns.regplot(x='Chopstick.Length', y='Food.Pinching.Efficiency', data=data, x_bins=6)
ax.set(xlabel='Chopstick Length (mm)', 
       ylabel='Food Pinching Efficiency',
       title='Food Pinching Efficiency vs. Chopstick Length')
plt.show()

In [None]:
# obtain the optimal chopstick length for each individual
maxFPE = []
optimalLength = []

for i in range(31):
    maxFPE.append(0)
    optimalLength.append(0)

for index, row in data.iterrows():
    individual = int(row['Individual']) - 1
    if (row['Food.Pinching.Efficiency'] > maxFPE[individual]):
        maxFPE[individual] = row['Food.Pinching.Efficiency']
        optimalLength[individual] = row['Chopstick.Length']

# print(optimalLength)
optimalLengthCounts = dict(Counter(optimalLength))
# print(optimalLengthCounts)
lengths = list(optimalLengthCounts.keys())
counts = list(optimalLengthCounts.values())
ax = sns.barplot(x=lengths, y=counts)
ax.set(xlabel='Chopstick Length (mm)', 
       ylabel='Number of Best Performances',
      title='Number of Best Performances for Each Chopstick Length')
plt.show()

In [None]:
# similarly, obtain the worst chopstick length for each individual
minFPE = []
worstLength = []

for i in range(31):
    minFPE.append(1000)
    worstLength.append(0)

for index, row in data.iterrows():
    individual = int(row['Individual']) - 1
    if (row['Food.Pinching.Efficiency'] < minFPE[individual]):
        minFPE[individual] = row['Food.Pinching.Efficiency']
        worstLength[individual] = row['Chopstick.Length']

# print(worstLength)
worstLengthCounts = dict(Counter(worstLength))
# print(worstLengthCounts)
lengths = list(worstLengthCounts.keys())
counts = list(worstLengthCounts.values())
ax = sns.barplot(x=lengths, y=counts)
ax.set(xlabel='Chopstick Length (mm)', 
       ylabel='Number of Worst Performances',
       title='Number of Worst Performances for Each Chopstick Length')
plt.show()

# Conclusions
- The optimal chopstick length for these male college students appears to be 240 mm.
- The worst chopstick length for these male college students appears to be 330 mm.
- Chopsticks with a length of 240 mm may or may not be the optimal length for other adults. That remains to be proven with more detailed studies with other age groups or with females.
- If we had the data for the children who participated, we would do some similar analysis to see what the optimal lengths are for them.