# Skincare Ingredient Classifier

This notebook is a prototype tool for classifying skincare products—especially for people with eczema or damaged skin barriers. It estimates whether a formula is mostly a **humectant**, **emollient**, or **occlusive**, or a blend. Users can paste ingredients to analyze.


## Step 1: Ingredient Dictionary

This defines a small ingredient dictionary that maps common skincare ingredients to their functions: humectant, emollient, or occlusive.


In [1]:
# Step 1: Ingredient Type Dictionary

ingredient_types = {
    "glycerin": "humectant",
    "petrolatum": "occlusive",
    "dimethicone": "occlusive",
    "shea butter": "emollient",  
    "hyaluronic acid": "humectant",
    "coconut oil": "emollient",
    "ceramide": "emollient",
    "lanolin": "occlusive",    
    "aloe vera": "humectant",
    "propylene glycol": "humectant",
    "panthenol": "humectant",
    "jojoba oil": "emollient",
    "mineral oil": "occlusive",
    "squalane": "emollient",
    "beeswax": "occlusive",
    "urea": "humectant",
    "sweet almond oil": "emollient",
    "castor oil": "occlusive",
    "avocado oil": "emollient",
    "colloidal oatmeal": "occlusive",
    "tocopherol": "emollient",  # vitamin E
    "butylene glycol": "humectant",
    "sorbitol": "humectant",
    "olive oil": "emollient",
    "argan oil": "emollient",
    "zinc oxide": "occlusive",
    "lanolin alcohol": "emollient",
    "petroleum jelly": "occlusive",
    "caprylic/capric triglyceride": "emollient",
    "lauric acid": "emollient",
    "stearic acid": "emollient",
    "lecithin": "emollient",
    "cetearyl alcohol": "emollient",
    "isopropyl myristate": "emollient",
    "polyethylene glycol": "humectant",
    "sodium lactate": "humectant",
    "dimethiconol": "occlusive",
    "ozokerite": "occlusive",
    "marula oil": "emollient",
    "tamanu oil": "emollient",
    "baobab oil": "emollient",
    "meadowfoam seed oil": "emollient",
    "algae extract": "humectant",
    "cucumber extract": "humectant",
    "rosehip oil": "emollient",
    "calendula oil": "emollient",
    "shea oil": "emollient",
    "chamomile extract": "humectant",
    "cyclopentasiloxane": "occlusive",
    "dimethiconol": "occlusive",
    "isododecane": "occlusive",
    "carbomer": "humectant",
    "ammonium lactate": "humectant",
    "glycereth-26": "humectant",
    "polyquaternium-10": "humectant",
    "ethylhexyl palmitate": "emollient",
    "c12-15 alkyl benzoate": "emollient",
    "cetyl alcohol": "emollient",
    "PEG-100 stearate": "emollient",
    "dimethyl isosorbide": "humectant",
    "ceramide np": "emollient",
    "ceramide ap": "emollient",
    "ceramide eop": "emollient",
    "cholesterol": "emollient",
    "phytosphingosine": "emollient",
    "sodium pca": "humectant",
    "niacinamide": "humectant",  # water retention + barrier strengthening
    "panthenol": "humectant",    # aka provitamin B5
    "allantoin": "humectant",
    "beta-glucan": "humectant",
    "madecassoside": "humectant",  # from centella asiatica
    "zinc gluconate": "occlusive",
    "magnesium ascorbyl phosphate": "humectant",  # stable vitamin C

}

print("Ingredient dictionary loaded.")


Ingredient dictionary loaded.


## Step 2: Paste Your Ingredients

Paste a list of ingredients below (comma-separated). Example:  
`Glycerin, Water, Shea Butter, Fragrance, Dimethicone`


In [2]:
# Step 2: Paste ingredients

user_input = input("Paste a list of ingredients, separated by commas:\n")
custom_ingredients = [i.strip().lower() for i in user_input.split(",")]

print("\nYou entered:")
print(custom_ingredients)


StdinNotImplementedError: raw_input was called, but this frontend does not support input requests.

## Step 3: Classify Ingredient Types

In [None]:
# Step 3: Classify ingredients

type_counts = {"humectant": 0, "emollient": 0, "occlusive": 0, "unknown": 0}

for ingredient in custom_ingredients:
    match = ingredient_types.get(ingredient.lower())
    if match:
        type_counts[match] += 1
    else:
        type_counts["unknown"] += 1

print("Classification Results:")
print(type_counts)


## Step 4: Percentage Breakdown

This calculates the percentage of known ingredients that fall into each category: humectant, emollient, or occlusive. Unknown ingredients are excluded from this calculation.


In [None]:
# Step 4: Percentages

total_known = type_counts["humectant"] + type_counts["emollient"] + type_counts["occlusive"]
percentages = {}

for key in ["humectant", "emollient", "occlusive"]:
    if total_known > 0:
        percentages[key] = round((type_counts[key] / total_known) * 100, 1)
    else:
        percentages[key] = 0

print("Percentage Breakdown:")
print(percentages)


## Step 5: Confidence Score

This calculates how confident the classifier is, based on how many ingredients matched the known dictionary versus how many were unknown.



In [None]:
# Step 5: Visualize the classification breakdown

import matplotlib.pyplot as plt

# Data
labels = list(percentages.keys())
values = list(percentages.values())

# Plot
plt.bar(labels, values)
plt.title("Ingredient Type Breakdown")
plt.ylabel("Percentage (%)")
plt.ylim(0, 100)
plt.show()


## Step 6: Primary Classification

This identifies the ingredient type (humectant, emollient, or occlusive) that appears most frequently among the known ingredients, and labels the product accordingly.


In [None]:
# Step 6: Print best guess category

if percentages:
    best_type = max(percentages, key=percentages.get)
    print(f"Primary classification: This product is mostly a **{best_type}**.")
else:
    print("No known ingredients to classify.")


## Step 7: Unknown Ingredients

This lists any ingredients that were not recognized by the dictionary, helping identify which items may need to be researched or added in the future.


In [None]:
# Step 7: Show unknown ingredients

unknowns = [ingredient for ingredient in custom_ingredients if ingredient.lower() not in ingredient_types]

if unknowns:
    print("⚠️ Unknown ingredients (not in database):")
    for u in unknowns:
        print(f" - {u}")
else:
    print("✅ All ingredients were matched.")


## Summary

This notebook is a prototype skincare ingredient classifier, designed for people with eczema or a damaged skin barrier. It allows users to paste product ingredients, matches them against a basic ingredient dictionary, and classifies the product based on humectants, emollients, and occlusives.

The output includes:
- A type breakdown (percentage)
- A confidence score
- The primary classification
- A list of unrecognized ingredients
- A bar chart for visual feedback

This tool can be expanded over time with more ingredient data and advanced classification logic.

