In [2]:
import pandas as pd
import numpy as np
from collections import Counter
import matplotlib.pyplot as plt

## Skincare Ingredients Breakdown
<b> <font size="3.5">Introduction </font><b>

<font size="2">Skincare has blown up throughout the past recent years and has been a big topic flowing around the internet. We can see an increase in discussions online (Reddit, Youtube, Tiktok) about what products are bad and good for your skin, yet we are left with vague and inconsistent answers that don't really help us gauge if we should using specific cosmetic products. However, organization such as EWG's Skin Deep database have found a way to give consumers a deeper look on their skincare products/ingredients through assessing their own scale of skincare "safeness".</font>

<font size="2">The potential issue with databases such as EWG's is that there tends to be an overestimate on the dangers of specific ingredients and EWG has even been coined as a "fear-mongering" campaign by many. EWG assess the safeness based off of scientific papers, noting of an ingredient's potential hazards such as toxicity and allergies. Sometimes, these hazard ratings can be based off of scientific suspicions of certain ingredients that have happened and can cause certain products to be thrown in the dust due to these safeness ratings.</font>

<font size="2">It's difficult to find this "perfect balance" of what is good and bad and also have it truthfully represent how everyone views cosmetic ingredients. Although there are generally well-known harmful ingredients to avoid, it's almost impossible to sort through thousands of different ingredients that exist today and do extensive research to reach a point of confidence to coin whether a product/ingredient is harzardous or not.</font>

<b><font size="3.5"> Approach </font> <b>
    
<font size="2.5">Instead of having to resort to resources online that may or may not be credible, we decided to approach this issue by allowing cosmetic product consumers to decide on their own what they believe is "good" or "useful" in terms of their skin. We wanted to provide breakdowns of skincare products and their ingredients in a more neutral manner, providing informational breakdown on where ingredients come from and how each product may benefit them.</font>
    
<font size="2">Although this shouldn't be a replacement for ingredient research that has credentials, it can provide consumers with a lot less stress on having to navigating every ingredient they have or will use.</font>

## Datasets Involved
    
<font size="2.5">Here, we have a webscrapped dataset taken from [kaggle](https://www.kaggle.com/code/eward96/skincare-recommendation-engine#Extracting-brand-names-%F0%9F%A7%AA) that provides general information on products found on [LOOKFANTASTIC](https://us.lookfantastic.com/). If this project was considered in a long term sense, we would webscrape cosmetic products from different websites to get a more general overview on all kinds of products sold in different places. </font>

In [93]:
skincare = pd.read_csv('skincare_products_clean.csv')
skincare.head(5)

Unnamed: 0,product_name,product_url,product_type,clean_ingreds,price
0,The Ordinary Natural Moisturising Factors + HA...,https://www.lookfantastic.com/the-ordinary-nat...,Moisturiser,"['capric triglyceride', 'cetyl alcohol', 'prop...",£5.20
1,CeraVe Facial Moisturising Lotion SPF 25 52ml,https://www.lookfantastic.com/cerave-facial-mo...,Moisturiser,"['homosalate', 'glycerin', 'octocrylene', 'eth...",£13.00
2,The Ordinary Hyaluronic Acid 2% + B5 Hydration...,https://www.lookfantastic.com/the-ordinary-hya...,Moisturiser,"['sodium hyaluronate', 'sodium hyaluronate', '...",£6.20
3,AMELIORATE Transforming Body Lotion 200ml,https://www.lookfantastic.com/ameliorate-trans...,Moisturiser,"['ammonium lactate', 'c12-15', 'glycerin', 'pr...",£22.50
4,CeraVe Moisturising Cream 454g,https://www.lookfantastic.com/cerave-moisturis...,Moisturiser,"['glycerin', 'cetearyl alcohol', 'capric trigl...",£16.00


<font size="2">Here, we have a dataset created by Anthony, displaying ingredients and information on how they may have been made, what kind of material_based composition they have (animal,mineral,etc), ingredient benefits, etc.. Although there aren't a ton of ingredients this is a good starting point to evaluate a product in the dataset above. </font>

In [94]:
ingredients = pd.read_csv('skincare_ingredients - Sheet1.csv')
# ingore this data cleaning step 
ingredients['is_vegan_friendly'] = ingredients['is_vegan_friendly'].str.strip(' ')
ingredients.at[0,'animal_based'] = 'yes'
ingredients['other_names'] = ingredients['other_names'].str.lower()
ingredients.head(5)

Unnamed: 0,ingredient,is_vegan_friendly,not_vegan_reason,petroleum_oil_based,plant_oil_based,plant_oil_kind,mineral_based,plant_nonoil_based,animal_based,paraben_based,fragrance_based,is_synthetic,is_natural,function,proposed_risks,known_benefits,other_names,suflate_based,web,cunt
0,capric triglyceride,not,contain a mixture of glycerin,no,yes,coconut,no,no,yes,no,no,no,yes,emollient,no,hydration,"caprylic/ capric triglyceride, caprylic/capric...",no,,
1,cetyl alcohol,maybe,maybe derived from animal oils,no,yes,"coconut, palm",no,no,maybe,no,no,no,yes,emulsifier,no,hydration,"1-hexadecanol, cetanol, cetyl alcohol, hexadec...",no,,
2,glycerin,not,mainly made from animal fats/ sometimes vegeta...,no,yes,"coconut, palm, soybean",no,no,yes,no,no,no,yes,humectant,no,"hydration, anti-aging","1,2,3-propanetriol, 1,2,3-trihydroxypropane, 1...",no,,
3,propanediol,yes,,yes,no,,no,yes,no,no,no,yes,yes,solvent,no,hydration,"1,3-dihydroxypropane, 1,3-propylene glycol, 1,...",no,,
4,hyaluronic acid,yes,,no,no,,no,no,no,no,no,yes,no,humectant,no,"anti-aging, hydration","hyaluronan, hyaluronic acid",no,,


<b><font size="3.5"> Dataset Breakdown/How was this Collected? </font> <b>
    
Here are the column breakdown for this dataset:

* **`ingredient`** : the name of the ingredient
* **`is_vegan_friendly`** : is it fully vegan friendly? "maybe" can occur if an ingredient can possibly be non-vegan depending on how it was made
* **`not_vegan_reason`** : reason why the ingredient isn't vegan or might not be vegan
* **`petroleum_oil_based`** : is it a petroleum based ingredient?
* **`plant_oil_based`** : is it a plant oil based ingredient?
* **`mineral_based`** : is it a mineral based ingredient?
* **`plant_nonoil_based`** : is it a plant-based ingredient (not including plant-based oils)?
* **`animal_based`** : is it a animal based product?
* **`paraben_based`** : is it a paraben based ingredient?
* **`fragrance_based`** : is it a fragrance based ingredient?
* **`is_synthetic`** : is this ingredient synethically produced?
* **`is_natural`** : does this ingredient naturally occur?
* **`function`** : proposed function based off of [Paula's Choice Ingredient Dictionary](https://www.paulaschoice.com/ingredient-dictionary)
* **`proposed_risks`** : Does this ingredient have any studyies or research that show potential risks?
* **`known_benefits`** : proposed benefits based off of [Paula's Choice Ingredient Dictionary](https://www.paulaschoice.com/ingredient-dictionary)
* **`other_names`** : other names this ingredient might go under taken from [EWG's skin deep](https://www.ewg.org/skindeep/) 

**NOTE - most of these columns were produced through research on each individual ingredient**

## Project Demo

In [95]:
# lets take this product for example
example = skincare[['product_name','clean_ingreds']].iloc[895].to_frame().T.reset_index()
example

Unnamed: 0,index,product_name,clean_ingreds
0,895,Avene Gentle Exfoliating Scrub 75ml,"['glycerin', 'pentylene glycol', 'hydroxyethyl..."


In [96]:
skincare.iloc[895].clean_ingreds

"['glycerin', 'pentylene glycol', 'hydroxyethyl acrylate/sodium acryloyldimethyl taurate', 'niacinamide', 'cellulose acetate', 'ascorbyl palmitate', 'cetrimonium bromide', 'citric acid', 'coco-glucoside', 'parfum', 'glyceryl oleate', 'hydrogenated palm glycerides citrate', 'simmondsia chinensis leaf extract', 'lecithin', 'polysorbate 60', 'ci 73360', 'sodium salicylate', 'sorbitan isostearate', 'talc', 'tocopherol', 'trisodium ethylenediamine disuccinate', 'zinc gluconate']"

In [97]:
# here are the list of ingredients that were scrapped from the given website
example_ingred = example['clean_ingreds'][0]
example_ingred

"['glycerin', 'pentylene glycol', 'hydroxyethyl acrylate/sodium acryloyldimethyl taurate', 'niacinamide', 'cellulose acetate', 'ascorbyl palmitate', 'cetrimonium bromide', 'citric acid', 'coco-glucoside', 'parfum', 'glyceryl oleate', 'hydrogenated palm glycerides citrate', 'simmondsia chinensis leaf extract', 'lecithin', 'polysorbate 60', 'ci 73360', 'sodium salicylate', 'sorbitan isostearate', 'talc', 'tocopherol', 'trisodium ethylenediamine disuccinate', 'zinc gluconate']"

In [98]:
hello = np.array([])
for ingredient in example_cleaned_ingred:
    hello = np.append(hello,ingredient)

In [99]:
example_ingred

"['glycerin', 'pentylene glycol', 'hydroxyethyl acrylate/sodium acryloyldimethyl taurate', 'niacinamide', 'cellulose acetate', 'ascorbyl palmitate', 'cetrimonium bromide', 'citric acid', 'coco-glucoside', 'parfum', 'glyceryl oleate', 'hydrogenated palm glycerides citrate', 'simmondsia chinensis leaf extract', 'lecithin', 'polysorbate 60', 'ci 73360', 'sodium salicylate', 'sorbitan isostearate', 'talc', 'tocopherol', 'trisodium ethylenediamine disuccinate', 'zinc gluconate']"

In [100]:
", ".join(hello)

'glycerin, pentylene glycol, hydroxyethyl acrylate/sodium acryloyldimethyl taurate, niacinamide, cellulose acetate, ascorbyl palmitate, cetrimonium bromide, citric acid, coco-glucoside, parfum, glyceryl oleate, hydrogenated palm glycerides citrate, simmondsia chinensis leaf extract, lecithin, polysorbate 60, ci 73360, sodium salicylate, sorbitan isostearate, talc, tocopherol, trisodium ethylenediamine disuccinate, zinc gluconate'

In [101]:
# lets get this cleaned up so we can analyze the breakdown of this cosmetic product
# here we can see a Series where each element is a each ingredient
example_cleaned_ingred = pd.Series(np.array(example_ingred.strip("\[\]").split(', '))).str.strip("\'\'")
for ingredient in example_cleaned_ingred:
    print(ingredient)

glycerin
pentylene glycol
hydroxyethyl acrylate/sodium acryloyldimethyl taurate
niacinamide
cellulose acetate
ascorbyl palmitate
cetrimonium bromide
citric acid
coco-glucoside
parfum
glyceryl oleate
hydrogenated palm glycerides citrate
simmondsia chinensis leaf extract
lecithin
polysorbate 60
ci 73360
sodium salicylate
sorbitan isostearate
talc
tocopherol
trisodium ethylenediamine disuccinate
zinc gluconate


In [102]:
# now we can see that ingredients can full under a lot of different names
# even if the ingredient appears inside the ingredients dataframe, even the slightest discrepancy in the name
# can make us assume that our database has not covered that ingredient yet
for ingredient in example_cleaned_ingred:
    if ingredient in ingredients['ingredient'].unique():
        print(ingredient + ': Found!')
    else:
        print(ingredient + ': Not Found!')

glycerin: Found!
pentylene glycol: Found!
hydroxyethyl acrylate/sodium acryloyldimethyl taurate: Found!
niacinamide: Found!
cellulose acetate: Not Found!
ascorbyl palmitate: Found!
cetrimonium bromide: Found!
citric acid: Found!
coco-glucoside: Found!
parfum: Found!
glyceryl oleate: Found!
hydrogenated palm glycerides citrate: Not Found!
simmondsia chinensis leaf extract: Not Found!
lecithin: Found!
polysorbate 60: Found!
ci 73360: Not Found!
sodium salicylate: Not Found!
sorbitan isostearate: Found!
talc: Found!
tocopherol: Found!
trisodium ethylenediamine disuccinate: Not Found!
zinc gluconate: Found!


In [103]:
# lets try a different approach
# inside the ingredients dataframe includes a column 'other_name' taken from EWG with different names the ingredient
# can go under
ingredients['other_names'].head(5)

0    caprylic/ capric triglyceride, caprylic/capric...
1    1-hexadecanol, cetanol, cetyl alcohol, hexadec...
2    1,2,3-propanetriol, 1,2,3-trihydroxypropane, 1...
3    1,3-dihydroxypropane, 1,3-propylene glycol, 1,...
4                          hyaluronan, hyaluronic acid
Name: other_names, dtype: object

In [104]:
# we can see that 'simmondsia chinensis leaf extract' DOES exist but has a different base ingredient name
# we make sure to drop/fill the Nan values as they can interfere with the any() function (Nan values = True)
any(ingredients['other_names'].str.contains('simmondsia chinensis leaf extract').dropna())

True

In [105]:
ingredients[ingredients['other_names'].str.contains('simmondsia chinensis leaf extract').fillna(False)]

Unnamed: 0,ingredient,is_vegan_friendly,not_vegan_reason,petroleum_oil_based,plant_oil_based,plant_oil_kind,mineral_based,plant_nonoil_based,animal_based,paraben_based,fragrance_based,is_synthetic,is_natural,function,proposed_risks,known_benefits,other_names,suflate_based,web,cunt
24,simmondsia chinensis (jojoba) leaf extract,yes,,no,yes,jojoba,no,yes,no,no,no,no,yes,"antioxidant, emollient",no,"hydration, anti-aging, soothing","simmondsia chinensis (jojoba) seed oil,extract...",no,,


In [106]:
# we have covered all ingredients that are available in the ingredients dataframe
# there are still some that are not found, we decided to leave out ingredients that don't have a lot of research done
found_ingredients = []
counter = 0
for ingredient in example_cleaned_ingred:
    found_base = ingredient in ingredients['ingredient'].unique()
    found_alt = ingredients['other_names'].str.contains(ingredient).fillna(False)
    if (found_base or any(found_alt)):
        found_ingredients.append(ingredient) if found_base else found_ingredients.append(ingredients[found_alt]['ingredient'].iloc[0])
            
        print(ingredient + ': Found!')
        counter += 1
        
    else:
        print(ingredient + ': Not Found!')
print('\n'+str(counter) + ' ingredient information were found in total')

glycerin: Found!
pentylene glycol: Found!
hydroxyethyl acrylate/sodium acryloyldimethyl taurate: Found!
niacinamide: Found!
cellulose acetate: Not Found!
ascorbyl palmitate: Found!
cetrimonium bromide: Found!
citric acid: Found!
coco-glucoside: Found!
parfum: Found!
glyceryl oleate: Found!
hydrogenated palm glycerides citrate: Not Found!
simmondsia chinensis leaf extract: Found!
lecithin: Found!
polysorbate 60: Found!
ci 73360: Not Found!
sodium salicylate: Not Found!
sorbitan isostearate: Found!
talc: Found!
tocopherol: Found!
trisodium ethylenediamine disuccinate: Not Found!
zinc gluconate: Found!

17 ingredient information were found in total


In [107]:
# now lets try to make something digestable for consumer purposes
# how can we use the information we have for this product's ingredients with the ingredient dataframe
example_ingred_info = ingredients[ingredients['ingredient'].isin(found_ingredients)]
example_ingred_info.head(5)

Unnamed: 0,ingredient,is_vegan_friendly,not_vegan_reason,petroleum_oil_based,plant_oil_based,plant_oil_kind,mineral_based,plant_nonoil_based,animal_based,paraben_based,fragrance_based,is_synthetic,is_natural,function,proposed_risks,known_benefits,other_names,suflate_based,web,cunt
2,glycerin,not,mainly made from animal fats/ sometimes vegeta...,no,yes,"coconut, palm, soybean",no,no,yes,no,no,no,yes,humectant,no,"hydration, anti-aging","1,2,3-propanetriol, 1,2,3-trihydroxypropane, 1...",no,,
7,niacinamide,yes,,no,no,,no,yes,no,no,no,no,yes,"emollient, humectant, antioxidant",no,"anti-aging, pore minimizer, soothing","3-aminopyridine, 3-carbamoylpyridine, 3-pyridi...",no,,
10,parfum,maybe,parfum refers to fragrance which can be made f...,no,maybe,,no,maybe,maybe,no,yes,yes,no,fragrance,yes,,"aroma, fragrance",no,EWG Skin Deep: https://www.ewg.org/skindeep/in...,
11,talc,yes,,no,no,,yes,no,no,no,no,no,yes,absorbent,yes,oil control,"cosmetic talc, french chalk, magnesium silicat...",no,EWG Skin Deep: https://www.ewg.org/skindeep/in...,"Canada, EU"
17,pentylene glycol,yes,,no,no,,no,yes,no,no,no,yes,yes,"humectant, solvent, preservative",no,hydration,"1,2-dihydroxypentane, 1,2-pentanediol, pentane...",no,,


In [108]:
# here is the most popular consumer-demanded breakdown -> are the products their using vegan or contain mostly vegan
# ingredients?
example_ingred_info['is_vegan_friendly'].value_counts()

yes      10
maybe     5
not       2
Name: is_vegan_friendly, dtype: int64

In [109]:
# here lets try to breakdown the benefits that each ingredient in the product has
# lets get this all into one dictionary so we can see the total times a benefit shows in a product
benefit_count = example_ingred_info['known_benefits'].dropna().str.split(', ').apply(Counter).reset_index(drop=True)
benefit_count

0                    {'hydration': 1, 'anti-aging': 1}
1    {'anti-aging': 1, 'pore minimizer': 1, 'soothi...
2                                   {'oil control': 1}
3                                     {'hydration': 1}
4    {'anti-aging': 1, 'dark spot fading': 1, 'even...
5     {'hydration': 1, 'anti-aging': 1, 'soothing': 1}
6                                     {'hydration': 1}
7                                     {'hydration': 1}
8    {'anti-aging': 1, 'soothing': 1, 'evens skin t...
Name: known_benefits, dtype: object

In [110]:
list_of_dicts = list(benefit_count)
merged_dict = {}

for dict in list_of_dicts:
    for key, value in dict.items():
        if key in merged_dict:
            merged_dict[key] += value
        else:
            merged_dict[key] = value

merged_dict

{'hydration': 5,
 'anti-aging': 5,
 'pore minimizer': 1,
 'soothing': 3,
 'oil control': 1,
 'dark spot fading': 1,
 'evens skin tone': 2}

<font size="2.5">Here is an example output that we can give based off of two columns. We summarize the benefits that this product can have based on their ingredients alongside a list of ingredients that may have proposed risks. In order to further this project, we may want to include actual links to articles that demonstrate potential ingredient risks.</font>

In [111]:
# we can show consumers what kind benefits that each product includes
# here is a sample of what it would look like
for (key,value) in merged_dict.items():
    ingred_list = example_ingred_info[example_ingred_info['known_benefits'].fillna('None').str.contains(key)]['ingredient']
    print('There were ' + str(value) + ' ingredients in this product that have benefits towards ' + str(key))
    print('Ingredient(s): ' + ', '.join(str(x) for x in list(ingred_list)))
    print('─' * 115)
    
risks = example_ingred_info[example_ingred_info['proposed_risks'] == 'yes']['ingredient']
risks_copy = risks.copy()
risks_copy.iloc[-1] = 'and ' + risks_copy.iloc[-1]
print('However, ' + ', '.join(str(x) for x in list(risks_copy)) + ' may have some proposed risks')
print('See how these ingredients may affect you: \n')

for val in risks:
    print(val + ": LINK GOES HERE")

There were 5 ingredients in this product that have benefits towards hydration
Ingredient(s): glycerin, pentylene glycol, simmondsia chinensis (jojoba) leaf extract, lecithin, zinc gluconate
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
There were 5 ingredients in this product that have benefits towards anti-aging
Ingredient(s): glycerin, niacinamide, ascorbyl palmitate, simmondsia chinensis (jojoba) leaf extract, tocopherol
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
There were 1 ingredients in this product that have benefits towards pore minimizer
Ingredient(s): niacinamide
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
There were 3 ingredients in this product that have benefits towards soothing
Ingredient(s): niacinamide, simmondsia chinensis (jojoba) leaf extract, tocopherol


In [112]:
merged_dict

{'hydration': 5,
 'anti-aging': 5,
 'pore minimizer': 1,
 'soothing': 3,
 'oil control': 1,
 'dark spot fading': 1,
 'evens skin tone': 2}

In [113]:
la = list(example_ingred_info['is_vegan_friendly'].value_counts(normalize = True))

In [114]:
lala = list(example_ingred_info['is_vegan_friendly'].value_counts(normalize = True).index)

In [115]:
list(zip(lala,la))

[('yes', 0.5882352941176471),
 ('maybe', 0.29411764705882354),
 ('not', 0.11764705882352941)]

In [116]:
example_ingred_info

Unnamed: 0,ingredient,is_vegan_friendly,not_vegan_reason,petroleum_oil_based,plant_oil_based,plant_oil_kind,mineral_based,plant_nonoil_based,animal_based,paraben_based,fragrance_based,is_synthetic,is_natural,function,proposed_risks,known_benefits,other_names,suflate_based,web,cunt
2,glycerin,not,mainly made from animal fats/ sometimes vegeta...,no,yes,"coconut, palm, soybean",no,no,yes,no,no,no,yes,humectant,no,"hydration, anti-aging","1,2,3-propanetriol, 1,2,3-trihydroxypropane, 1...",no,,
7,niacinamide,yes,,no,no,,no,yes,no,no,no,no,yes,"emollient, humectant, antioxidant",no,"anti-aging, pore minimizer, soothing","3-aminopyridine, 3-carbamoylpyridine, 3-pyridi...",no,,
10,parfum,maybe,parfum refers to fragrance which can be made f...,no,maybe,,no,maybe,maybe,no,yes,yes,no,fragrance,yes,,"aroma, fragrance",no,EWG Skin Deep: https://www.ewg.org/skindeep/in...,
11,talc,yes,,no,no,,yes,no,no,no,no,no,yes,absorbent,yes,oil control,"cosmetic talc, french chalk, magnesium silicat...",no,EWG Skin Deep: https://www.ewg.org/skindeep/in...,"Canada, EU"
17,pentylene glycol,yes,,no,no,,no,yes,no,no,no,yes,yes,"humectant, solvent, preservative",no,hydration,"1,2-dihydroxypentane, 1,2-pentanediol, pentane...",no,,
18,hydroxyethyl acrylate/sodium acryloyldimethyl ...,maybe,"is synthetically made, unknown about if vegan",no,maybe,,no,maybe,maybe,no,no,yes,no,"thickener, stabilizer, emulsifier",no,,hydroxyethyl acrylate/ sodium acryloyldimethyl...,no,,
19,ascorbyl palmitate,yes,,no,no,,no,yes,no,no,no,yes,no,antioxidant,no,"anti-aging, dark spot fading, evens skin tone","6-hexadecanoate l-ascorbic acid, 6-o-palmitoyl...",no,,
20,cetrimonium bromide,yes,,no,yes,coconut,no,no,no,no,no,yes,yes,"surfactant, emulsifier, preservative",yes,,"1-hexadecanaminium, n,n,n-trimethyl-, bromide,...",no,EWG Skin Deep: https://www.ewg.org/skindeep/in...,
21,citric acid,yes,,no,no,,no,yes,no,no,no,no,yes,"antioxidant, exfoliator",no,,"1,2,3-propanetricarboxylic acid, 2-hydroxy-, 1...",no,,
22,coco-glucoside,yes,,no,yes,coconut,no,yes,no,no,no,no,yes,cleansing agent,no,,,no,,


In [117]:
# formatting for each skin concern they have
yes = (example_ingred_info.is_vegan_friendly.value_counts(normalize=True) * 100).round(2)
idx = yes.index
val = yes.values
for i in range(len(yes)):
    print(f'{val[i]}% of the found ingredients are confirmed "{idx[i]}" to be vegan')
    if idx[i] == 'yes':
        print('Ingredients include: ' + ", ".join(example_ingred_info[example_ingred_info.is_vegan_friendly == idx[i]].ingredient))
    
    if idx[i] != 'yes':
        df = example_ingred_info[example_ingred_info.is_vegan_friendly == idx[i]]
        #print('_______________________________________')
        for j in range(df.shape[0]):
            print(f'Ingredient: {df.ingredient.iloc[j]}; Reason: {df.not_vegan_reason.iloc[j]}')
    print('___________________________________________________________________')

58.82% of the found ingredients are confirmed "yes" to be vegan
Ingredients include: niacinamide, talc, pentylene glycol, ascorbyl palmitate, cetrimonium bromide, citric acid, coco-glucoside, simmondsia chinensis (jojoba) leaf extract, zinc gluconate, tocopherol
___________________________________________________________________
29.41% of the found ingredients are confirmed "maybe" to be vegan
Ingredient: parfum; Reason: parfum refers to fragrance which can be made for a multitude of chemicals
Ingredient: hydroxyethyl acrylate/sodium acryloyldimethyl taurate; Reason: is synthetically made, unknown about if vegan
Ingredient: lecithin; Reason: plant-based and may contain animal tissues/organs or eggs
Ingredient: polysorbate 60; Reason: is a polysorbate of stearic acid. Stearic Acid can have plant or animal sources
Ingredient: sorbitan isostearate; Reason: derived from sorbitol, which is plant-derived, and stearic acid, which may be plant- or animal-derived
_______________________________

In [138]:
yuh = example_ingred_info[example_ingred_info.web.notna()]
print('Ingredient Proposed Risks According to Online Resources:')
for i in range(len(yes)):
    print(f'{yuh.ingredient.iloc[i]}')
    
    websites = yuh.web.iloc[0].split(', ')
    for j in range(len(websites)):
        print(websites[j])
    print('__')

Ingredient Proposed Risks According to Online Resources:
parfum
EWG Skin Deep: https://www.ewg.org/skindeep/ingredients/702512-FRAGRANCE/
Pubmed: https://pubmed.ncbi.nlm.nih.gov/?term=fragrance+ingredient+safety
__
talc
EWG Skin Deep: https://www.ewg.org/skindeep/ingredients/702512-FRAGRANCE/
Pubmed: https://pubmed.ncbi.nlm.nih.gov/?term=fragrance+ingredient+safety
__
cetrimonium bromide
EWG Skin Deep: https://www.ewg.org/skindeep/ingredients/702512-FRAGRANCE/
Pubmed: https://pubmed.ncbi.nlm.nih.gov/?term=fragrance+ingredient+safety
__


In [None]:
ha = example_ingred_info[example_ingred_info.cunt.notna()]
ha.ingredient.values

ha.ingredient == 

In [150]:
benefit_count = example_ingred_info['function'].dropna(
    ).str.split(', ').apply(Counter).reset_index(drop=True)
list_of_dicts = list(benefit_count)
merged_dict = {}
for dict in list_of_dicts:
    for key, value in dict.items():
        if key in merged_dict:
            merged_dict[key] += value
        else:
            merged_dict[key] = value
merged_dict

{'humectant': 4,
 'emollient': 4,
 'antioxidant': 5,
 'fragrance': 2,
 'absorbent': 1,
 'solvent': 1,
 'preservative': 3,
 'thickener': 1,
 'stabilizer': 1,
 'emulsifier': 6,
 'surfactant': 1,
 'exfoliator': 1,
 'cleansing agent': 4}

In [135]:
yuh.web.iloc[0].split(', ')

['EWG Skin Deep: https://www.ewg.org/skindeep/ingredients/702512-FRAGRANCE/',
 'Pubmed: https://pubmed.ncbi.nlm.nih.gov/?term=fragrance+ingredient+safety']

In [86]:
(example_ingred_info.petroleum_oil_based.value_counts(normalize=True) * 100).round(2)

no    100.0
Name: petroleum_oil_based, dtype: float64

In [84]:
example_ingred_info[example_ingred_info.petroleum_oil_based == 'no']

Unnamed: 0,ingredient,is_vegan_friendly,not_vegan_reason,petroleum_oil_based,plant_oil_based,plant_oil_kind,mineral_based,plant_nonoil_based,animal_based,paraben_based,fragrance_based,is_synthetic,is_natural,function,proposed_risks,known_benefits,other_names,suflate_based,web,cunt
2,glycerin,no,mainly made from animal fats/ sometimes vegeta...,no,yes,"coconut, palm, soybean",no,no,yes,no,no,no,yes,humectant,no,"hydration, anti-aging","1,2,3-propanetriol, 1,2,3-trihydroxypropane, 1...",no,,
7,niacinamide,yes,,no,no,,no,yes,no,no,no,no,yes,"emollient, humectant, antioxidant",no,"anti-aging, pore minimizer, soothing","3-aminopyridine, 3-carbamoylpyridine, 3-pyridi...",no,,
10,parfum,maybe,parfum refers to fragrance which can be made f...,no,maybe,,no,maybe,maybe,no,yes,yes,no,fragrance,yes,,"aroma, fragrance",no,https://www.ewg.org/skindeep/ingredients/70251...,
11,talc,yes,,no,no,,yes,no,no,no,no,no,yes,absorbent,yes,oil control,"cosmetic talc, french chalk, magnesium silicat...",no,https://www.ewg.org/skindeep/ingredients/70642...,"Canada, EU"
17,pentylene glycol,yes,,no,no,,no,yes,no,no,no,yes,yes,"humectant, solvent, preservative",no,hydration,"1,2-dihydroxypentane, 1,2-pentanediol, pentane...",no,,
18,hydroxyethyl acrylate/sodium acryloyldimethyl ...,maybe,"is synthetically made, unknown about if vegan",no,maybe,,no,maybe,maybe,no,no,yes,no,"thickener, stabilizer, emulsifier",no,,hydroxyethyl acrylate/ sodium acryloyldimethyl...,no,,
19,ascorbyl palmitate,yes,,no,no,,no,yes,no,no,no,yes,no,antioxidant,no,"anti-aging, dark spot fading, evens skin tone","6-hexadecanoate l-ascorbic acid, 6-o-palmitoyl...",no,,
20,cetrimonium bromide,yes,,no,yes,coconut,no,no,no,no,no,yes,yes,"surfactant, emulsifier, preservative",yes,,"1-hexadecanaminium, n,n,n-trimethyl-, bromide,...",no,https://www.ewg.org/skindeep/ingredients/71762...,
21,citric acid,yes,,no,no,,no,yes,no,no,no,no,yes,"antioxidant, exfoliator",no,,"1,2,3-propanetricarboxylic acid, 2-hydroxy-, 1...",no,,
22,coco-glucoside,yes,,no,yes,coconut,no,yes,no,no,no,no,yes,cleansing agent,no,,,no,,


In [55]:
yes

yes      58.82
maybe    29.41
no       11.76
Name: is_vegan_friendly, dtype: float64

In [35]:
#example_ingred_info['mineral_based'].value_counts()

In [63]:
# species = (
#     "petroleum",
#     "plant oil",
#     "mineral",
# )
# counts = {
#     "yes": np.array([0, 5, 2]),
#     "no": np.array([15, 9, 14]),
#     "maybe" : np.array([0, 2, 0])
# }
# width = 0.5

# fig, ax = plt.subplots()
# bottom = np.zeros(3)
# colors = ['black', 'orange', 'yellow']
# counter = 0
# for boolean, weight_count in counts.items():
#     p = ax.bar(species, weight_count, width, color = colors[counter],label=boolean, bottom=bottom)
#     bottom += weight_count
#     counter +=1

# #ax.set_title("Number of penguins with above average body mass")
# ax.legend(loc="upper right")

# #fig.figsize(100)
# plt.show()

WIP making some graphs :)

In [64]:
# mention what skin types there are: sensitive, dry, oily, normal, and combination -> breakdown benefits that could
# well with specific skin types

In [71]:
'hydration' in merged_dict.keys()

True

In [None]:
merged_dict['hydration']