## **Q5** How are popular brands rated by Good On You? *(Megan)*

### Qualitative:
#### Problem - 
- How can we learn more about the sustainability of popular brands?
#### Hypothesis & Claim - 
- We will extract data from Good On You's website to create a dataset of sustainability ratings for brands.
- We will use this data to compare sustainability practices of different brands and understand possible factors that go into a sustainability rating.
#### Context, Motivation & Rationale - 
- A *brand's* sustainability practices play a major role in determining the sustainability of a specific *product*, so it's important that we have that information.
- We can also examine Good On You's brand descriptions to identify the criteria they use to evaluate sustainability ratings.
#### Definitions, Data, and Methods - 
- For each of the brands, use Beautiful Soup to go to its webpage and get the overall rating, subratings (Planet, People, Animals), and description/reasoning.
#### Assumptions - 
- Good On You has done extensive research to provide accurate and comprehensive sustainability ratings for brands.
- Good On You considers a variety of factors/criteria related to sustainability practices.

### Quantitative:

In [1]:
# step 1: create a list of brands

brands = [
    'Princess Polly',
    'Brandy Melville',
    'Shein',
    'Nike',
    'Abercrombie & Fitch',
    'Amazon',
    'ASOS',
    'Forever 21', 
    'American Eagle',
    'Alo',
    'Reformation',
    'Acne Studios',
    'Alice + Olivia',
    'Sandy Liang',
    'Billabong',
    'Adidas',
    'Aritzia',
    'Uniqlo',
    'Area',
    'Balenciaga',
    'Bottega Veneta',
    'Brooks Brothers',
    'Burberry',
    'Chanel',
    'Coach',
    'Fendi',
    'Gucci',
    'Hermes',
    'Louis Vuitton',
    'Prada',
    'Ralph Lauren',
    'Saint Laurent',
    'Stella McCartney',
    'Telfar',
    'The Row',
    'Theory',
    'Tom Ford',
    'Tory Burch',
    'Valentino',
    '7 For All Mankind',
    "Arc'teryx",
    'aventura',
    'Banana Republic',
    'Boden',
    'Buck Mason',
    'Calvin Klein',
    'Carhartt',
    'Christy Dawn',
    'Columbia',
    'Cotopaxi',
    'Dickies',
    'Djerf Avenue',
    'Doen',
    'Edikted',
    'Everlane',
    'Faithfull the Brand',
    'Frankies Bikinis',
    'Girlfriend Collective',
    'Good American',
    'House of Sunny',
    'J.Crew',
    "Levi's",
    'Madewell',
    'Organic Basics',
    'Pact',
    'Patagonia',
    'prAna',
    'Quince',
    'RE/DONE',
    'Réalisation Par',
    'REI',
    'Sezane',
    'Spanx',
    'Summersalt',
    'tentree',
    'The North Face',
    'Tommy Hilfiger',
    'True Religion',
    'Wrangler',
    'Yes Friends',
    'Aeropostale',
    'Boohoo',
    'Cider',
    'Fashion Nova',
    'GUESS',
    'Hollister',
    'Hot Topic',
    'House of CB',
    'Mango',
    'Missguided',
    'Nasty Gal',
    'PacSun',
    'PrettyLittleThing',
    'Primark',
    'Romwe',
    'Temu',
    'Topshop',
    'Torrid',
    'Under Armour',
    "Victoria's Secret",
    'Yesstyle'
]

# brands added manually (only overall rating is provided): 
# - Ann Taylor
# - Aerie
# - Garage
# - Pink

In [2]:
# step 2: get data (ratings and description) from each brand

import requests
from bs4 import BeautifulSoup
import pandas as pd

all_brand_info = []
cols = ['brand', 'overall_rating', 'planet_score', 'people_score', 'animals_score', 'description']

url = 'https://directory.goodonyou.eco/brand/'

for brand in brands:
    try:
        # convert brand name to url format
        brand_converted = brand.lower() # lowercase
        brand_converted = brand_converted.replace(' ', '-') # replace spaces with dashes
        brand_converted = brand_converted.replace('&', 'and') # replace '&' with 'and'
        brand_converted = brand_converted.replace('+', '-') # replace '+' with '-'
        brand_converted = brand_converted.replace("'", '') # remove apostrophes
        brand_converted = brand_converted.replace('.', '') # remove periods
        brand_converted = brand_converted.replace('/', '') # remove slashes
        brand_converted = brand_converted.replace('é', 'e') # remove accent

        # get data on each brand's page
        response = requests.get(url+brand_converted)
        soup = BeautifulSoup(response.text, 'html.parser')

        # note: when finding classes on GOU's website, you might need to disable JavaScript (since BeautifulSoup can't load dynamic content)

        # get overall rating 
        overall_rating = soup.find('h6', class_='StyledHeading-sc-1rdh4aw-0 jNSEQB id__OverallRating-sc-12z6g46-7 cjSjNJ')
        overall_rating = overall_rating.text.split(': ')[1]

        # get subratings for planet, people, and animals
        subratings = soup.find_all('div', class_='id__RatingSingle-sc-12z6g46-9 ksJKxw')
    
        # remove category name from text 
        subratings[0] = subratings[0].text.split('Planet')[1]
        subratings[1] = subratings[1].text.split('People')[1]
        subratings[2] = subratings[2].text.split('Animals')[1]

        # if there's a rating, convert to int (makes it easier to analyze later)
        for i, rating in enumerate(subratings):
            if rating != 'Not applicable':
                subratings[i] = int(rating.split(' ')[0])

        # get description/justification
        text = soup.find('div', class_='id__BodyText-sc-12z6g46-15 eUqrmK').text
        
        # create new list of current brand info and add data
        brand_info = []
        brand_info.append(brand)
        brand_info.append(overall_rating)
        brand_info.append(subratings[0])
        brand_info.append(subratings[1])
        brand_info.append(subratings[2])
        brand_info.append(text)
        
        # add to overall list of brand info
        all_brand_info.append(brand_info)
    except:
        print(f"{brand} is not in Good On You's Drectory")

In [3]:
# step 3: add certain brands (with limited data provided) manually 

# aerie
aerie_brand_info = []
aerie_brand_info.append('Aerie')
aerie_brand_info.append(2)
aerie_brand_info.append('Not applicable')
aerie_brand_info.append('Not applicable')
aerie_brand_info.append('')

# ann taylor
ann_taylor_brand_info = []
ann_taylor_brand_info.append('Ann Taylor')
ann_taylor_brand_info.append(2)
ann_taylor_brand_info.append('Not applicable')
ann_taylor_brand_info.append('Not applicable')
ann_taylor_brand_info.append('')

# garage
garage_brand_info = []
garage_brand_info.append('Garage')
garage_brand_info.append(2)
garage_brand_info.append('Not applicable')
garage_brand_info.append('Not applicable')
garage_brand_info.append('')

# pink
pink_brand_info = []
pink_brand_info.append('Pink')
pink_brand_info.append(2)
pink_brand_info.append('Not applicable')
pink_brand_info.append('Not applicable')
pink_brand_info.append('')

all_brand_info.append(aerie_brand_info)
all_brand_info.append(ann_taylor_brand_info)
all_brand_info.append(garage_brand_info)
all_brand_info.append(pink_brand_info)

In [4]:
# step 4: create dataframe

brand_df = pd.DataFrame(all_brand_info, columns=cols)

# replace Good On You's categories with numerical ratings
# source: https://saturncloud.io/blog/how-to-convert-categorical-data-to-numerical-data-with-pandas
brand_df['overall_rating'] = brand_df['overall_rating'].replace({
    'We avoid': 1,
    'Not good enough': 2,
    "It's a start": 3,
    'Good': 4,
    'Great': 5
})

In [5]:
brand_df

Unnamed: 0,brand,overall_rating,planet_score,people_score,animals_score,description
0,Princess Polly,2,2,2,4,Our “Planet” rating evaluates brands based on ...
1,Brandy Melville,1,1,1,0,This brand provides insufficient relevant info...
2,Shein,1,1,1,2,Our “Planet” rating evaluates brands based on ...
3,Nike,3,3,3,2,Our “Planet” rating evaluates brands based on ...
4,Abercrombie & Fitch,2,2,2,2,Abercrombie & Fitch is owned by Abercrombie Ab...
...,...,...,...,...,...,...
100,Yesstyle,1,1,1,0,This brand provides insufficient relevant info...
101,Aerie,2,Not applicable,Not applicable,,
102,Ann Taylor,2,Not applicable,Not applicable,,
103,Garage,2,Not applicable,Not applicable,,


In [None]:
## step 5: export to a CSV file
# this CSV file is saved in the 'data' folder
brand_df.to_csv('../data/brand_info.csv')

### Qualitative:
#### Answer/Update to Question/Claim
- How can we learn more about the sustainability of popular brands?
   - We can use Good On You, a source for fashion sustainability ratings, to extract ratings and rationales for popular clothing brands.
#### Summary & Re-contextualization
- We were able to get sustainability ratings for 105 brands (luxury, sustainable, fast fashion, etc.).
#### Uncertainty, Limitations & Caveats
- Some brands have not been rated by Good On You.
- So far, we are only relying on one source for ratings.
#### New Problems & Next Steps
- We plan to cross-reference these ratings with other fashion sustainability sites, such as Eco-Stylist and Sustainable Review. This will ensure a more comprehensive and balanced assessment of the brands' sustainability practices and might also give us a broader selection of brands to analyze.

## **Q6** How can we compile a list of sustainable brands recommended by Good On You? *(Megan)*

### Qualitative:
#### Problem - 
- How can we find all of the sustainable brands that have been recommended by Good On You?
#### Hypothesis & Claim - 
- We should be able to extract data from Good On You's website to create a dataset of sustainability ratings for brands.
- We will use this data to compare sustainability practices of different brands and understand possible factors that go into a sustainability rating.
#### Context, Motivation & Rationale - 
- We want our Chrome extension to provide more sustainable alternatives to users, so we need a set of possible sustainable brands to recommend.
- We also aim to analyze the reasoning behind these ratings to understand the factors that Good On You took into consideration.
#### Definitions, Data, and Methods - 
- For each of the categories listed in the directory, use Selenium to parse through their 100 recommended brands and extract the overall rating, subratings (Planet, People, Animals), and description/reasoning.
#### Assumptions - 
- Good On You's description of each sustainable brand is informative enough that we are able to understand *why* they are considered sustainable.

### Quantitative

In [6]:
## step 1: get brands recommended by Good On You

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time

categories = [
    'tops',
    'dresses',
    'basics',
    'bottoms',
    'denim',
    'outerwear',
    'knitwear',
    'activewear',
    'sleepwear'
]

brands = set()

# set Chrome options for headless mode
chrome_options = Options()

# initialize WebDriver with headless mode
driver = webdriver.Chrome(options=chrome_options)

for category in categories:
    # open webpage
    driver.get('https://directory.goodonyou.eco/categories/' + category)

    # get initial height of the page
    last_height = driver.execute_script("return document.body.scrollHeight")

    while True:
        # scroll
        # source: https://stackoverflow.com/questions/73792388/how-to-scroll-to-the-bottom-of-the-page-with-selenium-python
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        
        # wait for 2 seconds
        time.sleep(2)
        
        # get new height of page
        new_height = driver.execute_script("return document.body.scrollHeight")
        
        # stop once it gets to the bottom
        if new_height == last_height:
            break
        
        # update height
        last_height = new_height

    # 100 results are now shown on the page
    for i in range(1, 101):
        # get xpath for each brand
        brand_xpath = f'//*[@id="__next"]/div/div[4]/div/div[2]/div/div/div[{i}]/div/div/div[2]/h5/a'

        # get brand element (after 5 second delay)
        brand_element = WebDriverWait(driver, 5).until(
            EC.visibility_of_element_located((By.XPATH, brand_xpath))
        )

        # get brand name and link
        # source for link: https://stackoverflow.com/questions/54862426/python-selenium-get-href-value
        brand_name = brand_element.text
        brand_link = brand_element.get_attribute('href')

        brands.add((brand_name, brand_link))
     
# close the WebDriver
driver.quit()

In [7]:
## step 2: get data (ratings and description) from each brand

# create list of all recommended brands + their info
recommended = []
cols = ['brand', 'overall_rating', 'planet_score', 'people_score', 'animals_score', 'description']

for brand in brands:
    brand_name = brand[0]
    brand_href = brand[1]

    # get data on each brand's page
    response = requests.get(brand_href)
    soup = BeautifulSoup(response.text, 'html.parser')

    # get overall rating 
    overall_rating = soup.find('h6', class_='StyledHeading-sc-1rdh4aw-0 jNSEQB id__OverallRating-sc-12z6g46-7 cjSjNJ')
    overall_rating = overall_rating.text.split(': ')[1]
    
    # get subratings for planet, people, and animals
    subratings = soup.find_all('div', class_='id__RatingSingle-sc-12z6g46-9 ksJKxw')
    
    # remove category name from text 
    subratings[0] = subratings[0].text.split('Planet')[1]
    subratings[1] = subratings[1].text.split('People')[1]
    subratings[2] = subratings[2].text.split('Animals')[1]

    # if there's a rating, convert to int (makes it easier to analyze later)
    for i, rating in enumerate(subratings):
        if rating != 'Not applicable':
            subratings[i] = int(rating.split(' ')[0])

    # get description/justification
    text = soup.find('div', class_='id__BodyText-sc-12z6g46-15 eUqrmK').text

    # create new list of current brand info and add data
    brand_info = []
    brand_info.append(brand_name)
    brand_info.append(overall_rating)
    brand_info.append(subratings[0])
    brand_info.append(subratings[1])
    brand_info.append(subratings[2])
    brand_info.append(text)
    
    # add to overall list of brand info
    recommended.append(brand_info)

In [8]:
## step 3: create dataframe

recommended_df = pd.DataFrame(recommended, columns=cols)

# replace Good On You's categories with numerical ratings
# source: https://saturncloud.io/blog/how-to-convert-categorical-data-to-numerical-data-with-pandas
recommended_df['overall_rating'] = recommended_df['overall_rating'].replace({
    'We avoid': 1,
    'Not good enough': 2,
    "It's a start": 3,
    'Good': 4,
    'Great': 5
})

In [None]:
recommended_df

Unnamed: 0,brand,overall_rating,planet_score,people_score,animals_score,description
0,Sami Miro Vintage,4,5,3,4,Our “Planet” rating evaluates brands based on ...
1,Ognx,3,3,3,Not applicable,Our “Planet” rating evaluates brands based on ...
2,Mantis World,5,4,4,5,Mantis World's environment rating is 'good'. I...
3,Le Gramme,3,4,2,Not applicable,Our “Planet” rating evaluates brands based on ...
4,WAXON,4,3,3,4,WAXON's environment rating is 'it's a start'. ...
...,...,...,...,...,...,...
451,Luva Huva,4,4,3,Not applicable,Our “Planet” rating evaluates brands based on ...
452,Pareto,4,5,3,5,Our “Planet” rating evaluates brands based on ...
453,Viktoria and Woods,3,3,3,3,Viktoria and Woods's environment rating is 'it...
454,London W11,4,5,3,4,London W11's environment rating is 'great'. It...


In [None]:
## step 4: export to a CSV file

# this CSV file is saved in the 'data' folder
recommended_df.to_csv('../data/gou_recommended.csv')

### Qualitative:
#### Answer/Update to Question/Claim
- How can we find all of the sustainable brands that have been recommended by Good On You?
   - We can use Selenium to find 100 sustainable brands for each clothing category.
#### Summary & Re-contextualization
- We were able to get sustainability ratings for 456 brands from 9 clothing categories.
#### Uncertainty, Limitations & Caveats
- While Good On You provides valuable sustainability ratings for a wide range of brands, it is important to note that their database may not encompass all sustainable brands in the market. Some brands, especially smaller or newer ones that may not have been evaluated by Good On You yet, could be missing from their ratings.
#### Next Steps
- Our Chrome extension will have a feature to recommend more sustainable alternatives from these brands.