# Question: Does the company promote diversity and inclusion?

Although many ESG issues may not be financially material to all industries, we believe that diversity and inclusion of all people -- regardless of gender, race, sexual orientation, religion, background, and other similar traits -- is one that can be viewed as material across the board.

Embracing different types of people is the moral thing to do, but it's also the best financial thing to do. In 2018, management consulting firm McKinsey & Co. released the report "Why Diversity Matters"; it examined data sets from 366 public companies in the U.S., Canada, Latin America, and the U.K. Companies in the top quartile for gender, racial, or ethnic diversity were more likely to generate financial returns above the national medians for their industry; the converse was also true. Meanwhile, McKinsey concluded that "diversity is probably a competitive differentiator that shifts market share toward more diverse companies over time."

Why? Plenty of data shows that diverse groups make better decisions than homogeneous ones. Narrow-minded in-group thinking, in which members are more likely to have similar experiences and viewpoints and come to the same conclusions, can be terrible for decision-making. Meanwhile, groups in which many different life experiences and perspectives are represented help set the stage for better decisions.

Underlining this principle is the correlation between what a company does and who runs it. For example, if a retailer sells jewelry aimed at female consumers, having an entirely male management team and board of directors would seem absurdly mismatched -- it would result in missing valuable perspectives about the very people it's aiming to attract as customers.

In ESG investing, we seek positive elements like a transparent workforce-diversity disclosure (some companies don't disclose their staff makeup at all); composition of its board of directors and executive leadership; and internal programs and policies that foster or support diversity -- adequate paid family leave, professional programs for people of color, and fair hiring practices.

Companies with poor diversity statistics, or lawsuits related to discrimination and harassment, will be noted for their heightened risk and probably disqualified from inclusion.

# Data Sources

- SEC Reports 

# Imports

In [4]:
import sec_edgar_downloader
from sec_edgar_downloader import Downloader
import re
import os
import shutil
import pandas as pd
import requests
from bs4 import BeautifulSoup

# Getting Company Ticker Symbols

In [5]:
# Getting site
page = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = BeautifulSoup(page.content, 'html.parser')

# Finding table
table = soup.find(id='constituents').tbody

# Getting table rows
table_rows = table.find_all('tr')

# Finding company ticker in row
companies = []
for row in table_rows[1:70]:
    elements = row.find_all('td')
    company = elements[0].text
    company = company.replace('\n','')
    companies.append(company)

# Getting Data

In [8]:
def get_dandi_df(companies):
    
    dandi_df = pd.DataFrame(columns=['company','diversity_inclusion_mentions'])
    
    for company in companies:
        try:
            d = Downloader()
            d.get('DEF 14A', 
                  '{}'.format(company), 
                  after='2015-01-01',
                  download_details=False,
                  query = 'diversity inclusion')

            text_files = []
            for root, dirs, files in os.walk("/Users/MichaelWirtz/Desktop/portfolio/programming_projects/long_term_trading_bot/archive/sec-edgar-filings/{}".format(company)):
                for file in files:
                    if file.endswith(".txt"):
                         text_files.append(os.path.join(root, file))

            dandi_count = 0

            for file in text_files:
                content = open(file).read()
                valids = re.sub(r"[^a-z]+", ' ', content)
                content_count = valids.count('diversity') + valids.count('inclusion')
                dandi_count += content_count

            dandi_df = dandi_df.append({'company': company,
                                        'diversity_inclusion_mentions': dandi_count}, ignore_index=True)

            shutil.rmtree('./sec-edgar-filings')
        except:
            dandi_df = dandi_df.append({'company': company,
                                        'diversity_inclusion_mentions': 0}, ignore_index=True)
    
    dandi_df = dandi_df.sort_values(by='diversity_inclusion_mentions', ascending=False)
    
    return dandi_df  

In [9]:
get_dandi_df(companies)

Unnamed: 0,company,diversity_inclusion_mentions
11,A,757
52,T,431
68,BIO,277
62,BK,238
61,BAC,237
...,...,...
53,ATO,24
34,AMP,14
26,AMCR,11
65,BRK.B,0
