From the df_most_diverse DataFrame we obtained from our College Data Collection previously, we scraped the websites of the top 20 diverse four-year colleges using tools like BeautifulSoup and Selenium (for JavaScript elements).

**We omitted colleges whose websites were not easily scrapable by using just BeautifulSoup and Selenium.*

We had different web-scraping methods for each college website due to the varying HTML and element structures.

In [1]:
import requests
from bs4 import BeautifulSoup

In [46]:
!pip install selenium



In [47]:
from selenium import webdriver
from selenium.webdriver.common.by import By
import time

In [48]:
!apt-get install chromium_driver

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
E: Unable to locate package chromium_driver


In [49]:
def web_driver():
  options = webdriver.ChromeOptions()
  options.add_argument('--verbose')
  options.add_argument('--no-sandbox')
  options.add_argument('--headless')
  options.add_argument('--disable-gpu')
  options.add_argument('--window-size=1920, 1200')
  options.add_argument('--disable-dev-shm-usage')
  driver = webdriver.Chrome(options=options)
  return driver

In [50]:
driver = web_driver()

# Columbia University


In [7]:
driver.get('https://doc.sis.columbia.edu/#sel/WMST_Spring2024.html')
table = driver.find_element(By.XPATH, '//table[@class="course-listing"]')
rows = table.find_elements(By.XPATH, './/tbody/tr/td')

all_links = []
skip_count = 0

for row in rows:
  course = row.find_element(By.XPATH, './/a').get_attribute('href')
  if skip_count % 2 == 0:
    all_links.append(course)
  skip_count += 1
  time.sleep(.1)

In [8]:
columbia = []
for link in all_links:
  response = requests.get(link)
  soup = BeautifulSoup(response.text, "html.parser")

  description = soup.find('p')
  if description:
    description = description.text.strip()

    if description not in columbia:
      columbia.append(description)
  time.sleep(.1)

In [9]:
columbia[0:2]

['Combines critical feminist and anti-racist analyses of medicine with current research in epidemiology and biomedicine to understand health and health disparities as co-produced by social systems and biology.',
 'This course offers a chronological study of the Anglophone, Hispanophone, and Francophone insular Caribbean through the eyes of some of the region’s most important writers and thinkers. We will focus on issues that key Caribbean intellectuals--including two Nobel prize-winning authors--consider particularly enduring and relevant in Caribbean cultures and societies. Among these are, for example, colonization, slavery, national and postcolonial identity, race, class, popular culture, gender, sexuality, tourism and migration. This course will also serve as an introduction to some of the exciting work on the Caribbean by professors at Barnard College and Columbia University (faculty spotlights).']

# Yale University

In [11]:
url = 'https://catalog.yale.edu/ycps/courses/wgss/'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('div', class_='sc_sccoursedescs')

yale = []
for course_block in table.find_all('div', class_='courseblock'):
  description_element = course_block.find('p', class_='courseblockdesc')

  if description_element:
    description = description_element.text.strip()
    yale.append(description)

  time.sleep(0.1)

yale[0:2]

['Overview of LGBTQ cultures and their relation to geography in literature, history, film, visual culture, and ethnography. Discussion topics include the historical emergence of urban communities; their tensions and intersections with rural locales; race, sexuality, gender, and suburbanization; and artistic visions of queer and trans places within the city and without. Emphasis is on the wide variety of U.S. metropolitan environments and regions, including New York City, Los Angeles, Miami, the Deep South, Appalachia, New England, and the Pacific Northwest. Enrollment limited to first-year students.\xa0 \u2002HUTTh 11:35am-12:50pm',
 'Exploration of scientific and medical writings on sexuality over the past century. Focus on the tension between nature and culture in shaping theories, the construction of heterosexuality and homosexuality, the role of scientific studies in moral discourse, and the rise of sexology as a scientific discipline. Enrollment limited to first-year students.\xa0

# Stanford University

In [12]:
url = 'https://archived-bulletin.stanford.mobi/coursedescriptions/femgen/'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('div', class_='sc_sccoursedescs')
rows = table.find_all('div', class_='courseblock')

descriptions = []
for row in rows:
  descriptions.append(row.find('p', class_='courseblockdesc'))
  time.sleep(.1)

In [13]:
stanford = []

for description in descriptions:
  if 'Same as:' in description.text:
    cleaned_description = description.text.split('Same as:')[0].strip()
    stanford.append(cleaned_description)
  else:
    stanford.append(description.text.strip())

stanford[0:2]

['Taught by long-time community organizer, Beatriz Herrera. This course explores the theory, practice and history of grassroots community organizing as a method for developing community power to promoting social justice. We will develop skills for 1-on-1 relational meetings, media messaging, fundraising strategies, power structure analysis, and strategies organizing across racial/ethnic difference. And we will contextualize these through the theories and practices developed in the racial, gender, queer, environmental, immigrant, housing and economic justice movements to better understand how organizing has been used to engage communities in the process of social change. Through this class, students will gain the hard skills and analytical tools needed to successfully organize campaigns and movements that work to address complex systems of power, privilege, and oppression. As a Community-Engaged Learning course, students will work directly with community organizations on campaigns to ad

# Swarthmore College

In [14]:
url = 'https://www.swarthmore.edu/gender-sexuality-studies/courses'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")


table = soup.find('ul', class_='list-group acalog-course__listing')
rows = table.find_all('li', class_='list-group-item')

links = []
for row in rows:
  link = row.find('a')

  if link is not None:
    links.append(link['href'])

  time.sleep(.1)

In [15]:
swarthmore = []
for link in links:
    url = link
    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")

    table = soup.find('table', class_='table_default')
    td_element = soup.select_one('td.block_content')

    if td_element:
      main_text = td_element.get_text(separator=' ', strip=True)

      start_index = main_text.find('College Bulletin 2024-2025')
      end_index = main_text.find('Print (opens a new window)Help (opens a new window)', main_text.rfind('Print (opens a new window)Help (opens a new window)'))
      if start_index != -1 and end_index != -1:
          description = main_text[start_index + len('College Bulletin 2024-2025') : end_index].strip()
      else:
          description = main_text.strip()

      index = description.find("window")
      end = description.find("Eligible")

      if index != -1:
          text = description[index + 7: end]

      swarthmore.append(text)
      time.sleep(.1)
    else:
      swarthmore.append("Description not found")

swarthmore[0:2]

[' ANTH 044. Gender, Sexuality, and Social Change (Cross-listed as PEAC 043 ) How has gender emerged as an analytical category? How has sexuality emerged as an analytical category? What role did discourses surrounding gender and sexuality play in the context of Western colonialism in the Global South historically as well as in the context of Western imperialism in the Global South today? How are gender and sexuality-based liberation understood differently around the world? What global social movements have surfaced to codify rights for women and LGBTQ populations? How has the global human rights apparatus shaped the experiences of women and queer communities? What is the relationship between gender and masculinity? What are the promises and limits of homonationalism and pinkwashing as theoretical frameworks in our understanding of LGBT rights discourses? When considering the relationship between faith and homosexuality, how are religious actors queering theology? How do we define socia

# Amherst College

In [16]:
driver.get('https://www.amherst.edu/academiclife/departments/sexuality_womens_gender_studies/courses')
table = driver.find_element(By.XPATH, '//div[@id="academics-course-list"]')

rows = table.find_elements(By.XPATH, './/div[@class="coursehead"]')

links = []
for row in rows:
  course = row.find_element(By.XPATH, './/a').get_attribute('href')
  links.append(course)
  time.sleep(.1)

In [17]:
amherst = []

for link in links:
  driver.get(link)
  table = driver.find_element(By.XPATH, '//div[@id="academics-course-list"]')
  all_text = table.text
  start = all_text.find('Description\n')
  start = start + 12
  end = all_text.find('Preference')
  description = all_text[start:end]
  amherst.append(description)
  time.sleep(0.1)

amherst[0:2]

["(Offered as CLAS 111 and SWAG 110) Since its invention in Athens, tragic drama has focused upward on the great or mighty as they fall but also outward on the disempowered as they are for once given public voice: women, slaves, and barbarians. The cosmic forces of fate and the gods play out along social fault lines with conflicting viewpoints. We look to a “hero,” but, changing his mask, a Greek actor could go from god to wife to peasant. This multiplicity complicates itself in modern stagings and films as they cast actors with specific gender and racial identities. Female actors now have indisputable claim on the once-male roles of Antigone, Cassandra, Medea, and Electra, as they do on Shakespeare’s Cleopatra. The dialects of tragic performance are multiple: translationese, Shakespeare, and Spanglish.\nIn this course we start with the formation of Hellenic identity and notions of heroism in Homer's Iliad and then look at the performance of plays by Aeschylus, Sophocles, Euripides, an

# University of Chicago

In [18]:
url = 'http://collegecatalog.uchicago.edu/thecollege/genderstudies/#genderandsexualitystudiescourses-general'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

tables = soup.find_all('div', class_='sc_courseblock')

all_rows = []

for table in tables:
    rows = table.find_all('div', class_=['courseblock', 'main'])
    all_rows.extend(rows)

chicago=[]
for row in all_rows:
  chicago.append(row.find('p', class_='courseblockdesc').text.strip())
  time.sleep(.1)

chicago[0:2]

['This is a one-quarter, seminar-style course for undergraduates. Its aim is triple: to engage scenes and concepts central to the interdisciplinary study of gender and sexuality; to provide familiarity with key theoretical anchors for that study; and to provide skills for deriving the theoretical bases of any kind of method. Students will produce descriptive, argumentative, and experimental engagements with theory and its scenes as the quarter progresses.',
 'In medieval literature, various modes of desire intersect in surprising ways: spiritual devotion unfolds through sensual longing, and personal pleasure intertwines with sacrificial love, producing structures of desire that are conflicting, disorienting, and not so dissimilar from our own. In this course, we will survey a range of late medieval genres to unpack the richly imaginative and experimental discourses of desire housed in fourteenth- and fifteenth-century England. Readings will include dream-vision poems like Pearl, where 

# Rice University


In [19]:
url = 'https://ga.rice.edu/programs-study/departments-programs/humanities/study-women-gender-sexuality/#coursestext'

reseponse = requests.get(url)
soup = BeautifulSoup(reseponse.text, "html.parser")

table = soup.find('div', class_='sc_sccoursedescs')
rows = table.find_all('div', class_='courseblock')

rice = []
for row in rows:
  text = row.find('p', class_='courseblockdesc')
  if text:
    text = text.text.strip()
    start = text.find('Description:')
    start = start + 12
    end = text.find('Cross-list:')
    text = text[start:end]
    rice.append(text)

  time.sleep(.1)

rice[0:2]

[' An introduction to the interdisciplinary study of these themes: 1) genders and sexualities as these intersect with race, class, migration status and other differences that shape our lives; 2) the social, political, and legal situations of women and LGBTQ+ people globally and in the United States; and 3) the production of gender and sexual identities and desires (e.g. queer, two-spirit, trans, nonbinary, asexual as well as normative).  We will also touch on efforts to theorize gender, sexuality, race, class, and other differences and become acquainted with the concept of engaged research. This course is required for the SGWS major and the SWGM minor',
 ' This class builds on SWGS\xa0100 to take a closer look at how sexualities and genders come to be known. Students will become familiar with queer, trans, and feminist theory and will see that key issues and debates appear in both scholarship and political activism globally and in the United States. The course addresses intersectional 

# Johns Hopkins University


In [21]:
url = 'https://e-catalogue.jhu.edu/course-descriptions/study_of_women__gender____sexuality/'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('div', class_='sc_sccoursedescs')
rows = table.find_all('div', class_='courseblock')

jhu = []

for row in rows:
    text_elements = row.find_all('div', class_='noindent')
    for text_element in text_elements:
        text = text_element.find_all('p', class_='courseblockextra noindent')
        for t in text:
          t = t.text.strip()
          end = t.find('Distribution Area')
          t = t[:end]
          jhu.append(t)

    time.sleep(0.1)

jhu[0:2]

['This course will serve as an intensive introduction to contemporary approaches to theories of gender and sexuality, and their relationship to cultural production and politics. Students will develop a historically situated knowledge of the development of feminist and queer scholarship in the 20th and 21st centuries, and consider the multiply intersecting forces which shape understandings of sexual and gender identity. We will consider both foundational questions (What is gender? Who is the subject of feminism? What defines queerness?) and questions of aesthetic and political strategy, and spend substantial time engaging with feminist and queer scholarship in comparative contexts. Students will be introduced to debates in Black feminism, intersectionality theory, third world feminism, socialist feminism, queer of colour critique, and trans* theory. We will read both canonical texts and recent works of scholarship, and the final weeks of the course will be devoted to thinking with our t

# Rutgers University - Newark

In [22]:
url = 'https://catalogs.rutgers.edu/generated/nwk-ug_1113/pg431.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

rutgers = []
table = soup.find_all('span', class_='course-desc')
for row in table:
  rutgers.append(row.text)
  time.sleep(0.1)

rutgers[0:2]

["Addresses the historical influences that have defined women's roles and\nexperiences and have contributed to current reevaluations of women's\nplace in modern society; provides an overview of developments in\nvarious fields. 21:988:201 emphasizes the humanities.\n21:988:202 emphasizes the social science perspectives.",
 '  Focuses on understanding culture from a feminist perspective. Explores ways in which gender influences and is influenced by class, ethnicity, race, nationality, language, and religion. ']

# University of Southern California

In [23]:
url = "https://catalogue.usc.edu/content.php?catoid=12&navoid=4099"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find_all('ul', class_='program-list')
courses_table = table[-1]
courses = courses_table.find_all('li')

domain = 'https://catalogue.usc.edu/'

usc = []

for course in courses:
  link = course.find('a').get('href')
  course_url = domain + link
  response = requests.get(course_url)
  soup = BeautifulSoup(response.text, parser='html.parser')
  container = soup.find('td', class_="block_content")
  des = container.find('hr').find_next_sibling(string=True)

  if des != '' and des != ' ' and des != '(Enroll in ':
    usc.append(des)

  time.sleep(0.1)

usc[0:2]

['Exploration of identity development in terms of social, political, and cultural constructs; examination of collegiate athletics and the contributions of women of color. ',
 'Multidisciplinary survey of gender assumptions in relation to sexuality, mental health, social and political relations, and artistic expression. ']

# University of Washington - Tacoma Campus

In [24]:
url = 'https://www.washington.edu/students/crscatt/tegl.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find_all('a')

uwash = []
for row in table:
  courses = row.find_all('p')
  for course in courses:
    desc = course.text.strip()
    start = desc.find('DIV') + 3
    end= desc.find('View course')
    desc = desc[start:end]

    uwash.append(desc)

  time.sleep(0.1)

uwash[0:2]

['Introduces theories, methods, and analytical frameworks for understanding the intersection of race, class, gender, and sexuality by examining key thinkers, texts, ideas, and concepts from across the humanities and social sciences. Teaches the core values and ideals of social justice that are foundational to ethnic, gender, and labor studies.',
 'Introduces foundational and interdisciplinary concepts about human diversity in the United States and critical multinational theory. Covers an examination of historical and contemporary issues of power, privilege and difference, and micro and macro methods for creating positive social change, reducing inequality and achieving equity.']

# George Mason University

In [25]:
url = 'https://catalog.gmu.edu/colleges-schools/humanities-social-sciences/women-gender-studies/'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

courseblocks = soup.find_all('div', class_='courseblocklevel')

gmu = []

for courses in courseblocks:
  course_list = courses.find_all('div', class_='courseblock')
  for course in course_list:
    des = course.find('div', class_="courseblockdesc").text.strip()
    remove_index = des.find('Offered by')

    if remove_index != -1:
        des = des[:remove_index].strip()

    gmu.append(des)

  time.sleep(0.1)

gmu[0:2]

['Explores ways women are portrayed around the world in advertising, film, TV, cartoons, and news media; literature and religious texts; as well as photography, and the visual and performing arts. Through interdisciplinary study, students evaluate the powerful effects these representations have on the political, economic, and social lives of women throughout the world.',
 'Interdisciplinary introduction to women’s, gender and sexuality studies, encompassing key concepts in the field, history of women’s movements and women’s studies in America, cross-cultural constructions of gender, and a thematic emphasis on the diversity of women’s experience across class, race, and cultural lines.']

# University of Maryland-Baltimore County

In [26]:
url = 'https://catalog.umbc.edu/content.php?filter%5B27%5D=GWST&filter%5B29%5D=&filter%5Bkeyword%5D=&filter%5B32%5D=1&filter%5Bcpage%5D=1&cur_cat_oid=36&expand=&navoid=2613&search_database=Filter#acalog_template_course_filter'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

all_tables = soup.find_all('table', class_='table_default')
table = all_tables[-1]
courses = table.find_all('td', class_='width')

umb = []

domain = 'https://catalog.umbc.edu/'

for course in courses:
  link = course.find('a').get('href')
  course_url = domain + link
  response = requests.get(course_url)
  soup = BeautifulSoup(response.text, parser='html.parser')
  container = soup.find('td', class_="block_content")
  des = container.find('hr').find_next_sibling(string=True)
  umb.append(des.strip())
  time.sleep(0.1)

umb[0:2]

['Drawing on feminist, queer, social, and critical race theory, this course examines the status of the body in both historical and contemporary debates about identity, representation, and politics. We tend to take the body for granted as the ground of experience and knowledge, but this course challenges that common sense, asking how the body is produced, managed, and deployed in\xa0various ways to discipline and manage populations. We will also investigate the political possibilities of body work to resist and reshape these same disciplinary practices, paying particular attention to “queer” forms of embodiment.',
 'This course introduces students to the interdisciplinary field of gender and women’s studies, feminist scholarship, and feminist activism. We will examine the relationship between gender, power, and the production of feminist knowledge in a variety of fields, including psychology, sociology, literature, media studies and history. The course provides critical perspectives on 

# University of Massachusetts-Boston

In [27]:
url = 'https://catalog.umb.edu/content.php?filter%5B27%5D=WGS&filter%5B29%5D=&filter%5Bkeyword%5D=&filter%5B32%5D=1&filter%5Bcpage%5D=1&cur_cat_oid=54&expand=&navoid=9274&search_database=Filter#acalog_template_course_filter'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

all_tables = soup.find_all('table', class_='table_default')
table = all_tables[-1]
courses = table.find_all('td', class_='width')

mass = []

domain = 'https://catalog.umb.edu/'

for course in courses:
  link = course.find('a').get('href')
  course_url = domain + link
  response = requests.get(course_url)
  soup = BeautifulSoup(response.text, parser='html.parser')
  container = soup.find('td', class_="block_content")

  des_tag = container.find('strong', string='Description:')
  des_text = des_tag.find_next_sibling(string=True)

  mass.append(des_text.strip())
  time.sleep(0.1)

mass[0:2]

['This course focuses on literary expressions and representations of the desire for and the crises of human rights. The various literary genres (poetry, fiction, drama, memoir, and essay) evoke the yearning of peoples to be awarded the right to live in safety and with dignity so that they pursue meaningful lives, and these literary genres record the abuses of the basic rights of people as they seek to lead lives of purpose. This course will examine the ways in which the techniques of literature (e.g., narrative, description, point of view, voice, image) compel readers’ attention and bring us nearer to turn to human rights abuses and peoples’ capacities to survive and surmount these conditions.',
 'This interdisciplinary course examines how social constructions of gender and sexuality shape our day-to-day interactions with a variety of social institutions, such as the family and workplace, and contribute to systems of power and privilege. Through a careful examination of texts, films an

# University of Nevada-Las Vegas

In [35]:
url = 'https://catalog.unlv.edu/content.php?filter%5B27%5D=-1&filter%5B29%5D=&filter%5Bkeyword%5D=WMST&filter%5B32%5D=1&filter%5Bcpage%5D=1&cur_cat_oid=4&expand=&navoid=206&search_database=Filter#acalog_template_course_filter'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

all_tables = soup.find_all('table', class_='table_default')
table = all_tables[-1]
courses = table.find_all('td', class_='width')

nev = []

domain = 'https://catalog.unlv.edu/'

for course in courses:
  link = course.find('a').get('href')
  course_url = domain + link
  response = requests.get(course_url)
  soup = BeautifulSoup(response.text, parser='html.parser')
  container = soup.find('td', class_="block_content")
  des = container.find('hr').find_next_sibling(string=True).strip()

  if '(Same as' in des:
    cleaned = des.split('(Same as')[1]

  if cleaned != '':
    nev.append(cleaned)

  time.sleep(0.1)

nev[0:2]

[' WMST 427B.) Study of gender and literature through the ages. Focus may be aesthetic, historical, or thematic.',
 ' WMST 440B.) Study of gender, sexuality, and literature from the beginning to the Early Modern period.']

# Massachusetts Institute of Technology

In [51]:
driver.get('https://catalog.mit.edu/subjects/wgs/#:~:text=Examines%20representations%20of%20race%2C%20gender,these%20social%20constructions%20in%20society.')

table = driver.find_element(By.XPATH, './/div[@id="sc_sccoursedescs"]')
rows = table.find_elements(By.XPATH, './/div[@class="courseblock"]')

mit = []
for row in rows:
  desc = row.find_element(By.XPATH, './/p[@class="courseblockdesc"]').text
  mit.append(desc)
  time.sleep(0.1)

mit[0:2]

['Drawing on multiple disciplines - such as literature, history, economics, psychology, philosophy, political science, anthropology, media studies and the arts - to examine cultural assumptions about sex, gender, and sexuality. Integrates analysis of current events through student presentations, aiming to increase awareness of contemporary and historical experiences of women, and of the ways sex and gender interact with race, class, nationality, and other social identities. Students are introduced to recent scholarship on gender and its implications for traditional disciplines.',
 "An interdisciplinary subject that examines questions of feminism, international women's issues, and globalization through the study of novels, films, critical essays, painting and music. Considers how women redefine the notions of community and nation, how development affects their lives, and how access to the internet and to the production industry impacts women's lives. Primary topics of interest include t

# University of California - Santa Cruz

In [38]:
url = 'https://registrar.ucsc.edu/catalog/archive/11-12/programs-courses/course-descriptions/fmstcourses.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('div', class_='content contentBox')

rows = table.find_all('p')
rows = rows[3:]

ucsc = []
for row in rows:
  start = row.text.find('\n') + 1
  end= row.text.find('Prerequisite')
  desc = row.text[start:end]
  ucsc.append(desc)
  time.sleep(0.1)

ucsc[0:2]

['Introduces the core concepts underlying the interdisciplinary field-formation of feminist studies within multiple geopolitical contexts. Explores how feminist inquiry rethinks disciplinary assumptions and categories, and animates our engagement with culture, history, and society. Topics include: the social construction of gender; the gendered division of labor, production, and reproduction; intersections of gender, race, class, and ethnicity; and histories of sexuality. (Formerly Introduction to Feminisms.) (General Education Code(s): CC, IH.) A. Arondeka',
 'Examines, and critically analyzes, select post-World War II movements for social justice in the United States from feminist perspectives. Considers how those movements and their participants responded to issues of race, class, gender, and sexuality. A feminist, transnational, analytic framework is also developed to consider how those movements may have embraced, enhanced, or debilitated feminist formations in other parts of the 

# University of California - Santa Barbara

In [52]:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver.get('https://catalog.ucsb.edu/departments/FEMST/courses')

table = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, '//tbody[@class="base-table__body"]'))
)
rows = table.find_elements(By.XPATH, './/tr')

ucsb = []
for row in rows:
  desc = row.find_elements(By.XPATH, './/td')[1].text.strip()

  if desc not in descriptions:
    ucsb.append(desc)

  time.sleep(0.1)

ucsb[0:2]

['Critical reading and analysis of gender relations and the place of women in major intellectual traditions. Texts will be drawn from Plato, Rousseau, Wollstonecraft, Mill, Marx, Freud, and de Beauvoir, among others.',
 'This interdisciplinary course will highlight how a curriculum focusing on racial, ethnic, gender, and LGBTQ studies is central to teaching and learning within diverse societal contexts. This grounding is essential for K-12 teachers in History and Eng...']

# University of Connecticut - Stamford

In [41]:
url = 'https://catalog.uconn.edu/undergraduate/courses/wgss/'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('div', class_='sc_sccoursedescs')
rows = table.find_all('div', class_='courseblock')

ucstam = []
for row in rows:
  desc = row.find_all('div', class_='noindent')
  if len(desc) >= 2:
    desc = desc[1]
    end = desc.text.find('CA')
    desc = desc.text[:end]
    ucstam.append(desc)
  time.sleep(0.1)

ucstam[0:2]

['How gender, sex, and sexuality are woven into systems of difference and stratification that shape everyday life. Examines these processes in the family, education, work, and politics with sensitivity to the diversity of individual experiences across class, racial ethnic groups, cultures, and regions. Provides experience in introductory research methods to analyze the social construction and structural organization of gender and sexuality. ',
 '(Also offered as HIST\xa01203.) The historical roots of challenges faced by contemporary women as revealed in the Western and/or non-Western experience: the political, economic, legal, religious, intellectual and family life of women. ']

# Edmonds College

In [42]:
url = 'https://www.edmonds.edu/programs-and-degrees/areas-of-study/social-sciences-and-cultural-studies/diversity-studies/women-course-descriptions.html#:~:text=This%20course%20explores%20political%2C%20historical,oppression%2C%20empowerment%2C%20and%20resistance.'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")

table = soup.find('maincontent')
rows = table.find_all('p')

edmonds = []
for i in [0, 4]:
  if i < len(rows):
      clean_description = rows[i].text.strip()
      edmonds.append(clean_description)
  time.sleep(0.1)

edmonds[0:2]

["Introduction to interdisciplinary methods and concepts related to Women's Studies. This course explores political, historical, and cultural constructions of gender, race, class, and sexuality. Topics include women's histories, intersections of identity, family, work, body politics, health, violence and protection, oppression, empowerment, and resistance. Dual listed as DIVST 200. Prerequisite(s): Placement in ENGL& 101 or instructor permission. Crosslisted as: DIVST 200.\n\nCourse Level Objectives\nApply key concepts and theories from the field of Women's Studies to a broad spectrum of historical, political, international and social issues.Reason and think critically about gender relations and women's positions from a wide variety of theoretical perspectives.Analyze and explore relationships between sociopolitical institutions and individual experience.Explore overlapping meanings and constructions of race, class, gender, and sexuality.Compare and contextualize the histories, stories

# Most Diverse DataFrame

In [54]:
import pandas as pd
from google.colab import drive
drive.mount('/content/drive')

df_diverse = pd.read_csv('/content/drive/My Drive/df_diverse.csv')
df_most = df_diverse.sort_values(by='SDI', ascending=False)

most_diverse_colleges = ['Columbia University in the City of New York', 'Yale University', 'Stanford University', 'Swarthmore College',
                         'Amherst College',
                          'University of Chicago', 'Rice University', 'Johns Hopkins University', 'Rutgers University-Newark',
                         'University of Southern California', 'University of Washington-Tacoma Campus', 'George Mason University',
                         'University of Maryland-Baltimore County', 'University of Massachusetts-Boston', 'University of Nevada-Las Vegas',
                         'Massachusetts Institute of Technology', 'University of California-Santa Cruz',
                         'University of California-Santa Barbara',
                         'University of Connecticut-Stamford', 'Edmonds College']

filtered_df = df_most[df_most['Name'].isin(most_diverse_colleges)]
filtered_df['Course Descriptions'] = [[] for _ in range(len(filtered_df))]

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['Course Descriptions'] = [[] for _ in range(len(filtered_df))]


In [55]:
descriptions = [columbia, yale, stanford, swarthmore, amherst, chicago, rice,
                jhu, rutgers, usc, uwash, gmu, umb, mass, nev, mit, ucsc, ucsb, ucstam,
                edmonds]

names = ['Columbia University in the City of New York', 'Yale University', 'Stanford University', 'Swarthmore College',
                         'Amherst College',
                          'University of Chicago', 'Rice University', 'Johns Hopkins University', 'Rutgers University-Newark',
                         'University of Southern California', 'University of Washington-Tacoma Campus', 'George Mason University',
                         'University of Maryland-Baltimore County', 'University of Massachusetts-Boston', 'University of Nevada-Las Vegas',
                         'Massachusetts Institute of Technology', 'University of California-Santa Cruz',
                         'University of California-Santa Barbara',
                         'University of Connecticut-Stamford', 'Edmonds College']

for i in range(0, 20):
  index = filtered_df[filtered_df['Name'] == names[i]]['Course Descriptions'].index[0]
  filtered_df.at[index, 'Course Descriptions'] = descriptions[i]

In [56]:
filtered_df

Unnamed: 0,UnitID,Name,State,Affiliation,Urbanization,Total,Men total,Women total,American Indian or Alaska Native total,Asian total,...,Women %,American Indian or Alaska Native %,Asian %,Black or African American %,Hispanic %,Native Hawaiian or Other Pacific Islander %,White %,Two or more races %,SDI,Course Descriptions
4,243744,Stanford University,CA,Private not-for-profit (no religious affiliation),21,8054.0,3900.0,4154.0,63.0,2175.0,...,51.6,0.8,27.0,8.1,18.5,0.2,23.8,9.3,0.767208,"[Taught by long-time community organizer, Beat..."
5,216287,Swarthmore College,PA,Private not-for-profit (no religious affiliation),21,1644.0,795.0,849.0,7.0,288.0,...,51.6,0.4,17.5,9.4,15.1,0.1,29.6,11.0,0.765415,"[ ANTH 044. Gender, Sexuality, and Social Chan..."
7,162928,Johns Hopkins University,MD,Private not-for-profit (no religious affiliation),11,6089.0,2742.0,3347.0,9.0,1551.0,...,55.0,0.1,25.5,9.8,19.9,0.1,21.8,6.2,0.761917,[This course will serve as an intensive introd...
8,182281,University of Nevada-Las Vegas,NV,Public,12,25794.0,11330.0,14464.0,73.0,3997.0,...,56.1,0.3,15.5,9.0,35.2,0.8,22.7,14.1,0.761161,[ WMST 427B.) Study of gender and literature t...
9,377564,University of Washington-Tacoma Campus,WA,Public,12,4014.0,2097.0,1917.0,22.0,937.0,...,47.8,0.5,23.3,11.3,15.8,1.0,34.3,8.9,0.759269,"[Introduces theories, methods, and analytical ..."
11,166638,University of Massachusetts-Boston,MA,Public,11,12373.0,5314.0,7059.0,12.0,1935.0,...,57.1,0.1,15.6,17.3,19.0,0.0,29.9,3.9,0.75374,[This course focuses on literary expressions a...
13,163268,University of Maryland-Baltimore County,MD,Public,21,10490.0,5681.0,4809.0,11.0,2501.0,...,45.8,0.1,23.8,23.8,9.3,0.0,30.6,6.4,0.751392,"[Drawing on feminist, queer, social, and criti..."
15,232186,George Mason University,VA,Public,21,27666.0,14111.0,13555.0,30.0,6481.0,...,49.0,0.1,23.4,12.3,17.3,0.1,32.9,5.6,0.748846,[Explores ways women are portrayed around the ...
16,123961,University of Southern California,CA,Private not-for-profit (no religious affiliation),11,21023.0,10148.0,10875.0,35.0,5157.0,...,51.7,0.2,24.5,7.0,18.1,0.2,27.3,6.4,0.74832,[Exploration of identity development in terms ...
17,190150,Columbia University in the City of New York,NY,Private not-for-profit (no religious affiliation),11,9111.0,4607.0,4504.0,27.0,1647.0,...,49.4,0.3,18.1,7.6,15.9,0.2,29.4,6.4,0.745629,[Combines critical feminist and anti-racist an...


In [None]:
filtered_df.to_csv('most_diverse_colleges.csv', index=False)