Penn Alumni Directory Scraper

A Python-based web scraper that extracts alumni information from the Penn Alumni Directory, specifically targeting alumni in Management Consulting.

Features

Automated login to MyPenn portal with DuoMobile authentication support
Scrapes alumni profiles in batches of 100
Extracts names and email addresses
Saves data to CSV files
Includes error handling and logging
Supports pagination through offset parameter

Prerequisites

Python 3.x
Chrome browser installed
Required Python packages:
- selenium
- beautifulsoup4

Installation

Install required packages:

uv add selenium beautifulsoup4 webdriver_manager

Configuration

Edit penn_alum_scraper.py and set your credentials:

username = "your_pennkey"
password = "your_password"
offset = 0  # Change this to start from a different page, keep it in 100s.

Usage

Run the scraper:

uv run penn_alum_scraper.py

When prompted, approve the login request on your DuoMobile app.
The scraper will:
- Log into MyPenn
- Navigate to the directory
- Scrape profiles in the Management Consulting industry
- Save results to CSV files in the output directory

Output

CSV files are saved in the output directory
File naming format: penn_alumni_[start]-[end].csv
Each file contains:
- Name
- Email address

Notes

The scraper includes a 15-second wait for DuoMobile authentication
Each batch processes up to 100 profiles
The scraper includes warm-up navigation to ensure proper loading
Error handling is implemented for various scenarios

Limitations

Only scrapes Management Consulting industry profiles
Requires manual DuoMobile authentication
Limited to 100 profiles per run
May need adjustments if the MyPenn interface changes

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
pyproject.toml		pyproject.toml
scraper.py		scraper.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Penn Alumni Directory Scraper

Features

Prerequisites

Installation

Configuration

Usage

Output

Notes

Limitations

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Penn Alumni Directory Scraper

Features

Prerequisites

Installation

Configuration

Usage

Output

Notes

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages