UCSB Faculty Directory

A tool for scraping, structuring, and enriching faculty data from UCSB departmental websites.

Project Goals

This project aims to create a structured database of faculty information from UC Santa Barbara's departmental websites, including:

Basic contact information (name, title, email, office, etc.)
Research specializations and expertise
Structured summaries of faculty research using AI
Department and inter-departmental relationships

The project uses a notebook-driven development approach with nbdev to maintain well-documented, tested code with rich explanatory context.

Features

Specialized scrapers for different department website layouts (Drupal, WordPress, custom)
Flexible Unit class to manage department-specific scraping and enrichment
AI-powered faculty research summarization (using OpenAI)
Utilities for crawling and analyzing faculty websites
Modular design for easy extension to additional departments

Developer Guide

If you are new to using nbdev here are some useful pointers to get you started.

Install faculty_expertise in Development mode

# make sure faculty_expertise package is installed in development mode
$ pip install -e .

# make changes under nbs/ directory
# ...

# compile to have changes apply to faculty_expertise
$ nbdev_prepare

Usage

Installation

Install latest from the GitHub repository:

$ pip install git+https://github.com/caylor/faculty_expertise.git

or from conda

$ conda install -c caylor faculty_expertise

or from pypi

$ pip install faculty_expertise

Documentation

Documentation can be found hosted on this GitHub repository's pages. Additionally you can find package manager specific guidelines on conda and pypi respectively.

Project Structure

nbs/: Jupyter notebooks that define the code base
- 00_core.ipynb: Core data structures (Unit class)
- 01_scrapers.ipynb: HTML scrapers for different department layouts
- 02_enrichment.ipynb: AI enrichment and metadata extraction
faculty_expertise/: Auto-generated Python modules from notebooks
- core.py: Core data structures
- my_scrapers.py: HTML scraping functions
- my_enrichment.py: AI enrichment functions
faculty_html/: HTML files from department websites
faculty_screenshots/: Screenshots of department pages

Quick Start Example

# Scrape faculty from a department
from faculty_expertise.core import Unit

# Create a unit for Computer Science department
unit = Unit("Computer Science", "faculty_html/Computer_Science.html")

# Scrape faculty information
df = unit.scrape()

# Display the first few rows
print(df.head())

# Optionally enrich with AI-powered summaries (requires OpenAI API key)
from faculty_expertise.my_enrichment import enrich_faculty_row
row = df.iloc[0]
result = enrich_faculty_row(row)
print(result)

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github		.github
faculty_expertise		faculty_expertise
faculty_html		faculty_html
faculty_screenshots		faculty_screenshots
nbs		nbs
.gitattributes		.gitattributes
.gitconfig		.gitconfig
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
UCSB Departments and Programs - Sheet1.csv		UCSB Departments and Programs - Sheet1.csv
faculty_contact_sheet.png		faculty_contact_sheet.png
nbdev_watch.log		nbdev_watch.log
page_type.ipynb		page_type.ipynb
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
settings.ini		settings.ini
setup.py		setup.py
watch.sh		watch.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UCSB Faculty Directory

Project Goals

Features

Developer Guide

Install faculty_expertise in Development mode

Usage

Installation

Documentation

Project Structure

Quick Start Example

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UCSB Faculty Directory

Project Goals

Features

Developer Guide

Install faculty_expertise in Development mode

Usage

Installation

Documentation

Project Structure

Quick Start Example

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages