# GEN AI PRINCIPLES: Course Project -1 
## RAG-based Interactive AI for MS in Applied Data Science Web Site 

Link: https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science

## 1. Understanding and Preparing the Data

### Objective: analyze and preprocess the textual content from the MS in Applied Data Science webpage, ensuring the data is suitable for retrieveal and generation tasks
- Web scraping, especially when dealing with a website that has multiple sublinks, involves a structured approach to ensure all relevant information is captured efficiently. Here's how you can implement web scraping to gather data from a main page and its sublinks:

### Identify the Main Page and Structure: Web Site
    - Start by identifying the main webpage URL, in this case, the URL for the MS in Applied Data Science program.
    - Inspect the structure of the webpage to understand how sublinks are organized. These might include links to specific sections such as curriculum details, faculty profiles, admissions information, etc.

### Tasks 
    - Extract and structure content from various sections of the webpage, such as program overview, curriculum details, faculty profiles, admissions criteria, and career resources.
    - Ensure data consistency and enhance content quality to optimize the performance of the RAG system


## Structure 

### 1A. Extract Data 

In [2]:
!pip install --upgrade pip

Collecting pip
  Downloading pip-25.1.1-py3-none-any.whl.metadata (3.6 kB)
Downloading pip-25.1.1-py3-none-any.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m11.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 25.0.1
    Uninstalling pip-25.0.1:
      Successfully uninstalled pip-25.0.1
Successfully installed pip-25.1.1


In [1]:
!pip install requests beautifulsoup4


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### 1A. 1) Main Page

In [3]:
import requests
from bs4 import BeautifulSoup

# Step 1: Request the webpage
url = "https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science/"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')

In [4]:
# Step 2: Print the page title
print("Title:", soup.title.text)

Title: Master’s in Applied Data Science – DSI


In [5]:
# Step 3: Extract and print main text content
main_content = soup.find_all(['h1', 'h2', 'p'])
for tag in main_content:
    print(tag.get_text(strip=True))

The Data Science Institute (DSI) executes the University of Chicago’s bold, innovative vision of Data Science as a new discipline.
Open faculty, postdoctoral, staff, and student roles with the UChicago Data Science Institute and our partners.
A new paradigm of transformational AI-enabled scientific discovery across the physical and biological sciences.
Protecting democracy in the digital age through cross-disciplinary research and convening key stakeholders.
Measuring and analyzing Internet performance and reliability to address inequity in U.S. communities.
Inter-discplinary integration of AI with fundamental domain knowledge to accelerate and transform climate research with a focus on both scientific advances and societal impacts.
The goal of data ecology is to study dataflows and design interventions to control how data impacts our world.
Exploring how to train AI models that augment rather than mimic human capacity.
A team committed to bringing equity to education with a focus on l

### 1A. 2) All Pages

In [11]:
# Step 1: Import required libraries
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

In [12]:
# Step 1.1: main pages to scrape
seed_links = [
    "https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science/",
    "https://datascience.uchicago.edu/about/",
    "https://datascience.uchicago.edu/about/leadership-staff/",
    "https://datascience.uchicago.edu/research/",
    "https://datascience.uchicago.edu/education/",
    "https://datascience.uchicago.edu/outreach/",
    "https://datascience.uchicago.edu/news/",
    "https://datascience.uchicago.edu/events/",
    "https://datascience.uchicago.edu/about/jobs/",
    "https://datascience.uchicago.edu/newsletter-archive/",
    "https://datascience.uchicago.edu/about/contact/"
]

In [13]:
# Step 2: Function to clean the raw text (remove blank lines, extra spaces)
def clean_text(text):
    return '\n'.join([line.strip() for line in text.split('\n') if line.strip()])

In [14]:
# Step 2.1: Function to extract all internal sublinks from a page
def get_internal_links(base_url, soup):
    domain = urlparse(base_url).netloc
    internal_links = set()
    for a in soup.find_all('a', href=True):
        href = a['href']
        full_url = urljoin(base_url, href)
        if urlparse(full_url).netloc == domain:
            internal_links.add(full_url)
    return internal_links

In [15]:
# Step 3: Scrape a page and return sublinks
visited = set()
all_scraped = []

def scrape_and_expand(url):
    if url in visited:
        return set()

    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        text = clean_text(soup.get_text(separator='\n', strip=True))
        all_scraped.append((url, text))
        print(f"Scraped: {url}")
        visited.add(url)

        return get_internal_links(url, soup)
    except Exception as e:
        print(f"Failed: {url} — {e}")
        return set()


In [17]:
# Step 3: Scrape a page and return sublinks
visited = set()
all_scraped = []

def normalize_url(url):
    # Remove fragment (#...) and query params (?...)
    return url.split('#')[0].split('?')[0]

def scrape_and_expand(url):
    normalized = normalize_url(url)

    if normalized in visited:
        return set()

    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')
        text = clean_text(soup.get_text(separator='\n', strip=True))

        all_scraped.append((normalized, text))
        visited.add(normalized)

        print(f"Scraped: {normalized} | Total visited: {len(visited)}")

        return get_internal_links(normalized, soup)

    except Exception as e:
        print(f"Failed: {url} — {e}")
        return set()

In [18]:
# Step 4: Initialize scraping loop from your seed pages
to_visit = set(seed_links)

while to_visit:
    current = to_visit.pop()
    new_links = scrape_and_expand(current)
    to_visit.update(new_links - visited)

Scraped: https://datascience.uchicago.edu/research/ | Total visited: 1
Scraped: https://datascience.uchicago.edu/news/the-university-of-chicago-data-science-institute-and-google-partner-on-cutting-edge-ai-and-security-research/ | Total visited: 2
Scraped: https://datascience.uchicago.edu/about/about-dsi/ | Total visited: 3
Scraped: https://datascience.uchicago.edu/news/rina-foygel-barber-and-margaret-gardel-elected-to-the-national-academy-of-sciences/ | Total visited: 4
Scraped: https://datascience.uchicago.edu/research/a-data-driven-trigger-system-for-the-large-hadron-collider/ | Total visited: 5
Scraped: https://datascience.uchicago.edu/news-events/past-events | Total visited: 6
Scraped: https://datascience.uchicago.edu/outreach/data4all/ | Total visited: 7
Scraped: https://datascience.uchicago.edu/outreach/ | Total visited: 8
Scraped: https://datascience.uchicago.edu/research/internet-equity/ | Total visited: 9
Scraped: https://datascience.uchicago.edu/outreach/capacity-accelerator-

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: http://datascience.uchicago.edu/wp-content/uploads/2023/09/OSL-Evaluating-Data-Capacity-Report-Links.pdf | Total visited: 217
Scraped: https://datascience.uchicago.edu/people/hieu-nguyen-he-him-mscapp-25/ | Total visited: 218
Scraped: https://datascience.uchicago.edu/people/neha-lingareddy/ | Total visited: 219
Scraped: https://datascience.uchicago.edu/events/data-science-institute-summit/ | Total visited: 220
Scraped: https://datascience.uchicago.edu/people/joseph-helbing/ | Total visited: 221
Scraped: https://datascience.uchicago.edu/events/dss-gari-clifford/ | Total visited: 222
Scraped: https://datascience.uchicago.edu/people/jarvis-lam/ | Total visited: 223
Scraped: https://datascience.uchicago.edu/people/yuxin-chen/ | Total visited: 224
Scraped: https://datascience.uchicago.edu/people/jacqueline-chen/ | Total visited: 225
Scraped: https://datascience.uchicago.edu/people/andrew-ferguson/ | Total visited: 226
Scraped: https://datascience.uchicago.edu/people/nathalie-valenz

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2023/03/The-Case-for-Component-based-Research.pdf | Total visited: 295
Scraped: https://datascience.uchicago.edu/people/jonatas-marques/ | Total visited: 296
Scraped: https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science/online-program/ | Total visited: 297
Scraped: https://datascience.uchicago.edu | Total visited: 298
Scraped: https://datascience.uchicago.edu/people/sydney-jenkins/ | Total visited: 299
Scraped: https://datascience.uchicago.edu/events/ai-science-summer-school-2025/ | Total visited: 300
Scraped: https://datascience.uchicago.edu/research/data-democracy-initiative/ | Total visited: 301
Scraped: https://datascience.uchicago.edu/people/amy-nussbaum/ | Total visited: 302
Scraped: https://datascience.uchicago.edu/events/ms-in-applied-data-science-online-program-student-panel/ | Total visited: 303
Scraped: https://datascience.uchicago.edu/insights/the-challenging-path-to-internet-broa

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/ilecon_248502_9731221_IL-CEI_Final_Poster.pdf | Total visited: 462
Scraped: https://datascience.uchicago.edu/news/mapping-and-mitigating-the-urban-digital-divide/ | Total visited: 463
Scraped: https://datascience.uchicago.edu/research/machine-learning-and-satellite-imaging-to-reduce-methane-emissions/ | Total visited: 464
Scraped: https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science/capstone-projects/ | Total visited: 465
Scraped: https://datascience.uchicago.edu/people/mentor-anjali-adukia-2/ | Total visited: 466
Scraped: https://datascience.uchicago.edu/people/david-uminsky/ | Total visited: 467
Scraped: https://datascience.uchicago.edu/people/emma-kerr/ | Total visited: 468
Scraped: https://datascience.uchicago.edu/people/christina-tuttle/ | Total visited: 469
Scraped: https://datascience.uchicago.edu/people/maria-v-fernandez/ | Total visited: 470
Scraped: https://datascience.uchic

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/MS-ADS-Capstone-Sponsor-Guide-2025-2.pdf | Total visited: 497
Scraped: https://datascience.uchicago.edu/events/grad-school-qa-ms-in-applied-data-science-for-uchicago-undergraduates/ | Total visited: 498
Scraped: https://datascience.uchicago.edu/people/joseph-jaiyeola/ | Total visited: 499
Scraped: https://datascience.uchicago.edu/people/cong-ma/ | Total visited: 500
Scraped: https://datascience.uchicago.edu/insights/debit-development-bank-investment-tracker/ | Total visited: 501
Scraped: https://datascience.uchicago.edu/people/julia-luo/ | Total visited: 502
Scraped: https://datascience.uchicago.edu/engage/data4all/ | Total visited: 503
Scraped: https://datascience.uchicago.edu/broadband/setup | Total visited: 504
Scraped: https://datascience.uchicago.edu/people/tiffany-shaw/ | Total visited: 505
Scraped: https://datascience.uchicago.edu/people/su-karaca/ | Total visited: 506
Scraped: https://datascience.uchicago.edu/

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/10/DSRF_title.pdf | Total visited: 556
Scraped: https://datascience.uchicago.edu/people/aloni-cohen/ | Total visited: 557
Scraped: https://datascience.uchicago.edu/research/improving-prediction-of-breast-cancer-patient-response-to-therapy/ | Total visited: 558
Scraped: https://datascience.uchicago.edu/people/ayah-ahmad/ | Total visited: 559
Scraped: https://datascience.uchicago.edu/events/chicago-data-night-david-zaretsky-northwestern/ | Total visited: 560
Scraped: https://datascience.uchicago.edu/people/julio-ramirez2-2/ | Total visited: 561
Scraped: https://datascience.uchicago.edu/research/exploring-climate-and-biodiversity-with-3d-deep-learning/ | Total visited: 562
Scraped: https://datascience.uchicago.edu/people/sonali-shaw-she-her/ | Total visited: 563
Scraped: https://datascience.uchicago.edu/people/resty-fufunan/ | Total visited: 564
Scraped: https://datascience.uchicago.edu/research/disentangling-visual-style-a

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2022/03/7f8f00_694f097ce09f442fa93444925455f6c5_mv2.webp | Total visited: 569
Scraped: https://datascience.uchicago.edu/people/isaac-mehlhaff/ | Total visited: 570
Scraped: https://datascience.uchicago.edu/news/online-database-brings-transparency-to-financing-of-mega-development-projects/ | Total visited: 571
Scraped: https://datascience.uchicago.edu/news/2020-cdac-summer-lab-kicks-off-with-37-student-researchers/ | Total visited: 572
Scraped: https://datascience.uchicago.edu/insights/deepstyle/ | Total visited: 573
Scraped: https://datascience.uchicago.edu/news/eric-and-wendy-schmidt-ai-in-science-postdoctoral-fellows-host-first-ever-hackathon/ | Total visited: 574
Scraped: https://datascience.uchicago.edu/civic-data-technology-clinic/ | Total visited: 575
Scraped: https://datascience.uchicago.edu/events/cdac-summer-lab-info-session/ | Total visited: 576
Scraped: https://datascience.uchicago.edu/people/abdelrahman-helal-he-h

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/capr_169460_9731015_CPR-DSI-Summer-Lab-Poster.pdf.pdf | Total visited: 845
Scraped: https://datascience.uchicago.edu/people/michael-giurcanu/ | Total visited: 846
Scraped: https://datascience.uchicago.edu/news/a-day-at-google-for-uchicagos-ms-in-applied-data-science-students/ | Total visited: 847
Scraped: https://datascience.uchicago.edu/research/ai-science/partnerships/ | Total visited: 848
Scraped: https://datascience.uchicago.edu/people/athmika-senthilkumar-she-her/ | Total visited: 849
Scraped: https://datascience.uchicago.edu/people/aditya-nandy/ | Total visited: 850
Scraped: https://datascience.uchicago.edu/people/colm-talbot/ | Total visited: 851
Scraped: https://datascience.uchicago.edu/news/meet-the-autumn-2019-cdac-discovery-grant-projects/ | Total visited: 852
Scraped: https://datascience.uchicago.edu/people/ali-khowaja-he-him/ | Total visited: 853
Scraped: https://datascience.uchicago.edu/research/postdoct

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2021/04/CDAC-Prospectus.pdf | Total visited: 873
Scraped: https://datascience.uchicago.edu/people/john-chodera/ | Total visited: 874
Scraped: https://datascience.uchicago.edu/people/shufan-zhang/ | Total visited: 875
Scraped: https://datascience.uchicago.edu/people/alexandra-nisenoff/ | Total visited: 876
Scraped: https://datascience.uchicago.edu/events/ask-a-student-in-ms-in-applied-data-science-4/ | Total visited: 877
Scraped: https://datascience.uchicago.edu/people/yi-wu/ | Total visited: 878
Scraped: https://datascience.uchicago.edu/news/victor-veitch-first-uchicago-data-science-faculty-builds-safe-and-credible-ai-systems/ | Total visited: 879
Scraped: https://datascience.uchicago.edu/people/diamon-dunlap-she-her/ | Total visited: 880
Scraped: https://datascience.uchicago.edu/people/kenia-godinez-nogueda/ | Total visited: 881
Scraped: https://datascience.uchicago.edu/people/dimitriy-leksanov/ | Total visited: 882
Scraped:

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/rafi_194417_9731017_RAFI-USA-POSTER-Final-.pptx.pdf | Total visited: 941
Scraped: https://datascience.uchicago.edu/people/chris-kennedy/ | Total visited: 942
Scraped: https://datascience.uchicago.edu/people/vira-kasprova/ | Total visited: 943
Scraped: https://datascience.uchicago.edu/events/chicago-data-night-james-nowell-ezbotai/ | Total visited: 944
Scraped: https://datascience.uchicago.edu/events/chicago-data-night-neal-sample-walgreens-boots-alliance/ | Total visited: 945
Scraped: https://datascience.uchicago.edu/people/seyed-a-esmaeili/ | Total visited: 946
Scraped: https://datascience.uchicago.edu/people/linyi-li/ | Total visited: 947
Scraped: https://datascience.uchicago.edu/events/exploring-climate-biodiversity/ | Total visited: 948
Scraped: https://datascience.uchicago.edu/events/bryce-meredig-citrineio-aiscience-schmidt-fellows-speaker-series/ | Total visited: 949
Scraped: https://datascience.uchicago.edu/pe

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/internetequity_192998_9731696_DSSI-Internet-Equity-Poster-FINAL.pdf | Total visited: 969
Scraped: https://datascience.uchicago.edu/people/ruishan-liu/ | Total visited: 970
Scraped: https://datascience.uchicago.edu/research/postdoctoral-programs/rising-stars/2021/ | Total visited: 971
Scraped: https://datascience.uchicago.edu/news/university-of-chicago-city-colleges-of-chicago-join-forces-to-increase-diversity-in-science-careers/ | Total visited: 972
Scraped: https://datascience.uchicago.edu/people/harlin-lee/ | Total visited: 973


Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/preventblindness_LATE_248515_9732718_Final-Poster.pdf | Total visited: 974
Scraped: https://datascience.uchicago.edu/education/masters-programs/ms-in-applied-data-science/in-person-program/ | Total visited: 975
Scraped: https://datascience.uchicago.edu/engage/jobs/ | Total visited: 976
Scraped: https://datascience.uchicago.edu/people/vlad-andreichuk/ | Total visited: 977
Scraped: https://datascience.uchicago.edu/people/yihang-wang/ | Total visited: 978
Scraped: https://datascience.uchicago.edu/people/johann-gaebler/ | Total visited: 979
Scraped: https://datascience.uchicago.edu/people/lizet-casas-sheher/ | Total visited: 980
Scraped: https://datascience.uchicago.edu/people/francesco-pinto/ | Total visited: 981
Scraped: https://datascience.uchicago.edu/events/c3-ai-dti-illinois-matchmaking-webinar/ | Total visited: 982
Scraped: https://datascience.uchicago.edu/people/emily-aiken/ | Total visited: 983
Scraped: https://d

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2021/10/Agenda_092821.pdf | Total visited: 989
Scraped: https://datascience.uchicago.edu/people/roshni-sahoo/ | Total visited: 990
Scraped: https://datascience.uchicago.edu/news/uchicago-researchers-demonstrate-the-quantifiable-uniqueness-of-former-president-donald-trumps-language-use/ | Total visited: 991
Scraped: https://datascience.uchicago.edu/news/top-data-science-skills-employers-are-looking-for/ | Total visited: 992
Scraped: https://datascience.uchicago.edu/research/ai-science/news/ | Total visited: 993
Scraped: https://datascience.uchicago.edu/insights/spatial-filters-of-function-and-phylogeny-determine-morphological-disparity-with-latitude/ | Total visited: 994
Scraped: https://datascience.uchicago.edu/news/life-sciences-tech-startups-advance-to-fall-2023-george-shultz-innovation-fund-finals/ | Total visited: 995
Scraped: https://datascience.uchicago.edu/people/phiala-shanahan/ | Total visited: 996
Scraped: https://d

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2024/11/cpl_248509_9731681_CPL-DSI-Summer-Lab_Poster-1.pdf | Total visited: 1011
Scraped: https://datascience.uchicago.edu/people/shana-mcdowell/ | Total visited: 1012
Scraped: https://datascience.uchicago.edu/people/yuanyuan-lei/ | Total visited: 1013
Scraped: https://datascience.uchicago.edu/news/the-world-awaits/ | Total visited: 1014
Scraped: https://datascience.uchicago.edu/people/anima-anandkumar/ | Total visited: 1015
Scraped: https://datascience.uchicago.edu/news/public-interest-technology-grant-funds-and-expands-cdacharris-civic-data-technology-clinic/ | Total visited: 1016
Scraped: https://datascience.uchicago.edu/people/ronald-carter/ | Total visited: 1017
Scraped: https://datascience.uchicago.edu/events/rising-stars-in-data-science-workshop-autumn2022/ | Total visited: 1018
Scraped: https://datascience.uchicago.edu/people/jax-alemu/ | Total visited: 1019
Scraped: https://datascience.uchicago.edu/events/hae-kyung-i

Some characters could not be decoded, and were replaced with REPLACEMENT CHARACTER.


Scraped: https://datascience.uchicago.edu/wp-content/uploads/2019/09/CDAC-Discovery-Grants-Autumn-2019-RFP.pdf | Total visited: 1066
Scraped: https://datascience.uchicago.edu/people/julio-ramirez2/ | Total visited: 1067
Scraped: https://datascience.uchicago.edu/events/introducing-palmwatch-mapping-the-impact-of-big-brands-palm-oil-use/ | Total visited: 1068
Scraped: https://datascience.uchicago.edu/people/jingshu-wang/ | Total visited: 1069
Scraped: https://datascience.uchicago.edu/people/ryan-shi/ | Total visited: 1070
Scraped: https://datascience.uchicago.edu/insights/broadband-terms-questions-and-myths/ | Total visited: 1071
Scraped: https://datascience.uchicago.edu/news/the-ms-in-applied-data-science-program-celebrates-10-years/ | Total visited: 1072
Scraped: https://datascience.uchicago.edu/news/asst-prof-aloni-cohen-receives-award-for-revealing-flaws-in-deidentifying-data/ | Total visited: 1073
Scraped: https://datascience.uchicago.edu/events/alex-kale-u-of-washington-data-interf

In [19]:
# Step 5: Save all scraped content to a local text file
with open("uchicago_datascience_fullscrape.txt", "w", encoding="utf-8") as f:
    for url, content in all_scraped:
        f.write(f"URL: {url}\n")
        f.write(content)
        f.write("\n" + "="*100 + "\n")

print("Full scrape complete and saved to 'uchicago_datascience_fullscrape.txt'")

Full scrape complete and saved to 'uchicago_datascience_fullscrape.txt'


In [26]:
# Define the text variable by reading the file
with open("uchicago_datascience_fullscrape.txt", "r", encoding="utf-8") as f:
    text = f.read()


In [28]:
# Now print the first and last 100 characters
print("First 1000 characters:\n")
print(text[:15500])

First 1000 characters:

URL: https://datascience.uchicago.edu/research/
Research – DSI
Skip to main content
About
About the Data Science Institute
The Data Science Institute (DSI) executes the University of Chicago’s bold, innovative vision of Data Science as a new discipline.
Jobs & Opportunities
Open faculty, postdoctoral, staff, and student roles with the UChicago Data Science Institute and our partners.
Visiting DSI @ UChicago
Contact
Research
Initiatives
AI + Science
A new paradigm of transformational AI-enabled scientific discovery across the physical and biological sciences.
Data & Democracy
Protecting democracy in the digital age through cross-disciplinary research and convening key stakeholders.
Internet Equity
Measuring and analyzing Internet performance and reliability to address inequity in U.S. communities.
AICE: AI for Climate
Inter-discplinary integration of AI with fundamental domain knowledge to accelerate and transform climate research with a focus on both scientific 