# Understanding Uncertainty
### DS 5030
### Tuesday/Thursday, 9:30–10:45 or 11:00 – 12:15, SDS 306

## The Course
- “Provides an in-depth exploration of probabilistic and statistical methods used to understand, quantify, and manage uncertainty. Learn foundational concepts in probability and statistics, simulation techniques, and modern approaches to parameter estimation, decision theory, and hypothesis testing. Topics include parametric and nonparametric methods, Bayesian and frequentist paradigms, and applications of uncertainty in real-world problems.”
- In this course, we pick up the threads of your math and probability education, and provide a foundation for later developments in machine learning and artificial intelligence, grounded in data. You will take multiple classes on machine learning, so the goal is not to compete with those courses, but to complement and foreshadow what will happen in them. We want to ensure you are ready to be quantitative thinkers and thoughtful modelers for the rest of your career. The course is slightly recursive, with elements building on themselves. This is intentional.

## Schedule Sketch
- 8/26 -- 9/9: Non-Parametric Analytics
- 9/16 -- 10/9: Probability Theory
- 10/14 -- 11/11: Bayesian Estimation
- 11/13 -- 12/9: Frequentism

## Assignments and Grading
1. Recital: Generative modeling from Markov Chains (12%, 9/23)
2. Exam: Traditional blue book exam on probability theory (22%, 11/6)
3. Bayesian Project: Detecting and blocking potential fraudulent banking transactions (12%, 11/25)
4. Frequentist Project: High dimensional inference with genomics data (12%, 12/9)
5. Final Presentation: Portfolio optimization, or expand on the Bayesian/Frequentist projects (12%, Dec 15/9:00-12:00 pm or Dec 16/2:00-5:00 pm)
6. Assignments: Each class will include two to five exercises that can be done as a group (30%)

## Course Content
- I'll continuously update: https://github.com/ds4e/understanding_uncertainty
- I plan to move content to AWS where storage constraints are less restrictive, but you can still use Git to clone/pull/fetch materials
- Learning Management Systems are awful, but I will collect assignments and distribute grades on Canvas
- I've learned that students organize their own group chats, and that trying to force technology on them is somewhat pointless, but I am happy to set up Slack/Piazza/Discord/etc. for the course if students will use it

## Teaching Assistants
- 9:30 - 10:45: Oai Tran, dzn7nf@virginia.edu
- 11:00 - 12:15: Eirik Steen, zxc6hs@virginia.edu

## Who is this class for? 

Everyone is welcome to participate in this class, of any age, culture, gender, language or geographic heritage, learning and physical abilities, political or social beliefs, race or ethnicity, religious or spiritual beliefs, sex, and social or economic class. Conversely, everyone participating in this class is expected to respect the dignity and humanity of their peers. Please feel free to approach or email your TA’s or instructor with instructions about how you would like to be addressed, including adjustments to or pronunciation of names or preferred pronouns.

## Academic Integrity: 

(Wording suggested by the administration) “I trust every student in this course to fully comply with all of the provisions of the University’s Honor Code. By enrolling in this course, you have agreed to abide by and uphold the Honor System of the University of Virginia. All graded assignments must be pledged, including homework and exams. All suspected violations will be forwarded to the Honor Committee. Please let me know if you have any questions regarding the course Honor policy. If you believe you may have committed an Honor Offense, you may wish to file a Conscientious Retraction by calling the Honor Offices at (434) 924-7602.”

## Generative Artificial Intelligence

1. Cite AI code appropriately. 
2. Make sure it actually runs and does what you intend. 
3. Recognize that you are not learning how to code, but instead learning how to query an AI system upon which you will become entirely dependent.

## Recommended Technology
1. Operating System
- Windows: Install WSL2 and get used to Linux
- MacOS/Linux: Terminal and command lines work great

2. Programming language: Python and Miniconda
- Make sure you've got Python 3 installed on your computer
- Install Miniconda as your package/environment manager: https://www.anaconda.com/docs/getting-started/miniconda/install
- You can use Miniconda to install and manage Python libraries, falling back to pip when necessary

3. Integrated Development Environment: VS Code, maybe PyCharm
- Visual Studio Code is a great option for working on .py and .ipynb files

4. Github
- You need a GitHub account for submitting assignments and doing group work

## Conda Cheatsheet
Conda allows you to build and maintain an environment that can be used from anywhere on your machine, unlike venv/pip

- Create environment: `conda create -n env python=3.12`
- Activate environment: `conda activate env`
- Install packages: `conda install packages`
- Deactivate environment: `conda deactivate`
- Remove environment: `conda env remove -n env`
- List all environments: `conda env list`

## Prompts
On your index card, I am giving you the chance to write whatever you want, anonymously or not

- Maybe you're worried about math or coding
- Maybe you're pivoting from a different field and feel like you're in over your head, and want to voice that concern
- Maybe, because of unanticipated life events, this semester is going to be harder than you thought
- Maybe you find sharks or spiders terrifying, and don't want me to use examples about shark or spider attacks

If there's something you feel like I should be aware of, but you don't feel comfortable saying it to me, this is your chance

## Student Heterogeneity
- There are something like 40 different undergraduate majors represented among the approximately 100 enrolled students
- There are people who are excellent at computer science but know little about statistics, there are people who have worked in industry for 20 years and forgotten all the calculus they ever knew, etc.
- I try to provide a balance of content for as many audiences as I can, but at some point in the course, it will be your turn to either (A) be bored or (B) be frustrated
- The purpose of the course is not to bore or frustrate you, but it's part of being in a diverse intellectual community: We have to work together, and be respectful
- But if you know everything in the course, and are feeling uniformly bored, please come talk to me and we can give you more appropriate work
- Likewise, if you feel like everything is constantly over your head, let's talk sooner rather than later, and find a more appropriate path for you to develop these skills

## Groups
- We'll break into smaller but relatively large groups (6-8 members) to work together during class and on projects
- Here is a somewhat arbitrary taxonomy of interest:
    1. Medical Sciences: Precision health, genomics, public health, epidemiology
    2. Social/Behavioral Sciences: sociology, policy, business, economics, finance, epidemiology
    3. Computer Sciences: LLMs, computer vision, deep learning
    4. Earth and Life Sciences: Physics, chemistry, biology, engineering