-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from data-8/dsep-changes
- Loading branch information
Showing
18 changed files
with
1,230 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
<!-- #region --> | ||
# Connector courses | ||
|
||
|
||
Connectors are one or two-unit courses which give students an in-depth introduction into the use of the data science toolkit and concepts in a particular field. Connector courses vary every semester, but are usually within the domain of social sciences. Connectors are not an official part of Data 8, though they play a crucial role in the data science education ecosystem with the DSEP (Data Science Education Program). Students are encouraged to take them together with Data 8 or a semester after they take Data 8. | ||
|
||
For more information about connector courses, check out [The Data Science Connector Courses Page](https://data.berkeley.edu/education/connectors) | ||
|
||
|
||
|
||
# Modules | ||
|
||
Modules are short explorations into data science that can be implemented into (almost) any class on campus. Modules allow students to explore a data set relevant to their course with the help of the data science toolkit. | ||
|
||
Like connectors, modules vary widely, and can be customized to fit each instructor's need and objective for the course. A module might include one or two lectures on analyzing course-relevant data and learning how to interpret the graphs, or they can be a set of labs in which students will learn the basics of programming and statistics which they can apply to their field of studies. Some modules also include a final class project run completely within Jupyter Notebooks. | ||
|
||
Modules allow students from different fields of studies to get familiarized with the foundations of data analysis. | ||
|
||
You can learn more about the modules the DSEP Team has created over the years on the [Modules webpage](https://data.berkeley.edu/education/modules) | ||
<!-- #endregion --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Data 8 Course Staff Structure | ||
|
||
## Motivation | ||
|
||
Data 8 has three hours of lecture a week with a two hour weekly lab section, 12 weekly lab and homework assignments each, and three larger projects. These are reflected in the course resource allocation diagram below. Weekly lab sections have a ratio of 27 students to one graduate student instructor and allow students to make personal connections with instructors in a large course. With large enrollment numbers, UC Berkeley had to address how to make class sizes in the thousands feel small. | ||
|
||
![](https://i.imgur.com/e421NaU.png) | ||
|
||
|
||
The answer is reflected in Data 8’s support resource strategy in the figure above. While the lecture halls seat hundreds, the teaching staff create a personal experience for students in the course by: | ||
|
||
- Hosting over **25 weekly office hours** for one-on-one help staffed by anywhere between 2 and 10+ people an hour (with 2 separate professor office hours) | ||
- Hosting **weekly small group tutoring sessions** (5 students and 1 tutor), and tailored exam review lectures. | ||
|
||
In the past, Data 8 course have also hosted: | ||
- **Guerilla sections**: Optional small topical worksheet-based review sessions offered by tutors once major course topics are covered. | ||
- **Project ”parties”**: Allows students to work in the same room and ask staff members questions close to deadlines. | ||
|
||
With this wide array of support resources, students in a 1500-person computing course can connect with their individual staff members. | ||
|
||
The scale of staffing for these support resources is only possible through the enthusiasm of former students in the class. An undergraduate who has performed well in Data 8 and enjoys the course material can join the course staff as an **academic intern (AI)**. Academic interns assist teaching assistants (TAs) by answering student questions in a weekly lab section for 1 academic Pass/No Pass unit - an end of semester reflection on their experience is also required for credit. Undergraduates who prove to be engaged and talented academic interns are then selected to be tutors for the class. | ||
|
||
**Tutors** assist with grading written responses on assignments, holding office hours, and conducting small group tutoring sections. Experienced tutors can eventually advance to positions as **Undergraduate Student Instructors (uGSIs)**, holding a weekly lab section and helping out with various pedagogical and logistical aspects of running the course. This hierarchical course staff structure is enabled by the careful leadership of Head GSIs and course instructors. | ||
|
||
|
||
You can read more about the [Roles and Responsibilities here](roles_data8) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
<!-- #region --> | ||
# The Data 8 Pedagogy Guide | ||
|
||
Data 8 is a “The Foundations of Data Science” course taught to first-year students at UC Berkeley. It combines principles and skills from statistics and computer science, such as inference, modeling, hypothesis testing, visualization, and others. It provides a foundation in the many disciplines encompassed by “data science”, and gives students a practical introduction to the field. | ||
|
||
Teaching Data Science requires a shift in the way we traditionally teach each of the individual concepts. What were once introductory classes in statistics, computer science, and ethics (among others) are now combined into a single introductory course. | ||
|
||
|
||
This book covers many of the pedagogical decisions that were made in Data 8 and should be seen as a reference and background for it. | ||
|
||
All of the tools that Data 8 uses are open source, and available for the community to use (either as broader community-run projects, or as Berkeley projects). The course material can be accessed at the following online resources: | ||
|
||
- [The Data 8 Course Site](http://data8.org) | ||
- [The Data 8 Course Textbook](https://www.inferentialthinking.com/chapters/intro) | ||
|
||
To explore the guide, select a section to the left! | ||
|
||
### Contacting Us | ||
|
||
If you woud like to learn more about any of the tools used in Data 8 or are interested in deploying your own data 8 course, please fill out our Data 8 [Instructor Interest form](https://forms.gle/y5aQBi816xDXKe7r7) or shoot us an email at [ds-help@berkeley.edu](mailto:ds-help@berkeley.edu). | ||
<!-- #endregion --> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
# Roles and Responsibilities | ||
Each semester, around 45 teaching assistants (TAs), 45 tutors, and 150 academic interns (AIs) are hired to be part of Data 8 course staff. There are differing levels of hourly commitments, roles, and responsibilities both by and within roles depending on the staff member’s experience and ability to contribute to course staff. | ||
|
||
### Academic Intern (AIs) | ||
|
||
##### Lab assisting (2 hrs/week) | ||
- 10 out of the 13 weeks (20 hours) → required for P | ||
- Actively participating in the worksheet discussion time and answering questions during the lab notebook time | ||
##### Lab prep (0.5 hr/week) | ||
- Work through the lab assignment | ||
- Discussion worksheet solutions provided | ||
|
||
##### Pedagogy Practice/Reflection (0.5 hr/week) | ||
- Weekly written check-ins | ||
- General meetings/professional development | ||
- | ||
##### Meetings | ||
- Meetings with TA | ||
- Pedagogy lessons for the upcoming weeks | ||
|
||
### Tutor | ||
|
||
#### Returning Tutors (8 hours) | ||
##### Grading (2) | ||
- Read grading rubrics | ||
- Grading assignments | ||
- Handling regrades | ||
##### Two tutoring sections (2) | ||
##### Preparation (1) | ||
- Review material being covered in lectures that week | ||
- Tutoring section worksheet | ||
- Weekly homework/projects | ||
##### Office hours (1) | ||
##### Flexible hour (1) | ||
- Midterm proctoring, prep, etc. | ||
##### Staff meeting (1) | ||
|
||
|
||
#### New Tutors (8 hours) | ||
|
||
##### Grading (2) | ||
- Read grading rubrics | ||
- Grading assignments | ||
- Handling regrades | ||
|
||
##### Preparation (1) | ||
- Review material being covered in lectures that week | ||
- Tutoring section worksheet | ||
- Weekly homework/projects | ||
|
||
##### Office hours (2) | ||
##### GSI-shadowed tutoring section (1) | ||
##### Flexible hour (1) | ||
- Midterm proctoring, prep, etc. | ||
##### Staff meeting (1) | ||
|
||
### uGSIs | ||
#### Lead uGSI (20) | ||
##### One lab sections (2) | ||
##### Tutor mentoring/Office hours (1) | ||
##### Preparation (2) | ||
- Review material being covered in lectures that week | ||
- Lab discussion worksheet | ||
- Work through/familiarize self with the lab assignment | ||
- Weekly homework/project | ||
##### Staff meeting (1) | ||
##### Other (2, spread out through the semester) | ||
- Proctoring for exams | ||
- Helping at review sessions or guerrilla sections | ||
##### Primary team (12) | ||
- Delegating/managing their team | ||
- Varies per role | ||
|
||
#### uGSI, returning (8) | ||
##### Lab section (2) | ||
##### Office hours (1) | ||
##### Preparation (2) | ||
- Review material being covered in lectures that week | ||
- Lab discussion worksheet | ||
- Work through/familiarize self with the lab assignment | ||
- Weekly homework/project | ||
##### Staff meeting (1) | ||
##### Additional role (1) | ||
- Lead TA will delegate roles, should work out to roughly 1 hour a week, but varies per team | ||
##### Other (1, spread out through the semester) | ||
- Proctoring and grading for exams | ||
- Lab assistant management | ||
|
||
### uGSI, new (8) | ||
New uGSIs are only hired in the Fall semester to be part of Data 8 course staff. In the Spring semester, all GSIs have roles and responsibilities equal to those for returning uGSIs. | ||
|
||
##### Lab section (2) | ||
##### Tutor mentoring/Office hours (1) | ||
##### Staff meeting (1) | ||
##### Pedagogy/New GSI meeting (1) | ||
##### Preparation (2) | ||
- Review material being covered in lectures that week | ||
- Lab discussion worksheet | ||
- Work through/familiarize self with the lab assignment | ||
- Tutoring section worksheet | ||
- Weekly homework/project | ||
##### Other (1, spread out through the semester) | ||
- Proctoring and grading for exams | ||
- Lab assistant management | ||
|
||
### uGSI Teams | ||
Each uGSI (new or returning) is assigned to one of the following teams: | ||
|
||
#### New GSI Support/GSI Support | ||
- Run weekly new GSI meetings, check-ins with new GSIs | ||
- "Lost student" support like connecting students to GSIs | ||
##### Content & Infrastructure | ||
- HW/Project/Lab Development and Release | ||
- Infrastructure (Website + Jupyter support) | ||
##### Grading | ||
- Autograder wrangling | ||
- Maintaining student grades, handling regrades | ||
##### Lab Logistics | ||
- LA Hiring/evaluations | ||
- Lab assignments, lab shadowing | ||
##### Tutor Support | ||
- Tutor management and logistics | ||
- Lead weekly tutor meetings | ||
##### Pedagogy | ||
- Assignment creation, exam writing | ||
- Help new GSIs/tutors teach |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
# Course structure | ||
|
||
The following is a general description of the course structure for Data 8. | ||
It covers how to build your course syllabus and is a reflection of the content that is covered in the Data 8 | ||
textbook, which is freely available at the following link: | ||
|
||
https://www.inferentialthinking.com/ | ||
|
||
The following sections describe some of the major takeaways that students | ||
should learn in the class. Data 8 assumes *no programming*, | ||
*no statistics*, and *no math* beyond a standard high-school level. | ||
|
||
## Conceptual understanding of uncertainty and causality | ||
|
||
A lot of the technical pieces in the course focus on enabling students to practice | ||
specific technical skills (like programming). It is crucial that these skills | ||
be learned in order to solidify a high-level understanding of how data, statistics, | ||
and inference are inter-related. For example, see | ||
[Chapter 2: Causality and Experiments](https://www.inferentialthinking.com/chapters/02/causality-and-experiments.html) | ||
from the Data 8 textbook. | ||
|
||
Below are a few high-level concepts that students should come away with: | ||
|
||
* A solid understanding of causality on a conceptual level. | ||
* Where did you get your data from? | ||
* Is it a controlled randomized study or observational data. | ||
* What is cause and effect? In the context of data? | ||
|
||
## Programming fundamentals | ||
|
||
Scripting and interactive computing are the primary ways that we operationalize | ||
the data science methods covered in the course. While it is possible to find | ||
programs that let you carry out various techniques with user-interfaces, Data 8 | ||
stresses that programming fundamentals will facilitate learning the analytic | ||
topics and provide a more useful and generic skillset in computational methods. | ||
|
||
In Data 8, programming fundamentals are taught *alongside* statistical concepts. | ||
For example, [iteration is taught alongside random sampling](https://www.inferentialthinking.com/chapters/09/2/iteration.html). | ||
|
||
Below are some programming fundamentals that students come away with: | ||
|
||
* Data types, structures, functions, tables, etc | ||
* Programming fundamentals *in the context of data*. | ||
* How to answer data science questions with the tables you have. | ||
* Complex operations with tables (grouping, joining, etc). | ||
* How to create visualizations (charts, histograms, etc) and understand what is being represented. | ||
|
||
|
||
## Statistics, sampling, and hypothesis testing | ||
|
||
Randomness and statistics are core components of data science. Data 8 | ||
has a heavy emphasis on both. It is particularly important that students come | ||
away with an appreciation for how a sampling method is | ||
used to generate data, as well as an understanding for how statistics can | ||
be used (and mis-used) to understand a dataset given a limited number | ||
of data points. | ||
|
||
Below are some statistics fundamentals that students come away with: | ||
|
||
* Iteration in the context of probability of an event (e.g probability when flipping a coin) | ||
* "for" loops etc are covered in the context of a coin flipping simulation | ||
* How to manipulate tables to calculate probability. | ||
* Sampling and empirical distributions - how to make conclusions based on random samples? | ||
* How to compare two samples. | ||
|
||
## Inference, prediction, and models | ||
|
||
While statistics describe a dataset, it does not inherently make *predictions* | ||
about the underlying distribution from which the data are drawn. Data 8 relies | ||
heavily on bootstrapping and permutation methods in order to make estimations | ||
of error/confidence in parameters derived from the data. | ||
|
||
Beyond estimating the value of a model's parameter given limited data, models | ||
are also used to generate *predictions* about the world given a new set of | ||
data. Data 8 treats prediction as an extension of inference. In the same | ||
sense that inference quantifies uncertainty in a model's parameter, we can also | ||
generate uncertainty in predictions given a data point that the model has not | ||
seen before. This is given treatment in the case of regression (models with quantitative outputs) | ||
as well as classification (models with qualitative outputs). | ||
|
||
Below are some inference, prediction, and modeling fundamentals that students come away with: | ||
|
||
* Correlation in the context of modeling. | ||
* Linear regression and least squares. | ||
* Regression inference - what does a regression line really mean? | ||
* How to compute confidence integrals of regression lines. | ||
* How to train classifiers with simple algorithms such as k-nearest neighbors | ||
* Brief introduction to machine learning | ||
|
||
## Comparing distributions | ||
|
||
Once students learn the various steps that go into statistically describing a | ||
single dataset, Data 8 covers how to make comparisons *between* datasets. This | ||
is a crucial part of most scientific analysis, as well as in industry data | ||
analytics (e.g., in A/B testing). Data 8 covers comparisons between distributions | ||
as an advanced case of the material that has been covered above. | ||
|
||
Below are some fundamentals for comparing two distributions that students come away with: | ||
|
||
* A/B testing - with what confidence can we say if two numerical samples come from the same underlying distribution or not? | ||
* Bootstrapping and confidence intervals | ||
* Importance of bias and variance of the sample mean | ||
* Statistics about statistics (e.g. variability of sample mean) | ||
* How to use sample means effectively for inference? | ||
* How to design experiments and intrepret the distributions with Central Limit Theorem, Chebyshev's, etc. | ||
|
||
|
||
# Building a Course syllabus page | ||
|
||
You may use whatever technology you prefer for managing your course and | ||
distributing content. However, we recommend setting up a syllabus page that | ||
is used for distributing interact links and course materials. For an example, | ||
see the structure of the Spring 2020 course syllabus: http://data8.org/sp20/ | ||
|
||
The syllabus has the following structure: | ||
|
||
| Date | Topic | Lecture | Reading | Assignment | ||
| ------------- | ----- | ------- | ------- | ---------- | ||
| Fri 01/24 | Cause and Effect | [Slides](https://docs.google.com/presentation/d/1lSwG_uGwQRL3oGQnmn7aphxYZlzX0G0KHeyNM-cNZb4/edit?usp=sharing) | [Chapter 2](https://www.inferentialthinking.com/chapters/02/causality-and-experiments.html) | [Homework 01](http://datahub.berkeley.edu/hub/user-redirect/git-sync?repo=https://github.com/data-8/materials-sp20&subPath=materials/sp20/hw/hw01/hw01.ipynb) | ||
| Mon 01/27 | Tables | [Slides](https://docs.google.com/presentation/d/1jn2X5JtbOqOfiBa_QdV2ITw491dhRdxIvKR0bR1tWUk/edit?usp=sharing), [Demos](http://datahub.berkeley.edu/hub/user-redirect/git-sync?repo=https://github.com/data-8/materials-sp20&subPath=lec/lec03.ipynb), [Video](https://www.youtube.com/watch?v=BW9XcOG8jag) | [Chapter 3](https://www.inferentialthinking.com/chapters/03/programming-in-python.html)| | | ||
| Wed 01/29 | Data Types | [Slides](https://docs.google.com/presentation/d/1TjU8ismB9qSrWAivDq2HvuYy68sRTE_Q_V3gPcHP0qQ/edit?usp=sharing), [Demos](http://datahub.berkeley.edu/hub/user-redirect/git-sync?repo=https://github.com/data-8/materials-sp20&subPath=lec/lec04.ipynb), [Video](http://youtube.com/watch?v=F3krVj7GkI0) | [Chapters 4](https://www.inferentialthinking.com/chapters/04/data-types.html), [5](https://www.inferentialthinking.com/chapters/05/sequences.html)| [Lab 02: Table Operations](http://datahub.berkeley.edu/hub/user-redirect/git-sync?repo=https://github.com/data-8/materials-sp20&subPath=materials/sp20/lab/lab02/lab02.ipynb) | ||
|
||
Each row is a lecture, and each column is a type of material you can distribute. | ||
The links in the columns either point to pages on the [course textbook](https://inferentialthinking.com). | ||
or interact links that connect students with the course JupyterHub for distribution of homeworks and labs. | ||
|
||
The videos and slides listed above and on the [Data 8 website](http://data8.org/sp20/) are restricted to berkeley.edu addresses. | ||
|
||
|
||
## Assignments | ||
|
||
Alongside the textbook are several computational homeworks, labs, and projects that let students interact with the | ||
ideas covered in class. They can all be run interactively in the Data 8 environment. | ||
|
||
These homework, labs, and project materials are freely available on the semester course repository. | ||
There is a new repository created for each semester of the course, as the materials | ||
tend to evolve over time. For example, here is the repository for the Spring 2020 course: https://github.com/data-8/materials-sp20 | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Syllabus & Lectures | ||
Teaching Data Science requires a shift in the way we traditionally teach each of the individual concepts in the course. What were once several introductory courses in statistics, computer science, writing, and ethics (among others) are now combined into a single introductory course. | ||
|
||
In Data 8, touching on these topics in a single course is a feature, not a bug. For example, by teaching programming with the fundamentals of statistics, students gain an appreciation for the close interplay between these two fields and learn a valuable set of skills at the same time. Through this course, students are able to learn how to interpret and ask questions about their data with the support of inferential and computational tools. | ||
|
||
This section covers many of the pedagogical decisions that were made in Data 8. It should be seen as a reference and background for the [Data 8 textbook](https://inferentialthinking.com), which is freely-available online. |
Oops, something went wrong.