# Title of Notebook [EDIT ME]

* * * 

### Icons used in this notebook
🔔 **Question**: A quick question to help you understand what's going on.<br>
🥊 **Challenge**: Interactive excersise. We'll work through these in the workshop!<br>
💭 **Reflection**: Helping you think about programming.<br>
⚠️ **Warning**: Heads-up about tricky stuff or common mistakes.<br>
💡 **Tip**: How to do something a bit more efficiently or effectively.<br>
🎬 **Demo**: Showing off something more advanced – so you know what Python can be used for!<br>

### Learning Objectives
1. [Section Name - EDIT ME](#section1)
2. [Section Name - EDIT ME](#section2)
3. [Reflection: [Title of Reflection] - EDIT ME](#refl)
4. [Demo: [Title of Demo] - EDIT ME](#demo)

<a id='section1'></a>

# Section Name [EDIT ME]

Main sections should be an H1 (one hashtag) header. They are linked to in the Learning Objectives. Use `<a id='section_name'></a>` at the top of the markdown cell to create the link. Capitalize Each Word for section headers. 

Be sure to have text between code cells explaining the material, step-by-step.

Tips for markdown cells:
- Keep text limited! 
- Use colloquial languague and minimal jargon; if you need it, explain it.
- Use **boldface** to highlight important terms, but use it sparingly.
- Use en-dash for lists.
- Consistently use emoji for relevant sections. Here are some examples:

In [3]:
# Code cells

# Comments:
# Use comments sparingly: better to explain in markdown!
# Use comments to further specify particular lines of code when needed.

# Coding style: 
# Use snake case and single quotation marks.
some_variable = 'Hello world'

# Data: 
# use relevant social science datasets whenever you can. 
# Think health data, demographics, etc.
# Stay away from "impersonal" datasets like the Iris dataset.
# Use a "data" folder in the main repo to store data

In [None]:
# You might want to have an install cell, if your workshop uses a special package
# e.g.,
# !pip install [PACKAGE]

In [None]:
# You might want to have an import cell, so that you can do all imports at the beginning.
# e.g.,
# import numpy as np
# import pandas as pd

## Subsection [EDIT ME]

Use H2 headers for subsections. These **do not** need to be linked to in the Learning Objectives at the top of the notebook.

### Subsubsection [EDIT ME]

Use H3 headers for subsubsections.

<a id='section2'></a>

# Section Name [EDIT ME]



## 🥊 Challenge: [Name of challenge]

- Challenges are typically formatted as subsection (##).
- Challenges have names.
- Keep challenges short: 5-10 minutes.
- Don't use "Bonus Challenges" or the like. If really needed, use a "Take-Home Challenge" (1 max!) so it's clear these can be completed outside the workshop.
- You can work through challenges by interacting with participants and allowing them to paste in chat, or by letting them work on their own.
- If letting participants work on their own, **don't just show them the answer afterwards** but make sure to let them answer in chat and take them through the answer.

### *Example:*

## 🥊 Challenge: Printing!

Write your own `print` statement in the code cell below. Follow the syntax of the example above, and change the text in the quotation marks.


In [None]:
# You may have some starter code for the challenge that you can put in its own cell.
# Always have a following cell that says "YOUR CODE HERE" with a few empty lines beneath it, 
# so that attendees know where to put their code.

In [4]:
# YOUR CODE HERE




💡 **Tip**: Keep them short and to the point. 

### *Example:*

💡 **Tip**: A method is written with parenteses: e.g. `gap.value_counts()`. An attribute is written without parentheses: e.g. `gap.columns`.


## 💡 Tip: [Name of tip]
- Tips can be formatted as subsection (##) when they're more substantial.
- Keep text short. Use links to relevant materials when needed.


🔔 **Question:** These are typically written in-line (no subsection header). They do not require participants to enter code. Scatter them throughout the notebook as attention checks, and discuss them during the workshop. Keep them short.

*Example:*

🔔 **Question**: what will the output of the following code be?

In [None]:
numbers = [12, 20, 43, 88, 97, 100, 105, 110]

for number in numbers:
    if number > 100:
        print(number, 'is greater than 100.')

⚠️ **Warning:** These are written in-line, and act as short reminders to participants of common mistakes and errors. 

### *Example:*

⚠️ **Warning:** Jupyter remembers all lines of code it executed, **even if it's not currently displayed in the notebook**. Deleting a line of code does not delete it from the notebook's memory if it has already been run! This can cause a lot of confusion.

<a id='refl'></a>

# 💭 Reflection: [Title of Reflection]

Reflections are longer pieces of text that need to be discussed in the workshop, and that can be referred to by participants when working on their own. They can be their own section, referred to in the Learnin Objectives. Keep them as short and colloquial as possible. Don't use jargon if you can avoid it.


### *Example:*

# 💭 Reflection: What it Means to "Know How to Code"

Python is a general-purpose, powerful, and high-level programming language, which makes it a very useful language to be comfortable with. Python can be used for many tasks, from building websites and software, to automating tasks, to conducting data analysis.

This workshop will take you through the fundamentals of Python with a focus on data analysis. But just as important is knowing *how to code* in general, as opposed to "knowing" Python, R, Matlab, or any other specific language. This is not a matter of memorization, but of a set of problem solving skills.

A programmer knows 1) general structures and programming logic, 2) how to find and use new functions, and 3) how to work through problems that arise. It is these three aspects we want to give you an intuition for.

When you're programming, 80% or more of your time will be spent debugging, looking stuff up (like program-specific syntax, [documentation](https://github.com/dlab-berkeley/python-intensive/blob/master/glossary.md#documentation) for packages, useful functions, etc.), or testing. Relatively little time is actually spent typing out the code - most of it goes into the thinking, planning, and testing to ensure well-designed code!


<a id='demo'></a>


# 🎬 Demo: [Title of demo]

Demos can be used at the end of a notebook to show / highlight some advanced functionalities outside the scope of the workshop. Use them as main sections (#) that are linked to in the Learning Objectives at the top of the notebook.

- Only one per notebook can be used.
- Use them to enthuse participants, and to point them to relevant other D-Lab workshops.
- Make sure to walk through the functionality; don't skip over them.
- Keep them short: 5-10 mins max.

### *Example:*

# 🎬 Demo: Working with Data Frames

To cap off this workshop, here's a demo to see what reproducible data science with Python looks like.
Just run the code cell below, and don't worry if you don't understand everything!

* We'll be using a `pandas` DataFrame to store and manipulate the data - you'll learn more about `pandas` and DataFrames in the next workshop!
* Our data comes from the California Health Interview Survey (CHIS), the nation's largest state health survey. 

Let's have a look at the data:

In [None]:
import pandas as pd

# Reading in a comma-seperated values file
chis_df = pd.read_csv('SOME_FILE')
chis_df.head()

Looks like we have a bunch of information here. Let's focus on the column for the number of sodas people have per day (the "number_sodas" column), and whether people rent or own a house (the "tenure" column).

In the next steps, we'll...
1. change the datatype of a column,
2. create dummy variables (variables that take values of 0 and 1), and 
3. group our data.

This allows us to calculate the average amount of sodas consumed by people who do and do not own a house.

In [None]:
# Changing the data type of a column
chis_df['number_sodas'] = chis_df['number_sodas'].astype(float)

# Creating dummy variables
chis_dummies = pd.get_dummies(chis_df, columns=['tenure'])

# Grouping the average number of consumed sodas by people who own a house
chis_dummies['number_sodas'].groupby(chis_dummies['tenure_OWN']).mean()