# Discussion 1 – Getting Started with Jupyter Notebooks

## DSC 10, Winter 2024

### Agenda

- Introduction.
- Python Basics.
- DataHub.
- Using Jupyter notebooks.
- Good programming style.

### Hi! I am Arya Rahnama
- BS in Computer Science and Mathematics from Oregon State University.
- Currently a 1st year Data Science Masters student at UCSD.
    - 2nd time TA-ing DSC 10! First time was last quarter
    - Previously tutored 2x for DSC 40A (Theoretical Foundations of DS).
- Data Science internships at Intel and Deloitte.
- Outside interests: video games, chess, basketball, my dog.

<img src='images/joon.png' width=25% style="float:left"> </font><img src='images/arya_joon.png' width=30% style="float:right">

### Important Dates

- There are six quizzes throughout the quarter, administered in discussion section:
    - Quiz 1: Monday, January 22 (in 2 weeks).
    - Quiz 2: Monday, January 29.
    - Quiz 3: Monday, February 5.
    - Quiz 4: Monday, February 26.
    - Quiz 5: Monday, March 4.
    - Quiz 6: Monday, March 11.
- Each quiz is worth 4%, and your lowest two quizzes are dropped (16% total).

### Group Activity
Find a group of 2-4 people sitting near you and introduce yourself! Ask your classmates about:
- Major, college, and year at UCSD.
- Hometown (who is from the furthest away?).
- Favorite hobby.
- Why are you interested in data science?

### Python Basics

- *Programming*: Giving instructions to a computer to perform tasks. Typically these instructions are in the form of *code*, which can be in one of many *languages*.

- *Python*: A popular programming language that is widely used for its readability and simplicity.

- *Jupyter Notebooks*: An interactive environment where you can write and run code and see results immediately. 

### Hello, World!

A "Hello, World!" program is generally a simple computer program which outputs (or displays) to the screen (often the console) a message similar to "Hello, World!" while ignoring any user input. While small test programs have existed since the development of programmable computers, the tradition of using the phrase "Hello, World!" goes back an example program from 1978! Here is Python's "Hello, World!" program:

In [None]:
print("Hello, World!")

Here, the Python built-in `print()` function displays the text or numbers provided to it. In this case, we gave the text "Hello, World!" as the input to the print function.

### Variables and Types

- *Variables*: Containers for storing data values. In Python, variables are created the moment you give or assign a value to them using the `=` operator. Variables are case-sensitive.
- *Types*: The different types of data that can be stored in variables. Python variables can hold various data types, including integers, floats, strings, booleans, tuples and lists. Use the built-in `type()` function to check the type of a variable by giving that variable as an input.

In [None]:
a = 10       # An integer variable (whole numbers)

In [None]:
a

In [None]:
b = 5.5      # A float variable (numbers with a decimal point)
b

In [None]:
c = "Hello!" # A string variable (text)
c

In [None]:
type(a), type(b), type(c)

- Variable names must start with a letter or an underscore (`_`).
- Variable names can only contain letters, numbers, and underscores.
- Variable names cannot contain spaces or special characters.

### Python Operators
<center>$\textbf{Python Arithmetic Operators and Functions}$</center>

| Operator | Math | Code | Math Example | Code Example |
|----------|------|------|---------|---------|
|Addition  |$+$   |`+`   |$1 + 2 = 3$   |`1 + 2 == 3`  |
|Subtraction|$-$   |`-`  |$1 - 2 = -1$|`1 - 2 == -1` |
|Multiplication|$\times$ |`*` |$1 \times  2 = 2$|`1 * 2 == 2`|
|Division  |$\div$|`/`   |$1 \div 2 = 0.5$|`1 / 2 == 0.5`|
|Floor Division| $\lfloor \frac{a}{b}\rfloor$	|   `//` | $\lfloor \frac{1}{2}\rfloor = 0$	| `1 // 2 == 0`|
|Exponentiation|$a^b$ |`**`  |$1^2 = 1$| `1 ** 2 == 1`|
|Absolute Value| $|a|$	|   `abs()` | $|-1| = 1$	| `abs(-1) == 1`|
|Remainder| mod	|   `%` | $5~\text{mod}~2 = 1$	| `5 % 2 == 1`|

In [None]:
# Addition
1 + 2


In [None]:
# Subtraction
1 - 2

In [None]:
# Multiplication
1 * 2

In [None]:
# Division
1 / 2

In [None]:
# Floor Division - divide and truncate any decimal part (round down)
1 // 2

In [None]:
# Exponentiation
1 ** 2

In [None]:
# Absolute Value
abs(-1)

In [None]:
# Remainder
5 % 2

### Imports and Modules

In Python, you use the `import` keyword to access a bundle of code called a module (or package). Access components of the imported module using "dot notation."

In [None]:
import math

In [None]:
# Calculate the area of a circle with radius r. Try changing r!
r = 1
math.pi * r ** 2 # Note the order of operations

In [None]:
r = 1
pi * r ** 2  # Doesn't work because there is no variable named pi!

In this class we will have the code `import babypandas as bpd` at the start of essentially every jupyter notebook, which will import our custom version of the popular data-science module `pandas` for you to use throughout that notebook.

### DataHub

Let's go through:
- What is DataHub?
- Accessing DataHub
    - Selecting the DSC 10 environment on DataHub for this class
- Viewing the lecture notebooks
- Uploading files
- Creating a notebook

### Using Jupyter Notebooks

Let's see how to:
- Add and delete cells, run all, run before, run after. 
- Run current cell using **shift + enter**. 
- Change cells to code or markdown.
- Enter command mode (blue bar) and edit mode (green bar). 
- Add cell before with 'a', add cell after with 'b', and delete cell with 'dd'.
    - All keyboard shortcuts are under the "Help" tab
- Comment and indent blocks of text.
- Tab to autocomplete variable names.
- Display variables in code cells

### Jupyter Notebook Memory
- The Jupyter kernel has short-term memory: leaving it for a while or restarting it will cause it to "forget" all the variables you've defined.
- When this happens, restart the kernel and run all cells.
    - In the menu, Kernel -> Restart and Run All does this in one step, but it will stop at the first error it encounters.
- Be careful - running cells out of order or running cells multiple times may cause problems.

In [None]:
a = 3

In [None]:
b = 7

In [None]:
print("a:", a)
print("b:", a)
a = a + b
a

### Navigating Jupyter Notebooks Video

We highly recommend watching [this video](https://youtu.be/Hq8VaNirDRQ) for more information. The video is long, but is timestamped so you can jump to specific topics:

Timestamps:  
0:20 – What is DataHub?  
1:04 – Accessing DataHub  
2:15 – File organization  
4:07 – Clicking links on the course website  
6:29 – Uploading files and creating new notebooks  
10:02 – Running cells (IMPORTANT!)  
10:58 – Test cases in labs vs. homeworks  
11:57 – Restarting the kernel  
13:49 – Restart and Run All  
14:40 – Adding and deleting cells with keyboard shortcuts; command mode vs. edit mode  
17:54 – Interrupting the kernel  
19:11 – Saving  
20:05 – Download as .ipynb (even on iPads!)  

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo("Hq8VaNirDRQ")

### Debugging

If you run into a technical issue, check out the [debugging page](https://dsc10.com/debugging/) on the course website.

### Good programming style
#### Variable Names
- Choose descriptive, concise variable names.
- See [r/badcode](https://www.reddit.com/r/badcode/) for more fun examples.

<img src='images/goodcode.png' width=50% style="float:left"> <font color="red" style="font-size:2.5em"> <b>vs</b> </font><img src='images/badcode.png' width=30% style="float:right">

- If using multiple words, `separate_like_this` (called snake_case).
<center><img src='images/image.png' width=45%></center>

### Whitespace

Whitespace doesn't affect the running of your code in Python, but for readability (and future classes or programming languages!) this is important.

- Put spaces in between operators (`x = 1` vs `x=1`).
- Don't be afraid to start new lines or create new variables if your line of code becomes too long.
- Break up long lines of code with `\`.
<img src='images/example.png' width=70%>

In [None]:
# Comment your code like this!
section_a_students = 150
total_students = 525 # Total number of students in all sections of the class.

### Thank You!

- There is no lecture or discussion next Monday, January 15 (MLK Day).
- In two weeks, we will have our first quiz in discussion. You must attend your enrolled discussion section or request to change on the [Welcome Survey](https://forms.gle/ggWCXXVvL6CREbrX7).
- If you have any questions, feel free to ask after your discussion section, during office hours, or on Ed!