# DS 3000 lecture 0

- Introduce myself
- Motivating clear communication in DS
- Admin/syllabus
- Jupyter (Google Colab also acceptable)
    - markdown
    - code    
    
- Python lightening review (brush up on our skills & pick up a few new tricks)
    - types (tuple, lists, dict, floats & ints, string)
        - tuple unpacking
        - list indexing
        - string formatting & operations
    - if statement & comparison operators
    - iteration (for a while loops)
        - iterating through dict
    - functions
        - default arguments
        - multiple return values (or is it?) ... tuple unpacking

# About me

- BS in Statistics and Math at University of Wisconsin-Madison
- Master and PhD in Statistics at Carnegie Mellon University
- Tenure-track Assistant Professor at Creighton Univeristy
- Thesis: Learning social networks from text data using covariate information
- Teaching: DS3000 Foundations of Data Science; DS4200 Information presentation and data visualization
- New faculty, still learning and happy to communicate!

# Motivating Clear Communication in DS

Let's give you a small example in the statistics. 

Suppose we are trying to compare the gender gap in DS study. In the end of the semester, I will calculate the average score for male and female respectively. It turns out that the average for male student is 84 and the average for female student is 86. Can I claim that the male students tend to perfrom worse that than female students in DS major? If not, what other questions I need to ask before I draw the conclusion? 

# Admin/Syllabus

Take a look at the course Canvas page.

- Please turn on the Canvas annoucement notification. 
- First homework will be assigned on July 2nd and due on 7th. 
- Homework will be graded by both effort and correctness.
- The homework is graded through Gradescope and we will use Piazza to answer the questions. Make sure you signup for the Piazza. 
- The project is a group project. I will assigned the groups based on your pre-course survey. Please finish the pre-course survey by today midnight!

# Jupyter Notebooks

Jupyter contains two cell (in these blue / green rectangles) types:
- markdown
    - markdown is a simple text/document formatting language
- python cells
    - a python interpreter is running in the background with all python variables / functions etc
    
By merging both, Jupyter provides a 'living' document which includes:
- results of analysis
- method of how analysis was done (the code)
- the ability to easily modify a few things and poke around or modify an analysis


    
# Installing Python and Jupyter Notebook  
  
A detailed installation manual is included in the Canvas, `supplemental` folder. Please let TA know if you have encounter any problems during the installation.   
  
In the terminal type:

`pip install notebook`

Then to run Jupyter Notebook in the browser, type in the terminal:

`jupyter notebook`

**Note**: make sure the notebook file `.ipynb` is in the appropriate folder.

# Navigating Jupyter

- selecting a cell
- changing cell type
- running a cell
    - for markdown cell: renders text
    - for python cell: runs the code
- add a cell
- remove a cell


# Markdown Rundown

Let's go through [this markdown guide](https://www.markdownguide.org/basic-syntax/)

## Headings

## Lists

## Links
you can link to website, like [this one](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) which contains a more complete markdown reference (and was used to generate this quick markdown guide, many examples taken from them)

## Images

![alt text](http://dangerouslyirrelevant.org/wp-content/uploads/2016/03/2015-Gallup-Student-Poll-1-3.jpg "Logo Title Text 1")

## Tables

| Car Repair                         | Cost ($) | Prob | Salted Roads? |
|------------------------------------|----------|------|---------------|
| None                               | 0        | .9   | No            |
| Oxygen sensor replacement          | 250      | .01  | No            |
| Under car rust repair              | 1000     | .02  | Yes           |
| Timing Belt Replacement            | 750      | .03  | No            |
| Fuel cap replacement or tightening | 25       | .03  | No            |
| rusted muffler repair              | 250      | .01  | Yes           |

Tables can be tough to generate by hand, go ahead and use a [table generator](https://www.tablesgenerator.com/markdown_tables) online to save yourself some time.

## Block quote 

## Python code for display (not for running)  


## Latex Math


# Exercise 1 (Part of your Homework 1)

Write an introduction. It can be your biography or anything you want to introduce. Be sure to use:
- 2 different heading levels
- a list
- a link to some website
- an image
    - avoid pictures of yourself please
    - link to something available online, see example above

You're welcome to be funny, this is really an excuse to get warmed up with jupyter and markdown and meet each other. Feel free to share with each other once you finish. 

Please be mindful:
- everything you share should make all classmates feel safe and welcome
- your response should be positive, take the moment to make somebody else smile and feel good :)

# Jupyter Output

# The Jupyter-Python Gotcha

The state of variables and functions may depend on previous cells which have been modified or deleted:

This can be problematic as `.ipynb` are saved with the outputs of each cell!

Mitigate the issue by:
- observing the index (idx) in `In [idx]` and `Out [idx]`

Best practice:

- Give a fresh `Kernel>Restart & Run All`
    - before sharing
    - when debugging

**Note**: this is required of all your submissions for this class


# Python lightening review

- brush up our skills 
- pick up a few new tricks

This is not intended to be an introduction to these topics, but a quick refresher. Please let me know if you have troubles to follow all the topics here. 

Also note:
- this review will be quicker paced than we go over new material
- please interrupt me (raise your hand or just speak up!), questionsmake class much more fun and tailored to your needs

# Types

- int/floats
- tuple: an immutable sequence of objects
- list: a mutable sequence of objects, can be sorting
- dict: a mutable mapping between objects
- string


## Tuples

## Lists

## Dictionaries

A real life dictionary assigns a definition (value) to every word (key).

Python dictionaries assign a (not necessarily unique) value to every key.  
(and they're not sorted like real dictionaries!)

# Strings

Python has awesome [string manipulation methods](https://docs.python.org/3/library/stdtypes.html#string-methods), we'll highlight a few useful ones here.  Worth a few minutes to famliarize yourself with the link.

(tip: handling file paths?  use [pathlib](https://docs.python.org/3/library/pathlib.html) instead of treating them as strings)

In [None]:
url = 'https://www.some-website.com/this-section/this-subsection/file_<useful-thing>_gibberish-here-too.html'

# Control Flow (If statements)

In [None]:
# write statements to decide whether a given number if larger than a boundaary
x = 3
boundary = 10

# Iteration (loops)

In [None]:
# print integer from 0 to 4


In [None]:
# describe what this code is doing
for idx in range(5):
    print(idx)
    if idx > 2:
        break

In [None]:
# describe what this code is doing
for idx in range(5):
    if idx==2:
        continue
    print(idx)    

# Functions

In [None]:
# write a function to return to the power 3. 


In [None]:
# You can also return multiple things
def nonsense_fxn(some_list, some_int, some_float):
    """ a nonsense function
    
    Args:
        some_list (list): a list
        some_int (int): integer
        some_float (float): float
        
    Returns:
        list_truncate (list): some_list, truncated to first some_int
            items
        float_scaled (float): some_float, multiplied by some_int
    """
    # truncate list
    list_truncate = some_list[:some_int]
    
    # scale float
    float_scaled = some_float * some_int
    
    return list_truncate, float_scaled

### Exercise 2:

Here is a list of number:
    
    numbers = [10, 15, 20, 25, 30, 35, 40, 45, 50]

We want to: 
1. Print all the even numbers from the list.
2. Print all the odd numbers from the list.
3. Print the number of even numbers.
4. Print the number of odd numbers.dd numbers: 5


### Exercise 3: 

Once you finish exercise 2, think about how to write it into a function so that it takes a list as the input and output all four things. 