# UBC
## Programming in Python for DS

June 07, 2023  

Instructor: Socorro Dominguez Vidana

## How does this Course Work?

4 Main Components:
- [Course Material](https://prog-learn.mds.ubc.ca/en/)
- Jupypter Hub for assignments
- Piazza for Questions
- Office Hours: 2 hours a week and will be recorded. (Optional Attendance) 

### Office Hours Expectations

- Students **must** bring in questions:
    - On Mondays we will do a Module preview.
    - Wednesdays, Office Hours rely entirely, on the questions brought by you. 
<br> 

- Try to keep your mic muted but your camera on (at least when you are speaking). Stay engaged.

- Although attendance is optional, it is highly encouraged that you do at least watch the recordings, there might be announcements made during Office Hours.

- To ask a question in Office Hours, be ready to share your screen and walk us through where you are stalled. If you have worked through the solution partially, also give us a tour :) 

### Piazza Expectations

- Piazza is similar to a website called StackOverflow.

- Its main purpose is to communicate questions among stuents and instructors.
    - Collaboration among all is extremely important.

- Try working alone first on a problem. If you have worked for 15-30 minutes and you cannot figure it out, **ASK** on Piazza. If you have that question, it is very likely other students are facing the same problem. And if someone finds the answer first, they can share it with you.

- Are you shy? Don't worry, you can post annonymously.

- Questions related to course content, post them on Piazza. If you email the instructors, you might wait too long for an answer. And the knowledge/discussion is then only delivered to you.

- A good question involves:
    - The Question that you are trying to sovle.
    - What you have tried doing (It is totally OK to post code)
        - For posting code, use three backticks ``` before the code and three backticks after the code. Copy and paste your code (rather than screenshots), that way, others can try to run it too.
            ```python
            print(hello)
            ``` 
    - The error message that you got.

### Assignment Expectations
- You will have one assignment per week. 
    - The first submission is on Sunday (the first assignment is more of a walkthrough).  
    <br> 

- **Do not** change the names of the files. It might be tempting to add your name. Please do not as this may cause problems when grading you.

- Avoid uploading data or unrelated course material into the server. This might also cause crashes with the server.

- Try to submit on time. We will talk more about late submissions next week. But try to stay on track as much as possible.

- To submit, all assignments (except for the Final Project) need to be submitted on the server only. To achieve this, simply save your file changes - as simple as that.

### Hints on how to Navigate the Course?

- Read the Goals of the Module.

- Read the Assignment (so that you have an idea of what you are expected to do)

- Watch the videos and solve the exercices.

- Instead of waiting to solve the whole assignment on the weekend, code along and try solving a bit every day.

- Try working on a problem for 15-30 minutes before asking on Piazza.

- If Piazza is not enough, bring your question to Office Hours.
    - Maybe you got the right answer but don't understand how/why it works. Bring the problem and we will discuss it.

### What is Python?

- Python is a widely used general-purpose, high-level programming language.

- Designed by Guido van Rossum in 1991 who developed by Python Software Foundation.

- Developed to allow programmers express concepts in fewer lines of code.

- Object-oriented programming language (can model real-world entities). 

- Dynamically-typed and already interpreted - we don't need to compile it.

- Python 3 was released in December 2008.

#### Python's Fun Facts

- Firstly introduced at the National Research Institute for Mathematics and Computer Science, Netherlands, 1991. [source](https://www.journaldev.com/34415/history-of-python-programming-language)

- Named after the comedy show Monty Python's Flying Circus (it's in Netflix)

- Python has become the most popular coding language in the world. 
    - This makes a career in Python a great choice. Not just for Data Science/Analytics.

- Python has just turned 30+, but:
    - Google users have searched "Python" much more than they have searched for Kim Kardashian, Donald Trump, or Tom Cruise etc.
    [source](https://trends.google.com/trends/?geo=CA)

### Why Python for Data Science ?

- Fast programming language to pick up - from a syntax point of view.
    - We will use python as a functional language rather than an OOP language.

- Active community with a vast selection of libraries (such as pandas and Altair)  and resources.

- Professionals working with Data Science applications want to focus on insights rather than on complications of language.

### What is Jupyter?

- It is an IDE (integrated development environment)

- We can use Python via Jupyter.

- You can think of Python like a car's engine, while Jupyter is like a car's dashboard.
    - Python is the programming language that runs computations
    - Jupyter is the IDE that provides an interface by adding convenient features and tools.
    
- We can use other programs with Jupyter (R, Julia, Matlab,...)

### Why Jupyter Lab?

- In Jupyter we can code, do plots, format text, equations, etc. in a single document.

- Allows us to run Python code interactively.

- Notebooks are great for exploration and for documenting a complete workflow.

- Notebooks can be shared in a human readable format:
    - Share online with nbviewer.jupyter.org
    - Github, any notebooks you upload are automatically rendered on the site.
    - Convert to HTML, PDF, etc.

### *Course Requirements?*

For this course, you do not need to install anything.  

The Jupyter server that loads when you start an assignment suffices for this course.

If you want to install it in your computer, follow these instructions:
[MDS Installation Guide](https://ubc-mds.github.io/resources_pages/installation_instructions/)

### Characteristics of Notebooks

- A notebook consists of a series of "cells":
    - Code cells: execute snippets of code and display the output
    - Markdown cells: formatted text, equations, images, and more

In [1]:
# Code Cell

x = 3
x + 6

9

```python
# Markdown Cell
```

$x + y = 10$

Note: By default, a new cell is always a code cell.

### Python Data Science Ecosystem

- Python has many uses: 
    - Web development
    - Automation or scripting
    - Software testing and prototyping
    - Everyday tasks
    - Data Analysis & Data Science
    
- Python has built-in functions. But that is not enough for us (we don't want to reinvent all functions).

- The Python libraries for data science are developed and maintained by external "3rd party" development teams

- Python core + 3rd party libraries = **ecosystem**
    - To install and manage 3rd party libraries, you need to use a package manager such as conda (which comes with Anaconda/Miniconda) - More on this in the DS Toolbox
    

Some of the libraries in the Python data science ecosystem:

![](ecosystem_big.png)

During the program, we will be working with Pandas, numPy, and Altair

## Tricks with Notebooks

In [2]:
# This is a code cell

x = 5
3+x # Shows output

8

In [3]:
y = 3

### Writing a formula 
- Render with latex using `$`

Write 
>
>```markdown
$x + y = 8$
>```

The output is: 

>$x + y = 8$

- Loading an image:

```markdown
![](image_path)
```

Writing chunks of code as markdown (that doesn't execute) - type:
```markdown
    ```python
        print("hello world!")
    ```
```


Renders

```python
print("hello world!")
```

Write variables from the document between `

`x`

In [5]:
z + y

8

In [4]:
z = 5