# Introduction to Python

## What is Python?
Python is a versatile programming language known for its readability and wide range of applications, from web development to data analysis. It's often the first choice for beginners due to its simple syntax and extensive community support.

## What is JupyterLab?
JupyterLab is the next-generation web-based interactive development environment for notebooks, code, and data. It extends the classic Jupyter Notebook interface with flexible layouts, file browsers, terminals, and rich support for many file types — all inside a single integrated workspace.

## The Jupyter Ecosystem (“JupyterVerse”)
Jupyter is more than just notebooks — it’s a whole ecosystem of tools designed to make interactive computing easier and more productive:

- Jupyter Notebook: The original web app for interactive coding in notebooks.
- JupyterLab: A powerful, flexible interface that builds on notebooks with IDE-like features.
- JupyterHub: A multi-user server for running Jupyter for teams or classrooms on shared infrastructure.
- Jupyter Kernels: Language backends that run your code — Python is the most common, but kernels exist for R, Julia, and many others.
- nbconvert: A tool to convert notebooks to HTML, PDF, slides, and more.
- Voila: Turn notebooks into standalone web apps without code cells visible.
- Quarto: A publishing system built on Jupyter and Pandoc, letting you create scientific documents, presentations, and websites from .qmd files.

### Key Features of Jupyter:
- Multi-Document Interface: Work with multiple notebooks, terminals, text editors, and output panels side by side.
- Extensible: Install extensions for Git integration, debugging, variable inspectors, and more.
- Supports Notebooks and More: Open .ipynb notebooks, Markdown files, CSVs, JSON, images, and even Quarto .qmd documents.
- Code Consoles: Run snip
- pets interactively without cluttering your notebooks.
- Real-Time Collaboration: Some extensions enable live sharing and collaboration.

# How to Get Started with Jupyter Lab
## Installing and Launching JupyterLab

There are several ways to [install JupyterLab](https://jupyter.org/install) on your computer. Choose the method that works best for you.

---

1. Install JupyterLab directly with `pip`

If you already have Python installed, you can install JupyterLab using `pip`:

```bash
pip install jupyterlab
```
After installation, you can launch JupyterLab from the terminal (Mac/Linux) or Command Prompt (Windows):
```bash
jupyter lab
```

This will open JupyterLab in your default web browser.

2. Install JupyterLab using Anaconda

[Anaconda](https://www.anaconda.com/docs/getting-started/anaconda/install) is a distribution of Python that comes with many scientific and data analysis packages pre-installed.

- Download and install Anaconda for your operating system.

- Open Anaconda Navigator (the app you installed).

- From Navigator, you can launch JupyterLab with a single click.

**Notes**:

- JupyterLab runs locally in your browser — your notebooks are stored on your computer.

- You only need to install it once. After that, just use jupyter lab (or Anaconda Navigator) to open it again.


## Creating and Activating Your Virtual Environment

1. We recommend creating a dedicated environment for the class to keep dependencies organized. From your terminal:

```bash
python -m venv dsci401
source dsci401/bin/activate ## Mac/linux
dsci401\Scripts\activate ## windows
```

2. Intsall required packages:

```bash
pip install jupyter ipykernal pandas numpy matplotlib
```

3. Linking Your Environment to JupyterLab
Register your environment as a kernel so you can select it inside JupyterLab:
```bash
python -m ipykernel install --user --name=dsci401 --display-name "Python (DSCI 401)"
```
4. Launch JupyterLab in the Class Folder

    1. Navigate to the course folder where you want to keep all notebooks and files:
       ```bash
       cd path/to/DSCI401
       ```

      2. Start JupyterLab
  
      ```bash
      jupyter lab
     ```
     3. When JupyterLab opens, choose the Python (DSCI 401) kernel from the top-right dropdown.

Recommended Workflow
- Organization: Work inside the STAT_401 folder. Create subfolders for each week/project (e.g., week1/, week2/).

- Saving Work: Save your notebook frequently with Ctrl + S or Cmd + S.

- Using Markdown: Use Markdown cells to document your code and explain your analysis.

- Shortcuts: Get familiar with Jupyter shortcuts like Shift + Enter to run cells.

# Layout and Interface

- **Main Work Area**: This is the central space where you edit and run your files. You’ll usually be working in a notebook (.ipynb) or a script (.py). Each file opens in its own tab, similar to a web browser.

- **Launcher**: When JupyterLab first opens, you’ll see the Launcher tab. From here you can create new notebooks, text files, terminals, or open consoles.

- **File Browser** (Left Sidebar): On the left-hand side is a file navigation pane. This lets you browse your directories and open files, much like a file explorer.

- **Sidebar Tabs**:
Also in the left sidebar, you can switch between tools such as the Running Terminals and Kernels panel (shows which notebooks and processes are active), the Table of Contents (if enabled), and the Extensions manager (probably don't use for now).

- **Kernel**:
Each notebook runs on a kernel, which is the computational engine that executes your code. For Python notebooks, you’ll see a Python kernel (e.g., Python 3). The current kernel is shown in the top-right of the notebook window. If your code seems “stuck” or you need to restart fresh, you can use the Kernel menu at the top to restart, interrupt, or change kernels. *Our class is using a specific DSCI 401 Kernal you just made*

- **Variables/Inspector Panel**:
On the right-hand side, you may see a tab called Variable Inspector (if installed), which shows the current variables in memory and their values. If you don’t see it, variables can still be printed directly in the notebook by calling them in a code cell.

- **Status Bar (Bottom)**:
The bottom of JupyterLab shows information such as which kernel is running, whether the notebook is saved, and background activity (like code execution).

# Common File Types in Data Science

When working in JupyterLab (or in general data science projects), you’ll often run into a variety of file types. Each serves a different purpose:

1. Notebook files (.ipynb)

- These are Jupyter Notebook files.

- Contain a mix of code cells, Markdown (text) cells, and outputs (tables, plots, etc.).

- Great for analysis, documentation, and reproducibility.

- Example: analysis.ipynb

2. Python scripts (.py)

- Plain Python code files.

- Often used to hold reusable functions or longer code that doesn’t need to live inside a notebook.

- You can import them into a notebook with:

```bash
import script_name
```
- Example: helper_functions.py

3. Data files

- CSV (.csv) – Comma-separated values, a very common format for datasets.

- Excel (.xlsx) – Spreadsheet files; can be read with pandas.

- JSON (.json) – Structured data often used for web APIs.


4. Configuration and environment files

- requirements.txt – Lists Python packages your project needs.

- environment.yml – Same idea, but used with Conda environments.

- These help recreate the same setup across different computers.

5. Other text-based files

- Markdown (.md) – Simple formatting for documentation (used in GitHub READMEs).

- Text (.txt) – Basic text files.

- Quarto/LaTeX (.qmd, .tex) – For reports, articles, and presentations.

Genernal Workflow:
- Use notebooks (.ipynb) to explore and explain.

- Use scripts (.py) to organize and reuse code.

- Store your datasets in formats like CSV/Excel/Parquet.

- Document your project with README.md and environment files.

# Jupyter Notebook Toolbar Buttons (top of a notebook)

1. Play / Run Cell Button ▶️

- Runs the code in the current cell and moves to the next one.

- Equivalent to pressing Shift + Enter.

2. Stop / Interrupt Kernel ⏹ / ⏸️

- Stops code that is currently running.

- Useful if you accidentally run an infinite loop or a very long computation.

3. Restart Kernel 🔄

- Stops and restarts the kernel, clearing all variables in memory.

- Use when things get messy or you want to start fresh.

4. Bug / Debug Icon 🐞 

- Runs the cell in debug mode.

- Opens a panel showing all current variables and their values.

- Lets you step through your code line by line, set breakpoints, and inspect variables — very handy for troubleshooting.

5. Other common toolbar buttons

- Save 💾 – Saves the notebook. Shortcut: Ctrl+S / Cmd+S.

- Add Cell ➕ – Adds a new code or Markdown cell.

- Cut / Copy / Paste Cells ✂️📋 – Reorganize your notebook cells.

- Move Cell Up/Down ⬆️⬇️ – Rearrange cells without cutting/pasting.

- Cell Type Dropdown – Switch a cell between Code, Markdown, or Raw.

# Packages
In Python, **packages** are collections of modules that provide additional functionality and tools, making it easier to perform specific tasks without having to write code from scratch. A package typically contains various functions, classes, and variables that you can use directly by importing the package into your Python script.

Here are some commonly used packages:

-   **`math`**: This package provides mathematical functions and constants, such as `sqrt` for square roots, `sin` for sine functions, and `pi` for the constant π.

-   **`numpy`**: Short for Numerical Python, this package is essential for scientific computing. It offers support for large, multi-dimensional arrays and matrices, along with a wide variety of mathematical functions to operate on these arrays.

-   **`pandas`**: This package is widely used for data manipulation and analysis. It allows for easy handling of structured data through its powerful data structures like DataFrames, which are similar to tables in a database or Excel spreadsheet.

## What is a Kernel and Why Does It Matter?
When you run Python code in JupyterLab (or any Jupyter interface), it runs inside a kernel — a computational engine that executes your code and keeps track of all variables and imports.

Each kernel is tied to a specific Python environment, meaning it only has access to the packages installed in that environment.

This is why:
- If you open a notebook and use a kernel linked to a virtual environment where numpy or pandas is not installed, you will get an error when you try to import those packages.

- Conversely, if you select the kernel connected to the environment where all required packages are installed (like the Python (DSCI 401) kernel we set up), your notebook will run smoothly.

### How This Works in This Course
- You will create a dedicated Python environment for this class (using venv or Conda).

- You will install the packages we use (e.g., numpy, pandas) inside this environment.

- You will register this environment as a kernel named Python (DSCI 401).

When working in JupyterLab, always select this kernel for your notebooks to ensure you have access to all the packages you need.

In [1]:
# Importing necessary packages
import math  # Math functions like sqrt, sin, etc.
import numpy as np  # For array and matrix operations
import pandas as pd  # For working with data frames



# File Types and Comments in Python

## Understanding File Types

In Python, working with different file formats is a common task, especially when dealing with data analysis and storage. Here are some of the most common file types you might encounter:

-   **CSV Files**:

    -   CSV (Comma-Separated Values) files are plain text files used to store tabular data. They are widely used because they are simple and can be easily read and written by both humans and machines.

    -   Example operations include reading a CSV file into a DataFrame and saving a DataFrame to a CSV file.

-   **JSON Files**:

    -   JSON (JavaScript Object Notation) files are used to store and exchange data in a lightweight and human-readable format. JSON is often used for transmitting data in web applications.

    -   Python allows you to easily read data from a JSON file and write data back into it.

-   **Pickle Files**:

    -   Pickle is a Python-specific format used to serialize and deserialize Python objects. This format is useful for saving complex data structures like DataFrames or custom Python objects, so they can be easily loaded back into your program later.

Here’s an illustrative example in Python (written as comments for explanation):

In [2]:
'''
# Example: Reading and writing files in Python
# CSV files
df = pd.read_csv('file.csv')
df.to_csv('output.csv')

# JSON files
data = pd.read_json('data.json')
data.to_json('output.json')

# Pickle files (for saving Python objects)
df.to_pickle('data.pkl')
df = pd.read_pickle('data.pkl')
'''

"\n# Example: Reading and writing files in Python\n# CSV files\ndf = pd.read_csv('file.csv')\ndf.to_csv('output.csv')\n\n# JSON files\ndata = pd.read_json('data.json')\ndata.to_json('output.json')\n\n# Pickle files (for saving Python objects)\ndf.to_pickle('data.pkl')\ndf = pd.read_pickle('data.pkl')\n"

# Comments in Python

Comments are an essential part of any programming language, and Python is no exception. They are used to explain the code, making it easier to understand for others (or for yourself when you come back to it later). Python supports two types of comments:

-   **Single-line Comments**:

    -   These begin with a `#` symbol and continue until the end of the line. They are used for brief explanations or annotations within the code.

    -   Example: `# This is a single-line comment`

-   **Multi-line Comments**:

    -   Although Python doesn’t have a specific syntax for multi-line comments, a common practice is to use triple quotes (`''' or """`) to comment out multiple lines of code. This can be useful when you want to temporarily disable a block of code or provide a detailed explanation.



In [3]:
# This is a single-line comment

'''
This is a multi-line comment
It can span several lines
'''


'\nThis is a multi-line comment\nIt can span several lines\n'

# Assignment and Variable Types in Python

## Understanding Variable Assignment

In Python, variables are used to store data that can be used and manipulated throughout your code. You don’t need to declare the type of a variable explicitly, as Python is a dynamically typed language, meaning it infers the type based on the value assigned to the variable.

### Example of Variable Types

Here’s an example that demonstrates how Python handles different variable types:

In [4]:
# You do not HAVE to declare types in Python
# But you can. This may help to avoid problems later on.

height = 442  # 'height' is an integer (int)
print(type(height))  # Output: <class 'int'>

height = 442.0  # 'height' is now a floating-point number (float)
print(type(height))  # Output: <class 'float'>

height = 'really tall'  # 'height' is now a string (str)
print(type(height))  # Output: <class 'str'>


<class 'int'>
<class 'float'>
<class 'str'>


In the above example, the variable `height` changes its type based on the value assigned to it. Python allows you to reassign variables to different types without any issues.

### String Variables and Their Use

Python supports various ways to create and manipulate strings:

In [5]:
a = 'Yes'
b = "Yup"
c = '''
If you're happy and you know it
clap your hands.
Clap your hands.'''

print(a)  # Output: Yes
print(b)  # Output: Yup
print(c)  # Output: The multi-line string


Yes
Yup

If you're happy and you know it
clap your hands.
Clap your hands.


Here, the variables `a`, `b`, and `c` are all strings, even though they are created using different quote styles (`'`, `"`, and `'''`). Python supports single, double, and triple quotes for string creation. Triple quotes are particularly useful for multi-line strings.

### Importance of Spaces and Case Sensitivity

In Python, spacing inside strings and the case of variable names matter:

In [4]:
# Space outside quotes doesn't matter
a = "Yes"  # This is the same as a ="Yes"

# Space INSIDE quotes DOES matter
a = "Yes"  # This is NOT the same as a = " Yes"

# Python is case-sensitive! These are all different variables
name = 'Wakko'
Name = 'Yakko'
NAME = 'Dot'


Here, `name`, `Name`, and `NAME` are considered different variables because Python is case-sensitive. Additionally, spacing within strings is important, as it affects how the string is stored and displayed.

### Printing Variables and Text

The `print()` function in Python is used to display text and variables:

In [5]:
# Use print to display a single line of text.
print('Hello', name)  # Output: Hello Wakko
print('Hello', Name)  # Output: Hello Yakko
print('Hello', NAME)  # Output: Hello Dot

# Language statements like 'print' are ALWAYS lowercase
print("Hello World")  # Correct usage
# PRINT("Hello World")  # This would cause an error

# The default separator in print is a space.
print(name, Name, NAME)  # Output: Wakko Yakko Dot
print('Your name is', NAME)  # Output: Your name is Dot

# Removing spaces between words in the output
print(name, Name, NAME, sep="")  # Output: WakkoYakkoDot


Hello Wakko
Hello Yakko
Hello Dot
Hello World
Wakko Yakko Dot
Your name is Dot
WakkoYakkoDot


The `print()` function can take multiple arguments and by default, separates them with a space. You can customize the separator using the `sep` parameter, as shown in the last example.

By understanding how to assign variables, the significance of spacing and case, and how to print text and variables, you’ll be able to write more effective and readable Python code.

# Assignment and Variable Types in Python

## Understanding Variable Assignment

In Python, variables are used to store data that can be used and manipulated throughout your code. You don’t need to declare the type of a variable explicitly, as Python is a dynamically typed language, meaning it infers the type based on the value assigned to the variable.

### Example of Variable Types

Here’s an example that demonstrates how Python handles different variable types:

In [None]:
# You do not HAVE to declare types in Python
# But you can. This may help to avoid problems later on.

height = 442  # 'height' is an integer (int)
print(type(height))  # Output: <class 'int'>

height = 442.0  # 'height' is now a floating-point number (float)
print(type(height))  # Output: <class 'float'>

height = 'really tall'  # 'height' is now a string (str)
print(type(height))  # Output: <class 'str'>



In the above example, the variable `height` changes its type based on the value assigned to it. Python allows you to reassign variables to different types without any issues.

### String Variables and Their Use

Python supports various ways to create and manipulate strings:

In [None]:
a = 'Yes'
b = "Yup"
c = '''
If you're happy and you know it
clap your hands.
Clap your hands.'''

print(a)  # Output: Yes
print(b)  # Output: Yup
print(c)  # Output: The multi-line string


Here, the variables `a`, `b`, and `c` are all strings, even though they are created using different quote styles (`'`, `"`, and `'''`). Python supports single, double, and triple quotes for string creation. Triple quotes are particularly useful for multi-line strings.

### Importance of Spaces and Case Sensitivity

In Python, spacing inside strings and the case of variable names matter:

In [None]:
# Space outside quotes doesn't matter
a = "Yes"  # This is the same as a ="Yes"

# Space INSIDE quotes DOES matter
a = "Yes"  # This is NOT the same as a = " Yes"

# Python is case-sensitive! These are all different variables
name = 'Wakko'
Name = 'Yakko'
NAME = 'Dot'


Here, `name`, `Name`, and `NAME` are considered different variables because Python is case-sensitive. Additionally, spacing within strings is important, as it affects how the string is stored and displayed.

### Printing Variables and Text

The `print()` function in Python is used to display text and variables:


In [None]:
# Use print to display a single line of text.
print('Hello', name)  # Output: Hello Wakko
print('Hello', Name)  # Output: Hello Yakko
print('Hello', NAME)  # Output: Hello Dot

# Language statements like 'print' are ALWAYS lowercase
print("Hello World")  # Correct usage
# PRINT("Hello World")  # This would cause an error

# The default separator in print is a space.
print(name, Name, NAME)  # Output: Wakko Yakko Dot
print('Your name is', NAME)  # Output: Your name is Dot

# Removing spaces between words in the output
print(name, Name, NAME, sep="")  # Output: WakkoYakkoDot


Hello Wakko
Hello Yakko
Hello Dot
Hello World
Wakko Yakko Dot
Your name is Dot
WakkoYakkoDot


The `print()` function can take multiple arguments and by default, separates them with a space. You can customize the separator using the `sep` parameter, as shown in the last example.

By understanding how to assign variables, the significance of spacing and case, and how to print text and variables, you’ll be able to write more effective and readable Python code.

# Values in Python

## Numbers in Python

Python supports four main types of numbers: Booleans, Integers, Floating-point numbers, and Complex numbers. Each type is used for different kinds of numerical operations and holds specific properties.

### Booleans

Booleans are a simple data type in Python that can hold one of two values: `True` or `False`. These values are useful in logical operations and control flow.


In [None]:
# Booleans
a = True
b = False

# True evaluates as 1 and False as 0
c = 4 + True  # Adds 1 to 4, because True is equivalent to 1
print('c =', c)  # Output: c = 5

d = 0
if d == False:  # Checking if 'd' is equivalent to False
  print('d is False')  # Output: d is False


c = 5
d is False


Booleans can be combined with other data types, such as integers, and used in logical comparisons.

### Integers

Integers are whole numbers, both positive and negative, including zero. They are of arbitrary size in Python, meaning they can grow as large as your memory allows.

In [None]:
# Integers
a = 37
b = -29875486231854477
c = 0x7fa8  # Hexadecimal representation
d = 0b1000001111  # Binary representation

# Python supports both small and large integer values
print(a, b, c, d)  # Output: 37 -29875486231854477 32680 527


37 -29875486231854477 32680 527


### Integer Operations

Python provides various operators and functions to perform arithmetic operations on integers. These operations are fundamental to many programming tasks.

#### Common Integer Operations:

-   **"+" : Sum**\
    Adds two numbers together.

-   **"-" : Difference**\
    Subtracts one number from another.

-   **"\*" : Product**\
    Multiplies two numbers together.

-   **"/" : Quotient**\
    Divides one number by another. The result is a floating-point number.

-   **"//" : Floored Quotient**\
    Divides one number by another and rounds down to the nearest integer.

-   **"%" : Remainder (Modulus)**\
    Returns the remainder of a division operation.

-   **`abs(x)` : Absolute Value**\
    Returns the absolute (non-negative) value of `x`.

-   **`int(x)` : Convert to Integer**\
    Converts a number or string to an integer.

-   **`float(x)` : Convert to Float**\
    Converts a number or string to a floating-point number.

-   **`complex(re, im)` : Complex Number**\
    Creates a complex number with real part `re` and imaginary part `im`.

-   **`c.conjugate()` : Conjugate of Complex Number `c`**\
    Returns the complex conjugate of the number `c`.

-   **`divmod(x, y)` : Pair of `x // y` and `x % y`**\
    Returns a tuple containing the floored quotient and the remainder.

-   **`pow(x, y)` : `x` to the Power of `y`**\
    Raises `x` to the power `y`.

-   **`x ** y` : `x` to the Power of `y`**\
    Another way to raise `x` to the power `y`.


In [8]:
# Integer operations
print("5 + 4 =", 5 + 4)  # Sum: Output 5 + 4 = 9
print("5 - 4 =", 5 - 4)  # Difference: Output 5 - 4 = 1
print("5 * 4 =", 5 * 4)  # Product: Output 5 * 4 = 20
print("5 / 4 =", 5 / 4)  # Quotient: Output 5 / 4 = 1.25
print("5 // 4 =", 5 // 4)  # Floored quotient: Output 1
print("5 % 4 =", 5 % 4)  # Remainder: Output 1
print("abs(-5) =", abs(-5))  # Absolute value: Output 5
print("int(3.14) =", int(3.14))  # Convert to integer: Output 3
print("float(5) =", float(5))  # Convert to float: Output 5.0

# Complex number and conjugate
c = complex(1, 3)
print("Complex number:", c)  # Output: (1+3j)
print("Conjugate:", c.conjugate())  # Output: (1-3j)

# Power and divmod operations
print("2^3 =", pow(2, 3))  # Power: Output 2^3 = 8
print("2^3 =", 2 ** 3)  # Power (alternative syntax): Output 2^3 = 8
print("divmod(12, 5) =", divmod(12, 5))  # Output: (2, 2)


5 + 4 = 9
5 - 4 = 1
5 * 4 = 20
5 / 4 = 1.25
5 // 4 = 1
5 % 4 = 1
abs(-5) = 5
int(3.14) = 3
float(5) = 5.0
Complex number: (1+3j)
Conjugate: (1-3j)
2^3 = 8
2^3 = 8
divmod(12, 5) = (2, 2)


In [None]:
# Comparisons
a = 1
b = 2
c = 3
if b >= a and b <= c:
  print('b is between a and c')  # Output: b is between a and c


b is between a and c


You can also combine logical conditions using `and`, `or`, and `not` operators but I do not recommend.

### Floating Point

Floating-point numbers are used to represent real numbers that include decimal points. They can be written using decimal notation or exponential notation.


In [None]:
# Floating point
a = 37.5
b = 4e5  # Exponential notation (equivalent to 400000.0)
c = 1.34e-10
print('a =', a)  # Output: a = 37.5
print('b =', b)  # Output: b = 400000.0
print('c =', c)  # Output: c = 1.34e-10


Floating-point numbers are not always exact, due to the way they are stored in memory.

In [None]:
# Floating point precision
a = 2.1 + 4.1
print(a == 6.3)  # Output: False
print(a)  # Output: 6.300000000000001


Python provides many functions in the `math` module to work with floating-point numbers, including square roots, trigonometric functions, and logarithms.

In [None]:
# import math

x = 4
print(math.sqrt(x))  # Square root: Output 2.0
print(math.sin(x))  # Sine of x: Output -0.7568024953079282
print(math.log(2))  # Natural log of 2: Output 0.6931471805599453


### Type Conversion

Python allows you to convert between different types of numbers using built-in functions like `int()`, `float()`, and `complex()`.


In [None]:
# Type conversion
x = 5.0
print(x)  # Output: 5.0
print(type(x))  # Output: <class 'float'>

a = int(x)  # Convert to integer
print(a)  # Output: 5
print(type(a))  # Output: <class 'int'>

b = float(a)  # Convert back to float
print(b)  # Output: 5.0
print(type(b))  # Output: <class 'float'>

pi = '3.141519'
a = float(pi)  # Convert string to float
print(a)  # Output: 3.141519
print(type(a))  # Output: <class 'float'>


In [None]:
print