## **Lecture 1: Introduction to Python for Data Science**

### **1. Introduction to Python**

* **What is Python?**
    * High-level, interpreted language
    * General-purpose, but excels in data science
    * Readable syntax, emphasizing code clarity
* **Why Python for Data Science?**
    * **Rich ecosystem of libraries:** NumPy, Pandas, Matplotlib, Scikit-learn, TensorFlow, PyTorch
    * **Large and active community:** Constant development, support, and resources
    * **Versatility:** Can be used for data cleaning, analysis, visualization, machine learning, and more
* **Basic Syntax**
    * **Variables and data types:**
        * Numbers (integers, floats)
        * Strings
        * Booleans
        * Lists
        * Dictionaries
    * **Operators:** Arithmetic, comparison, logical
    * **Control flow:** `if`, `else`, `elif`, `for`, `while`
    * **Functions:** Defining and calling functions

**Example:**

In [1]:
# Simple Python program
print("Hello, world!")

# Variables and data types
x = 10  # Integer
y = 3.14  # Float
name = "Alice"  # String
is_student = True  # Boolean

# List
my_list = [1, 2, 3, "apple", "banana"]

# Dictionary
my_dict = {"name": "Bob", "age": 30, "city": "New York"}

# Control flow
if x > 5:
    print("x is greater than 5")
else:
    print("x is less than or equal to 5")

# Function
def greet(name):
    print("Hello, " + name + "!")

greet("Charlie")

Hello, world!
x is greater than 5
Hello, Charlie!


### **2. Setting Up the Environment**

**Virtual Environments**
* **Why use virtual environments?**
    * Isolation of project dependencies
    * Avoid conflicts between different projects
* **Creating a virtual environment with `venv`:**
```bash
python -m venv prog-spring-2025
```
* **Activating the virtual environment:**
    * **Windows:** `prog-spring-2025\Scripts\activate`
    * **Linux/macOS:** `source prog-spring-2025/bin/activate`
* **Installing packages:**
```bash
pip install numpy pandas matplotlib scikit-learn
```

* **Installing packages with requirements file**:

Create file `requirements-prog-spring-2025.txt`:
```
scikit-learn==1.3.0
numpy==1.24.4
pandas==2.0.3
matplotlib==3.7.2
mlxtend==0.22.0
graphviz==0.20.1
mglearn==0.2.0
future==0.18.3
notebook==7.0.2
ipykernel==6.25.0
shap==0.42.1
category-encoders==2.6.1
seaborn==0.12.2
optuna==3.3.0
ipywidgets==8.1.0
xgboost==2.0.3
plotly==5.14.1
pyarrow==16.1.0
polars==0.20.29
```
Run install (make sure your virtual environment is activated):
```bash
pip install -r requirements-prog-spring-2025.txt
```

**VS Code Setup**
* **Installing VS Code:** Download and install from the official website.
* **Installing Python extension:** Search for "Python" in the extensions tab and install.
* **Configuring the interpreter:**
    * Open the command palette (Ctrl+Shift+P) and search for "Python: Select Interpreter".
    * Choose the Python interpreter from your virtual environment.
* **Using VS Code features:**
    * Code completion, syntax highlighting, linting
    * Debugging
    * Git integration

**Additional Tips**
* **Learn from online resources:**
    * Official Python documentation
    * DataCamp, Coursera, edX
    * YouTube tutorials
* **Practice regularly:**
    * Work on small projects
    * Participate in online challenges (Kaggle, HackerRank)

* **Use AI resources responsibly:**
    * Learn concepts before applying AI to generate code
    * If you use aI to generate code you don't understand, you will not be able to debug it and spot errors
    * To use AI effectively you need:
        * Understand task
        * Understand how solution should look like
        * Be able to spot errors in the solution provided by AI
        * Be able to debug code generated by AI
