# Module 0: Setting Up Your Data Science Environment

Welcome to **Data Science From Scratch**! Before we dive into Python and machine learning, we need to set up a clean and powerful development environment. This is the single most important step to ensure a smooth learning journey. 🚀

**In this notebook, we will:**
1.  Install Python using the Anaconda distribution.
2.  Set up Visual Studio Code (VS Code) as our code editor.
3.  Create an isolated virtual environment.
4.  Install the essential data science libraries.
5.  Verify that everything is working correctly.

## Step 1: Install Python via Anaconda

While you can install Python directly from python.org, we **highly recommend** using the **Anaconda Distribution**.

**Why Anaconda?**
* **It's an all-in-one package:** Anaconda comes with Python, a powerful package manager (`conda`), Jupyter Notebook, and hundreds of the most popular data science libraries pre-installed.
* **Simplifies Environment Management:** It makes handling different project dependencies much easier.

➡️ **Action:** [Download and install the Anaconda Distribution for your operating system (Windows, macOS, or Linux).](https://www.anaconda.com/download)

*Select the latest Python 3.x version during installation.*



## Step 2: Set Up Visual Studio Code (VS Code)

VS Code is a modern, free, and highly versatile code editor that has excellent support for Python and Jupyter Notebooks.

➡️ **Action:** [Download and install VS Code.](https://code.visualstudio.com/download)

#### Install Essential Extensions

Once VS Code is installed, you need to add extensions to make it powerful for data science. Go to the Extensions view (`Ctrl+Shift+X`) and install the following:

1.  **Python** (by Microsoft): This provides IntelliSense, linting, debugging, and more.
2.  **Jupyter** (by Microsoft): This allows you to work with `.ipynb` notebooks directly within VS Code.

## Step 3: Create a Virtual Environment

A virtual environment is a self-contained directory that holds a specific Python interpreter and its own set of installed libraries. This prevents conflicts between different projects.

➡️ **Action:** Open the terminal in VS Code (`View > Terminal` or `` ` ``) and run the following commands.

**1. Create the environment (we'll name it `dsfs_env`):**

```bash
conda create --name dsfs_env python=3.10 -y
```

**2. Activate the environment:**

```bash
conda activate dsfs_env
```

You'll know it's active because your terminal prompt will change to show `(dsfs_env)`.

**3. Connect VS Code to the environment:**

Open the command palette (`Ctrl+Shift+P`), type `Python: Select Interpreter`, and choose the one that has `('dsfs_env')` in its path.

## Step 4: Install Core Data Science Libraries

With your `dsfs_env` environment active, let's install the fundamental libraries we'll be using throughout this course.

➡️ **Action:** Run the following command in your active terminal.

```bash
pip install numpy pandas matplotlib seaborn scikit-learn notebook
```

**What are these libraries?**
* **`numpy`**: For numerical operations and working with arrays.
* **`pandas`**: For data manipulation and analysis (think Excel on steroids).
* **`matplotlib` & `seaborn`**: For creating static, animated, and interactive visualizations.
* **`scikit-learn`**: The go-to library for machine learning algorithms.
* **`notebook`**: The package that allows Jupyter to run.

## Step 5: Verify Your Installation

Let's run a quick piece of code to make sure all our libraries are installed and accessible within the notebook.

➡️ **Action:** Run the code cell below. You can do this by clicking the 'Run' icon next to the cell or by pressing `Shift + Enter`.

In [None]:
import numpy as np
import pandas as pd
import matplotlib
import seaborn as sns
import sklearn

print(f"NumPy version: {np.__version__}")
print(f"Pandas version: {pd.__version__}")
print(f"Matplotlib version: {matplotlib.__version__}")
print(f"Seaborn version: {sns.__version__}")
print(f"Scikit-learn version: {sklearn.__version__}")

## ✅ All Done!

If the cell above ran without errors and printed the version numbers for each library, then your environment is perfectly set up and ready for your data science journey!

You are now ready to move on to the next module 