# Python-SQL-Data-Science Workspace Setup

This notebook will guide you through setting up a new workspace for Python, SQL, and Data Science development. Follow each section to prepare your environment for efficient and reproducible analysis.

## 1. Create a New Workspace Directory

Use Python's `os` and `pathlib` modules to create a new directory for your workspace. This ensures your project files are organized in a dedicated location.

In [1]:
import os
from pathlib import Path

# Define the workspace directory name
workspace_dir = Path.cwd() / "my_data_science_project"

# Create the directory if it doesn't exist
workspace_dir.mkdir(exist_ok=True)
print(f"Workspace directory created at: {workspace_dir}")

Workspace directory created at: /Users/drahmetacik/Projects/python-sql-data-science/my_data_science_project


## 2. Initialize a Git Repository

Initialize a new Git repository in your workspace directory to enable version control for your project files.

In [2]:
import subprocess

# Change to the workspace directory
os.chdir(workspace_dir)

# Initialize git repository
subprocess.run(["git", "init"], check=True)
print(f"Initialized empty Git repository in {workspace_dir}")

Initialized empty Git repository in /Users/drahmetacik/Projects/python-sql-data-science/my_data_science_project/.git/
Initialized empty Git repository in /Users/drahmetacik/Projects/python-sql-data-science/my_data_science_project


## 3. Set Up a Python Virtual Environment

Create and activate a Python virtual environment in your workspace to isolate dependencies and ensure reproducibility.

In [3]:
import sys

# Create a virtual environment in the workspace directory
venv_dir = workspace_dir / ".venv"
subprocess.run([sys.executable, "-m", "venv", str(venv_dir)], check=True)
print(f"Virtual environment created at: {venv_dir}")

Virtual environment created at: /Users/drahmetacik/Projects/python-sql-data-science/my_data_science_project/.venv


## 4. Install Essential Packages

Use pip to install essential packages for data science and SQL workflows, such as numpy, pandas, SQLAlchemy, jupyter, and database connectors.

In [5]:
# Install essential packages in the virtual environment
pip_path = venv_dir / "bin" / "pip"
subprocess.run([str(pip_path), "install", "--upgrade", "pip"], check=True)
subprocess.run([str(pip_path), "install", "numpy", "pandas", "sqlalchemy", "jupyter", "ipython", "matplotlib", "mysql-connector-python"], check=True)
print("Essential packages installed.")

Collecting numpy
  Using cached numpy-2.0.2-cp39-cp39-macosx_14_0_x86_64.whl.metadata (60 kB)
Collecting pandas
  Using cached pandas-2.3.2-cp39-cp39-macosx_10_9_x86_64.whl.metadata (91 kB)
Collecting numpy
  Using cached numpy-2.0.2-cp39-cp39-macosx_14_0_x86_64.whl.metadata (60 kB)
Collecting pandas
  Using cached pandas-2.3.2-cp39-cp39-macosx_10_9_x86_64.whl.metadata (91 kB)
Collecting sqlalchemy
  Using cached sqlalchemy-2.0.43-cp39-cp39-macosx_10_9_x86_64.whl.metadata (9.6 kB)
Collecting jupyter
  Using cached jupyter-1.1.1-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting ipython
  Using cached ipython-8.18.1-py3-none-any.whl.metadata (6.0 kB)
Collecting sqlalchemy
  Using cached sqlalchemy-2.0.43-cp39-cp39-macosx_10_9_x86_64.whl.metadata (9.6 kB)
Collecting jupyter
  Using cached jupyter-1.1.1-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting ipython
  Using cached ipython-8.18.1-py3-none-any.whl.metadata (6.0 kB)
Collecting matplotlib
  Using cached matplotlib-3.9.4-cp39-cp39-ma

## 5. Configure VS Code Workspace Settings

Set up VS Code workspace settings to use the correct Python interpreter and configure other preferences for data science development.

In [6]:
import json
vscode_dir = workspace_dir / ".vscode"
vscode_dir.mkdir(exist_ok=True)
settings_path = vscode_dir / "settings.json"

# Set the Python interpreter to the virtual environment
settings = {
    "python.pythonPath": str(venv_dir / "bin" / "python"),
    "python.formatting.provider": "black",
    "python.linting.enabled": True,
    "python.linting.pylintEnabled": True,
    "jupyter.jupyterServerType": "local"
}
with open(settings_path, "w") as f:
    json.dump(settings, f, indent=4)
print(f"VS Code workspace settings written to {settings_path}")

VS Code workspace settings written to /Users/drahmetacik/Projects/python-sql-data-science/my_data_science_project/.vscode/settings.json
