## 1. Introduction to SQL

In [8]:
import sqlite3
import pandas as pd

# Create in-memory database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()

print("=" * 60)
print("SQL FUNDAMENTALS")
print("=" * 60)
print("\n✓ Database and connection created")

SQL FUNDAMENTALS

✓ Database and connection created


## 1. Data Definition Language (DDL) - CREATE

DDL statements define the structure of databases and tables.

**CREATE TABLE** - Defines a new table with columns and constraints
- PRIMARY KEY: Unique identifier for each row
- NOT NULL: Value must be provided
- UNIQUE: All values must be different
- AUTOINCREMENT: Auto-generate next value

In [9]:
# Create departments table
print("\n" + "=" * 60)
print("1. CREATE TABLE - Departments")
print("=" * 60)

cursor.execute("""
CREATE TABLE departments (
    dept_id INTEGER PRIMARY KEY AUTOINCREMENT,
    dept_name TEXT NOT NULL UNIQUE,
    location TEXT
)
""")

print("✓ departments table created")
print("\nStructure:")
print("  dept_id (PRIMARY KEY, AUTOINCREMENT)")
print("  dept_name (NOT NULL, UNIQUE)")
print("  location")

# Create employees table
cursor.execute("""
CREATE TABLE employees (
    emp_id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    department TEXT NOT NULL,
    salary REAL NOT NULL,
    hire_date TEXT
)
""")

print("\n✓ employees table created")
print("\nStructure:")
print("  emp_id (PRIMARY KEY, AUTOINCREMENT)")
print("  name (NOT NULL)")
print("  department (NOT NULL)")
print("  salary (NOT NULL)")
print("  hire_date")


1. CREATE TABLE - Departments
✓ departments table created

Structure:
  dept_id (PRIMARY KEY, AUTOINCREMENT)
  dept_name (NOT NULL, UNIQUE)
  location

✓ employees table created

Structure:
  emp_id (PRIMARY KEY, AUTOINCREMENT)
  name (NOT NULL)
  department (NOT NULL)
  salary (NOT NULL)
  hire_date


## 2. Data Manipulation Language (DML) - INSERT

DML statements modify data in tables.

**INSERT** - Adds new rows to a table
- Single insert: One row at a time
- Batch insert: Multiple rows with executemany()
- Must provide values for NOT NULL columns

In [10]:
print("\n" + "=" * 60)
print("2. DML - INSERT DATA")
print("=" * 60)

# Insert departments
departments = [
    ('Sales', 'New York'),
    ('Engineering', 'San Francisco'),
    ('Marketing', 'Los Angeles'),
    ('HR', 'Chicago'),
]

cursor.executemany(
    "INSERT INTO departments (dept_name, location) VALUES (?, ?)",
    departments
)

print(f"✓ Inserted {len(departments)} departments")

# Insert employees
employees = [
    ('Alice', 'Sales', 85000, '2021-01-15'),
    ('Bob', 'Engineering', 120000, '2020-03-22'),
    ('Carol', 'Sales', 80000, '2021-06-10'),
    ('David', 'Engineering', 130000, '2019-11-30'),
    ('Eve', 'Marketing', 75000, '2022-02-14'),
    ('Frank', 'Sales', 90000, '2020-09-01'),
    ('Grace', 'Engineering', 125000, '2020-05-20'),
    ('Henry', 'HR', 70000, '2022-08-10'),
]

cursor.executemany(
    "INSERT INTO employees (name, department, salary, hire_date) VALUES (?, ?, ?, ?)",
    employees
)
conn.commit()

print(f"✓ Inserted {len(employees)} employees")


2. DML - INSERT DATA
✓ Inserted 4 departments
✓ Inserted 8 employees


## 3. Data Query Language (DQL) - SELECT

DQL statements retrieve data from tables.

**SELECT** - Retrieves specific columns from tables
- SELECT * : All columns
- SELECT col1, col2 : Specific columns
- WHERE : Filter conditions
- ORDER BY : Sort results
- LIMIT : Restrict number of rows

In [11]:
print("\n" + "=" * 60)
print("3. DQL - SELECT QUERIES")
print("=" * 60)

# Simple SELECT
print("\nQuery 1: All employees")
query = "SELECT * FROM employees"
result = pd.read_sql_query(query, conn)
print(result)

# SELECT specific columns
print("\n" + "-" * 60)
print("Query 2: Employee names and salaries")
query = "SELECT name, salary FROM employees"
result = pd.read_sql_query(query, conn)
print(result)


3. DQL - SELECT QUERIES

Query 1: All employees
   emp_id   name   department    salary   hire_date
0       1  Alice        Sales   85000.0  2021-01-15
1       2    Bob  Engineering  120000.0  2020-03-22
2       3  Carol        Sales   80000.0  2021-06-10
3       4  David  Engineering  130000.0  2019-11-30
4       5    Eve    Marketing   75000.0  2022-02-14
5       6  Frank        Sales   90000.0  2020-09-01
6       7  Grace  Engineering  125000.0  2020-05-20
7       8  Henry           HR   70000.0  2022-08-10

------------------------------------------------------------
Query 2: Employee names and salaries
    name    salary
0  Alice   85000.0
1    Bob  120000.0
2  Carol   80000.0
3  David  130000.0
4    Eve   75000.0
5  Frank   90000.0
6  Grace  125000.0
7  Henry   70000.0


## 4. Filtering with WHERE

The WHERE clause filters rows based on conditions.

**Operators:**
- = (equal), != (not equal)
- > (greater), < (less), >= (greater/equal), <= (less/equal)
- AND, OR: Multiple conditions
- IN: List of values
- BETWEEN: Range of values
- LIKE: Pattern matching

In [12]:
print("\n" + "=" * 60)
print("4. WHERE - FILTERING DATA")
print("=" * 60)

# Single condition
print("\nQuery 1: Employees in Sales department")
query = "SELECT name, department, salary FROM employees WHERE department = 'Sales'"
result = pd.read_sql_query(query, conn)
print(result)

# Multiple conditions with AND
print("\n" + "-" * 60)
print("Query 2: Sales employees earning > 80000")
query = """
SELECT name, department, salary FROM employees 
WHERE department = 'Sales' AND salary > 80000
"""
result = pd.read_sql_query(query, conn)
print(result)

# OR condition
print("\n" + "-" * 60)
print("Query 3: Sales OR Engineering department")
query = """
SELECT name, department, salary FROM employees 
WHERE department = 'Sales' OR department = 'Engineering'
"""
result = pd.read_sql_query(query, conn)
print(result)

# IN clause
print("\n" + "-" * 60)
print("Query 4: Using IN clause")
query = "SELECT name, salary FROM employees WHERE department IN ('Sales', 'Marketing')"
result = pd.read_sql_query(query, conn)
print(result)

# BETWEEN clause
print("\n" + "-" * 60)
print("Query 5: Salary between 80000 and 100000")
query = "SELECT name, salary FROM employees WHERE salary BETWEEN 80000 AND 100000"
result = pd.read_sql_query(query, conn)
print(result)


4. WHERE - FILTERING DATA

Query 1: Employees in Sales department
    name department   salary
0  Alice      Sales  85000.0
1  Carol      Sales  80000.0
2  Frank      Sales  90000.0

------------------------------------------------------------
Query 2: Sales employees earning > 80000
    name department   salary
0  Alice      Sales  85000.0
1  Frank      Sales  90000.0

------------------------------------------------------------
Query 3: Sales OR Engineering department
    name   department    salary
0  Alice        Sales   85000.0
1    Bob  Engineering  120000.0
2  Carol        Sales   80000.0
3  David  Engineering  130000.0
4  Frank        Sales   90000.0
5  Grace  Engineering  125000.0

------------------------------------------------------------
Query 4: Using IN clause
    name   salary
0  Alice  85000.0
1  Carol  80000.0
2    Eve  75000.0
3  Frank  90000.0

------------------------------------------------------------
Query 5: Salary between 80000 and 100000
    name   salary
0 

## 5. Sorting with ORDER BY and LIMIT

**ORDER BY** - Sorts results
- ASC: Ascending (default)
- DESC: Descending

**LIMIT** - Restricts number of rows returned

In [13]:
print("\n" + "=" * 60)
print("5. ORDER BY and LIMIT")
print("=" * 60)

# ORDER BY ascending
print("\nQuery 1: Employees ordered by name (A to Z)")
query = "SELECT name, salary FROM employees ORDER BY name ASC"
result = pd.read_sql_query(query, conn)
print(result)

# ORDER BY descending
print("\n" + "-" * 60)
print("Query 2: Employees ordered by salary (highest to lowest)")
query = "SELECT name, salary FROM employees ORDER BY salary DESC"
result = pd.read_sql_query(query, conn)
print(result)

# LIMIT
print("\n" + "-" * 60)
print("Query 3: Top 3 highest paid employees")
query = "SELECT name, salary FROM employees ORDER BY salary DESC LIMIT 3"
result = pd.read_sql_query(query, conn)
print(result)

# Multiple ORDER BY
print("\n" + "-" * 60)
print("Query 4: Order by department, then by salary")
query = "SELECT name, department, salary FROM employees ORDER BY department ASC, salary DESC LIMIT 5"
result = pd.read_sql_query(query, conn)
print(result)


5. ORDER BY and LIMIT

Query 1: Employees ordered by name (A to Z)
    name    salary
0  Alice   85000.0
1    Bob  120000.0
2  Carol   80000.0
3  David  130000.0
4    Eve   75000.0
5  Frank   90000.0
6  Grace  125000.0
7  Henry   70000.0

------------------------------------------------------------
Query 2: Employees ordered by salary (highest to lowest)
    name    salary
0  David  130000.0
1  Grace  125000.0
2    Bob  120000.0
3  Frank   90000.0
4  Alice   85000.0
5  Carol   80000.0
6    Eve   75000.0
7  Henry   70000.0

------------------------------------------------------------
Query 3: Top 3 highest paid employees
    name    salary
0  David  130000.0
1  Grace  125000.0
2    Bob  120000.0

------------------------------------------------------------
Query 4: Order by department, then by salary
    name   department    salary
0  David  Engineering  130000.0
1  Grace  Engineering  125000.0
2    Bob  Engineering  120000.0
3  Henry           HR   70000.0
4    Eve    Marketing   7500

## 6. Data Modification - UPDATE and DELETE

**UPDATE** - Modifies existing data
- Must use WHERE to specify which rows
- Can update one or multiple columns

**DELETE** - Removes rows
- Must use WHERE to specify which rows
- DELETE without WHERE deletes all rows!

In [14]:
print("\n" + "=" * 60)
print("6. UPDATE AND DELETE")
print("=" * 60)

# UPDATE example
print("\nQuery 1: Give Alice a 10% raise")
salary_increase = """
UPDATE employees 
SET salary = salary * 1.1 
WHERE name = 'Alice'
"""
cursor.execute(salary_increase)
conn.commit()

query = "SELECT name, salary FROM employees WHERE name = 'Alice'"
result = pd.read_sql_query(query, conn)
print(result)
print("✓ Alice's salary updated")

# DELETE example
print("\n" + "-" * 60)
print("Query 2: Delete employees with salary < 75000")
print("Before deletion:")
query = "SELECT name, salary FROM employees WHERE salary < 75000"
result = pd.read_sql_query(query, conn)
print(result)

cursor.execute("DELETE FROM employees WHERE salary < 75000")
conn.commit()

print("\nAfter deletion:")
query = "SELECT name, salary FROM employees WHERE salary < 75000"
result = pd.read_sql_query(query, conn)
print(f"Rows remaining: {len(result)}")
print("✓ Low salary employees deleted")


6. UPDATE AND DELETE

Query 1: Give Alice a 10% raise
    name   salary
0  Alice  93500.0
✓ Alice's salary updated

------------------------------------------------------------
Query 2: Delete employees with salary < 75000
Before deletion:
    name   salary
0  Henry  70000.0

After deletion:
Rows remaining: 0
✓ Low salary employees deleted


## Summary of SQL Fundamentals

| SQL Type | Statements | Purpose |
|----------|-----------|---------|
| **DDL** | CREATE, ALTER, DROP | Define database structure |
| **DML** | INSERT, UPDATE, DELETE | Modify data |
| **DQL** | SELECT | Retrieve data |
| **DCL** | GRANT, REVOKE | Control access |
| **TCL** | COMMIT, ROLLBACK | Transaction control |

### Key Concepts:
- **Primary Key**: Unique identifier for rows
- **Constraints**: Rules for data integrity
- **WHERE**: Filters rows before processing
- **ORDER BY**: Sorts results
- **LIMIT**: Restricts output

In [15]:
print("=" * 50)
print("SQL FUNDAMENTALS")
print("=" * 50)

intro = """
SQL (Structured Query Language):
- Declarative language for managing databases
- DDL: Data Definition Language (CREATE, ALTER, DROP)
- DML: Data Manipulation Language (INSERT, UPDATE, DELETE)
- DQL: Data Query Language (SELECT)
- DCL: Data Control Language (GRANT, REVOKE)
- TCL: Transaction Control Language (COMMIT, ROLLBACK)

Key Features:
- Set-based operations
- Database independence (mostly)
- Powerful joins and aggregations
- Transaction support
- ACID properties
"""
print(intro)

SQL FUNDAMENTALS

SQL (Structured Query Language):
- Declarative language for managing databases
- DDL: Data Definition Language (CREATE, ALTER, DROP)
- DML: Data Manipulation Language (INSERT, UPDATE, DELETE)
- DQL: Data Query Language (SELECT)
- DCL: Data Control Language (GRANT, REVOKE)
- TCL: Transaction Control Language (COMMIT, ROLLBACK)

Key Features:
- Set-based operations
- Database independence (mostly)
- Powerful joins and aggregations
- Transaction support
- ACID properties



## 2. SQLite Setup

In [16]:
import sqlite3
import pandas as pd

print("\n" + "=" * 50)
print("SQLITE SETUP")
print("=" * 50)

# Create in-memory database
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()

print("✓ SQLite connection established")
print(f"✓ SQLite version: {sqlite3.version}")


SQLITE SETUP
✓ SQLite connection established
✓ SQLite version: 2.6.0


  print(f"✓ SQLite version: {sqlite3.version}")


## 3. Data Definition Language (DDL) - CREATE

In [17]:
print("\n" + "=" * 50)
print("DDL - CREATE TABLE")
print("=" * 50)

# Create tables
create_employees = """
CREATE TABLE employees (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT NOT NULL,
    department TEXT,
    salary REAL,
    hire_date DATE,
    UNIQUE(name)
)
"""

create_departments = """
CREATE TABLE departments (
    dept_id INTEGER PRIMARY KEY,
    dept_name TEXT NOT NULL,
    location TEXT
)
"""

cursor.execute(create_employees)
cursor.execute(create_departments)
conn.commit()

print("✓ Tables created:")
print("  - employees (id, name, department, salary, hire_date)")
print("  - departments (dept_id, dept_name, location)")


DDL - CREATE TABLE
✓ Tables created:
  - employees (id, name, department, salary, hire_date)
  - departments (dept_id, dept_name, location)


## 4. Data Manipulation Language (DML) - INSERT

In [18]:
print("\n" + "=" * 50)
print("DML - INSERT")
print("=" * 50)

# Insert departments
departments_data = [
    (1, 'Sales', 'New York'),
    (2, 'Engineering', 'San Francisco'),
    (3, 'Marketing', 'Boston'),
    (4, 'HR', 'Chicago')
]

cursor.executemany(
    "INSERT INTO departments VALUES (?, ?, ?)",
    departments_data
)

# Insert employees
employees_data = [
    ('Alice Johnson', 'Sales', 75000, '2020-01-15'),
    ('Bob Smith', 'Engineering', 95000, '2019-05-20'),
    ('Carol White', 'Engineering', 90000, '2021-03-10'),
    ('David Brown', 'Sales', 70000, '2020-06-01'),
    ('Eve Davis', 'Marketing', 65000, '2021-01-20'),
    ('Frank Miller', 'Engineering', 100000, '2018-11-30'),
    ('Grace Lee', 'HR', 60000, '2022-02-14'),
    ('Henry Wilson', 'Sales', 72000, '2021-07-18')
]

cursor.executemany(
    "INSERT INTO employees (name, department, salary, hire_date) VALUES (?, ?, ?, ?)",
    employees_data
)

conn.commit()

print(f"✓ Inserted {len(departments_data)} departments")
print(f"✓ Inserted {len(employees_data)} employees")


DML - INSERT
✓ Inserted 4 departments
✓ Inserted 8 employees


## 5. Data Query Language (DQL) - SELECT

In [19]:
print("\n" + "=" * 50)
print("DQL - SELECT")
print("=" * 50)

# Basic SELECT
print("\n1. Select all employees:")
query = "SELECT * FROM employees"
result = pd.read_sql_query(query, conn)
print(result)

# SELECT specific columns
print("\n2. Select name and salary:")
query = "SELECT name, salary FROM employees"
result = pd.read_sql_query(query, conn)
print(result)

# SELECT with WHERE
print("\n3. Employees with salary > 75000:")
query = "SELECT name, salary FROM employees WHERE salary > 75000"
result = pd.read_sql_query(query, conn)
print(result)


DQL - SELECT

1. Select all employees:
   id           name   department    salary   hire_date
0   1  Alice Johnson        Sales   75000.0  2020-01-15
1   2      Bob Smith  Engineering   95000.0  2019-05-20
2   3    Carol White  Engineering   90000.0  2021-03-10
3   4    David Brown        Sales   70000.0  2020-06-01
4   5      Eve Davis    Marketing   65000.0  2021-01-20
5   6   Frank Miller  Engineering  100000.0  2018-11-30
6   7      Grace Lee           HR   60000.0  2022-02-14
7   8   Henry Wilson        Sales   72000.0  2021-07-18

2. Select name and salary:
            name    salary
0  Alice Johnson   75000.0
1      Bob Smith   95000.0
2    Carol White   90000.0
3    David Brown   70000.0
4      Eve Davis   65000.0
5   Frank Miller  100000.0
6      Grace Lee   60000.0
7   Henry Wilson   72000.0

3. Employees with salary > 75000:
           name    salary
0     Bob Smith   95000.0
1   Carol White   90000.0
2  Frank Miller  100000.0


## 6. Filtering and Conditions

print("\n" + "=" * 50)
print("WHERE CLAUSE - FILTERING")
print("=" * 50)

# WHERE with AND
print("\n1. Engineering department AND salary > 90000:")
query = "SELECT name, department, salary FROM employees WHERE department = 'Engineering' AND salary > 90000"
result = pd.read_sql_query(query, conn)
print(result)

# WHERE with OR
print("\n2. Sales OR Marketing department:")
query = "SELECT name, department FROM employees WHERE department = 'Sales' OR department = 'Marketing'"
result = pd.read_sql_query(query, conn)
print(result)

# WHERE with IN
print("\n3. Using IN clause:")
query = "SELECT name, department FROM employees WHERE department IN ('Sales', 'Engineering')"
result = pd.read_sql_query(query, conn)
print(result)

# WHERE with BETWEEN
print("\n4. Salary between 65000 and 80000:")
query = "SELECT name, salary FROM employees WHERE salary BETWEEN 65000 AND 80000"
result = pd.read_sql_query(query, conn)
print(result)

## 7. Sorting and Limiting

print("\n" + "=" * 50)
print("ORDER BY AND LIMIT")
print("=" * 50)

# ORDER BY ASC (default)
print("\n1. Employees sorted by salary (ascending):")
query = "SELECT name, salary FROM employees ORDER BY salary ASC LIMIT 5"
result = pd.read_sql_query(query, conn)
print(result)

# ORDER BY DESC
print("\n2. Top 3 highest paid employees:")
query = "SELECT name, salary FROM employees ORDER BY salary DESC LIMIT 3"
result = pd.read_sql_query(query, conn)
print(result)

# ORDER BY multiple columns
print("\n3. Sorted by department then salary:")
query = "SELECT name, department, salary FROM employees ORDER BY department ASC, salary DESC"
result = pd.read_sql_query(query, conn)
print(result)

## 8. Data Manipulation - UPDATE and DELETE

In [20]:
print("\n" + "=" * 50)
print("DML - UPDATE AND DELETE")
print("=" * 50)

# UPDATE
print("\n1. Update Bob's salary to 98000:")
cursor.execute("UPDATE employees SET salary = 98000 WHERE name = 'Bob Smith'")
conn.commit()
query = "SELECT name, salary FROM employees WHERE name = 'Bob Smith'"
result = pd.read_sql_query(query, conn)
print(result)

# UPDATE multiple rows
print("\n2. Give 10% raise to Engineering department:")
cursor.execute("UPDATE employees SET salary = salary * 1.1 WHERE department = 'Engineering'")
conn.commit()
query = "SELECT name, department, salary FROM employees WHERE department = 'Engineering'"
result = pd.read_sql_query(query, conn)
print(result)

# DELETE
print("\n3. Delete employee with lowest salary:")
cursor.execute("DELETE FROM employees WHERE salary = (SELECT MIN(salary) FROM employees)")
conn.commit()
print(f"✓ Deleted 1 employee")


DML - UPDATE AND DELETE

1. Update Bob's salary to 98000:
        name   salary
0  Bob Smith  98000.0

2. Give 10% raise to Engineering department:
           name   department    salary
0     Bob Smith  Engineering  107800.0
1   Carol White  Engineering   99000.0
2  Frank Miller  Engineering  110000.0

3. Delete employee with lowest salary:
✓ Deleted 1 employee


## 9. Summary

In [21]:
print("\n" + "=" * 50)
print("SUMMARY")
print("=" * 50)

summary = """
SQL Fundamentals Covered:

DDL (Data Definition):
✓ CREATE TABLE - Define table structure
✓ Data types (INTEGER, TEXT, REAL, DATE)
✓ Constraints (PRIMARY KEY, NOT NULL, UNIQUE)

DML (Data Manipulation):
✓ INSERT - Add new records
✓ UPDATE - Modify existing records
✓ DELETE - Remove records

DQL (Data Query):
✓ SELECT - Retrieve data
✓ WHERE - Filter conditions
✓ AND, OR, IN, BETWEEN - Complex conditions
✓ ORDER BY - Sort results
✓ LIMIT - Restrict row count

Key Concepts:
✓ Relational databases
✓ Table structure and relationships
✓ CRUD operations
✓ Basic query construction
"""
print(summary)


SUMMARY

SQL Fundamentals Covered:

DDL (Data Definition):
✓ CREATE TABLE - Define table structure
✓ Data types (INTEGER, TEXT, REAL, DATE)
✓ Constraints (PRIMARY KEY, NOT NULL, UNIQUE)

DML (Data Manipulation):
✓ INSERT - Add new records
✓ UPDATE - Modify existing records
✓ DELETE - Remove records

DQL (Data Query):
✓ SELECT - Retrieve data
✓ WHERE - Filter conditions
✓ AND, OR, IN, BETWEEN - Complex conditions
✓ ORDER BY - Sort results
✓ LIMIT - Restrict row count

Key Concepts:
✓ Relational databases
✓ Table structure and relationships
✓ CRUD operations
✓ Basic query construction

