# 1.2 Creating Tables and Data Types

This section covers table creation in SQL, including data types, constraints, and best practices for database schema design.

## Learning Objectives
By the end of this section, you will be able to:
- Create tables with appropriate data types
- Apply constraints to ensure data integrity
- Understand primary keys and foreign keys
- Design basic database schemas

## Prerequisites
- Completed section 1.1 (Database Setup)
- Understanding of basic data types
- Familiarity with database concepts

## Database Connection

Let's ensure we have an active database connection for this section.

In [1]:
import sqlite3
import pandas as pd
from IPython.display import display

# Connect to our database
conn = sqlite3.connect('my_database.db')
cursor = conn.cursor()

print("✅ Database connection established!")

✅ Database connection established!


## Understanding SQL Data Types

### Common SQLite Data Types:
- **INTEGER** - Whole numbers (1, 2, 100, -5)
- **TEXT** - String data ('John', 'Hello World')
- **REAL** - Floating point numbers (3.14, 2.5)
- **BLOB** - Binary data (images, files)
- **NULL** - Missing or empty values

### Data Type Mapping:
SQLite is flexible with data types, but we can specify them for clarity:

In [2]:
# Demonstrate data type examples
print("📊 SQLite Data Type Examples:")
print("=" * 40)

examples = {
    'INTEGER': [1, 42, -10, 999999],
    'TEXT': ['John', 'Hello World', 'SQL@email.com'],
    'REAL': [3.14, 2.5, -1.23, 100.0],
    'BOOLEAN': ['TRUE', 'FALSE', '1', '0']  # SQLite stores as INTEGER
}

for data_type, values in examples.items():
    print(f"{data_type:10}: {values}")

📊 SQLite Data Type Examples:
INTEGER   : [1, 42, -10, 999999]
TEXT      : ['John', 'Hello World', 'SQL@email.com']
REAL      : [3.14, 2.5, -1.23, 100.0]
BOOLEAN   : ['TRUE', 'FALSE', '1', '0']


## Creating Your First Table

Let's create a simple `departments` table with proper data types and constraints.

In [3]:
# Create departments table with detailed data types
cursor.execute('''
DROP TABLE IF EXISTS departments
''')

cursor.execute('''
CREATE TABLE departments (
    dept_id INTEGER PRIMARY KEY AUTOINCREMENT,
    dept_name VARCHAR(100) NOT NULL,
    location VARCHAR(100),
    budget DECIMAL(10, 2) DEFAULT 0,
    created_date DATE DEFAULT CURRENT_DATE,
    is_active BOOLEAN DEFAULT 1
)
''')

print("✅ Departments table created successfully!")

# Show table structure
cursor.execute("PRAGMA table_info(departments)")
columns = cursor.fetchall()

print("\n📋 Table Structure:")
print("Column Name       | Data Type    | Not Null | Default | Primary Key")
print("-" * 65)
for col in columns:
    pk = "YES" if col[5] else "NO"
    not_null = "YES" if col[3] else "NO"
    default = col[4] if col[4] else "None"
    print(f"{col[1]:17} | {col[2]:12} | {not_null:8} | {default:7} | {pk}")

✅ Departments table created successfully!

📋 Table Structure:
Column Name       | Data Type    | Not Null | Default | Primary Key
-----------------------------------------------------------------
dept_id           | INTEGER      | NO       | None    | YES
dept_name         | VARCHAR(100) | YES      | None    | NO
location          | VARCHAR(100) | NO       | None    | NO
budget            | DECIMAL(10, 2) | NO       | 0       | NO
created_date      | DATE         | NO       | CURRENT_DATE | NO
is_active         | BOOLEAN      | NO       | 1       | NO


## Understanding Constraints

### Primary Key
- Uniquely identifies each row
- Cannot be NULL
- Each table should have one primary key

### Foreign Key
- References primary key in another table
- Maintains referential integrity
- Creates relationships between tables

### Other Constraints
- **NOT NULL**: Column cannot be empty
- **UNIQUE**: Values must be unique across rows
- **DEFAULT**: Provides default value if none specified

In [4]:
# Create employees table with foreign key constraint
cursor.execute('''
DROP TABLE IF EXISTS employees
''')

cursor.execute('''
CREATE TABLE employees (
    emp_id INTEGER PRIMARY KEY AUTOINCREMENT,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE NOT NULL,
    hire_date DATE NOT NULL,
    salary DECIMAL(10, 2) CHECK (salary > 0),
    dept_id INTEGER,
    FOREIGN KEY (dept_id) REFERENCES departments (dept_id)
)
''')

print("✅ Employees table created with constraints!")

# Show the table structure
cursor.execute("PRAGMA table_info(employees)")
emp_columns = cursor.fetchall()

print("\n📋 Employees Table Structure:")
print("Column Name       | Data Type    | Not Null | Default | Primary Key")
print("-" * 65)
for col in emp_columns:
    pk = "YES" if col[5] else "NO"
    not_null = "YES" if col[3] else "NO"
    default = col[4] if col[4] else "None"
    print(f"{col[1]:17} | {col[2]:12} | {not_null:8} | {default:7} | {pk}")

✅ Employees table created with constraints!

📋 Employees Table Structure:
Column Name       | Data Type    | Not Null | Default | Primary Key
-----------------------------------------------------------------
emp_id            | INTEGER      | NO       | None    | YES
first_name        | VARCHAR(50)  | YES      | None    | NO
last_name         | VARCHAR(50)  | YES      | None    | NO
email             | VARCHAR(100) | YES      | None    | NO
hire_date         | DATE         | YES      | None    | NO
salary            | DECIMAL(10, 2) | NO       | None    | NO
dept_id           | INTEGER      | NO       | None    | NO


## Creating a Projects Table

Let's create a more complex table that demonstrates various data types and constraints.

In [5]:
# Create projects table with comprehensive constraints
cursor.execute('''
DROP TABLE IF EXISTS projects
''')

cursor.execute('''
CREATE TABLE projects (
    project_id INTEGER PRIMARY KEY AUTOINCREMENT,
    project_name VARCHAR(100) NOT NULL UNIQUE,
    description TEXT,
    start_date DATE NOT NULL,
    end_date DATE,
    budget DECIMAL(12, 2) NOT NULL DEFAULT 0,
    status VARCHAR(20) DEFAULT 'Planning',
    priority INTEGER CHECK (priority BETWEEN 1 AND 5) DEFAULT 3,
    dept_id INTEGER NOT NULL,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (dept_id) REFERENCES departments (dept_id),
    CHECK (end_date IS NULL OR end_date >= start_date)
)
''')

print("✅ Projects table created with advanced constraints!")

# Commit all table creations
conn.commit()
print("💾 All tables committed to database!")

# Show all tables in database
cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
tables = cursor.fetchall()

print(f"\n📚 Database now contains {len(tables)} tables:")
for table in tables:
    print(f"  • {table[0]}")

✅ Projects table created with advanced constraints!
💾 All tables committed to database!

📚 Database now contains 5 tables:
  • sqlite_sequence
  • customers
  • departments
  • employees
  • projects


## Inserting Sample Data

Now let's populate our tables with some sample data to test our schema.

In [6]:
# Insert sample departments
departments_data = [
    ('Human Resources', 'New York', 500000, '2020-01-01', 1),
    ('Engineering', 'San Francisco', 2000000, '2019-06-15', 1),
    ('Marketing', 'Chicago', 750000, '2020-03-01', 1),
    ('Sales', 'Los Angeles', 1200000, '2019-12-01', 1),
    ('Finance', 'Boston', 600000, '2020-02-15', 1)
]

cursor.executemany('''
INSERT INTO departments (dept_name, location, budget, created_date, is_active) 
VALUES (?, ?, ?, ?, ?)
''', departments_data)

print("✅ Sample departments inserted!")

# Insert sample employees
employees_data = [
    ('John', 'Doe', 'john.doe@company.com', '2020-01-15', 75000, 2),
    ('Jane', 'Smith', 'jane.smith@company.com', '2019-03-22', 82000, 2),
    ('Mike', 'Johnson', 'mike.johnson@company.com', '2021-06-10', 65000, 3),
    ('Sarah', 'Williams', 'sarah.williams@company.com', '2020-09-05', 90000, 1),
    ('David', 'Brown', 'david.brown@company.com', '2018-11-12', 95000, 4)
]

cursor.executemany('''
INSERT INTO employees (first_name, last_name, email, hire_date, salary, dept_id) 
VALUES (?, ?, ?, ?, ?, ?)
''', employees_data)

print("✅ Sample employees inserted!")

# Insert sample projects
projects_data = [
    ('Website Redesign', 'Complete overhaul of company website', '2023-01-01', '2023-06-30', 150000, 'In Progress', 4, 2),
    ('Mobile App Development', 'Native mobile app for iOS and Android', '2023-03-15', '2023-12-31', 300000, 'Planning', 5, 2),
    ('Marketing Campaign Q2', 'Summer marketing campaign', '2023-04-01', '2023-06-30', 75000, 'Active', 3, 3),
    ('Sales Training Program', 'Comprehensive sales training', '2023-02-01', '2023-05-31', 50000, 'Completed', 2, 4),
    ('Financial System Upgrade', 'ERP system modernization', '2023-01-15', '2023-08-31', 200000, 'In Progress', 5, 5)
]

cursor.executemany('''
INSERT INTO projects (project_name, description, start_date, end_date, budget, status, priority, dept_id) 
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
''', projects_data)

print("✅ Sample projects inserted!")

# Commit all data
conn.commit()
print("💾 All sample data committed!")

✅ Sample departments inserted!
✅ Sample employees inserted!
✅ Sample projects inserted!
💾 All sample data committed!


## Verifying Our Schema

Let's verify that our tables and data were created correctly.

In [7]:
# Check data in each table
print("🔍 Data Verification:")
print("=" * 50)

# Departments
print("\n📊 DEPARTMENTS:")
df = pd.read_sql_query("SELECT * FROM departments", conn)
display(df)

print("\n👥 EMPLOYEES:")
df = pd.read_sql_query("SELECT * FROM employees", conn)
display(df)

print("\n🚀 PROJECTS:")
df = pd.read_sql_query("SELECT * FROM projects", conn)
display(df)

# Show record counts
print("\n📈 Record Counts:")
tables = ['departments', 'employees', 'projects']
for table in tables:
    cursor.execute(f"SELECT COUNT(*) FROM {table}")
    count = cursor.fetchone()[0]
    print(f"  {table:12}: {count:3} records")

🔍 Data Verification:

📊 DEPARTMENTS:


Unnamed: 0,dept_id,dept_name,location,budget,created_date,is_active
0,1,Human Resources,New York,500000,2020-01-01,1
1,2,Engineering,San Francisco,2000000,2019-06-15,1
2,3,Marketing,Chicago,750000,2020-03-01,1
3,4,Sales,Los Angeles,1200000,2019-12-01,1
4,5,Finance,Boston,600000,2020-02-15,1



👥 EMPLOYEES:


Unnamed: 0,emp_id,first_name,last_name,email,hire_date,salary,dept_id
0,1,John,Doe,john.doe@company.com,2020-01-15,75000,2
1,2,Jane,Smith,jane.smith@company.com,2019-03-22,82000,2
2,3,Mike,Johnson,mike.johnson@company.com,2021-06-10,65000,3
3,4,Sarah,Williams,sarah.williams@company.com,2020-09-05,90000,1
4,5,David,Brown,david.brown@company.com,2018-11-12,95000,4



🚀 PROJECTS:


Unnamed: 0,project_id,project_name,description,start_date,end_date,budget,status,priority,dept_id,created_at,updated_at
0,1,Website Redesign,Complete overhaul of company website,2023-01-01,2023-06-30,150000,In Progress,4,2,2025-07-04 07:48:20,2025-07-04 07:48:20
1,2,Mobile App Development,Native mobile app for iOS and Android,2023-03-15,2023-12-31,300000,Planning,5,2,2025-07-04 07:48:20,2025-07-04 07:48:20
2,3,Marketing Campaign Q2,Summer marketing campaign,2023-04-01,2023-06-30,75000,Active,3,3,2025-07-04 07:48:20,2025-07-04 07:48:20
3,4,Sales Training Program,Comprehensive sales training,2023-02-01,2023-05-31,50000,Completed,2,4,2025-07-04 07:48:20,2025-07-04 07:48:20
4,5,Financial System Upgrade,ERP system modernization,2023-01-15,2023-08-31,200000,In Progress,5,5,2025-07-04 07:48:20,2025-07-04 07:48:20



📈 Record Counts:
  departments :   5 records
  employees   :   5 records
  projects    :   5 records


## Practice Exercises

Practice creating tables and understanding constraints:

1. **Create a new table**: Design a `customers` table with appropriate constraints
2. **Add constraints**: Modify an existing table to add new constraints
3. **Foreign key relationships**: Create a table that references multiple other tables

Complete the exercises below:

In [8]:
# Exercise 1: Create a customers table
print("Exercise 1: Creating customers table")

cursor.execute('''
DROP TABLE IF EXISTS customers
''')

# Your turn - create a customers table with these columns:
# - customer_id (primary key, auto-increment)
# - company_name (text, not null)
# - contact_name (text, not null)  
# - email (text, unique)
# - phone (text)
# - address (text)
# - city (text)
# - country (text, default 'USA')
# - credit_limit (decimal, check > 0)

cursor.execute('''
CREATE TABLE customers (
    customer_id INTEGER PRIMARY KEY AUTOINCREMENT,
    company_name VARCHAR(100) NOT NULL,
    contact_name VARCHAR(100) NOT NULL,
    email VARCHAR(100) UNIQUE,
    phone VARCHAR(20),
    address TEXT,
    city VARCHAR(50),
    country VARCHAR(50) DEFAULT 'USA',
    credit_limit DECIMAL(10, 2) CHECK (credit_limit > 0)
)
''')

print("✅ Customers table created!")

# Verify the structure
cursor.execute("PRAGMA table_info(customers)")
customer_columns = cursor.fetchall()
print("\nCustomers table structure:")
for col in customer_columns:
    print(f"  {col[1]} ({col[2]}) - Not Null: {bool(col[3])}")

print("\n" + "="*30 + "\n")

# Exercise 2: Insert sample customer data
print("Exercise 2: Adding sample customers")

customers_data = [
    ('Tech Solutions Inc', 'Alice Johnson', 'alice@techsolutions.com', '555-0101', '123 Tech St', 'San Francisco', 'USA', 50000),
    ('Global Marketing Ltd', 'Bob Smith', 'bob@globalmarketing.com', '555-0102', '456 Market Ave', 'New York', 'USA', 75000),
    ('Innovation Corp', 'Carol Davis', 'carol@innovation.com', '555-0103', '789 Innovation Blvd', 'Austin', 'USA', 100000)
]

cursor.executemany('''
INSERT INTO customers (company_name, contact_name, email, phone, address, city, country, credit_limit)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
''', customers_data)

conn.commit()
print("✅ Sample customers added!")

# Display the results
df = pd.read_sql_query("SELECT * FROM customers", conn)
display(df)

Exercise 1: Creating customers table
✅ Customers table created!

Customers table structure:
  customer_id (INTEGER) - Not Null: False
  company_name (VARCHAR(100)) - Not Null: True
  contact_name (VARCHAR(100)) - Not Null: True
  email (VARCHAR(100)) - Not Null: False
  phone (VARCHAR(20)) - Not Null: False
  address (TEXT) - Not Null: False
  city (VARCHAR(50)) - Not Null: False
  country (VARCHAR(50)) - Not Null: False
  credit_limit (DECIMAL(10, 2)) - Not Null: False


Exercise 2: Adding sample customers
✅ Sample customers added!


Unnamed: 0,customer_id,company_name,contact_name,email,phone,address,city,country,credit_limit
0,1,Tech Solutions Inc,Alice Johnson,alice@techsolutions.com,555-0101,123 Tech St,San Francisco,USA,50000
1,2,Global Marketing Ltd,Bob Smith,bob@globalmarketing.com,555-0102,456 Market Ave,New York,USA,75000
2,3,Innovation Corp,Carol Davis,carol@innovation.com,555-0103,789 Innovation Blvd,Austin,USA,100000


## Section Summary

In this section, you mastered:

✅ **Table Creation**: Using `CREATE TABLE` with proper syntax  
✅ **Data Types**: INTEGER, TEXT, REAL, DATE, DECIMAL, BOOLEAN  
✅ **Constraints**: PRIMARY KEY, FOREIGN KEY, NOT NULL, UNIQUE, CHECK  
✅ **Schema Design**: Planning table relationships and data integrity  
✅ **Sample Data**: Inserting test data to verify your schema  

### Key SQL Commands:
- `CREATE TABLE` - Define new tables
- `DROP TABLE IF EXISTS` - Remove existing tables safely
- `PRAGMA table_info()` - View table structure
- `INSERT INTO` - Add data to tables
- `CHECK` constraints - Validate data ranges and conditions

### Best Practices Learned:
- Always use appropriate data types
- Include NOT NULL for required fields
- Use CHECK constraints for data validation
- Plan foreign key relationships carefully
- Include default values where appropriate

### Next Steps
In section 1.3, you'll learn how to query this data using SELECT statements and retrieve information from your tables.