# Basic SQL Queries with DuckDB

This notebook covers fundamental SQL queries:
- Selecting all columns (SELECT *)
- Selecting specific columns
- Basic filtering with WHERE
- Using comparison operators
- Column aliases

In [None]:
import duckdb
import pandas as pd

# Connect to the movies database
conn = duckdb.connect('movies.db')
print("Connected to database successfully!")

## SELECT All Columns

The `SELECT *` statement retrieves all columns from a table:
```sql
SELECT *
FROM table_name;
```

Example: Get all information about movies

In [None]:
query = """
SELECT *
FROM movies;
"""
result = conn.execute(query).df()
display(result.head())

🏋️ Challenge: Get all information from the actors table

In [None]:
# Your code here

## SELECT Specific Columns

To select specific columns, list them after SELECT:
```sql
SELECT column1, column2
FROM table_name;
```

Example: Get movie titles and release years

In [None]:
query = """
SELECT title, release_year
FROM movies;
"""
result = conn.execute(query).df()
display(result.head())

🏋️ Challenge: Get only the name and birth year from the actors table

In [None]:
# Your code here

## Column Aliases

Use AS to give columns different names in the output:
```sql
SELECT column1 AS "New Name"
FROM table_name;
```

Example: Rename columns for clearer output

In [None]:
query = """
SELECT 
    title AS "Movie Title",
    release_year AS "Year Released"
FROM movies;
"""
result = conn.execute(query).df()
display(result.head())

🏋️ Challenge: Get the name column from actors as "Actor Name" and birth_year as "Year Born"

In [None]:
# Your code here

## WHERE Clause

The WHERE clause filters rows based on conditions:
```sql
SELECT column1, column2
FROM table_name
WHERE condition;
```

Example: Find recent movies

In [None]:
query = """
SELECT title, release_year
FROM movies
WHERE release_year > 2020;
"""
result = conn.execute(query).df()
display(result)

🏋️ Challenge: Find all actors born after 1990

In [None]:
# Your code here

## Comparison Operators

SQL supports: =, <>, >, <, >=, <=


Example: Find movies from exactly 2023

In [None]:
query = """
SELECT title, release_year
FROM movies
WHERE release_year = 2023;
"""
result = conn.execute(query).df()
display(result)

🏋️ Challenge: Find all movies released before 1950

In [None]:
# Your code here

## AND/OR Conditions

Combine conditions using AND/OR:
```sql
SELECT columns
FROM table
WHERE condition1 AND/OR condition2;
```

Example: Find movies from the 2020s with specific word in title

In [None]:
query = """
SELECT title, release_year
FROM movies
WHERE release_year >= 2020 
AND title LIKE '%Love%';
"""
result = conn.execute(query).df()
display(result)

🏋️ Challenge: Find all movies released either before 1950 OR after 2020

In [None]:
# Your code here

## Solutions

Here are solutions to the challenges:

### Challenge 1: All Actor Information

In [None]:
query = """
SELECT *
FROM actors;
"""
result = conn.execute(query).df()
display(result)

### Challenge 2: Actor Names and Birth Years

In [None]:
query = """
SELECT name, birth_year
FROM actors;
"""
result = conn.execute(query).df()
display(result)

### Challenge 3: Actor Column Aliases

In [None]:
query = """
SELECT 
    name AS "Actor Name",
    birth_year AS "Year Born"
FROM actors;
"""
result = conn.execute(query).df()
display(result)

### Challenge 4: Recent Actors

In [None]:
query = """
SELECT name, birth_year
FROM actors
WHERE birth_year > 1990;
"""
result = conn.execute(query).df()
display(result)

### Challenge 5: Old Movies

In [None]:
query = """
SELECT title, release_year
FROM movies
WHERE release_year < 1950;
"""
result = conn.execute(query).df()
display(result)

### Challenge 6: Very Old or Very New Movies

In [None]:
query = """
SELECT title, release_year
FROM movies
WHERE release_year < 1950 
OR release_year > 2020;
"""
result = conn.execute(query).df()
display(result)

In [None]:
conn.close()
print("Database connection closed.")

## Key Points to Remember
- SELECT * returns all columns but should be used sparingly
- Always specify the columns you actually need
- Use column aliases to make output more readable
- WHERE clause filters rows based on conditions
- You can combine conditions using AND/OR
- Always close your database connection when finished