# Database 101: Structure and elementary function in SQL


## 📘 Introduction
This session introduces SQL, the Structured Query Language, which is used to interact with relational databases. Understanding SQL is fundamental for data analysts and researchers to retrieve, filter, and explore datasets efficiently.


## What is SQL?
SQL (Structured Query Language) is a standard language used to interact with relational databases. It allows you to:
- Retrieve data from one or more tables
- Filter, sort, and aggregate data
- Insert, update, or delete records

> SQL is declarative: you tell the database *what* you want, not *how* to get it.


## What is a Database?
A **database** is a structured collection of data. A **relational database** organizes data into tables. Each **table** consists of:
- **Rows**: individual records (also called tuples)
- **Columns**: attributes of each record

### Example Table: `reviews`
| review_id | game_name | review_text        | rating | review_date |
|-----------|-----------|--------------------|--------|-------------|
| 1         | Game A    | Great experience!  | 5      | 2023-01-01  |
| 2         | Game B    | Needs improvement  | 3      | 2023-02-10  |


## Basic SQL Syntax: `SELECT`
The `SELECT` statement is used to query data from a table.

```sql
SELECT column1, column2 FROM table_name;
```

### Example:
```sql
SELECT game_name, rating FROM reviews;
```

## Filtering Data: `WHERE`
The `WHERE` clause filters rows based on conditions.

```sql
SELECT * FROM reviews WHERE rating >= 4;
```

You can use operators like:
- `=` (equals)
- `!=` or `<>` (not equals)
- `<`, `<=`, `>`, `>=`
- `LIKE` (pattern matching)


## Preview Data: `LIMIT`
The `LIMIT` clause restricts the number of rows returned.

```sql
SELECT * FROM reviews LIMIT 5;
```

Useful for quickly exploring a table without fetching everything.



## Setting Up for Practice
We'll use `sqlite3` and `pandas` in Python to simulate a small SQL database.

```python
import sqlite3
import pandas as pd

# Here, we simply create a local database with some sample data, just to illustrate the different sql functions
data = [
    (1, 'Game A', 'Amazing experience', 5, '2023-01-01'),
    (2, 'Game B', 'Not bad but buggy', 3, '2023-01-15'),
    (3, 'Game A', 'Loved it!', 4, '2023-02-01'),
    (4, 'Game C', 'Terrible experience', 1, '2023-03-12'),
    (5, 'Game A', 'Pretty good', 4, '2023-04-05')
]

# Here, we create a connection to the database and create the table
conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('''
    CREATE TABLE reviews (
        review_id INTEGER,
        game_name TEXT,
        review_text TEXT,
        rating INTEGER,
        review_date TEXT
    );
''')
cursor.executemany('INSERT INTO reviews VALUES (?, ?, ?, ?, ?);', data)
conn.commit()
```


## 🔍 Query Example in Python
```python
query = """
SELECT game_name, rating FROM reviews WHERE rating >= 4;
"""
pd.read_sql_query(query, conn)
```

## ✏️ Exercises
1. Select all reviews for 'Game A'.
2. Show all review texts with a rating below 4.
3. Retrieve the first 3 reviews.
4. Select `game_name` and `review_date` where the rating is 5.

You can try them below:

```python
# Exercise 1: All reviews for Game A
pd.read_sql_query("""
SELECT * FROM reviews WHERE game_name = 'Game A';
""", conn)

# Exercise 2: Review texts with rating below 4
# (Fill in below)

# Exercise 3: First 3 reviews
# (Fill in below)

# Exercise 4: Game names and review dates with rating 5
# (Fill in below)
```


## Summary
Today you learned:
- What a relational database and table structure look like
- How to write `SELECT`, use `WHERE` conditions, and limit results with `LIMIT`
- How to query a sample SQLite database in Python

Next session: **Aggregation and Sorting with SQL**.

In [None]:
import sqlite3
import pandas as pd

data = [
    (1, 'Game A', 'Amazing experience', 5, '2023-01-01'),
    (2, 'Game B', 'Not bad but buggy', 3, '2023-01-15'),
    (3, 'Game A', 'Loved it!', 4, '2023-02-01'),
    (4, 'Game C', 'Terrible experience', 1, '2023-03-12'),
    (5, 'Game A', 'Pretty good', 4, '2023-04-05')
]

conn = sqlite3.connect(':memory:')
cursor = conn.cursor()
cursor.execute('''
    CREATE TABLE reviews (
        review_id INTEGER,
        game_name TEXT,
        review_text TEXT,
        rating INTEGER,
        review_date TEXT
    );
''')
cursor.executemany('INSERT INTO reviews VALUES (?, ?, ?, ?, ?);', data)
conn.commit()

In [4]:
pd.read_sql_query("""
SELECT * FROM reviews;
""", conn)


Unnamed: 0,review_id,game_name,review_text,rating,review_date
0,1,Game A,Amazing experience,5,2023-01-01
1,2,Game B,Not bad but buggy,3,2023-01-15
2,3,Game A,Loved it!,4,2023-02-01
3,4,Game C,Terrible experience,1,2023-03-12
4,5,Game A,Pretty good,4,2023-04-05
