# SQL Query 1 – Validation with DuckDB  

This notebook validates **Query 1** by reading the SQL statement from  
`sql/query_1.txt` and running it with DuckDB over the original CSVs.

### Data cleaning

Two malformed rows contained the literal string **`"#VALUE!"`** and had to be
dropped before casting to proper types:

| File | Bad column          |
|------|---------------------|
| `date_info.csv`            | `calendar_date` |
| `restaurants_visitors.csv` | `visit_date`    |

### Business context

The query returns **genre** and **area** in addition to the restaurant ID and
average visitors. These extra columns make the results easier to interpret for
non-technical stakeholders.

In [3]:
import pandas as pd, duckdb, pathlib

DATA_PATH  = pathlib.Path('../data/raw/Data set')
QUERY_PATH = pathlib.Path('../sql/query_1.txt')

con = duckdb.connect()

# ------------ Load & clean with pandas ------------
date_info = pd.read_csv(DATA_PATH / 'date_info.csv')
# Drop malformed '#VALUE!' row in date_info
date_info = date_info[date_info['calendar_date'] != '#VALUE!']
date_info['calendar_date'] = pd.to_datetime(date_info['calendar_date'])

visitors = pd.read_csv(DATA_PATH / 'restaurants_visitors.csv')

# Drop malformed '#VALUE!' row in visitors
visitors = visitors[visitors['visit_date'] != '#VALUE!']
visitors['visit_date'] = pd.to_datetime(visitors['visit_date'])

store_info = pd.read_csv(DATA_PATH / 'store_info.csv')

# ---------- Register clean DataFrames ----------
con.register('date_info', date_info)
con.register('restaurants_visitors', visitors)
con.register('store_info', store_info)

# ---------- Execute SQL from file ----------
sql_query = QUERY_PATH.read_text()
result = con.execute(sql_query).df()
result


Unnamed: 0,restaurant_id,genre_name,area_name,avg_holiday_visitors
0,db80363d35f10926,Dining bar,Hokkaidō Asahikawa-shi 6 Jōdōri,7.275
1,bb09595bab7d5cfb,Izakaya,Niigata-ken Niigata-shi Teraohigashi,5.833333
2,e053c561f32acc28,Izakaya,Hokkaidō Asahikawa-shi 6 Jōdōri,5.24
3,24b9b2a020826ede,Japanese food,Fukuoka-ken Kitakyūshū-shi Ōtemachi,4.333333
4,42c9aa6d617c5057,Italian/French,Hyōgo-ken Kakogawa-shi Kakogawachō Kitazaike,4.228571
