# Data Validation Script  

## Overview  
This script performs **data validation for restaurant dataset values** to ensure data integrity and correctness.  
It checks for:  
- **Valid price values** (must be greater than zero)  
- **Correct drink values** (should match predefined options: "Yes" or "No")  
- **Presence of required fields**  

## Purpose  
- Automatically detect data entry errors  
- Ensure that dataset values meet expected business rules  
- Support a structured data validation process for quality assurance  

## How to Run  
Run the following command in the terminal:  
```bash
python validate_data.py


In [1]:
import pandas as pd

In [2]:
data = "C:/Users/jaeki/OneDrive/바탕 화면/데이터/데이터 분석/data/Restaurant.csv"

df = pd.read_csv(data)
df

Unnamed: 0.1,Unnamed: 0,Day,Time,Customers,Price,Menu,Drink
0,0,Fri,Dinner,4,200000,"Lamb steak, Lamb steak, Mushroom confit, Lamb ...",Yes
1,1,Fri,Lunch,4,110000,"Octopus confit, Mushroom confit, Mushroom conf...",No
2,2,Sun,Dinner,3,112000,"Mushroom confit, Lamb steak, Lamb steak, Octop...",No
3,3,Fri,Lunch,4,122000,"gnocchi, Lamb steak, Lamb steak, Truffle pasta",No
4,4,Wed,Dinner,4,92000,"gnocchi, Mushroom confit, Truffle pasta, Truff...",No
...,...,...,...,...,...,...,...
233,233,Sun,Lunch,2,174000,"gnocchi, gnocchi, Lamb steak",Yes
234,234,Fri,Lunch,1,128000,Truffle pasta,Yes
235,235,Sat,Lunch,4,160000,"Mushroom confit, Truffle pasta, Lamb steak, La...",No
236,236,Tue,Dinner,4,100000,"Octopus confit, Truffle pasta, gnocchi, Mushro...",No


Make function to check errors in this data frame.
We will check column Customers, Price and Drink

In [3]:
def validate_data(df):
    errors = []
    if (df['Customers'] <= 0).any():
        errors.append("Customers column contains values <= 0")
        
    if (df['Price'] <= 0).any():
        errors.append("Price column contains values <= 0")
        
    if not (df['Drink'].isin({"Yes","No"})).all():
        errors.append("Drink column contains invaild values")
        
    return errors if errors else ["data is vaild"]


print('\n'.join(validate_data(df)))
        

data is vaild
