## Check Uniqueness & Validity

**Objective**: Evaluate data quality by checking for uniqueness and validity of data entries.

For this activity, you will use a sample dataset students.csv that contains the following
columns: ID , Name , Age , Grade , Email .

**Steps**:
1. Check Uniqueness
    - Unique IDs
    - Unique Email Addresses
    - Unique Combination

2. Check Validity
    - Validate Age Range
    - Validate Grade Scale
    - Validate Name Format

In [2]:
# Write your code from here
import pandas as pd
import re

# Sample students data
data = pd.DataFrame({
    'ID': [101, 102, 103, 104, 102],  # duplicate ID 102
    'Name': ['Alice', 'Bob1', 'Charlie', 'David', 'Eve'],
    'Age': [20, 21, 22, 19, 25],
    'Grade': [85, 110, 88, -5, 95],  # invalid grades 110 and -5
    'Email': ['alice@example.com', 'bob@example.com', 'charlie@example.com', 'david@sample.com', 'bob@example.com']  # duplicate email bob@example.com
})

# 1. Check Uniqueness
unique_ids = data['ID'].is_unique
unique_emails = data['Email'].is_unique
unique_id_name_combo = data[['ID', 'Name']].drop_duplicates().shape[0] == data.shape[0]

# 2. Check Validity
# a. Validate Age Range (0-120)
valid_age = data['Age'].between(0, 120)

# b. Validate Grade Scale (0-100)
valid_grade = data['Grade'].between(0, 100)

# c. Validate Name Format (only letters and spaces)
name_pattern = re.compile(r'^[A-Za-z\s]+$')
valid_name = data['Name'].apply(lambda x: bool(name_pattern.match(x)))

# Results summary
print("Uniqueness checks:")
print(f"Unique IDs: {unique_ids}")
print(f"Unique Emails: {unique_emails}")
print(f"Unique ID+Name combinations: {unique_id_name_combo}\n")

print("Validity checks (True means valid):")
print(pd.DataFrame({
    'Age_Valid': valid_age,
    'Grade_Valid': valid_grade,
    'Name_Valid': valid_name
}))
#

Uniqueness checks:
Unique IDs: False
Unique Emails: False
Unique ID+Name combinations: True

Validity checks (True means valid):
   Age_Valid  Grade_Valid  Name_Valid
0       True         True        True
1       True        False       False
2       True         True        True
3       True        False        True
4       True         True        True
