## Build a Data Quality Dashboard

**Description**: Create a simple dashboard that displays data quality metrics using a library like `dash` or `streamlit`.

**Steps:**
1. Install Streamlit: pip install streamlit
2. Create a Python script dashboard.py.
3. Run the dashboard: streamlit run dashboard.py

In [2]:
# Write your code from here
import streamlit as st
import pandas as pd
import numpy as np

# Sample data for demo
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', None],
    'Age': [25, np.nan, 30, 22, 25],
    'Email': ['a@example.com', 'b@example.com', 'c@example.com', 'd@example.com', 'a@example.com']
}
df = pd.DataFrame(data)

# Data Quality Metrics
def calculate_metrics(df):
    total_rows = len(df)
    missing_percent = df.isnull().sum() / total_rows * 100
    duplicate_rows = df.duplicated().sum()
    return missing_percent, duplicate_rows

# Streamlit UI
st.title("📊 Data Quality Dashboard")

st.subheader("Input Data")
st.dataframe(df)

missing_percent, duplicate_rows = calculate_metrics(df)

st.subheader("🔍 Missing Value Percentage")
st.write(missing_percent)

st.subheader("📎 Duplicate Records")
st.write(f"Total duplicate rows: {duplicate_rows}")

st.subheader("✅ Validity Check (Email Format)")
valid_emails = df['Email'].str.contains(r'^\S+@\S+\.\S+$', na=False)
invalid_count = (~valid_emails).sum()
st.write(f"Invalid email entries: {invalid_count}")


2025-05-25 07:18:02.487 
  command:

    streamlit run /home/vscode/.local/lib/python3.10/site-packages/ipykernel_launcher.py [ARGUMENTS]
