Clean and prepare the raw dataset by handling missing values, duplicates, inconsistent text, and incorrect formats.
- Removed missing values and duplicates
- Standardized column names (lowercase + underscores)
- Cleaned text formats for gender and no-show columns
- Converted dates to datetime format
- Fixed data types (age as int, IDs as string)
- Removed invalid ages (<0 or >115)
- Python 3.11
- Pandas
A cleaned dataset ready for analysis: cleaned_medical_appointments.csv
Rushikesh Palekar