Hospital Patient Data Analysis using SQL, Python, and Excel || Capstone Project For End Semester EXCELR
This project focuses on analyzing hospital patient data to extract meaningful insights related to disease patterns, doctor workload, and treatment costs. The objective is to use data analysis techniques to support better decision-making in healthcare management.
The dataset includes patient details such as age, gender, disease, assigned doctor, and treatment cost.
- Identify the most common diseases among patients
- Analyze doctor workload distribution
- Calculate average treatment costs
- Detect high-cost medical cases
- Present findings using data visualization
- SQL (MySQL / SQLite) → Data storage and querying
- Python (Pandas, Matplotlib) → Data analysis and visualization
- Google Sheets / Excel → Pivot tables and data summarization
-
Data Collection
- Patient dataset stored in CSV format
-
Data Processing (SQL)
- Created table and inserted data
- Performed queries to analyze disease frequency, cost, and workload
-
Data Analysis (Python)
- Loaded dataset using Pandas
- Generated summary statistics
- Analyzed disease distribution and costs
-
Data Visualization
- Pie Chart → Disease distribution
- Bar Chart → Doctor workload
-
Data Summarization
- Used Pivot Tables in Google Sheets for quick insights
- Diabetes is the most common disease among patients
- Dr Shah handles the highest number of patients
- The average treatment cost is approximately ₹6600
- Heart Disease cases have the highest treatment cost
Hospital-Patient-Analysis/ │ ├── patients.csv ├── sql_queries.sql ├── python_analysis.ipynb ├── Project Report.pdf
- Expand dataset with more patient records
- Build an interactive dashboard using Power BI or Tableau
- Apply machine learning for predicting disease trends
- Include additional parameters like treatment duration and recovery rate
This project demonstrates how basic data analysis tools such as SQL, Python, and Excel can be used together to extract valuable insights from healthcare data. It highlights the importance of data-driven decision-making in improving hospital operations and patient care.
Satyam Anand Roll-2330045