# Exercise 1: Hospital Patient Data Analysis

Goal: Create a mini-project to manipulate patient data and visualize basic statistics.

# Step 1: Create the project directory
Create a project folder named HospitalDataAnalysis.

mkdir HospitalDataAnalysis

cd HospitalDataAnalysis


# Step 2: Create a virtual environment with a specific Python version

If you have multiple Python versions installed (e.g., Python 3.10, 3.11), specify the one you need:

python -m venv env


# Activate it and install the required libraries:
.\env\Scripts\activate

pip install numpy pandas matplotlib jupyter

# Step 2: Create project structure

Organize your project into a clean structure by creating the following folders and files:
```
HospitalDataAnalysis/
├── env/                # Virtual environment folder (ignored by Git)
├── data/               # Folder to store CSV files
│   └── patients.csv
├── notebooks/          # Jupyter notebooks
│   └── patients_analysis.ipynb
├── scripts/            # Optional Python scripts
├── .gitignore
└── README.md ```



## Step 3: Load and explore the data

- Create a small CSV file patients.csv in the data/ folder with columns:
PatientID, Age, Sex, BloodPressure, Cholesterol, Diagnosis

cd data
echo "PatientID,Age,Sex,BloodPressure,Cholesterol,Diagnosis" > patients.csv
echo "1,45,M,130,200,Healthy" >> patients.csv
echo "2,54,F,150,240,Hypertension" >> patients.csv
echo "3,39,M,120,180,Healthy" >> patients.csv
echo "4,65,F,170,260,Hypertension" >> patients.csv

- Load it in the notebook using:

import pandas as pd

df = pd.read_csv("../data/patients.csv")
df.head()


- Run basic descriptive statistics using:

df.describe()
df['Sex'].value_counts()
df['Diagnosis'].value_counts()

##Visualize the data

Plot an age histogram and scatter plot of Age vs BloodPressure:

import matplotlib.pyplot as plt
import seaborn as sns

sns.histplot(df['Age'], bins=10)
plt.show()

sns.scatterplot(x='Age', y='BloodPressure', data=df)
plt.show()






## Step 4: Initialize Git and create README

### Create a GitHub account (if you don’t already have one).  
- Go to [https://github.com](https://github.com) and sign up.  
- Create a new repository on GitHub called **HospitalDataAnalysis**.  
   👉 Do not initialize it with a README, `.gitignore`, or license (we’ll add them from local).  

### Link your local project with the GitHub repository:  
   git remote add origin https://github.com/<your-username>/HospitalDataAnalysis.git
   git branch -M main
   git push -u origin main
   
## Create a new README.md file in your local project with a short description of the project: 

   %%writefile README.md
A mini data science project to analyze hospital data and visualize basic statistics.


### Create a .gitignore file in your local project with the following content:

%%writefile .gitignore

echo "env/
__pycache__/
*.pyc" > .gitignore

## Commit and push the changes.

git add README.md .gitignore
git commit -m "Add README and .gitignore"
git push


## Step 6: share the project

How can you export all the installed libraries in your virtual environment to a file so that someone else can recreate the same environment?

pip freeze > requirements.txt


Test recreating the environment on a different machine:

pip install -r requirements.txt