# 📊 Visualizing Biomedical Data in Python
This short notebook introduces basic data visualization techniques using `pandas`, `seaborn`, and `matplotlib`. The dataset includes simulated gene expression data across several tissue types and experimental conditions.

## 🔧 Setup

In [None]:
# Install required packages (uncomment if needed)
# !pip install pandas seaborn matplotlib

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Set seaborn style
sns.set(style="whitegrid")

## 📁 Load the Data

In [None]:
# Simulate some gene expression data
import numpy as np

np.random.seed(42)

genes = ['TP53', 'EGFR', 'BRCA1', 'MYC', 'PTEN']
tissues = ['Liver', 'Lung', 'Breast', 'Brain']
conditions = ['Control', 'Treated']

data = []
for gene in genes:
    for tissue in tissues:
        for condition in conditions:
            for rep in range(10):  # 10 replicates
                expression = np.random.normal(loc=10 if condition == 'Control' else 12, scale=2)
                data.append([gene, tissue, condition, expression])

df = pd.DataFrame(data, columns=['Gene', 'Tissue', 'Condition', 'Expression'])
df.head()

## 📈 Boxplot: Expression by Condition

In [None]:
plt.figure(figsize=(8, 6))
sns.boxplot(data=df, x='Condition', y='Expression', palette='Set2')
plt.title("Gene Expression by Condition")
plt.show()

## 📉 Violin Plot: Expression by Gene and Condition

In [None]:
plt.figure(figsize=(10, 6))
sns.violinplot(data=df, x='Gene', y='Expression', hue='Condition', split=True, palette='muted')
plt.title("Distribution of Expression Across Genes")
plt.show()

## 🔬 Tissue-Specific Comparison: Expression of TP53

In [None]:
tp53_data = df[df['Gene'] == 'TP53']

plt.figure(figsize=(8, 6))
sns.barplot(data=tp53_data, x='Tissue', y='Expression', hue='Condition', ci='sd', palette='pastel')
plt.title("TP53 Expression by Tissue and Condition")
plt.ylabel("Mean Expression (± SD)")
plt.show()

## 🧠 Optional: Pairplot for Multivariate Exploration

In [None]:
# Pivot the data to show multivariate comparison (only for 1 tissue and 1 condition for clarity)
subset = df[(df['Tissue'] == 'Lung') & (df['Condition'] == 'Treated')]
pivoted = subset.pivot_table(index=subset.index, columns='Gene', values='Expression')

sns.pairplot(pivoted)
plt.suptitle("Pairwise Gene Expression Comparison (Lung, Treated)", y=1.02)
plt.show()

## 📝 Practice Exercise
Use `seaborn` to create your own plot:
1. Choose a different gene.
2. Plot its expression across tissues, separated by condition.
3. Use either a `boxplot`, `barplot`, or `stripplot`.