<h1 style="color: #db0909; font-weight: bold;">
DEEP BREAST: AI BASED BREAST CANCER DETECTION
</h1>

## **Short Explanation of the Project Workflow and File Structure**

This project uses deep learning and image processing, so summarizing the file purposes helps keep things clear.

Raw microscope images are prepared for classification, a CNN model is trained on them, and the best model is evaluated on the test set using metrics like accuracy, precision, recall, and F1-score.

## **GENERAL STRUCTURE**

#### **README.md**
A file that introduces the project's working logic and sections of the Streamlit interface.

#### **requirements.txt**
Lists the basic packages needed by the application.
* torch
* torchvision
* numpy
* pandas
* matplotlib
* tqdm
* scikit-learn
* opencv-python
* streamlit
* pillow
* plotly
* reportlab

#### **.gitignore**
We determine the outputs, data folders, and virtual environment files that we will upload to Github.

#### **setting.json**
Specifies which Python interpreter will be used when we open the project.
We are using Python 3.10 in the virtual environment (venv) because this is the best Python version for tensorflow.

#### **__version__.py**
The file where our version information is kept.

#### **temp_uploaded_image.png**
A cached version of the sample histopathological image loaded in the last test.

## **DATASETS, MODELS and REPORTS**

#### **data**
The folder containing raw data, processed (benign/malignant) data and sample data.

#### **reports/DeepBreast_Model_Report.pdf**
Example of the final PDF report generated from the Streamlit performance dashboard.

## **SOURCE CODES**

## **REFACTORED SOURCE STRUCTURE**

The `src/` directory has been reorganized into logical modules for better maintainability:

```
src/
├── core/                 # Core components
│   ├── model.py              → CNN architecture
│   ├── data_loader.py        → Data loading & splitting
│   └── xai_visualizer.py     → Grad-CAM implementation
│
├── training/             # Training & evaluation
│   ├── train_model.py        → Model training script
│   ├── evaluate_model.py     → Model evaluation script
│   └── organize_dataset.py   → Dataset organization script
│
├── ui/                   # Streamlit interface
│   ├── app.py                → Main Streamlit app
│   ├── predict.py            → Prediction panel
│   ├── analysis_panel.py     → Grad-CAM analysis panel
│   ├── performance.py        → Performance metrics dashboard
│   └── about.py              → About page
│
└── scripts/              # Standalone test scripts
    └── test_xai.py           → XAI testing script
```

**Benefits:**
- ✅ Better code organization
- ✅ Easy navigation and maintenance
- ✅ Modular design for scalability
- ✅ Clear separation of concerns

#### **src/ui/app.py**
The main entry point of the Streamlit application handles **top/bottom banner design**, **logo upload**, **sidebar tabs**, and **run_prediction**, **run_analysis**, **run_performance**, **run_about** calls.

**Note:** A wrapper `app.py` exists in the project root for easy execution with `streamlit run app.py`

#### **src/ui/predict.py**
It stores the trained **CNN** in memory (GPU/CPU), preprocesses the loaded images, eliminates inappropriate images with histopathology similarity filter, and displays the prediction + confidence score with Streamlit components.

#### **src/ui/analysis_panel.py**
Generates Grad-CAM heatmap on image coming from or reloaded from Prediction page; uses the same histopathology filter and provides **xai_visualizer.generate_gradcam** output with transparency set.

#### **src/ui/performance.py**
Reads training log and evaluation results from **JSON** and displays them with Plotly graphs (accuracy/loss, metric bars, confusion matrix)

#### **src/ui/about.py**
Information page within Streamlit presenting the project's purpose, the dataset used, and current accuracy information.

#### **src/core/model.py**
It describes the BreastCancerCNN architecture, which consists of **four convolution** + **batch norm blocks and two fully connected layers**.

#### **src/core/data_loader.py**
BreakHis takes the data from the data/processed directory, splits it into train/val/test, separates the **augmentation** and **normalization** transformations, and prepares the DataLoader objects.

- Augmentation: Adding artificial changes such as rotating, cropping, or flipping the image to diversify the data.
- Normalization: Bringing the data to a standard scale such as 0–1 and making it suitable for the model.Like standart-scaler.

#### **src/training/organize_dataset.py**
It traverses the raw folder structure on Kaggle, creates benign/malignant target directories, and copies the files by renaming them. In short, it creates our processed data.

#### **src/training/train_model.py**
It manages the training cycle (10 epochs), collects the snapshot metrics, saves **best_model.pth** with the best accuracy, and updates **train_history.json**.

#### **src/training/evaluate_model.py**
It runs the saved model on the test Loader, calculates the precision/recall/F1 + confusion matrix and saves it as **models/eval_results.json**.

#### **src/core/xai_visualizer.py**
It performs Grad-CAM generation; establishes forward/backward hooks, captures target layer activations and gradients, and creates heat maps with **OpenCV**.

#### **src/scripts/test_xai.py**
A small helper script to test the grad-cam function on its own (opens the sample image and shows the overlay).

## **NOTES**

- .gitkeep in models/ ensures that empty folders are kept in the repo; ui/ and notebooks/, while currently empty, are reserved for future non-Streamlit UI prototypes or Jupyter experiments.

- data/test_samples ve logo_assets hızlı demo ve görsel ihtiyaçları için depo içinde tutulan küçük artefaktlar.


## **RUNNING the PROJECT**

*Eğitim aşaması 2 saate yakın sürdüğü için eğitim kismini atlayıp direkt eğitilmiş model üzerindne olan kısımdan devam edeceğiz.*

* pip install -r requirements.txt // Establishing libraries
* python src/training/organize_dataset.py // Separating data
* python src/training/train_model.py // Training the model
* python src/training/evaluate_model.py // Calculating test set metrics
* streamlit run app.py // User interface (wrapper in root directory)
* python src/scripts/test_xai.py // Testing XAI/Grad-CAM visualization