# Code Smell Detector CLI Tool — Usage Guide

This notebook provides instructions on how to use the command-line tool `CodeSmellDetector` for analyzing source code files and detecting common code smells using a fine-tuned [CodeT5](https://arxiv.org/abs/2109.00859) transformer model.

---

## Project Structure Requirements

Ensure the following directory layout exists in your project:

```
goit-cp-code-smell-transformers/
├── src/
│   ├── code_smell_detector/
│   │   ├── code_smell_detector.py
│   │   └── __init__.py
│   ├── data_processing/
│   │   ├── cleaner.py
│   │   └── __init__.py
├── models/
│   └── transformers/
│       └── codet5/
│           └── codet5-base_multilabel_finetuned/
```

The model path can be customized via the `--model_path` parameter.

---

## Preprocessing

The tool internally applies code cleaning before prediction:
- Removes single-line and multi-line comments
- Normalizes whitespace
- Strips empty lines

This preprocessing improves the consistency of inference results.

---

## Usage

### Run from the command line (macOS/Linux)

```bash
PYTHONPATH=src python -m src.code_smell_detector.code_smell_detector \
  --code /absolute/path/to/your/SourceFile.java
```

Alternatively, if `sys.path` patching is enabled (as in `code_smell_detector.py`), you can omit `PYTHONPATH`:

```bash
python -m src.code_smell_detector.code_smell_detector \
  --code /absolute/path/to/your/SourceFile.java
```

Optional: specify a custom path to the fine-tuned model:

```bash
--model_path models/transformers/codet5/codet5-base_multilabel_finetuned
```

---

## Output

Example output for a given `.java` file:

```
Predicted code smells: Long Method, Feature Envy
```

The tool returns a **comma-separated list of predicted code smell categories**, including:

- `Long Method`
- `God/Large Class`
- `Feature Envy`
- `Data Class`
- or `Clean` (if no smells detected)

---

## Notes

- The tool is designed for **Java-like syntax**, but can be extended to support other languages.
- The model was fine-tuned using a **multi-label classification formulation** with natural language prompts:
  > `"detect code smell: {code snippet}"`

---

Feel free to run inference on different files and explore predictions!