# Chapter 86: Development Best Practices

## **Learning Objectives**

By the end of this chapter, you will be able to:

- Understand the importance of code quality and maintainability in time‑series prediction systems.
- Apply coding standards (PEP 8, docstrings, type hints) to improve readability and reduce errors.
- Implement a comprehensive testing strategy, including unit tests, integration tests, and data validation tests.
- Establish an effective code review process that catches issues early and shares knowledge.
- Write clear documentation for code, APIs, and models to ensure long‑term usability.
- Use version control effectively with feature branches, commit conventions, and tagging.
- Set up continuous integration and continuous deployment (CI/CD) pipelines for automated testing and deployment.
- Recognise and manage technical debt to keep the system healthy.
- Foster a culture of knowledge sharing and continuous improvement within the team.

---

## **86.1 Introduction to Development Best Practices**

Building a time‑series prediction system, such as the NEPSE stock predictor, is not just about training a good model. The codebase must be robust, maintainable, and scalable. In a team environment, multiple developers contribute to the system over time. Without best practices, the code can become a tangled mess (often called “technical debt”), leading to bugs, slow development, and eventually system failure.

Best practices are not one‑time activities but ongoing disciplines. They cover the entire software development lifecycle: writing code, testing, reviewing, documenting, deploying, and maintaining. In this chapter, we will cover the essential practices that every team working on a prediction system should adopt, with concrete examples drawn from the NEPSE project.

---

## **86.2 Code Quality Standards**

### **86.2.1 PEP 8 and Style Guides**

Python code should follow **PEP 8**, the official style guide. Consistent style makes code easier to read and maintain. Key points:

- Use 4 spaces per indentation level.
- Limit lines to 79 characters for code, 72 for docstrings.
- Use blank lines to separate functions and classes.
- Use descriptive variable names (`close_price` not `cp`).

Tools like `flake8`, `pylint`, and `black` (auto‑formatter) help enforce these rules. Integrating them into the CI pipeline ensures that all code meets the standard.

**Example: Inconsistent vs. Consistent Code**

```python
# Inconsistent (hard to read)
def calc_mae(p,a): return sum(abs(p-a))/len(p)

# Consistent (clear)
def calculate_mean_absolute_error(predictions, actuals):
    """Calculate the mean absolute error between predictions and actuals."""
    errors = [abs(p - a) for p, a in zip(predictions, actuals)]
    return sum(errors) / len(errors)
```

### **86.2.2 Docstrings and Comments**

Docstrings describe what a function, class, or module does. Use the **Google** or **NumPy** style. For the NEPSE system, every public function should have a docstring.

```python
def fetch_nepse_data(date):
    """
    Fetch raw NEPSE data for a given date.

    Args:
        date (datetime.date): The date for which to fetch data.

    Returns:
        pd.DataFrame: DataFrame with columns ['Symbol', 'Open', 'High', 'Low', 'Close', 'Vol'].

    Raises:
        ValueError: If no data is available for the given date.
    """
    # implementation...
```

Comments should explain **why**, not **what**. The code itself should show what it does.

### **86.2.3 Type Hints**

Type hints improve code readability and allow static type checkers (e.g., `mypy`) to catch errors. Python 3.6+ supports them.

```python
from typing import List, Optional
import pandas as pd

def engineer_features(
    df: pd.DataFrame,
    symbols: Optional[List[str]] = None
) -> pd.DataFrame:
    """
    Engineer features from raw NEPSE data.
    If symbols is provided, filter to those symbols.
    """
    ...
```

**Benefits**: Better IDE autocompletion, fewer runtime type errors, and self‑documenting code.

---

## **86.3 Testing Strategies**

Testing is essential to ensure that changes do not break existing functionality. For a prediction system, we need tests at multiple levels.

### **86.3.1 Unit Tests**

Unit tests verify individual functions or methods. They should be fast and isolated (no external dependencies like databases or APIs). Use `pytest` as the testing framework.

**Example**: Testing a function that computes daily returns.

```python
# feature_engineering.py
def compute_daily_return(df: pd.DataFrame) -> pd.DataFrame:
    """Add a column 'daily_return' as percentage change of 'Close'."""
    df = df.copy()
    df['daily_return'] = df['Close'].pct_change() * 100
    return df

# test_feature_engineering.py
import pandas as pd
import numpy as np
from feature_engineering import compute_daily_return

def test_compute_daily_return():
    data = pd.DataFrame({
        'Close': [100, 105, 103, 108]
    })
    result = compute_daily_return(data)
    expected = [np.nan, 5.0, -1.90476, 4.85437]  # approximate
    pd.testing.assert_series_equal(
        result['daily_return'],
        pd.Series(expected, name='daily_return'),
        check_less_precise=True
    )
```

**Best practices**: Test edge cases (empty DataFrame, single row, missing values). Use parameterised tests to cover multiple scenarios.

### **86.3.2 Integration Tests**

Integration tests verify that components work together correctly, e.g., the feature engineering pipeline followed by model training. They may involve a test database or file system.

**Example**: Test that the full pipeline from raw data to prediction does not crash.

```python
def test_end_to_end(tmp_path):
    # Create a small synthetic dataset
    raw_data = pd.DataFrame(...)
    raw_path = tmp_path / "raw.parquet"
    raw_data.to_parquet(raw_path)

    # Run ingestion
    ingest = NEPSEIngestion()
    df = ingest.fetch_from_csv(raw_path)

    # Run feature engineering
    engineer = NEPSEFeatureEngineer()
    features = engineer.compute_features(df)

    # Train a tiny model
    model = xgb.XGBRegressor(n_estimators=5)
    X = features[engineer.feature_columns]
    y = features['Close']
    model.fit(X, y)

    # Make a prediction
    pred = model.predict(X.iloc[[0]])
    assert pred is not None
```

Integration tests are slower, so run them less frequently (e.g., in CI but not on every save).

### **86.3.3 Data Quality Tests**

In a prediction system, data quality is critical. Tests should validate:

- No missing values in critical columns.
- Date ranges are as expected.
- No duplicates (same symbol, same date).
- Prices are positive, volume non‑negative.

These can be implemented as assertions in the data ingestion or as separate test scripts.

```python
def test_data_quality(df):
    assert df['Close'].min() > 0, "Close prices must be positive"
    assert not df[['Symbol', 'Date']].duplicated().any(), "Duplicate symbol-date pairs"
    # Check that dates are consecutive (no gaps)
    date_diffs = df.groupby('Symbol')['Date'].diff().dt.days
    assert (date_diffs.dropna() == 1).all(), "Data should be daily with no gaps"
```

### **86.3.4 Model Tests**

Model tests ensure that trained models meet basic sanity checks:

- Model can predict on a sample input.
- Performance on a fixed validation set does not degrade beyond a threshold (regression testing).
- Feature importance is not extreme (e.g., no single feature dominates).

### **86.3.5 Test Coverage**

Aim for high test coverage (e.g., >80%). Use `pytest-cov` to measure coverage. However, coverage is not a goal in itself; focus on testing critical paths.

---

## **86.4 Code Review Process**

Code reviews are a powerful way to catch bugs, share knowledge, and maintain code quality.

### **86.4.1 Review Checklist**

A good code review checklist includes:

- Does the code meet the style guide?
- Are there unit tests for new functionality?
- Is the logic correct? (Reviewer should understand the code.)
- Are there any performance issues?
- Is error handling appropriate?
- Is documentation updated?

### **86.4.2 Review Tools**

Use platforms like GitHub, GitLab, or Bitbucket for pull requests. Require at least one approval before merging. Automated checks (linting, tests) should run and pass.

### **86.4.3 Best Practices**

- Keep pull requests small and focused (under 400 lines).
- Be constructive and respectful in comments.
- Explain *why* a change is needed in the description.
- Rotate reviewers to spread knowledge.

**Example PR description**:

```
## Description
Adds a new feature: 14-day RSI to the feature engineering pipeline.
- Implements RSI calculation as per technical analysis standard.
- Adds unit tests for RSI on known data.
- Updates feature list documentation.

## Testing
- Unit tests pass.
- Ran end‑to‑end pipeline on one month of NEPSE data; RSI values look plausible.

## Related Issue
Closes #123
```

---

## **86.5 Documentation Strategies**

Documentation is often neglected but vital for long‑term maintenance.

### **86.5.1 Code Documentation**

As covered earlier, use docstrings for all public modules, classes, and functions. Tools like Sphinx can generate HTML documentation from these docstrings.

### **86.5.2 API Documentation**

For services (e.g., prediction API), document endpoints, request/response formats, and error codes. FastAPI automatically generates OpenAPI (Swagger) docs.

```python
@app.post("/predict", response_model=PredictionResponse)
def predict(request: PredictionRequest):
    """
    Predict the closing price for a given symbol and date.

    The model uses features up to the day before the requested date.
    """
```

### **86.5.3 Architecture Documentation**

Maintain high‑level documents describing the system architecture, data flow, and component interactions. Include diagrams (e.g., using C4 model) in a `docs/` folder.

### **86.5.4 Model Documentation**

For each model, create a **model card** (as introduced in Chapter 77). This card should include:

- Model type and version.
- Training data period and source.
- Features used.
- Performance metrics (overall and per subgroup).
- Intended use and limitations.
- Ethical considerations.

**Example** (from Chapter 77):

```markdown
# Model Card: NEPSE Close Price Predictor v2.3

## Description
XGBoost regressor trained on daily NEPSE data from 2018‑2022.

## Performance
- Overall MAE: 12.34
- MAE on high‑volatility days (>3% change): 18.56
- MAE on low‑volatility days: 8.21

## Features
Close_Lag_1, Close_Lag_5, SMA_20, RSI, Volume_Z_Score, ...

## Limitations
Not suitable for predicting during market holidays or after major policy announcements.
```

### **86.5.5 README and Onboarding**

The repository `README.md` should explain how to set up the project, run tests, and contribute. Include a quick start guide for new team members.

---

## **86.6 Version Control**

Git is the de facto standard. Best practices include:

### **86.6.1 Branching Strategy**

A common strategy is **GitHub Flow**:

- `main` branch is always deployable.
- Create feature branches from `main` (e.g., `feature/add‑rsi`).
- Open a pull request to merge back.
- After review, merge (usually squash merge to keep history clean).

For larger teams, **Git Flow** with `develop` and release branches may be used, but for most ML projects, GitHub Flow is sufficient.

### **86.6.2 Commit Messages**

Follow the [Conventional Commits](https://www.conventionalcommits.org/) specification:

```
feat: add RSI feature
fix: correct off-by-one error in lag calculation
docs: update API documentation for /predict
test: add unit tests for compute_daily_return
```

This makes it easy to generate changelogs and automate versioning.

### **86.6.3 Tagging Releases**

When a new version of the model or API is deployed, tag the commit with a version number (e.g., `v2.3.0`). This allows rolling back to a known state if needed.

### **86.6.4 Ignoring Unnecessary Files**

Use `.gitignore` to exclude virtual environments, notebooks with output, data files, and credentials.

---

## **86.7 CI/CD Pipelines**

Continuous Integration (CI) automatically runs tests on every push. Continuous Deployment (CD) automatically deploys to staging or production after tests pass.

### **86.7.1 CI Pipeline**

A typical CI pipeline for a Python project includes:

- Linting (`flake8`, `black --check`)
- Type checking (`mypy`)
- Unit tests (`pytest`)
- Integration tests (if feasible)

Use platforms like GitHub Actions, GitLab CI, or Jenkins.

**Example GitHub Actions workflow** (`.github/workflows/ci.yml`):

```yaml
name: CI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'
      - name: Install dependencies
        run: |
          pip install poetry
          poetry install
      - name: Lint
        run: poetry run flake8 .
      - name: Type check
        run: poetry run mypy .
      - name: Test
        run: poetry run pytest --cov=src
```

### **86.7.2 CD Pipeline**

For the prediction service, CD could:

- Build a Docker image.
- Push to a container registry.
- Deploy to a staging environment.
- Run smoke tests.
- If successful, deploy to production (maybe with a manual approval gate).

**Example deployment to Kubernetes** (using `kubectl`):

```yaml
- name: Deploy to production
  if: github.ref == 'refs/heads/main'
  run: |
    kubectl set image deployment/prediction-service prediction-service=${{ steps.build.outputs.image }}
    kubectl rollout status deployment/prediction-service
```

### **86.7.3 Model Training Pipeline**

For models, CI/CD may be replaced by a **Continuous Training (CT)** pipeline that retrains models periodically. This is often implemented with Airflow or a similar scheduler, and can be integrated with the same version control and testing principles.

---

## **86.8 Refactoring and Technical Debt**

Technical debt refers to the cost of additional rework caused by choosing an easy solution now instead of a better approach that would take longer. It accumulates like financial debt, with interest (slower development, more bugs).

### **86.8.1 Signs of Technical Debt**

- Code is hard to understand or modify.
- Tests are slow or flaky.
- Duplicated code.
- Many “TODO” comments.
- Long functions or classes.
- Fear of changing code because “it might break something”.

### **86.8.2 Managing Technical Debt**

- **Refactor regularly**: Allocate time in each sprint for small refactorings (e.g., “boy scout rule” – leave code cleaner than you found it).
- **Automate**: Use tools like `pylint` to flag complexity.
- **Document known debt**: Keep a list in the project’s issue tracker.
- **Prioritise**: Address debt in areas that change frequently or are critical.

**Example refactoring**: Replace repeated code with a function.

Before:
```python
sma_5 = df['Close'].rolling(5).mean()
sma_10 = df['Close'].rolling(10).mean()
sma_20 = df['Close'].rolling(20).mean()
```

After:
```python
def moving_average(series, window):
    return series.rolling(window).mean()

for w in [5, 10, 20]:
    df[f'SMA_{w}'] = moving_average(df['Close'], w)
```

---

## **86.9 Knowledge Sharing and Onboarding**

### **86.9.1 Pair Programming**
Pair programming (two developers working together) is excellent for sharing knowledge and catching mistakes in real time. It can be used for complex features or when onboarding a new team member.

### **86.9.2 Tech Talks and Demos**
Regularly schedule short presentations where team members share what they’ve learned, demonstrate a new feature, or discuss a recent challenge.

### **86.9.3 Onboarding Documentation**
Create a `CONTRIBUTING.md` file that explains:

- How to set up the development environment.
- How to run tests and linting.
- The branching strategy and pull request process.
- Where to find documentation.

Also maintain a `docs/onboarding.md` with a step‑by‑step guide for new developers.

### **86.9.4 Wiki or Knowledge Base**
Use a company wiki (Confluence, Notion) to store design documents, meeting notes, and post‑mortems. Keep it organised and up to date.

---

## **86.10 Continuous Improvement**

Development best practices are not static; they should evolve as the team and project grow. Conduct **retrospectives** (e.g., after each sprint or release) to discuss:

- What went well?
- What could be improved?
- What actions will we take?

Implement the agreed‑upon improvements in the next cycle.

---

## **Chapter Summary**

In this chapter, we covered the essential development best practices that ensure a time‑series prediction system (like the NEPSE stock predictor) remains robust, maintainable, and scalable. We discussed:

- Code quality through style guides, docstrings, and type hints.
- A multi‑level testing strategy including unit, integration, data quality, and model tests.
- The importance of code reviews and how to conduct them effectively.
- Documentation at the code, API, architecture, and model levels.
- Version control best practices: branching, commit messages, and tagging.
- CI/CD pipelines for automated testing and deployment.
- Managing technical debt through regular refactoring.
- Fostering a culture of knowledge sharing and continuous improvement.

Adopting these practices may require upfront effort, but they pay off in reduced bugs, faster development, and happier teams. As you continue to develop and enhance your prediction system, make these practices an integral part of your daily work.

In the next chapter, we will explore **Team Collaboration**, diving deeper into how teams structure themselves, communicate, and work together effectively.

---

**End of Chapter 86**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='../11. advanced_implementation_patterns/85. distributed_systems.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='87. team_collaboration.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
