# CI/CD for Machine Learning – Personal Notes

## 1. Introduction
This course covers **Continuous Integration (CI)** and **Continuous Delivery/Deployment (CD)** techniques tailored for machine learning workflows.

---

## 2. Software Development Life Cycle (SDLC)

### Definition
SDLC is a structured process for developing, deploying, and maintaining software applications.

### Key Stages
- **Build**: Compile source code into executable form.
- **Test**: Validate functionality and quality.
- **Deploy**: Release software into target environments.

---

## 3. SDLC in Machine Learning

### Unique Challenges
- ML models evolve with data; not static algorithms.
- Data engineering is resource-intensive: includes collection, transformation, storage, and serving.
- Integration with SDLC requires automation for speed and quality.

### Benefits of CI/CD in ML
- Streamlines delivery of high-quality ML software.
- Enables rapid prototyping and testing.
- Facilitates algorithm and hyperparameter exploration.
- Improves decision-making through faster iteration.

**Reference**: [Google Cloud – ML Lifecycle](https://cloud.google.com/blog/products/ai-machine-learning/making-the-machine-the-machine-learning-lifecycle)

---

## 4. What is CI/CD?

### Continuous Integration (CI)
- Automatically builds and tests code on integration into a shared repo.
- Prevents integration issues and ensures code stability.

### Continuous Delivery (CD)
- Automates delivery of code to production-like environments.
- Requires manual approval before deployment.

### Continuous Deployment (CD)
- Fully automates release to production without manual intervention.

---

## 5. CI/CD in Machine Learning

### Key Differences from Traditional Software
- ML = Code + Data -> both must be versioned.
- Experimentation requires tracking model performance and configurations.
- Reproducibility demands versioning of data, models, and code.

### Testing in ML CI
- Goes beyond unit tests: includes data preprocessing, training, and evaluation.
- Ensures pipeline reliability and model quality.

### Deployment Considerations
- More complex than traditional software.
- Requires:
  - Model serving infrastructure
  - Performance monitoring
  - Update management and rollback strategies

---

## 6. Course Scope

Focus areas:
- Data preparation and versioning
- Model development and evaluation
- Hyperparameter tuning
- CI/CD integration across these stages

---

## 7. Summary

### SDLC Workflow
- Build -> Test -> Deploy

### CI/CD Benefits in ML
- **CI**: Frequent code merging, early bug detection
- **CD (Delivery)**: Manual approval before release
- **CD (Deployment)**: Fully automated release

### ML-Specific Enhancements
- Data/model versioning for reproducibility
- Automation for experimentation
- Full pipeline testing
- Reliable and rapid deployment

 Continuous deployment is the practice of automatically releasing every code change to production, while continuous delivery is the practice of preparing code changes for release but allowing for manual approval before deployment.
 
Continuous deployment is actually the automated process of deploying code changes to production, while continuous delivery is the practice of preparing code changes for release.

![image.png](attachment:3e1638c9-d138-42da-9357-d31700bafc0d.png)

Generic Workflow


# YAML for CI/CD in Machine Learning – Personal Notes

## 1. What is YAML?

### Definition
- YAML stands for **"YAML Ain't Markup Language"**
- A human-readable data serialization format used for:
  - Configuration files
  - Data exchange
  - Structured data representation

### Comparison
- Alternative to XML
- Comparable to JSON in functionality
- Designed for readability and simplicity

### Usage in CI/CD
- YAML is the backbone of configuration in tools like:
  - **GitHub Actions** (workflow orchestration)
  - **DVC** (pipeline stages and metadata)
- File extensions: `.yaml` or `.yml`

---

## 2. YAML Syntax

### Structure Rules
- Uses **indentation** and **line separation** to define hierarchy
- Indentation is space-based (no tabs allowed)
- Syntax errors often stem from inconsistent spacing

name: Santosh
occupation: Instructor
# this is valid format
programming_languges: R, Python # this is too
  python:advanced
  javascript: advanced

  


### Best Practices
- Use YAML-aware IDEs with validation support
- Comments begin with `#` and are ignored during parsing

---

## 3. YAML Scalars

### Supported Scalar Types
- **String**: quoted or unquoted "Rustam" and Rustam both are sstrings
- **Number**: integer or float
- **Boolean**: `true` / `false` (unquoted)
- **Null**: `null` or `~`

### Notes
- Booleans and nulls must not be quoted to retain type
- Strings can be wrapped in `'single'` or `"double"` quotes when needed

---

## 4. YAML Collections

### Sequences (Lists)
- Ordered elements
- **a. Block style**: uses hyphens
  ```yaml
  - item1
  - item2
  - item3
```

** b. Flow style: uses brackets **

yaml
[item1, item2, item3]
Mappings (Key-Value Pairs)
Uniquely keyed values

Syntax:

yaml
key1: value1
key2:
  - nested1
  - nested2
key3: [val1, val2, val3]
```

![image.png](attachment:f089ce70-4ec8-479a-8f0b-a56108899283.png)


![image.png](attachment:9fa891f4-202e-42ca-be75-9262776af238.png)


# GitHub Actions (GHA) – Personal Notes

## 1. What is GitHub Actions?
![image.png](attachment:1742f301-4612-4a96-930c-eafad069f3b9.png)
### Definition
- GitHub Actions (GHA) is GitHub’s built-in automation and CI/CD system.
- Enables automation of build, test, and deployment pipelines directly within GitHub repositories.
- A **pipeline** is a sequence of interconnected steps representing the flow of work and data.

### Analogy
- Similar to a car assembly line: each step performs a specific task (e.g., attach engine, paint).
- In GHA, each step automates a part of the software development lifecycle.

**Reference**: [Medium – CI/CD with GitHub Actions for Android](https://medium.com/empathyco/applying-ci-cd-using-github-actions-for-android-1231e40cc52f)

---

## 2. Core Components of GitHub Actions

### Event
- An **event** triggers the execution of a workflow.
- Examples:
  - `push` to a branch
  - `pull_request` opened
  - `issue` created

### Workflow
- A **workflow** is a YAML-defined automated process.
- Stored in `.github/workflows/` directory.
- Can be triggered by:
  - Events
  - Manual triggers
  - Scheduled intervals
- Multiple workflows can exist in a repo:
  - One for testing PRs
  - One for deployment
  - One for issue labeling

### Steps and Actions
- A **step** is a unit of work executed in sequence.
- Steps share the same runner and can pass data between them.
- Examples:
  - Build application
  - Run tests
  - Execute shell scripts
- An **action** is a reusable application that performs a task.
  - Examples: `actions/checkout`, auto-commenting on PRs
  - 
![image.png](attachment:f1a3e195-9a37-435d-a37f-827024b89bfd.png)

### Jobs and Runners
- A **job** is a set of steps.
- Jobs are independent and can run in parallel.
- Jobs can be configured with dependencies.
- All steps in a job run on the same **runner** (compute machine).

---
![image.png](attachment:08478645-eea1-4b6a-a595-ce3aedde42b1.png)

## 3. Example Workflow

### Trigger
- A `push` event initiates the workflow.

### Job
- Runs on an Ubuntu Linux runner.

### Steps
```yaml
- name: Checkout code
  uses: actions/checkout@v3

- name: Run Python app
  run: python app.py
```

![image.png](attachment:7b91431b-6067-4613-a0aa-ac558b56ad8f.png)


![image.png](attachment:760a7523-1175-42a7-b9bd-9fbc12a62494.png)


you can also specify a "job" to be dependent on another "job."

# Intermediate YAML 

## 1. Overview

To work effectively with GitHub Actions and other CI/CD tools, a deeper understanding of YAML is required—especially for handling multiline strings, dynamic values, and multi-document structures.

![image.png](attachment:0765f84a-5ab6-4c2f-a796-49bd18aa3f88.png)
---

## 2. Multiline Strings: Block Scalar Format

### Purpose
- Used to represent multi-line strings with preserved formatting.
- Common in:
  - Shell commands
  - Log messages
  - Configuration blocks

### Styles
- **Literal (`|`)**: preserves line breaks and indentation exactly.
- **Folded (`>`)**: collapses line breaks into spaces for wrapped text.

---

## 3. Literal Style (`|`)

### Behavior
- Maintains all line breaks and indentation.
- Ideal for shell scripts or formatted logs.
- 
![image.png](attachment:4a6bc64e-bd27-469f-ac78-303c0e7de61b.png)

### Example
```yaml
script: |
  echo "Starting process"
  
    indented line
  echo "Done"
## 4. Folded Style (>)
Behavior
Converts line breaks into spaces.

Preserves blank lines and indented blocks.
```

![image.png](attachment:3678756a-9751-420f-9c89-b8596fa39c73.png)


![image.png](attachment:6f3e88ff-e229-4776-be75-0b51552c096b.png)


### Example
```yaml
message: >
  This is a long message
  that will be folded into
  a single paragraph.
      
## 5. Chomping Indicators
Purpose
Control how trailing newlines are handled in block scalars. added after style indicators

Modes
Clip (default): adds one newline at end (no symbol needed).

Strip (-): removes all trailing newlines.
```
![image.png](attachment:b3e30fdd-a7c0-4a8e-9ebe-d9522b250b06.png)

Keep (+): retains all trailing newlines.

![image.png](attachment:dff853e9-481b-4681-819f-5b8dfe22aba3.png)

Example
yaml
log: |-
  Line one
  Line two
yaml
log: |+
  Line one
  Line two

## 6. Dynamic Value Injection
### Description
Not part of standard YAML spec.

Used by specific tools to inject runtime values.

``` 
Syntax: ${{ expression }} or $ENV_VAR

### Use Cases
Referencing environment variables

Accessing config values from other YAML sections

Example

``` 
yaml
host: ${{ secrets.host_url }}
database: ${{ secrets.DB_URL }}
Note: Support depends on the tool (e.g., GitHub Actions, Helm, etc.).

## 7. Multi-Document YAML
### Purpose
Store multiple independent YAML documents in one file.

Useful for grouping related configs or metadata.

Syntax
Use --- to separate documents.

Example
``` 
yaml
---
name: Alice
age: 30
---
name: Bob

age: 40
occupation: Engineer
---
name: Carol
age: 25
References
yaml-multiline.info

```
![image.png](attachment:2306b283-94f8-4bdc-8ecd-3328096e4b50.png)

In [3]:
import yaml

with open("yaml_practice/demo.yaml", "r") as f:
    docs = yaml.safe_load_all(f)
    for i, doc in enumerate(docs):
        print(f"Document {i+1}:")
        print(doc)


Document 1:
{'string_value': 'hello world', 'integer_value': 42, 'float_value': 3.14, 'boolean_true': True, 'boolean_false': False, 'null_value_1': None, 'null_value_2': None, 'tools': ['GitHub Actions', 'DVC', 'MLflow', 'Great Expectations'], 'mlops_stack': ['GitHub Actions', 'DVC', 'MLflow', 'Great Expectations'], 'pipeline_stage': {'name': 'model_training', 'duration_minutes': 45, 'dependencies': ['data_preprocessing', 'feature_engineering']}, 'bash_script': '#!/bin/bash\necho "Starting model training"\n\n  python train.py --epochs 50\necho "Training complete"\n', 'deployment_notes': 'This deployment includes updated model weights, revised preprocessing logic, and improved monitoring hooks for production stability.\n', 'note_strip': 'This string has no trailing newlines.\nAnother line is added.', 'note_keep': 'This string keeps all trailing newlines.\nAnother line is added.\n\n', 'current_date': datetime.datetime(2024, 6, 15, 12, 0, tzinfo=datetime.timezone.utc), 'database_url': '${

# CI/CD with GitHub Actions

## 1. What is GitHub Actions?

GitHub Actions is GitHub’s built-in automation system for CI/CD. It allows you to define workflows that build, test, and deploy your code based on events in your repository.

- **Workflow**: A YAML-defined automation pipeline
- **Event**: Triggers the workflow (e.g., `push`, `pull_request`)
- **Job**: A set of steps executed on a runner
- **Step**: A single task (e.g., run script, checkout code)
- **Runner**: The compute machine that executes jobs

---

## 2. Anatomy of a GitHub Actions Workflow

### Minimal Example
```yaml
name: CI

on:
  push:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Run script
        run: |
          echo "Running script.py"
          python3 script.py




run: | uses literal block style to preserve line breaks

Jobs can run in parallel unless dependencies are defined

```
## 3. Setting Up GitHub Actions
Steps
Create a repo at github.com/new

Add Python .gitignore and license

Navigate to Settings > Actions and enable permissions

in action tab create simple workflow and create one yaml file by committing changes
![image.png](attachment:cf2c32b5-c4e6-4c93-9c92-9f1e353104db.png)


Create .github/workflows/ci.yaml


Commit and push to trigger the workflow

![image.png](attachment:0def0dd9-1e3a-4304-b86b-302c7ce5dc65.png)

then we will see nww thing pop ulp in github actions tab
![image.png](attachment:980f3ebf-51ec-4d75-b45d-cbfd6371b3cc.png)

## 4. Inspecting Workflow Runs
Go to Actions tab

Click on workflow name

View job logs and step outputs

click new worlflow that we jsut added now see there is build we see all descriptio in build



![image.png](attachment:834e730a-8f01-4305-9308-fbd0d835d652.png)

detailed description will arive like this in build that is output logs in each steps of execution of github action




![image.png](attachment:11b0f4f4-101b-459d-a992-e0785e29c08b.png)






# GitHub Actions – Pull Request Triggered CI Pipeline

 branching, workflow configuration, action syntax, and log inspection.

---

## 2. Shared Repository Model


![image.png](attachment:1a8dbbe6-341d-4808-86f7-3a89f45b214e.png)
In collaborative development:
- Developers work on **feature/topic branches**
- Changes are merged via **pull requests (PRs)**
- CI/CD tools run tests on PR creation:
  - Code quality
  - Security vulnerabilities
  - Compatibility checks

This early feedback ensures high-quality code before merging into `main`.

---

## 3. Creating a Feature Branch

### Steps
1. Go to your repo’s landing page
2. Click **Branch**
3. Click **New branch**
4. Name it (e.g., `pr-workflow`)
5. Confirm it’s active

---

## 4. Adding Repository Code

Create a simple Python script:
```python
# hello_world.py
import datetime

print("Hello, World!")
print("Current time:", datetime.datetime.now())
```
## 5. Configuring Workflow Trigger
Update your workflow YAML to trigger on pull requests:

``` yaml
name: PR Workflow
```
on:
  pull_request:
    branches: [main]

This runs the workflow when a PR targets the main branch.


## 6. GitHub Actions Syntax
Action Format
```
yaml
uses: org_or_user/repo_name@version
```

With Arguments
```yaml
with:
  argument_name: value
```
Think of Actions as functions and with as parameters.

## 7. Workflow Steps and Actions
Full Example
```yaml
name: PR Workflow

on:
  pull_request:
    branches: [main]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout repository
        uses: actions/checkout@v3

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Run script
        run: |
          echo "Running hello_world.py"
          python3 hello_world.py
```

## 8. Creating a Pull Request
Steps
Commit the workflow to pr-workflow

Open a PR from pr-workflow → main

GitHub Actions will trigger automatically

## 9. Inspecting Workflow Logs
Go to Actions tab

Click the workflow run

Click the job (e.g., build)

View logs for:

Checkout repository

Setup Python

Run script

You’ll see output from hello_world.py, confirming successful execution.




## 2. Shared Repository Model
In collaborative environments:
- Multiple developers work on the same repository simultaneously.
- Feature or topic branches are created to organize related work.
- When work is complete, developers open a pull request for code review.
- CI/CD tools run tests automatically on PR creation, checking for:
  - Code quality
  - Security vulnerabilities
  - Compatibility with other components  
This early feedback ensures issues are caught before merging into the main branch, maintaining high-quality code.

---

## 3. Creating a Feature Branch
- Navigate to the repository landing page.
- Click on **Branch** → **New branch**.
- Provide a name (e.g., `pr-workflow`).
- Confirm that the new branch is active.

---

## 4. Adding Repository Code
- Add a simple Python script to the branch.
- The script prints a “Hello World” message and the current time.
- Commit the file as `hello_world.py`.

---

## 5. Configuring Workflow Event
- Modify the workflow trigger from `on: push` to `on: pull_request`.
- The `branches` key specifies the target branch of the PR (in this case, `main`).
- This ensures the workflow runs when a PR is opened against `main`.

---

## 6. Actions Syntax
- Actions are defined in workflow steps under the `uses` key.
- Syntax: `organization/repository@version`.
- Arguments can be passed using the `with` key.
- Actions function like reusable modules, with parameters acting as inputs.
- Many ready-to-use Actions are available in the GitHub Marketplace.

---

## 7. Configuring Workflow Steps and Actions
- To run repository code, two key steps are required:
  - **Checkout**: Uses the `checkout` action to retrieve repository code.
  - **Setup Python**: Uses the `setup-python` action to configure the Python environment.
- Each action specifies its version (e.g., `@v3`, `@v4`) and arguments such as `python-version`.

---

## 8. Putting It Together
- The workflow now triggers on pull requests.
- Steps include:
  - Checking out the repository
  - Setting up Python
  - Running the Python script  
This creates a complete CI pipeline for PR validation.

---

## 9. Creating a Pull Request
- Commit workflow changes to the `pr-workflow` branch.
- Open a PR from `pr-workflow` → `main`.
- The workflow triggers immediately upon PR creation.
- Status can be inspected via the **Details** link.

---

## 10. Inspecting Workflow Logs
- Logs show execution of all steps:
  - Repository checkout
  - Python setup
  - Script execution  
The successful run confirms that the workflow is correctly configured.

# have this code in your yaml file and whever pull requests get it triggers this folowing setting up python and running it in githyuh

name: PR

on:
  pull_request:
    branches:["main"]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
      - name: setup python
        uses: actions/setup-python@v4
        with:
          python-version: '3.9'
      - name: run test.py
        run: |
          echo "Starting test.py execution"
          python cicd_assets/test.py
          echo "Finished test.py execution"

 step is have the file in our branch that is pr-workflow in this case and have these file and test.py or any python file to be executed
 above yaml sets up python environment and runs the python code there and we can see it in log after we go to pr and thereis log right there when we see compre and pull request asthis will only run after hitting pullrequest in the branch we see logs as i instructed python to display date and some loggings it did as expected
 
 
![image.png](attachment:2d6bb4ee-4243-45b8-93ed-83b3181d3150.png)
 



## 2. Contexts in GitHub Actions
GitHub provides predefined **contexts**—structured data available during a workflow run. These contexts allow dynamic behavior based on the environment or event.

### Common Contexts:
- `github`: Information about the repository, workflow, and event.
- `env`: Variables defined at workflow, job, or step level.
- `secrets`: Encrypted values like API keys or tokens.
- `job`: Metadata about the current job.
- `runner`: Details about the runner executing the job.

Access syntax: `${{ github.actor }}`, `${{ secrets.MY_SECRET }}`, `${{ env.MY_VAR }}`  
Reference: [GitHub Contexts Documentation](https://docs.github.com/en/actions/learn-github-actions/contexts)

---

## 3. Environment Variables
- Used for non-sensitive data (e.g., compiler flags, usernames).
- Declared using the `env` keyword.
- Scope can be workflow-wide, job-specific, or step-specific.
- Accessed via `${{ env.VARIABLE_NAME }}`.

---

## 4. Secrets
- Used for sensitive data (e.g., passwords, API keys).
- Encrypted and masked in logs.
- Accessed via `${{ secrets.SECRET_NAME }}`.
- Can be passed as environment variables or action inputs.

---

## 5. Setting Secrets in GitHub
To add a repository-level secret:
- Go to the repository → Settings → Secrets and Variables → Actions.
- Click the **Secrets** tab → **New repository secret**.
- 
- ![image.png](attachment:33254306-d10e-47e4-b31d-7ac2b4a82e06.png)
- 
- Provide a name and value → Click **Add secret**.


![image.png](attachment:8aa55c67-c5c0-4880-83b3-ed0674643501.png)


---

## 6. GITHUB_TOKEN Secret
- Built-in secret automatically available in every workflow.
- Enables interaction with GitHub API:
  - Clone repository
  - Open/close issues and PRs
  - Comment on issues and PRs
- Permissions are auto-configured based on the event.
- Can be scoped using:
  ```yaml
  permissions:
    pull-requests: write
7. Example: Commenting on a Pull Request
Use the thollander/actions-comment-pull-request Action to post comments via GitHub Actions.
![image.png](attachment:cff53530-03f0-4b4f-ae82-7641940b5d95.png)

Workflow Snippet:
```yaml
jobs:
  comment-on-pr:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - name: Comment on PR
        uses: thollander/actions-comment-pull-request@v2
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          message: |
            Hello world! This is an automated comment.
This posts a comment on the pull request using the GitHub Actions bot.

## Topic: Model Training with GitHub Actions and CML

---

## 1. Overview
We explored how to automate machine learning model training using GitHub Actions, with a focus on CI/CD integration via Continuous Machine Learning (CML).

---

## 2. Dataset: Weather Prediction in Australia
- Source: [Kaggle – Weather Data](https://www.kaggle.com/datasets/rever3nd/weather-data)
- Task: Binary classification – predict whether it will rain tomorrow.
- Features:
  - 5 categorical: location, wind directions, rain today, etc.
  - 17 numerical: temperature, wind gusts, rainfall amount, etc.

---

## 3. Modeling Workflow
- Convert categorical features to numerical (target encoding).
- Impute missing values (mean strategy).
- Scale features to zero mean and unit variance.
- Split into train/test sets.
- Train a `RandomForestClassifier` with fixed hyperparameters.
- Report metrics: precision, recall, accuracy, F1 score.

> No hyperparameter tuning yet — deferred to later stages.

---

## 4. Data Preparation: Target Encoding
- Reference: [Target Encoding Blog](https://maxhalford.github.io/blog/target-encoding/)
- Strategy:
  - Replace each categorical value with its average target value.
  - Useful for high-cardinality features.
  - Avoids complexity of one-hot encoding.

---

## 5. Imputing and Scaling
- Impute missing values using mean.
- Scale features using `impute_and_scale_data` function:
  - Zero mean
  - Unit standard deviation

---

## 6. Model Training
- Split data using `train_test_split` from scikit-learn.
- Train using `RandomForestClassifier`:
  - High accuracy
  - Robust to overfitting
  - Handles large feature sets

---

## 7. Evaluation Metrics
- Reference: [Scikit-learn Classification Metrics](https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics)
- Metrics reported:
  - Accuracy
  - Precision
  - Recall
  - F1 Score

---

## 8. Confusion Matrix Plot
- Visualized as a heatmap.
- Cells show:
  - True Positive
  - False Positive
  - True Negative
  - False Negative
- Diagonal cells = correct predictions.

---

## 9. GitHub Actions Workflow
- Trigger: Pull request from feature branch → main.
- Tool: [Continuous Machine Learning (CML)](https://cml.dev/)
- Purpose:
  - Provision runner
  - Train and evaluate model
  - Compare experiments
  - Monitor dataset changes
  - Auto-generate visual report in PR
![image.png](attachment:f074fdda-fb03-4611-a739-3f2acb4c3c9a.png)

> Reference: [Feature Branch Workflow](https://martinfowler.com/bliki/FeatureBranch.html)

---

## 10. CML Commands in Workflow
- Use `setup-cml` GitHub Action.
- Run training code as shell command.
- Read outputs:
  - `results.txt`
  - Graph image
- Write to markdown file.
- Use:
  ```bash
  cml comment create report.md
to post results in the pull request.

GitHub token is passed as environment variable to enable commenting.

11. Output
When PR is opened, the workflow runs.

CML posts a comment with:

Evaluation metrics

Confusion matrix plot

Summary of model performance


# Traininng model in github actions 
we need to have the files and folder  as 


```
processed_dataset/
rsw_dataset/
  weather.csv
metrics_and_plots.py
model.py
preprocess_dataset.py
train.py
utils_and_constants.py
```

these are supposed to be there and thing is we need to have our stuffs above 

the github will execute them as i will have bloank.yaml in .github/preprocess/
that will be having script to execute teh preprocess and train 
the model we will see the confusion mtrix in png 
and another preprocessed csv in preprocessedd_dataset and finally 
.json file will shot the accuracy metric calculated
that is already given in train.py file
