Fraud Detection - Training Job

This Fraud Detection component will run as a K8s BatchJob in the fraud-detection workflow. The job is responsible for periodically retraining the fraud detection model. Unlike the checkout and inference services which are long-running Kubernetes Deployments, this job exits as the model finishes training.

Workflow

The training job follows these steps:

Consume Events: It connects to the RabbitMQ queue (checkout.events) provisioned by Terraform, to consume new transaction events that have occurred since the last run.
Append Data: The new events are appended to a historical dataset (data.csv) stored in a persistent volume also provisioned by Terraform.
Train Model: A new LogisticRegression model from scikit-learn is trained on the entire updated dataset.
Save Model: The newly trained model artifact (model.pkl) is saved to a MinIO object storage bucket. The inference service then loads this model for making predictions.

Model Labeling Strategy

A crucial part of the training process is how transactions are labeled as fraudulent or not. Since we don't have pre-existing fraud labels, the script employs a simple rule-based approach for training purposes:

Note: Any transaction with an amount greater than 1000 is automatically labeled as fraudulent (1), and all others are labeled as non-fraudulent (0).

This is implemented in the code as:

y = (df["amount"] > 1000).astype(int) # Fake threshold for fraud

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
Dockerfile		Dockerfile
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Fraud Detection - Training Job

Workflow

Model Labeling Strategy

About

Uh oh!

Releases

Packages

Languages

ShrutiC-git/python-ml-batchjob

Folders and files

Latest commit

History

Repository files navigation

Fraud Detection - Training Job

Workflow

Model Labeling Strategy

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages