PoC for dashboard built specifically for monitoring of ML workloads (training + deployment).
This project has three parts:
- Jobs: there are three jobs (image class. training, text class. training, text class. deployment). Each job will have different things that it is going to log for real world behaviour.
- Server: this is the core because this is what we want to build for efficiency and easy usage
- Website: the dashboard