1. Add a pipeline to get all the different datasets into a specific format as suited for the base model to tune. 2. Add a pipeline to mix different datasets for tuning: - Random Mix - Similarity Based Mix - other SOTA