Week 6: Sharded data-parallel training, distributed training optimizations Lecture: TBA Seminar: link Homework: see the homework folder