MapReduce project for GIU Distributed & Web-based Systems.
Implemented tasks:
- Task 1: Inverted Index
- Task 2: Maximum Temperature
- Task 3: Average Movie Rating
- Task 4: Product Sales Total
- Task 5: Most Frequent Word
Project structure:
src/: MRJob task implementations and shared base classtests/: unit tests for the base class and all tasksdata/raw/: input datasets for Tasks 1 to 5notebooks/: final Google Colab submission notebook
Useful commands:
make test: run all testsmake run-t1tomake run-t5: generate task outputs locally