The Big Data Processes course teaches management and usage of data sets, interpretation and visualisation of data, and understanding data in larger contexts. It enables the identification of Big Data trends, understanding the value of insights to organizations, and designing Big Data processes. It also promotes the production of analytical insights and understanding the implications of Big Data processes.
This course is available to all DIM students. As a non-DIM student, one should have basic literacy in a programming language (for instance R or Python), corresponding to an introductory course in programming or equivalent.
Weeks | Topics | Exercise Description |
---|---|---|
Week 1 | Introduction | Opening, examining of simple datasets |
Week 2 | Prediction | Where to get datasets, dataset manipulation, visualisations |
Week 3 | Classification | Pearson correlation matrix, decision trees for classification, K-NN |
Week 4 | Ensemble Methods | Splitting and scaling, bagging, boosing, ensemble voting |
Week 5 | Evaluating | Confusion matrix, scores and metrics, over- and undersampling |
Week 6 | ML & Climate Change | Using codecarbon from EmissionsTracker |
Week 7 | Exploratory Data Analysis | Data cleaning, exploration, outliers, and visualisation |
Week 8 | Power | NO CODE |
Week 9 | Development | NO CODE |
Week 10 | Implementation & Maintenance | NO CODE |
Week 11 | AI Ethics | NO CODE |
Week 12 | International Contexts | NO CODE |