Skip to content

Latest commit

 

History

History

Data Science and Machine Learning

2020 MSA Phase 1 Data Pathway


What you will learn

  • Dataset building - API Call and Web Scraping
  • Exploratory analysis - Python and R
  • Model building - Python and R

What you will need


Video tutorial

Our data science pathway video playlist could be found here

Prerequisite

Online Training via Microsoft Learn – complete at least 2000 XP from any modules within the learning paths below

Course Name Duration of learning path(min) Total XP in learning path
Principals of Cloud Computing 62 800
Analyze climate data with Azure Notebooks 45 900
Predict flight delays 51 700
Azure Fundamentals 588 12000
Data Science VM N/A N/A

You can choose different modules from different learning paths as long as the amount of XP completed is above 2000.


Introduction Video Tutorials


Assessment

  1. Watch the video tutorials (unless you are quite familiar with what we are teaching)
  2. Using the provided house price dataset as a base, add at least 2 extra columns/features. This can be the 2018 Census Population and the Deprivation index or another feature. Refer to the Data Collection section for instructions on how to do this.
  3. Using either R or Python, analyse the resulting dataset. This would include preprocessing the data, exploring patterns in the data and building a machine learning model. You may get some inspiration from the 'Data Analysis and Model Building' section.
  4. As the final output of the assignment, produce a report of detailing your findings from above. This should include an:
    • Executive Summary
    • Initial data analysis
    • Analysis of correlations and patterns in the data
    • Build a model and comment on it (eg. if you did not include any attributes, why did you drop them)
    • Conclusions
  5. The 'report' can be a pdf written in Microsoft Word or Azure/Jupyter notebook with markdown or a R markdown file. Submit your dataset, Rmd/notebook files and report (in pdf format).
  6. Even if you did not manage to complete everything listed, we encourage you to submit it regardless. This aim of this assessment is to get you to try to use some of the skills we taught, and we would like to see what you have tried. Just because not all the boxes are ticked doesn't mean that you automatically won't pass! While marking we are more interested in seeing the effort you put in and your train of thoughts.