Skip to content

97alexlo/datascience-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

Python

Business Startup Analysis with PySpark and Prediction

In this project, my goal was to predict the cost of starting a business in different countries around the world. I decided to use the World Development Indicators dataset, which presents "the most current and accurate global development data available, and includes national, regional, and global estimates."

R

National Case Study Competition: Predicting Ferry Delays in B.C

This was my first national Kaggle competition. It is hosted by CANSSI (Canadian Statistical Sciences Institute) and the goal was to analyze and predict sailing delays between Vancouver and Victoria. The contest is open to all graduate and undergraduate students across Canada. In this project, I cleaned and transformed data, performed feature engineering, and implemented Logistic Regression and XGBoost to create predictive model. I placed 24th/65 teams on the private leaderboard

Predicting House Sales in King County

In this project, I collaborated with two classmates to analyze a dataset from Kaggle to predict house sales. We created a multiple linear regression model by comparing different feature selection methods and evaluated diagnostic plots and summaries. Our findings were summarised and presented in an executive report and PowerPoint presentation to a class of 70 students.

Analysis of Marvel Universe Cinematic films from 2008 - 2019

In this project, I scraped and analyzed data from Wikipedia's webpage of MCU films. I created a variety of visualizations to find trends in the data.

Twitter (and Sentiment) Analysis of Donald Trump’s tweets

In this project, I collected data from Donald Trump's twitter through an API. I performed text preprocessing and exploratory analysis to draw insights on his tweeting behaviour

SFU Course Outline Webscraper

This is a project from STAT 240 (Introduction to Data Science) where I created webscraper that cleans and collects relevant course information from SFU's webpages (ex. Professor's name, exam locations, course ID, textbooks)

My first Shiny app - Climate change in BC from 1979 - 2017

This is my first shiny app. I customized the layout along with buttons that plots different curves/slopes with the data sets provided by Dr. Campbell

SQLite Databases - STAT 240 Introduction to Data Science

Connected to an SQLite database in R and wrote queries to extract relevant information to perform calculations and create graphs to answer questions

Releases

No releases published

Packages

No packages published