My Data Science Portfolio contains Projects that I completed, which are solving a particular business problem or introducing a new product/algorithm.
Why: To estimate the house prices for helping people who plan to buy a house so they can know the price range in the future and plan their finance well. In addition, house price predictions are also beneficial for property investors to know the trend of housing prices in a certain location.
Steps:
- Used the AmesHousing dataset here compiled by Dean De Cock
- Applyed Feature Transformation and Feature Selection
- Used K-fold Cross Validation to eliminate overfitting
- Linear Regression to Predict the House Prices
Why: Heart disease can be predicted based on various symptoms such as age, gender, heart rate, etc. Machine learning algorithms play an essential and precise role in the prediction of heart disease. This would help to reduce the death rate of heart patients.
Steps:
- Analysing the different variables and finding relations and insights using descriptive statistical analysis technique
- A Stepwise Data Preprocessing
- Splitting the data into train and test sets
- Using Logistic Regression algorithm to predict heart disease
- Analysing the Models performance
The project aimes to group the different types of customers in a mall using the K-Means clustering technique.
Why: This project shows how to use Machine Learning in business. By clustering your customer data you can group them by age, salary, gender, spendings or any other feature that you have in your customer dataset. The algorithm will assure to find the best strategy for a market using the customer trends.
Steps:
- Feature Transformation
- Analysing the different Parameters
- Principal Component Analysis (PCA) for reducing the parameters and finding Eigenvalues and Loading Scores
- K-mean Clustering: modelling using sciket-learn and algorithm-based modelling
- Cluster-wise Analysis and Observations
Why: To estimate the churn rate of a bank customer in order to identify customers that are likely to leave the company and try to encourage them to stay by various marketing tools.
How: Using customer behavour data to analyse the different parameters and train Artifical Neural Network (ANN) to predict the probability of each customer leaving the company.
Why: In Quantitative finance, predicting the stock price with high accuracy is a very important and difficult task. Although, it is imposiible to exactly estimate the stock prices, it is possible to make accurate predictions by using the past stock prices (Theory behind Brownian Motions).
How: We analyse the upward and downward treds in the past and use this as an indicator to predict the future prices. Long Short-Term Memory (LSTM) is a more sophisticated version of RNN which addresses the Vanishing Gradient Problem that RNNs often suffer from.
For this project, we use the past 5 years Google stock price data in the time period of 2016-2020. The goal is to predict the upward or downward trend in the stock price of Google for January 2021.