GitHub - Minsifye/portfolio: my portfolio website content holder.

Monika Bagyal

Data Science Portfolio

Personal Website

Spark Project: Sparkify

Sparkify is a fictional music streaming service like Spotify or Pandora. I will be using Sparkify Churn Prediction as a problem statement. The main findings of the code can be found at the Medium post available here.

Future Works

In the future, I can try to separate the whole dataset in monthly or weekly data and predict for next month or week churned customers.
I can use more advance machine learning techniques, by combining two or more algorithms to improve the overall prediction rate.
I can run this over the AWS cluster to see the model performance and use cross-validation for a better f1 score.

Disaster Response Pipeline

To analyze disaster data from Figure Eight to build a model for an API that classifies disaster messages.

The outcome of this project is to start from scratch with a dataset, create a ETL pipeline for data engineering job and create a Machine Learning pipeline to train a model which can read text data and predict 36 classification categories.
At the end, use that trained and tuned ML model and use to predict any new message and find which disaster category it will fit.
Create a front-end application using flask to showcase visualization and model disaster category prediction on a webpage.

Recommendations with IBM

Analyze the interactions that users have with articles on the IBM Watson Studio platform, and make recommendations to them about new articles you think they will like.

This project is divided into the following tasks:

I. Exploratory Data Analysis
II. Rank Based Recommendations
III. User-User Based Collaborative Filtering
V. Matrix Factorization

Stack-Overflow Survey Analysis

I have used CRISP-DM process during this analysis.

Business Understanding - Started analysis with posed questions in mind.
Data Understanding - To better understand the data, I started going through the dataset and noted points as how to use it for my analysis. For example: which columns will be helpful to answer a particular questions?
Prepare Data - At various points, I have to do data wrangling and perform data transformation to achieve the results. Keeping DRY techniques in mind, I have also created a function to draw plotly barchart as this code was repeating often.
Model Data - My analysis does not involve modeling step. I might add this in my future work. Results - I am using visualizations like barchart and piecharts to convey my findings, also added result statements at the end of every visualization for easy understanding of thought process.
Deploy - I am not deploying this code anywhere right now. For now, it is available in jupyter notebook form only.

The main findings of the code can be found at the Medium post available here.

BigView - PearlHack2020

Natural Language Processing on Online Reviews Data, a BigView Idea. Created at Pearl Hackathon event at UNC Chapel Hill.
Simplifying Daily life buying decisions by getting insights from online reviews faster.

Identifying Suspicious Activities in Financial Data

Identifying Suspicious Activities in Financial Data, in this project we are trying to find, how suspicious activities can be caught using a supervised learning algorithm in existing customer data under the compliance department of a bank or financial institution.
The results show that False Positives can be reduced using Supervised Machine Learning algorithms because these algorithms have the potential to differentiate between regular and suspicious patterns of customer activity. This project is created under ITCS 6156 - Machine Learning at UNC Charlotte. Created on Dec,2019 Quick View.

Image Classifier Model

In this project, I have implemented an image classification application using a deep learning model on a dataset of images.
First I have trained the model to classify new images using Jupyter notebook and then converted it into a Python application that will run from the command line in a system. A Udacity Data Scientist Nanodegree Project-Term1. Created on Mar,2019 Quick View.

Identify Customer Segments

The data and design for this project were provided by Arvato Financial Services. I have applied unsupervised learning techniques on demographic and spending data for a sample of German households.
I have preprocessed the data, applied dimensionality reduction techniques, and implemented clustering algorithms to segment customers with the goal of optimizing customer outreach for a mail-order company. A Udacity Data Scientist Nanodegree Project-Term1. Created on Mar,2019 Quick View.

Finding Donor Charity Model

CharityML is a fictitious charity organization that provides financial support for people learning machine learning. In an effort to improve donor outreach effectiveness, I have built an algorithm that best identifies potential donors.
My goal was to evaluate and optimize several different supervised learners to determine which algorithm will provide the highest donation yield. A Udacity Data Scientist Nanodegree Project-Term1. Created on Jan,2019 Quick View.

Bank Churn Model

Bank Churning Model is built to predict possibility of a customer to leave the bank, this dataset is fictional and taken from https://www.superdatascience.com/machine-learning/ platform. Created on Jan,2019 Quick View.

Housing Data Linear Regression

In this project, I am using Linear Regression model to predict house prices for king county, this dataset is taken from Kaggle platform. Created on Jan,2019 [Quick View](https://nbviewer.jupyter.org/github/Minsifye/House-Price-Prediction-Linear Regression/blob/master/HousingData_LinearRegression.ipynb).

Medium

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
assets		assets
forms		forms
.gitignore		.gitignore
Backup_Readme.txt		Backup_Readme.txt
LICENSE		LICENSE
README.md		README.md
Readme.txt		Readme.txt
_config.yml		_config.yml
airline.html		airline.html
aml.html		aml.html
auto.html		auto.html
bankchurn.html		bankchurn.html
changelog.txt		changelog.txt
customersegments.html		customersegments.html
disaster.html		disaster.html
findingdonor.html		findingdonor.html
hackathon.html		hackathon.html
housing.html		housing.html
imageclassifier.html		imageclassifier.html
index.html		index.html
inner-page.html		inner-page.html
portfolio-details.html		portfolio-details.html
recommender.html		recommender.html
sparkify.html		sparkify.html
stackoverflow.html		stackoverflow.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Monika Bagyal

Data Science Portfolio

Personal Website

Spark Project: Sparkify

Disaster Response Pipeline

Recommendations with IBM

Stack-Overflow Survey Analysis

BigView - PearlHack2020

Identifying Suspicious Activities in Financial Data

Image Classifier Model

Identify Customer Segments

Finding Donor Charity Model

Bank Churn Model

Housing Data Linear Regression

About

Releases

Packages

Languages

License

Minsifye/portfolio

Folders and files

Latest commit

History

Repository files navigation

Monika Bagyal

Data Science Portfolio

About

Topics

Resources

License

Stars

Watchers

Forks

Languages