-
Notifications
You must be signed in to change notification settings - Fork 0
/
sparkify.html
238 lines (182 loc) · 10.4 KB
/
sparkify.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=device-width, initial-scale=1.0" name="viewport">
<title>Portfolio Details</title>
<meta content="" name="descriptison">
<meta content="" name="keywords">
<!-- Favicons -->
<link href="assets/img/favicon.png" rel="icon">
<link href="assets/img/apple-touch-icon.png" rel="apple-touch-icon">
<!-- Google Fonts -->
<link href="https://fonts.googleapis.com/css?family=Open+Sans:300,300i,400,400i,600,600i,700,700i|Raleway:300,300i,400,400i,500,500i,600,600i,700,700i|Poppins:300,300i,400,400i,500,500i,600,600i,700,700i" rel="stylesheet">
<!-- Vendor CSS Files -->
<link href="assets/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">
<link href="assets/vendor/icofont/icofont.min.css" rel="stylesheet">
<link href="assets/vendor/boxicons/css/boxicons.min.css" rel="stylesheet">
<link href="assets/vendor/venobox/venobox.css" rel="stylesheet">
<link href="assets/vendor/owl.carousel/assets/owl.carousel.min.css" rel="stylesheet">
<link href="assets/vendor/aos/aos.css" rel="stylesheet">
<!-- Template Main CSS File -->
<link href="assets/css/style.css" rel="stylesheet">
<!-- =======================================================
* Template Name: iPortfolio - v1.3.0
* Template URL: https://bootstrapmade.com/iportfolio-bootstrap-portfolio-websites-template/
* Author: BootstrapMade.com
* License: https://bootstrapmade.com/license/
======================================================== -->
</head>
<body>
<!-- ======= Mobile nav toggle button ======= -->
<button type="button" class="mobile-nav-toggle d-xl-none"><i class="icofont-navigation-menu"></i></button>
<!-- ======= Header ======= -->
<header id="header">
<div class="d-flex flex-column">
<div class="profile">
<img src="assets/img/profile-img.jpg" alt="" class="img-fluid rounded-circle">
<h1 class="text-light"><a href="index.html">Monika Bagyal</a></h1>
<div class="social-links mt-3 text-center">
<a href="https://github.com/Minsifye" class="github"><i class="bx bxl-github"></i></a>
<a href="https://www.linkedin.com/in/mbagyal/" class="linkedin"><i class="bx bxl-linkedin"></i></a>
</div>
</div>
<nav class="nav-menu">
<ul>
<li><a href="index.html"><i class="bx bx-home"></i> <span>Home</span></a></li>
<li><a href="index.html"><i class="bx bx-user"></i> <span>About</span></a></li>
<li><a href="index.html"><i class="bx bx-file-blank"></i> <span>Resume</span></a></li>
<li class="active"><a href="index.html"><i class="bx bx-book-content"></i> Portfolio</a></li>
<li><a href="index.html"><i class="bx bx-server"></i> Blogs</a></li>
<li><a href="index.html"><i class="bx bx-envelope"></i> Contact</a></li>
</ul>
</nav><!-- .nav-menu -->
<button type="button" class="mobile-nav-toggle d-xl-none"><i class="icofont-navigation-menu"></i></button>
</div>
</header><!-- End Header -->
<main id="main">
<!-- ======= Breadcrumbs ======= -->
<section id="breadcrumbs" class="breadcrumbs">
<div class="container">
<div class="d-flex justify-content-between align-items-center">
<h2>Project Details</h2>
<ol>
<li><a href="index.html">Home</a></li>
<li>Project Details</li>
</ol>
</div>
</div>
</section><!-- End Breadcrumbs -->
<!-- ======= Portfolio Details Section ======= -->
<section id="portfolio-details" class="portfolio-details">
<div class="container">
<div class="portfolio-details-container">
<div class="owl-carousel portfolio-details-carousel">
<img src="assets/img/portfolio/sparkify.jpg" class="img-fluid" alt="">
<!--img src="assets/img/portfolio-details-2.jpg" class="img-fluid" alt="">
<img src="assets/img/portfolio-details-3.jpg" class="img-fluid" alt=""-->
</div>
<div class="portfolio-info">
<h3>Project information</h3>
<ul>
<li><strong>Category</strong>: Classification</li>
<li><strong>Project date</strong>: April, 2020</li>
<li><strong>Project URL</strong>: <a href="https://github.com/Minsifye/Sparkify">Github</a></li>
<li><strong>Blog Post</strong>: <a href="https://medium.com/@monika.bagyal/how-to-predict-customer-churn-using-machine-learning-model-on-spark-99f5277993b7?sk=dfae709ed6563237de260a46195bd783">Medium</a></li>
</ul>
</div>
</div>
<span>Photo by <a href="https://unsplash.com/@stephaniemccabe?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Stephanie McCabe</a> on <a href="https://unsplash.com/s/photos/spark?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a></span>
</div>
</section><!-- End Portfolio Details Section -->
<section id="resume" class="resume">
<div class="container">
<div class="section-title">
<h2>Sparkify</h2>
<p>
Sparkify is a fictional music streaming service like Spotify or Pandora. This project is completed under Udacity Data Scientist Nanodegree Requirement.
</p>
</div>
<div class="row">
<div class="col-lg-6" data-aos="fade-up">
<h3 class="resume-title">Project Requirements</h3>
<div class="resume-item pb-0">
<p><em>I am using Sparkify Churn Prediction as a problem statement and using pySpark throughout the project to deploy it on AWS.</em></p>
<ul>
<li>Using pySpark as primary programming language.</li>
<li>Writing a blogpost discussing results.</li>
<li>Running project on AWS Spark platform with 12GB dataset.</li>
</ul>
</div>
<h3 class="resume-title">Skills Required</h3>
<div class="resume-item">
<h5>Python</h5>
<h5>pySpark</h5>
<h5>pandas</h5>
<h5>Matplotlib</h5>
<h5>numPy</h5>
<h5>Seaborn</h5>
<h5>DecisionTreeClassifier</h5>
<h5>GBTClassifier</h5>
<h5>RandomForestClassifier</h5>
<!--h5></h5>
<h5></h5>
<h5></h5-->
<!--ul>
<li>Supervise the assessment of all graphic materials in order to ensure quality and accuracy of the design</li>
<li>Oversee the efficient use of production project budgets ranging from $2,000 - $25,000</li>
</ul-->
</div>
</div>
<div class="col-lg-6" data-aos="fade-up" data-aos-delay="100">
<h3 class="resume-title">Techniques</h3>
<div class="resume-item">
<h4>Major Tasks</h4>
<ul>
<li><strong>Data Understanding</strong>: Exploratory Data Analysis on local machine with a small(128mb) Sparkify dataset provided by Udacity.</li>
<li><strong>Data Preparation</strong>: Worked with data challenges like missing values, data cleaning, imputing categorical variables, and identifying a target variable to predict churned customers.</li>
<li><strong>Data Visualizations</strong>: Provided data visualization for a deeper understanding of Sparkify customers.</li>
<li><strong>Modeling</strong>: trained Decision Tree Classifier, Random Forest Classifier, and Gradient Boosting Classifier models and compared the outcomes before choosing the winning model.</li>
<li><strong>Evaluation Metric</strong>: In pySpark, we can not print the confusion matrix, instead, we have to use MulticlassClassificationEvaluator to evaluate accuracy and f1_score. We can also use BinaryClassificationEvaluator to calculate Area under the ROC curve and Area under the Precision-Recall.</li>
<li><strong>Results</strong>: After comparing all models on different evaluation metrics, Gradient Boosting performs better than the other two. You can see that f1-score for the Gradient boosting algorithm is 0.90, while Random Forest f1-score is 0.79.</li>
<li><strong>Deploy</strong>: Run this project on AWS Spark with 12GB Dataset.</li>
</ul>
<p><em>I have followed the CRISP-DM process throughout the project. Completed using Jupyter Notebook.</em></p>
</div>
</div>
</div>
</div>
</section><!-- End Resume Section -->
</main><!-- End #main -->
<!-- ======= Footer ======= -->
<footer id="footer">
<div class="container">
<div class="copyright">
© Copyright <strong><span>iPortfolio</span></strong>
</div>
<div class="credits">
<!-- All the links in the footer should remain intact. -->
<!-- You can delete the links only if you purchased the pro version. -->
<!-- Licensing information: https://bootstrapmade.com/license/ -->
<!-- Purchase the pro version with working PHP/AJAX contact form: https://bootstrapmade.com/iportfolio-bootstrap-portfolio-websites-template/ -->
Designed by <a href="https://bootstrapmade.com/">BootstrapMade</a>
</div>
</div>
</footer><!-- End Footer -->
<a href="#" class="back-to-top"><i class="icofont-simple-up"></i></a>
<!-- Vendor JS Files -->
<script src="assets/vendor/jquery/jquery.min.js"></script>
<script src="assets/vendor/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="assets/vendor/jquery.easing/jquery.easing.min.js"></script>
<script src="assets/vendor/php-email-form/validate.js"></script>
<script src="assets/vendor/waypoints/jquery.waypoints.min.js"></script>
<script src="assets/vendor/counterup/counterup.min.js"></script>
<script src="assets/vendor/isotope-layout/isotope.pkgd.min.js"></script>
<script src="assets/vendor/venobox/venobox.min.js"></script>
<script src="assets/vendor/owl.carousel/owl.carousel.min.js"></script>
<script src="assets/vendor/typed.js/typed.min.js"></script>
<script src="assets/vendor/aos/aos.js"></script>
<!-- Template Main JS File -->
<script src="assets/js/main.js"></script>
</body>
</html>