Application-of-DMAIC-Framework-on-the-Sentiment-Analysis

This repository contains scraping and sentiment analysis code with the aim of making business recommendations using the DMAIC framework.

Sentiment Analysis

Sentiment analysis is one of the popular tasks in the field of Natural Language Processing (NLP) and belongs to the category of learning that requires targeted learning (Supervised Learning). Sentiment analysis is actually a plain text classification task with a specific goal, namely to find out the sentiment of public opinion on a subject/object (example: HP products at Tokopedia).

DMAIC

DMAIC (Define, Measure, Analyze, Improve, Control) is a broadly structured problem solving procedure that used in quality and quality improvement processes. It is often associated with six sigma activities, and almost all implementations of six sigma using the DMAIC process.

Each step in the cyclical DMAIC Process is required to ensure the best possible results. The process steps:

Define the Customer, their Critical to Quality (CTQ) issues, and the Core Business Process involved.
Measure the performance of the Core Business Process involved.
Analyze the data collected and process map to determine root causes of defects and opportunities for improvement.
Improve the target process by designing creative solutions to fix and prevent problems.
Control the improvements to keep the process on the new course.

What we need to prepare?

Data

The data used is the *** application review data obtained from scraping data on the google play store.
Environtmet
- Python Version : 3.7.6
- Library : Pandas, Numpy, Matplotlib, Seaborn, Scikitlearn, Google-Play-Scraper, Sastrawi, Re, String.

Data Preparation

The data provided from google play store are not clean yet. We have to cleaned up and feature selection before going to the sentiment analysis.

Data Cleaning & Preprocessing

Data cleaning and preprocessing procedure including :
- Lowercase
- Remove Number
- Remove Punctuation
- Remove Whitespace
- Remove ASCII and Unicode
- Remove Newline
- Stop Words
- Stemming
- Vectorization
Feature Selection

From the variables we already have, we need to add one more variable. The variable is a value which is a label of positive or negative sentiment. So, now we have 5 variables to enter the data analysis.
Labeling
- As initial labeling, scores 1-3 are classified as negative label and 4-5 are classified as positive label.
- Only 2 classification classes are used because the need for DMAIC is only 2 classes in the form of defective and non-defective data..

We only use data in 2021 period. After going through the process of cleaning, stop words, stemming, remove missing values and filter data, the data used is 30,119 data.

Exploratory Data Analysis

Model

The classification model used is the SVM model and handling imbalance data by resampling.
After resampling, we use 9,500 data for negative and positive label.
Classification Score:

DMAIC

P-Control Table

Business Recomendation

Sentiment that has been obtained will be used as input in the DMAIC analysis. DMAIC is expected to be able to provide appropriate recommendations according to existing conditions. You can see the business recomendation for this study in PDF file in this repository.

License

The underlying code of this project is licensed under the MIT license https://github.com/madekrisnaj/Application-of-DMAIC-Framework-on-the-Sentiment-Analysis/blob/main/LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Business Recomendation.pdf		Business Recomendation.pdf
LICENSE		LICENSE
README.md		README.md
Scraping_from_google_playstore.ipynb		Scraping_from_google_playstore.ipynb
Sentiment Analysis with SVM.ipynb		Sentiment Analysis with SVM.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Business Recomendation.pdf

Business Recomendation.pdf

LICENSE

LICENSE

README.md

README.md

Scraping_from_google_playstore.ipynb

Scraping_from_google_playstore.ipynb

Sentiment Analysis with SVM.ipynb

Sentiment Analysis with SVM.ipynb

Repository files navigation

Application-of-DMAIC-Framework-on-the-Sentiment-Analysis

Sentiment Analysis

DMAIC

What we need to prepare?

Data Preparation

Exploratory Data Analysis

Model

DMAIC

Business Recomendation

License

About

Releases

Packages

Languages

License

madekrisnaj/Application-of-DMAIC-Framework-on-the-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Application-of-DMAIC-Framework-on-the-Sentiment-Analysis

Sentiment Analysis

DMAIC

What we need to prepare?

Data Preparation

Exploratory Data Analysis

Model

DMAIC

Business Recomendation

License

About

Resources

License

Stars

Watchers

Forks

Languages