Skip to content
View LingAdeu's full-sized avatar

Block or report LingAdeu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
LingAdeu/README.md

Header

About

Hi, I am a junior linguist with a current interest in computational stylometry. As this area of interest requires a deep understanding of linguistics, statistics, and computational methods, I'm using data science and machine learning to explore my area of interest.

Currently, I'm learning how to extract important linguistic features from text data and how to experiment machine learning models for text classification. I am also exploring how to apply statistical techniques for authorship attribution. In addition to these, I am working on some data science projects in business context to get myself familiar with numbers.

Key Projects
  • PREDICTIVE MODELING
    • Optimizing Ride Fares: A Dynamic Pricing Model for Ride-Sharing Services
      • Currently, ride-sharing prices are primarily set based on ride duration, overlooking fluctuating demand and supply. This project explores a dynamic pricing model powered by machine learning to enhance profitability while keeping prices appealing to customers. By experimenting with 12 ML algorithms and two feature engineering techniques, the project developed a model that, when tested with a simulation of 100 customers, showed that increasing the expected ride duration by 20% through a promotional campaign could generate a net profit of $2.4K. (Read More)
    • Addressing Customer Churn in an E-Commerce Company
      • This project seeks to reduce an e-commerce company's customer churn rate from 16.8% to 10%. Using diagnostic analysis and a classification model, we focused on minimizing false negatives due to their higher financial impact. After testing various techniques and algorithms, we chose XGBoost and identified tenure and cashback amount as key factors for intervention. Simulations showed that with targeted strategies, achieving the 10% churn rate is feasible. (Read More)

  • DATA ANALYSIS
    • Evaluating Marketing Campaign Effectiveness for New Menu Items: An A/B Testing Approach
      • This project assesses which promotional campaign best boosts sales for a fast-food company's new menu items. Statistical analysis, including the Kruskal-Wallis H test and Dunn's post-hoc test, was used due to non-normal sales distributions and outliers. Results showed the first campaign achieved the highest median sales, but the practical difference ($\eta^2$) between campaigns were minor. It is recommended that the Marketing Manager re-evaluate marketing strategies and target customers to improve campaign impact. (Read More)
    • Improving the Number of Review: Exploring Review Patterns in Bangkok's Airbnb Landscape
      • Despite an increase in reviews, about 36% (5.7 thousand) of Airbnb listings in Bangkok received none from 2012 to 2022. This project explores why some listings lack reviews and offers recommendations for Airbnb Thailand. It finds that unreviewed listings often have higher prices and longer minimum stays, which may deter bookings and reviews. In contrast, reviewed listings are typically entire homes or apartments, more centrally located, and closer to popular areas. Recommendations include adjusting prices and minimum stays for unreviewed listings, running promotions to boost reviews, and improving marketing to highlight unique features and attractions. (Read More)

  • NATURAL LANGUAGE PROCESSING
    • Regular Expression for Rule-Based Content Moderation
      • This project addresses taboo expressions in computer-mediated communications by detecting and censoring specific elements of messages (e.g., "Shit, I forgot!" $\rightarrow$ "****, I forgot!"). A rule-based approach using regular expressions was chosen over machine learning for its efficient implementation, high explainability to stakeholders, and reliable detection of inappropriate content through rule matching. (Read More)
    • Using Personal Names to Predict Gender: A 3-Character N-Gram Approach
      • This project investigated whether conventional machine learning algorithms with character n-grams could outperform Long Short-Term Memory (LSTM) models, which achieved an F1 score of 0.93 (Septiandri, 2017). Using 3-character n-grams focusing on word boundaries to capture spacing between name parts, the Support Vector Machine with a linear kernel performed best, achieving an F1 score of 0.94. The results suggest that conventional models can match or exceed LSTM performance when using word-boundary 3-character n-grams. (Read More)

My tools

rstudio logo python logo vscode logo markdown logo msql logo markdown logo tableau logo

Connect with me

linkedin logo medium logo

Pinned Loading

  1. customer-churn-prediction customer-churn-prediction Public

    This project aims to reduce churn rate from 16.8% to 10% by exploiting both diagnostic and predictive analytics. Using the final model, the churn rate can be reduced to even below 10% based on a si…

    Jupyter Notebook

  2. dynamic-pricing-model dynamic-pricing-model Public

    The goal of this project is to build a dynamic pricing model to adjust fares of bike-ride services based on different factors. Per 100 customers, the model can help generate net profit of USD 2.4K …

    Jupyter Notebook

  3. ab-testing-campaign-effectiveness ab-testing-campaign-effectiveness Public

    This project aims to investigate out of three promotional campaign, which one performs the best in terms of generating sales. To this end, this project utilizes an A/B testing approach.

    Jupyter Notebook

  4. predicting-gender-based-on-name predicting-gender-based-on-name Public

    This project seeks to build a classifier to predict someone's gender (binary categories) based on their full names. It is IMPORTANT to note that the model's predictions are only valid for Indonesia…

    Jupyter Notebook

  5. bangkok-airbnb-review-exploration bangkok-airbnb-review-exploration Public

    This repository contains the code and resources for analyzing Airbnb listing reviews in Bangkok, Thailand. My aim is to explore the factors influencing the lack of reviews for certain listings in B…

    Jupyter Notebook

  6. spam-message-prediction spam-message-prediction Public

    This project aims to build a classification model to detect a spam message utilizing term frequency-inverse document frequency (TF-IDF) as the text representation.

    Jupyter Notebook