Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 1.67 KB

README.md

File metadata and controls

31 lines (23 loc) · 1.67 KB

Outfox a Fraud

Online ticket vendors offer the convenience of digital tickets without the expense of building an online sale platform. As intermediaries, these vendors are particularly reliant on the convenience factor; their service must be easy and secure. There are few things less convenient than getting scammed. Fraud detection is a growing application of machine learning, and is a vital security measure across a range of industries.

This repo explores the ways to customize fraud detection classification models for use in the ticket vendor industry. Because we utilize real data from a ticket sales company, some segments of the work posted here will not be reproducible. This is one of several steps taken to ensure the anonymity of the vendor and its clients.

Also, please note: this repository is a work in progress. I am in the process of recreating files to remove all reference to identifiable information. As new sections are completed, I will post files here and update the content list below.

Available for Review

  • EDA

  • Feature Engineering

  • Baseline Random Forest Model

In Progress

  • Natural Language Processing
    • Process text columns (event & organization descriptions)
      • Remove HTML, punctuation, stop words
      • Tokenize & lemmatize
    • Compare vectorizers: Count, TF-IDF, and Hash
  • Model Selection
    • Hyperparameter tuning using Bayesian Optimizer
    • Compare model performance with cross-validation
      • Random Forest
      • Logistic Regression
      • Gradient Boosting