Skip to content

betas-org/betas

Repository files navigation

logo

Build Status Coverage Status Language Version License Code Size Contributors

Background

Our project aims to create a simple and convenient visualization tool, Betas, for data scientists and data analysts to analyze model performance with visualizations in Python. Users can simply run a single line of code to generate custom plots for analyzing a linear regression model with assumptions diagnostics, computing model scores in binary classification, and for evaluating the performance of principal component analysis (PCA) and clustering algorithms. This tool also helps users to fit machine learning models to datasets without a detailed understanding of how the models work. Betas package is pip installable and easy to use by following our example IPython notebooks, in which we are using the Spam dataset and College dataset as demonstration. In addition, we have two interactive web dashboards designed for model diagnostics in linear regression and binary classification.

Team Members

Joel Stremmel

Yiming Liu

Cathy Jia

Mengying Bi

Arjun Singh

Data

Data Set 1: The Spam data (Source)

Data Set 2: The Breast Cancer data (Source)

Data Set 3: The College data (Source)

Data Set 4: The Auto data (Source)

Software

Programming Languages

Python

Python Packages

numpy >= 1.13.1

pandas >= 0.23.1

matplotlib >= 2.0.2

seaborn >= 0.9.0

scipy <= 1.2.0

scikit-learn >= 0.20.2

statsmodels >= 0.9.0

dash >= 0.43.0

bokeh >= 1.0.4

Structure

This package has the following structure. See betas library documentation for details.

betas/
  |- betas/
     |- README.md
     |- __init__.py
     |- binary_score_diagnostics.py
     |- binary_score_plot.py
     |- clustering_evaluate.py
     |- download.js
     |- pca_evaluate.py
     |- regression_analysis_plot.py
     |- regression_diagnostics.py
     |- setup.cfg
     |- test_analysis_plot.py
     |- test_binary_score_plot.py
     |- test_clustering_evaluate.py
     |- test_pca_evaluate.py
  |- data/
     |- college.csv
     |- spam.data.txt
     |- spam.traintest.txt
     |- spam_score_label.csv
  |- dist/
     |- betas-v1.3.tar.gz
  |- docs/
     |- Component_Specification.pdf
     |- Final_Presentation.pdf
     |- Functional_Specification.pdf
     |- Project_Summary.pdf
     |- Technology_Review.pdf
     |- logo_black.png
     |- logo_white.png
  |- examples/
     |- demo_regression_analysis_plot.ipynb
     |- demo_binary_score_plot.ipynb
     |- demo_clustering_evaluate.ipynb
     |- demo_pca_evaluate.ipynb
  |- LICENSE.txt
  |- README.md
  |- environment.yml
  |- requirements.txt
  |- setup.py

Installation

pip install betas

License Information

MIT License