Skip to content

vhuynh2017/Data-Science-Python

Repository files navigation

Data-Science-Python

The repository contains Data Science related assignments or projects. The sources are either come from a class at Florida Atlantic University (with permission from the professor to post old assignments) or side project from the web.

Current Topics:

  1. The Python Data Science Stack
  • Compute Summary Statistics
  • Compute Pearson coefficient
  • Test Hypothesis
  • Data Cleaning/Data preprocessing
  • Plot histogram, kernel density estimate (KDE) plot, lmplot (regression and fit), Box-and-whisker plots
  1. Exploratory Data Analysis
  • Exploratory dada analysis based on the content of each dataset
  • Plot barplot, pie plot, histogram, line graph
  • Data Manipulation (pivot table, group by, etc)
  • Hypothesis Testing
  1. Statistical Analysis Part 1, Part 2
  • Compute Summary Statistics
  • Compute Empirical Cumulative Distribution Function (ECDF)
  • Compute Percentiles
  • Computes the Pearson Correlation Coefficient
  • Compute the Cumulative Distribution Function (CDF) and Complementary CDF(CCDF)
  • Compute Moments
  • Plot the Probability Density Function(PDF) & Cumulative Distribution Function(CDF)
  • Plot histogram, swarmplot, Box-and-whisker plots, Scatter plots, Pair plots
  1. Hypothesis Testing
  • Hypothesis Testing
  • Compute Empirical Cumulative Distribution Function (ECDF)
  • Compute p-values
  • Test of Correlation
  1. Regression Analysis Part 1, Part 2, Part 3
  • Linear Regression by least squares
  • Linear Regression on Anscombe's quartet data
  • Polynomial Regression
  • Regularization in Polynomial Regression: Ridge regression (L2 Regularization), Lasso regression (L1 Regularization)
  • Compute RMSE and R^2 score
  • Multiple Regression
  • Logistic Regression
  1. Data Science / Machine Learning workflow (CAP5768 Final Project)
  • Compute Summary Statistics
  • Decision Trees
  • Plot Pair plots
  • Digit classification with Naive Bayes classifier using scikit-learn's MultinomialNB()
  • Digit classification with RandomForestClassifier
  • Hyperparameter Optimization with GridSearchCV
  • Face Recognition with multi-class SVM classifier
  • Principal Component Analysis (PCA)
  1. Affect of Outliers on Central Tendency Measures
  • Mean, Median, and Mode

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published