Skip to content

jryan814/datab-projects

Repository files navigation

datab-projects Binder

Project Portfolio

Selection

I selected these projects to showcase some of the work I have done. They were chosen based on several factors, such as my personal interest in the project, the technical skills used, and the overall genre of it's application. Virtually all of the data I have worked with has been unstructured, requiring some degree of cleaning, transforming, and reconstructing.

Some of the libraries used are:

  • Pandas
  • NumPy
  • Matplotlib
  • Seaborn
  • SQL (SQLite, SQLalchemy, PostgreSQL)
  • scikitlearn

Technical Skills Used:

  • Python
  • SQL
  • Statistical Analysis
  • EDA
  • Data Analysis and Visualization
  • Machine Learning
  • Deep Learning
  • Web Scraping & APIs

Project Descriptions (wip)

  • Business Intelligence Tool Development (subfolder)
    • Engineered ETL pipeline (cleans/transforms and loads flat file into a star schema database).
    • Queried and extracted data subset (xxx,xxx records) from Federal Procurement Data System (FPDS) based on NAICS codes, business size, and recency.
    • Wrote Python and SQL scripts to report key business analytics to identify and target competitor companies, ranking by size and level of activity.
    • Created visualizations in Tableau and Jupyter Notebooks to represent key market spaces.
    • Reported detailed analytics on relevant competitor entities to enable improved business competition and identify potential targets for acquisition.
  • HNproject.ipynb
    • Analysis to determine optimal times to post on Hacker News.
  • advertising_analysis.ipynb
    • Determining the best markets for advertising campaign.
  • bike_rental_predictions.ipynb
    • Implementation of multiple machine learning models to predict volume of bike rentals.
  • war_on_spam.ipynb
    • SMS spam filter created using a Naive Bayes Algorithm to classify messages as spam or non-spam.
  • stock_market_project Dashboard
    • Stock market machine learning pipeline.
    • Feature engineering and model training pipelines are in the stock_market_project dir.
  • agraph.py
    • An AI maze solver. Generates nodes and edges based off the maze array. TODO: create maze image input, and a graphical output