Scott Miner sminerport

Scott Miner's Tech and Data Journey

Hey there! I’m the kind of person who finds joy in data—whether it’s teasing out insights from numbers or building a slick new website. My career has been a bit of a journey through the tech world, with stops as a Data Science Engineer, Data Analyst, SQL Server Developer, and even a Data Conversion Specialist. But at heart, I just love making things work better through technology.

I got my start with a B.A. in Psychology from the University of Maryland, Baltimore County, where I not only delved into the human mind but also played tunes as part of the jazz bands under the legendary Chris Vadala. That experience harmonized perfectly with my academic path, which later led me to Red Rocks Community College where I earned a double encore in tech with certifications in Software Development and Programming.

The crescendo of my educational symphony was a Master of Science in Data Analytics from CSU-Global, class of 2021, with a focus on the cutting-edge fields of AI and ML. My particular fascinations? Computer vision and natural language processing (NLP), fields where tech meets human perception and communication.

When I’m not delving into datasets or getting lost in lines of code, I find relaxation and inspiration by watching a variety of YouTube videos. From Mixed Martial Arts to in-depth coding tutorials and beyond, these videos fuel my passion for continuous learning and keep me updated on the latest trends and techniques. My world is one where creativity meets logic—whether I'm exploring the intricacies of web development, or marveling at the complexities of machine learning and data visualization, it’s all about finding the harmony in chaos and the patterns that lead to insights. I’m eager to connect and collaborate, so let’s see what stories we can tell through data and how we can make it sing!

🏆 Awards and Certifications

2022 ANNY Award - Contributed to Echo Global Logistics winning the Excellence in Analytics Award for their Digital Freight Marketplace program
Predictive Analytics Management - SAS and Colorado State University Global (December 2020)
Business Intelligence and Performance Management - SAS and Colorado State University Global (December 2020)
Applied Data Analytics - SAS and Colorado State University Global (December 2020)
Programming Specialist Certification (August 2017) - Red Rocks Community College
Software Developer Certification (May 2016) - Red Rocks Community College

🎓 Education

Master of Science in Data Analytics - Colorado State University Global, Aurora, CO (2021)
- Specialization in Artificial Intelligence and Machine Learning
- GPA: 4.0
Programming Specialist and Software Developer Certifications - Red Rocks Community College, Lakewood, CO (2015-2017)
- Recognized by instructor Dr. Julie Schneider for exceptional achievement in creating a highly efficient and high-quality C++ application with a well-designed user interface.
- GPA: 4.0
Bachelor of Arts in Psychology, cum laude - University of Maryland, Baltimore County (UMBC), Baltimore, MD (2009)

💼 Experience

Data Analyst/Librarian, Vibrant Emotional Health | New York, NY (08/2023 - Present)
Data Science Engineer, Echo Global Logistics | Chicago, IL (01/2022 – 01/2023)
Data Analyst, Community College of Aurora | Aurora, CO (02/2020 – 06/2021)
Senior SQL Server Developer, MeritDirect | Rye Brook, NY (07/2018 – 09/2019)
Data Conversion Specialist, Journal Technologies | Englewood, CO (01/2017 – 06/2018)

🔧 Technical Skills

Programming Languages: Python, R, SAS, SQL, Java, C++, C#, JavaScript, HTML, CSS, and Tableau
Frameworks and Libraries: TensorFlow, ML Frameworks & Libraries, Python Libraries
Tools: Git, Docker, Kubernetes, Microsoft Azure, Microsoft Office Suite, SSIS, SSMS, Visual Studio, VSCode
Databases: Amazon Redshift, Oracle, PostgreSQL, SQL Server

🌟 Featured Projects

K-Nearest Neighbors Classifier - A Python implementation of the KNN classification algorithm, including preprocessing, cross-validation, and a sample dataset for demonstration.
Word2Vec with TensorFlow 2.0 - An implementation of the Word2Vec algorithm using TensorFlow 2.0 to compute vector representations of words, trained on the text8 dataset.
Text Augmentation using Back-Translation - A Python script that leverages the NLPAug library and round-trip translation technique to augment text datasets for various NLP tasks.
AVL Tree Implementation - A Python implementation of an AVL tree (balanced binary search tree) that automatically maintains balance as new nodes are inserted or removed.
Boston Housing Analysis - A comprehensive analysis of the Boston Housing dataset using various regression models, providing valuable insights into the housing market in Boston.
Universal-EDA - A Python library that provides an easy-to-use interface for Exploratory Data Analysis (EDA) and data loading, supporting various datasets.
Sequence Prediction ANN - A Python implementation of a simple Artificial Neural Network (ANN) for predicting the next number in a given sequence, using NumPy and scikit-learn.
Naive Bayes Golf Prediction - A Python implementation of a Naive Bayes classifier for predicting golf play based on the weather outlook, demonstrating the Laplacian correction technique, and incorporating UML diagrams for clear understanding.
8-Puzzle Solver - A Python-based solution to solve the classic 8-puzzle game using the A* search algorithm, capable of checking puzzle solvability and calculating solution time. The project includes comprehensive diagrams for clear understanding and a user-friendly console interface.
Healthcare Expert System with Pyke - A Python implementation of a rule-based expert system for heart failure telemonitoring, inspired by Seto et al.'s research. The system utilizes the Pyke engine and includes a comprehensive rule base and a fact base. It demonstrates the application of forward-chaining in reasoning with a healthcare-specific rule set. The project is complete with detailed documentation and interactive diagrams for clear understanding.

1. K-Nearest Neighbors Classifier

Short description: This repository contains a Python implementation of the k-nearest neighbors (KNN) classification algorithm. The project includes preprocessing, cross-validation, and the KNN classifier itself, along with a sample dataset to demonstrate its usage. The implementation is modular and easy to use, making it suitable for a wide range of classification tasks.

Problem it solves: The k-nearest neighbors algorithm is a versatile and easy-to-understand classification method that can be applied to a variety of problems in machine learning, such as image recognition, text classification, and recommendation systems. By implementing the KNN classifier, this project provides a foundation for users to apply the algorithm to their own datasets and understand its inner workings.

Technologies used:

Python 3.x
CSV file handling
Euclidean distance calculation
Preprocessing and data loading
Cross-validation for model evaluation

Sample use case: The repository includes a sample dataset (in CSV format) to demonstrate the usage of the KNN classifier. Users can follow along with the example to understand how to preprocess the data, split it into training and test sets, and apply the KNN classifier to make predictions. The example also showcases the use of cross-validation for model evaluation and selection.

2. Word2Vec with TensorFlow 2.0

Short description: This project is an implementation of the Word2Vec algorithm using TensorFlow 2.0 to compute vector representations of words. The Word2Vec model used is the Skip-Gram model, which is trained on a small chunk of Wikipedia articles (the text8 dataset).

Problem it solves: Word2Vec is a popular word embedding technique that represents words as vectors in a high-dimensional space. These embeddings can be used in various natural language processing tasks, such as sentiment analysis, document classification, and machine translation. The main idea behind Word2Vec is that words with similar meanings tend to occur in similar contexts.

Technologies used:

Python 3
TensorFlow 2.0
NumPy
urllib
zipfile
Text8 dataset

3. Text Augmentation using Back-Translation

Short description: This repository contains a Python script that leverages the NLPAug library and round-trip translation (RTT) technique to augment text datasets. The script processes TXT files in the "data/" folder, translating text to another language and back, creating augmented versions. The augmented dataset enhances training data for natural language processing tasks like chatbot training or text classification.

Problem it solves: Text augmentation is a technique used to expand or modify existing text data in a way that increases the variety and quantity of training data for natural language processing tasks. This back-translation augmentation technique helps create more diverse training data, improving the performance of NLP models in tasks like text classification, sentiment analysis, named entity recognition, machine translation, chatbot training, and question-answering systems.

Technologies used:

Python
NLPAug library

4. AVL Tree Implementation

Short description: This repository contains the code for implementing an AVL tree (balanced binary search tree) in Python. The implementation covers the Node and Tree classes, build_tree() method, and the insert() and delete() methods for inserting and removing nodes in the AVL tree.

Problem it solves: AVL trees are a type of self-balancing binary search tree that automatically maintain their balance as new nodes are inserted or removed. This property ensures that the tree remains efficient for searching, insertion, and deletion operations, making it useful for various applications, such as databases and file systems.

Technologies used:

Python 3.x

5. Boston Housing Analysis

Short description: This repository contains a comprehensive analysis of the Boston Housing dataset using various regression models, including Linear Regression, Lasso Regression, and Ridge Regression. The project explores the dataset, visualizes the relationships between features and target variables, and evaluates the performance of the different regression models, providing valuable insights into the housing market in Boston.

Problem it solves: The analysis of the Boston Housing dataset aims to identify factors that influence housing prices and evaluate the effectiveness of different regression models for predicting housing prices based on given features. By understanding the relationships between variables in the housing market, stakeholders such as homeowners, real estate agents, and policymakers can make informed decisions about the housing market.

Technologies used:

Python
numpy
pandas (version 1.5.3)
matplotlib
seaborn
scikit-learn (version lower than 1.0)

6. Universal-EDA

Short description: Universal-EDA is a Python library that provides an easy-to-use interface for Exploratory Data Analysis (EDA) and data loading. It supports various datasets and helps streamline the EDA process.

Problem it solves: This project aims to simplify and streamline the EDA process by providing a unified interface for working with various datasets. Universal-EDA allows users to load common datasets, perform EDA, and convert DataFrames to CSV files, making it easier to perform data analysis and prepare data for machine learning tasks.

Technologies used:

Python
pandas
scikit-learn

7. Sequence Prediction ANN

Short description: This repository contains a Python implementation of a simple Artificial Neural Network (ANN) for predicting the next number in a given sequence. The ANN is built using the Python NumPy library and trains using backpropagation. The network's performance is analyzed using Mean Squared Error (MSE) over a customizable number of training epochs.

Problem it solves: Predicting the next number in a sequence is a fundamental problem in various fields, such as finance, weather forecasting, and data analysis. This project demonstrates the use of a basic 2-layer ANN for sequence prediction, providing a foundation for understanding neural networks and their applications in solving real-world problems.

Technologies used:

Python 3.x
NumPy
scikit-learn

8. Naive Bayes Golf Prediction

Short description: This repository contains a Python implementation of a Naive Bayes classifier to predict whether or not one will play golf based on the weather outlook. The project includes the Laplacian correction technique for handling zero-probability problems, along with UML diagrams for better code understanding and a sample dataset to demonstrate its usage.

Problem it solves: The Naive Bayes algorithm is a well-known classification technique that can be applied to various predictive modeling problems. In this project, we demonstrate its usage in predicting whether golf will be played under different weather conditions. This not only provides a hands-on approach to understanding and implementing the Naive Bayes algorithm but also demonstrates the use of Laplacian correction in handling the zero-probability problem, which is a common issue in probabilistic models.

Technologies used:

Python 3.x
scikit-learn
Pandas
PlantUML (for creating UML diagrams)

Sample use case: The repository includes a sample dataset to demonstrate the usage of the Naive Bayes classifier. Users can follow along with the example to understand how to apply the Laplacian correction, create a frequency table, calculate likelihoods, and compute posterior probabilities for each class. The example also showcases the use of UML diagrams for understanding and visualizing the structure and workflow of the application.

9. 8-Puzzle Solver

Short description: This project offers a Python-based solution to solve the classic 8-puzzle game. The program uses the A* search algorithm, a heuristic-driven approach that finds the optimal solution path from a given initial state to the goal state. It includes the ability to check the solvability of a given 8-puzzle and computes the time taken to find a solution.

Problem it solves: The 8-puzzle game is a sliding puzzle that is frequently used in AI and problem-solving courses. It's a perfect candidate for practicing the A* search algorithm, which can be used in many other contexts, such as pathfinding, game AI, and many other decision-making applications. This implementation not only solves the 8-puzzle but also checks the solvability of a given puzzle and calculates the time taken to find a solution.

Technologies used:

Python 3.x
SimpleAI library
Colorama library

Sample use case: The solver allows users to input any valid initial state of the 8-puzzle. The program will then use the A* algorithm to find a solution, displaying each step of the process as it goes. If the given puzzle is unsolvable, it will inform the user. The time taken to solve the puzzle is also recorded and displayed, offering a way to benchmark the performance of the algorithm.

10. Healthcare Expert Systems: A Heart Failure Telemonitoring Case Study

Short description: This repository presents a Python-based expert system using the Pyke knowledge-based system to emulate a heart failure telemonitoring system. The project takes inspiration from the research of Seto et al. (2012), demonstrating how expert systems can significantly improve self-care and clinical management in healthcare scenarios.

Problem it solves: The purpose of this project is to illustrate the potential of expert systems, particularly in healthcare. Expert systems can use collected data to provide specific, personalized advice or recommendations, thus improving patient self-care, clinical management, and overall patient outcomes.

Technologies used:

Python 2.6 (Pyke Legacy Expert System)
Pyke (Legacy System, required Python 2.6)
CSV file handling
Rule Base and Fact Base for data and logic handling

Sample use case: The repository includes various rule sets in the rule base that monitor patient health, like detecting high body mass index, high blood pressure, and abnormal heart rate. A driver code executes the rule base against the fact base, providing the user with relevant recommendations based on the patient's vital signs. The project also considers the potential for future enhancements like integrating a question rule base and back-chaining rules into the expert system.

📧 Contact

Email: scott.miner.data.scientist@gmail.com
Last.fm Profile - Just for fun!🎵
Personal Website - Explore more Data Science projects!🚀💻

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scott Miner sminerport

Achievements

Achievements

Block or report sminerport

Scott Miner's Tech and Data Journey

TOC

🏆 Awards and Certifications

🎓 Education

💼 Experience

🔧 Technical Skills

🌟 Featured Projects

1. K-Nearest Neighbors Classifier

2. Word2Vec with TensorFlow 2.0

3. Text Augmentation using Back-Translation

4. AVL Tree Implementation

5. Boston Housing Analysis

6. Universal-EDA

7. Sequence Prediction ANN

8. Naive Bayes Golf Prediction

9. 8-Puzzle Solver

10. Healthcare Expert Systems: A Heart Failure Telemonitoring Case Study

📧 Contact

Pinned