Skip to content
View hgallocodes's full-sized avatar

Block or report hgallocodes

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hgallocodes/README.md

Hello, I'm Hector!
More on my Career

🤖 Machine Learning Projects:

Tools | Python Tools | SciKit Learn Tools | Pandas Tools | Numpy

  • Designed and deployed a machine learning pipeline using XGBoost to predict flight delays at LAX, integrating historical flight records (2021–2024) with local weather and holiday data to capture seasonal and operational patterns.
  • Developed an interactive Dash web application that guides users through airline selection, destination input, and flight details entry, delivering real-time delay probabilities via visualizations including gauges, summary tables, and route maps.
  • Applied Python data engineering techniques (Pandas, NumPy, Scikit-Learn) for dataset preparation, feature engineering, and model validation, ensuring accurate predictions and actionable insights for travelers and analysts.

Tools | Python Tools | SciKit Learn Tools | Pandas Tools | Numpy Tools | Matplotlib

  • Developed a K-Nearest Neighbors model aimed at predicting future contracts of soccer players based on a comprehensive set of variables including performance statistics, age, and market potential.
  • Employed Python for data manipulation and analysis, leveraging libraries such as Pandas and Numpy for data cleaning and merging, ensuring the reliability of the dataset used for model training and validation.
  • Utilized SciKit-Learn to implement a KNN model, analyzing player data to forecast key contract elements like transfer value and salary, enhancing predictive accuracy and relevance for real-world applications.

Tools | Python Tools | SciKit Learn Tools | Pandas Tools | Plotly

  • Developed and fine-tuned a k-Nearest Neighbors (k-NN) regression model to predict the durations of power outages based on influential factors like cause, geography, and population metrics, using a dataset that included detailed information on past power outage events.
  • Conducted rigorous data cleaning and exploratory data analysis to identify key predictors, manage outliers, and ensure data quality, which included encoding categorical variables and selecting relevant quantitative features for model training.
  • Evaluated the model's performance using root mean square error (RMSE) and R-squared metrics, performed a fairness analysis comparing predictions across different U.S. regions, and refined the model by incorporating features like peak energy consumption hours and population metrics to improve prediction accuracy.

📊 Data Science Projects:

Tools | Python Tools | Pandas Tools | Numpy Tools | Matplotlib

  • Conducted in-depth data analysis for a comprehensive study on the impact of COVID-19 on air quality in San Diego County, utilizing Python, pandas, and NumPy for data cleaning and merging, and Matplotlib and Seaborn for data visualization.
  • Applied statistical expertise to investigate correlations between pandemic trends and NO2 levels using linear regression models in Python, interpreting results to establish a nuanced understanding of the pandemic's environmental effects.
  • Collaborated within a multidisciplinary team, contributing to research, data interpretation, and ethical analysis, leveraging communication and technical skills to ensure the project's success and validity.

Tools | Python Tools | Pandas Tools | Numpy

  • Analyzed a comprehensive dataset of major power outages in the U.S. from January 2000 to July 2016 to identify the correlation between weather conditions and power outage durations, focusing specifically on the impact of colder weather on the length of outages.
  • Conducted extensive data cleaning and exploratory data analysis to refine the dataset for relevant variables, implemented hypothesis testing to examine the relationship between power outage durations and cold weather, and communicated findings through clear insights.
  • Utilized statistical methods and permutation tests to validate the hypothesis that colder weather leads to longer power outage durations, contributing to the development of strategies for improving infrastructure resilience and emergency response during adverse weather conditions.

Tools | JavaScript Tools | Svelte Tools | D3

  • Developed an interactive globe, using JavaScript and D3, that allows the user to hover over countries and reveal its COVID-19 cases, GDP, and population.
  • Automated the visualization to spin at a constant rate and programmed it to give the user the ability to zoom in and out of countries as well as pan from one side to the other.

Tools | Python Tools | Pandas Tools | Numpy Tools | Tableau

  • Analyzed large datasets to find trends and perform statistical analysis on eBike accidents in the County of San Diego utilizing Python, Pandas, and Numpy packages and visualized findings on Tableau.
  • Executed literature studies on eBike safety, and synthesized key research findings to assess if there is a greater accident-to-user ratio in eBikes than in bicycles.
  • Revised and improved newsletter surveys with a focus on obtaining valuable feedback from nonprofit organizations and medical institutions to meticulously assess the health-related benefits and associated costs attributed to using eBikes by riders.

Tools | Java

  • Developed a music management system to store, organize, and recommend songs, using Java data structures for efficient categorization and CSV file integration.
  • Built a recommendation engine to analyze user data and generate personalized song suggestions by tracking listening habits and song attributes.
  • Applied object-oriented design, creating PlayableItem objects for sorting and comparison based on duration and popularity.
  • Collaborated on file I/O, data parsing, and recommendation algorithms to ensure seamless integration of music database, playlist management, and user preferences.

🤳 Connect with me:

Hector Gallo | LinkedIn

Popular repositories Loading

  1. COGS108_Repo COGS108_Repo Public

  2. Overview Overview Public

    Forked from COGS108/Overview

    Overview and map of the organization, for the UCSD course COGS108: Data Science in Practice.

  3. MyFirstPullRequest MyFirstPullRequest Public

    Forked from COGS108/MyFirstPullRequest

    To be used for the assignments in Cogs 108

  4. Group018-Sp22 Group018-Sp22 Public

    Forked from COGS118A/Group018-Sp22

    Jupyter Notebook

  5. hgallocodes hgallocodes Public

  6. Cold-Weather-and-Power-Outages Cold-Weather-and-Power-Outages Public

    This is a project for DSC80 at UCSD

    Jupyter Notebook