Skip to content
View sgjholt's full-sized avatar
🦀
Learning Rust
🦀
Learning Rust
Block or Report

Block or report sgjholt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sgjholt/README.md

A Quick, Who am I?

I’m an experienced data scientist with almost a decade of employment sucessfully applying scientific methods, to different disciplines, between in industry and academia. I hold three advanced degrees, two in Geophysics (Ph.D., and BSc.), and one in Risk and Uncertainty Management (MRes.) awarded by the University of Liverpool (Liverpool, UK). I typically use Python as my language of choice, given its wide support and integration withing the machine learning community, but I also frequently use shell (Bash/ZSH/Powershell), SQL, HTML/CSS, Rust, and have some experience with JavaScript, Ruby, C++, C#, Perl, Fortran, and MATLAB.

More About me

My passion for data science stems from my curiosity about the natural phenomena that shape our planet and the data that can help us comprehend them better. That’s why I obtained a Ph.D. in geophysics, where I investigated seismic waves and their interactions with the Earth’s structure. I utilized Python and MATLAB to process and examine large datasets of seismic recordings (digital time series) from around the world. As a researcher, I authored three peer-reviewed publications in reputable journals.

After completing my degree, I joined a leading seismological observatory as a researcher (University of Utah Seismograph Stations), where I continued to work on seismic data analysis and earthquake magnitude modeling. I cooperated with other scientists from the USGS, AFRL, LLNL, and other institutions to enhance our understanding of earthquake hazards and risks.

I then decided to switch gears and explore other domains where data science can make an impact. I joined Bayer crop sciences as a data scientist, where I worked on models that forecast corn seed yield given historic growth and other agronomic factors. I used Python, SQL, and Domino to manipulate and explore large datasets of crop measurements, weather data, soil data, and satellite imagery. I constructed the predictive models using scikit-learn.

My most recent position was working as a Data Scientist at Coyote Logistics, which is a 3PL service provider where I work on sophisticated predictive models that estimate potential distributions of market cost for a given load of freight. I use Python, SQL (and noSQL), and Databricks to ingest and transform data from various sources such as carriers, shippers, brokers, and third-party APIs. I also use LightGBM, scikit-learn, and PyTorch to build and fit models that capture the uncertainty and variability of the freight market.

Data Science Projects

Here are some of the projects I've worked on or I'm currently working on:

  • Freight Market Cost Distribution: I was the team lead of the spot pricing team, which developed a cutting-edge machine learning system that used a custom Boosted Gradiant Random Forest algorithm, and LightGBM / H2O.ai Gradient Boosting models, integrated through a live API model service. The model API produced a predicted distribution, that showed the most probable range of costs to transport different types of freight equipment and cargo, to and from distribution centers across the USA, Canada, and Mexico. This model helped us generate millions of dollars of extra revenue, as it enabled faster and more intelligent buying and brokering negotiations at scale.

  • Corn Yield Prediction: Created new and improved existing random forest models to predict corn seed yield across the world, leveraging multi-year spanning historic yields, and agronomic features (e.g., weather conditions, soil conditions, field clusters defined using remote sensing data, etc.).

  • Earthquake Magnitude Estimation: Created physics-informed empirical models of amplitude decay over distance via non-parametric inversions, pre-conditioned using events with known earthquake magnitudes. This method was applied for both Local (Richter), and Moment Magnitudes estimations, with the latter focusing on small magnitude estimation, which is difficult to obtain via conventional physical modeling. This work resulted in two peer-reviewed publications which were published in the Bulletin of the Seismological Society of America, and Seismological Research Letters.

Personal Projects

Skills

Here are some of the skills I've learned or improved during my data science journey:

  • Programming Languages: Python (advanced), SQL (intermediate), Matlab (intermediate)
  • Data Analysis Tools: Pandas (advanced), NumPy (advanced), SciPy (advanced), PySpark (intermediate), Polars (intermediate)
  • Data Visualization Tools: Matplotlib (advanced), Seaborn (advanced), Plotly (advanced), Streamlit (intermediate)
  • Machine Learning Tools: scikit-learn (advanced), statsmodels (intermediate), PyTorch (basic)
  • Cloud Computing Platforms: Azure (advanced), AWS (beginner)
  • Data Engineering Tools: Databricks (Airflow (beginner), Docker (beginner), Kubernetes (beginner)
  • Version Control Systems: Git, GitHub, Azure DevOps

Contact

LinkedIn

Pinned

  1. specmod_dash specmod_dash Public

    A dash application to view and review SpecMod spectra.

    Python

  2. eqstochsim eqstochsim Public

    A library to explore the stochastic method following Boore (2003).

    Jupyter Notebook 1

  3. SpecMod SpecMod Public

    Forked from uofuseismo/SpecMod

    SpecMod - A Toolbox for Processing and Modeling Seismic Spectra

    Jupyter Notebook 4

  4. matplotconf matplotconf Public

    A repository of custom matplotlib settings.

    Python

  5. mlmags mlmags Public

    A small data science project which explores the use of some basic machine learning tools to predict moment magnitude for small-to moderate-sized earthquake in Utah, USA.

    Jupyter Notebook

  6. ynp_local_magnitude_recalibration ynp_local_magnitude_recalibration Public

    A companion repository of code that was used to recalibrate Local Magnitude (ML), then compare and constrast past and present ML assignments for earthquakes recorded in Yellowstone National Park, USA.

    Jupyter Notebook 2 2