Adds first scenario for feature engineering examples #68

HamiltonRepoMigrationBot · 2023-02-26T02:08:02Z

Issue by skrawcz
Tuesday Feb 14, 2023 at 20:22 GMT
Originally opened as stitchfix/hamilton#311

This example shows how you can use the same feature definitions in Hamilton in an offline setting and use them in an online setting.

Assumptions:

the API request can provide the same raw data that training provides.
if you have aggregation features, you need to store the training result for them, and provide them to the online side.

Changes

adds feature_engineering folder to examples
adds scenario 1

How I tested this

ran this code locally

Notes

Checklist

PR has an informative and human-readable title (this will be pulled into the release notes)
Changes are limited to a single goal (no scope creep)
Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
Any change in functionality is tested
New functions are documented (with a description, list of inputs, and expected output)
Placeholder code is flagged / future TODOs are captured in comments
Project documentation has been updated if adding/changing functionality.

skrawcz included the following code: https://github.com/stitchfix/hamilton/pull/311/commits

HamiltonRepoMigrationBot · 2023-02-26T02:08:03Z

Comment by skrawcz
Monday Feb 20, 2023 at 23:51 GMT

Good start -- I don't think this is going to be clear to most people who haven't really dug into this. A few thoughts:

We can clarify the wording/make it crisper to specify why this is a problem, how its normally done, and why hamilton alleviates this

We can give more context about what we're doing here/why its in an online context

We can root on tooling that might be familiar to them. While loading fake models/whatnot makes sense, I think its going to confuse the users. So either load from a model/feature store they're used to, or (more likely) abstract it away and make it very clear that it could be implemented in many different ways.

This stuff is natural to us as we've been building online/batch inference/training tooling for years, but I think this will be extremely complex to most people out there, and fall flat. Hamilton is simple enough and makes this easy enough that this is a good chance to capture market share, but to do so we need to really hammer home a pattern and a motivation.

That's the point of the scenarios, there is no one size fits all. That is, show the simplest possible thing, then one where there is a feature store, etc.

Will add more to motivation -- and draw some pictures.

elijahbenizzy · 2024-01-18T16:48:08Z

Closing this, we have multiple blog posts: https://blog.dagworks.io/p/feature-engineering-with-hamilton

elijahbenizzy added the migrated-from-old-repo Migrated from old repository label Feb 26, 2023

elijahbenizzy closed this as completed Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds first scenario for feature engineering examples #68

Adds first scenario for feature engineering examples #68

HamiltonRepoMigrationBot commented Feb 26, 2023

HamiltonRepoMigrationBot commented Feb 26, 2023

elijahbenizzy commented Jan 18, 2024

Adds first scenario for feature engineering examples #68

Adds first scenario for feature engineering examples #68

Comments

HamiltonRepoMigrationBot commented Feb 26, 2023

Changes

How I tested this

Notes

Checklist

HamiltonRepoMigrationBot commented Feb 26, 2023

elijahbenizzy commented Jan 18, 2024