Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds first scenario for feature engineering examples #68

Closed
7 tasks done
HamiltonRepoMigrationBot opened this issue Feb 26, 2023 · 2 comments
Closed
7 tasks done
Labels
migrated-from-old-repo Migrated from old repository

Comments

@HamiltonRepoMigrationBot
Copy link
Collaborator

Issue by skrawcz
Tuesday Feb 14, 2023 at 20:22 GMT
Originally opened as stitchfix/hamilton#311


This example shows how you can use the same feature definitions in Hamilton in an offline setting and use them in an online setting.

Assumptions:

  • the API request can provide the same raw data that training provides.
  • if you have aggregation features, you need to store the training result for them, and provide them to the online side.

Changes

  • adds feature_engineering folder to examples
  • adds scenario 1

How I tested this

  • ran this code locally

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

skrawcz included the following code: https://github.com/stitchfix/hamilton/pull/311/commits

@HamiltonRepoMigrationBot
Copy link
Collaborator Author

Comment by skrawcz
Monday Feb 20, 2023 at 23:51 GMT


Good start -- I don't think this is going to be clear to most people who haven't really dug into this. A few thoughts:

  1. We can clarify the wording/make it crisper to specify why this is a problem, how its normally done, and why hamilton alleviates this
  2. We can give more context about what we're doing here/why its in an online context
  3. We can root on tooling that might be familiar to them. While loading fake models/whatnot makes sense, I think its going to confuse the users. So either load from a model/feature store they're used to, or (more likely) abstract it away and make it very clear that it could be implemented in many different ways.

This stuff is natural to us as we've been building online/batch inference/training tooling for years, but I think this will be extremely complex to most people out there, and fall flat. Hamilton is simple enough and makes this easy enough that this is a good chance to capture market share, but to do so we need to really hammer home a pattern and a motivation.

That's the point of the scenarios, there is no one size fits all. That is, show the simplest possible thing, then one where there is a feature store, etc.

Will add more to motivation -- and draw some pictures.

@elijahbenizzy elijahbenizzy added the migrated-from-old-repo Migrated from old repository label Feb 26, 2023
@elijahbenizzy
Copy link
Collaborator

Closing this, we have multiple blog posts: https://blog.dagworks.io/p/feature-engineering-with-hamilton

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migrated-from-old-repo Migrated from old repository
Projects
None yet
Development

No branches or pull requests

2 participants