Skip to content

Conversation

@YassinNouh21
Copy link
Contributor

What this PR does / why we need it

Adds CLI commands to import dbt models as Feast FeatureViews, enabling automatic generation of Feast objects from dbt manifest.json files.

  • feast dbt list - Discover dbt models available for import
  • feast dbt import - Create Feast FeatureViews from dbt models
  • --output option to generate Python files instead of applying to registry
  • Supports BigQuery, Snowflake, and File data sources
  • Maps 38 dbt data types to Feast types (including ARRAY and NUMBER with precision)
  • Preserves dbt metadata (tags, descriptions) in generated Feast objects

Which issue(s) this PR fixes

Closes #3335

Does this PR introduce a user-facing change

Yes. Users can now import dbt models as Feast FeatureViews using:

# List models with 'feast' tag
feast dbt list -m target/manifest.json --tag feast

# Import models to registry
feast dbt import -m target/manifest.json -e driver_id --tag feast

# Generate Python file instead
feast dbt import -m target/manifest.json -e driver_id --output features.py

Test plan

  • Unit tests for parser, mapper, and codegen modules
  • Tested with real dbt projects at 3 complexity levels
  • Verified generated Python files are syntactically correct and importable

Signed-off-by: yassinnouh21 yassinnouh21@gmail.com

@YassinNouh21 YassinNouh21 requested a review from a team as a code owner January 10, 2026 12:34
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds dbt integration to Feast, enabling users to automatically import dbt models as Feast FeatureViews. The integration includes CLI commands for discovering and importing dbt models, with support for BigQuery, Snowflake, and File data sources.

Changes:

  • Adds feast dbt list and feast dbt import CLI commands for dbt model discovery and import
  • Implements comprehensive dbt type mapping (38 data types including ARRAY and NUMBER with precision)
  • Provides code generation capability to output Python files instead of applying directly to registry

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
setup.py Adds dbt-artifacts-parser dependency for dbt integration
sdk/python/feast/dbt/parser.py Implements dbt manifest.json parsing with typed support for dbt versions 0.19-1.11+
sdk/python/feast/dbt/mapper.py Maps dbt data types to Feast types and creates Feast objects from dbt models
sdk/python/feast/dbt/codegen.py Generates Python code for Feast objects using Jinja2 templates
sdk/python/feast/cli/dbt_import.py Implements CLI commands for listing and importing dbt models
sdk/python/feast/cli/cli.py Integrates dbt command group into main CLI
sdk/python/tests/unit/dbt/test_parser.py Unit tests for dbt manifest parser
sdk/python/tests/unit/dbt/test_mapper.py Unit tests for dbt-to-Feast type mapping and object creation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch 3 times, most recently from e581fec to 3fb62f3 Compare January 10, 2026 14:27
…-dev#3335)

This PR implements the dbt-Feast integration feature requested in feast-dev#3335,
enabling users to import dbt models as Feast FeatureViews.

## New CLI Commands

- `feast dbt list` - List dbt models available for import
- `feast dbt import` - Import dbt models as Feast objects

## Features

- Parse dbt manifest.json files to extract model metadata
- Map dbt types to Feast types (38 types supported)
- Generate Entity, DataSource, and FeatureView objects
- Support for BigQuery, Snowflake, and File data sources
- Tag-based filtering (--tag) to select specific models
- Code generation (--output) to create Python files
- Dry-run mode to preview changes before applying

## Usage Examples

```bash
# List models with 'feast' tag
feast dbt list -m target/manifest.json --tag feast

# Import models to registry
feast dbt import -m target/manifest.json -e driver_id --tag feast

# Generate Python file instead
feast dbt import -m target/manifest.json -e driver_id --output features.py
```

Closes feast-dev#3335

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
- Add dbt-artifacts-parser as optional dependency (feast[dbt])
- Update parser to use typed parsing with fallback to raw dict
- Provides better support for manifest versions v1-v12

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
When parsing minimal/incomplete manifests (e.g., in unit tests),
dbt-artifacts-parser may fail validation. This change adds a graceful
fallback to use raw dict parsing when typed parsing fails.

Also updated test fixture with dbt_schema_version field.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Since dbt-artifacts-parser is an optional dependency, unit tests
should be skipped in CI when it's not installed.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Removed manual/fallback dict parsing code. The parser now exclusively
uses dbt-artifacts-parser typed objects. Updated test fixtures to
create complete manifests that dbt-artifacts-parser can parse.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Install dbt-artifacts-parser in CI so dbt unit tests run instead
of being skipped.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
- mapper.py: Fix Array element type check to use set membership instead
  of incorrect isinstance() comparison
- codegen.py: Add safe getattr() with fallback for Array.base_type access

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch from 3fb62f3 to 01730a8 Compare January 10, 2026 14:28
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21
Copy link
Contributor Author

@franciscojavierarceo it passed the CI and I reviewed your comments

@franciscojavierarceo
Copy link
Member

Can you add documentation for this? 🙏

@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch from 2261481 to 9929527 Compare January 11, 2026 11:38
- Add dbt-artifacts-parser to pyproject.toml under feast[dbt] and feast[ci] extras
- Remove separate install step from unit_tests.yml workflow
- Update all requirements lock files

Addresses review feedback from @ntkathole.

Signed-off-by: YassinNouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch from 9929527 to fb40e93 Compare January 11, 2026 11:53
Add comprehensive documentation for the new dbt integration feature:
- Quick start guide with step-by-step instructions
- CLI reference for `feast dbt list` and `feast dbt import`
- Type mapping table for dbt to Feast types
- Data source configuration examples (BigQuery, Snowflake, File)
- Best practices for tagging, documentation, and CI/CD
- Troubleshooting section

Addresses review feedback from @franciscojavierarceo.

Signed-off-by: YassinNouh21 <yassinnouh21@gmail.com>
Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21
Copy link
Contributor Author

@franciscojavierarceo @ntkathole

Thanks for the reviews! I've addressed all the feedback:

  • Added comprehensive documentation for the dbt integration (commit 53932ff)
  • Added dbt-artifacts-parser to both setup.py and pyproject.toml under feast[ci] (commit fb40e93)

All CI checks are passing and there are no merge conflicts. The PR should be ready for another review when you have a chance.

Let me know if there's anything else needed! 🙏

cli_check_repo(repo, fs_yaml_file)
store = create_feature_store(ctx)

store.apply(all_objects)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel the user should generate the code first and then separately run feast apply manually

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YassinNouh21 this is one change I do want to include here. Can we add a parameter instead that says "apply" to the CLI then? by default we shouldn't apply in the dbt code gen.

It should be:

feast dbt-import --apply=True

Or something like that.

Float64: "Float64",
Bool: "Bool",
UnixTimestamp: "UnixTimestamp",
Bytes: "Bytes",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess Image and PdfBytes wouldn't be sensible here

@HaoXuAI
Copy link
Collaborator

HaoXuAI commented Jan 14, 2026

looks good to me

YassinNouh21 added a commit to YassinNouh21/feast that referenced this pull request Jan 14, 2026
Add prominent warning callout highlighting that the dbt integration is
an alpha feature with current limitations. This sets proper expectations
for users regarding:
- Supported data sources (BigQuery, Snowflake, File only)
- Single entity per model constraint
- Potential for breaking changes in future releases

Addresses feedback from PR feast-dev#5827 review comments.
YassinNouh21 added a commit to YassinNouh21/feast that referenced this pull request Jan 14, 2026
Ensure dbt-artifacts-parser is installed in CI environments by adding
it to the CI_REQUIRED list in setup.py. This matches the dependency
already present in pyproject.toml and ensures CI tests for dbt
integration have access to the required parser library.

Addresses feedback from PR feast-dev#5827 review comments.
Add prominent warning callout highlighting that the dbt integration is
an alpha feature with current limitations. This sets proper expectations
for users regarding:
- Supported data sources (BigQuery, Snowflake, File only)
- Single entity per model constraint
- Potential for breaking changes in future releases

Addresses feedback from PR feast-dev#5827 review comments.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
Ensure dbt-artifacts-parser is installed in CI environments by adding
it to the CI_REQUIRED list in setup.py. This matches the dependency
already present in pyproject.toml and ensures CI tests for dbt
integration have access to the required parser library.

Addresses feedback from PR feast-dev#5827 review comments.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch from 2807a7a to b2901f4 Compare January 14, 2026 08:57
Add logging and defensive attribute access for Array.base_type in code
generation to prevent potential AttributeError. While Array.__init__
always sets base_type, defensive programming with warnings provides:
- Protection against edge cases or future Array implementation changes
- Clear visibility when fallback occurs via logger.warning
- Consistent error handling across both usage sites

Changes:
- Add logging module and logger instance
- Update _get_feast_type_name() to use getattr with warning
- Update import tracking logic to use getattr with warning
- Add concise comments with examples (e.g., Array(String) -> base_type = String)

Addresses code review feedback from PR feast-dev#5827.

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21
Copy link
Contributor Author

You're right - ImageBytes and PdfBytes wouldn't make sense here. While these types exist in Feast, dbt manifests only expose generic BYTES type without semantic information about whether bytes represent images, PDFs, or other binary data.

Example:

-- dbt model
SELECT 
    user_id,
    profile_photo,  -- BigQuery type: BYTES
    resume_pdf      -- BigQuery type: BYTES  
FROM users

dbt manifest only shows "data_type": "BYTES" for both columns - no way to distinguish images from PDFs. The mapping only includes types that can actually appear in dbt artifacts.


Re: codegen.py:152 ImageBytes/PdfBytes comment

Add clarifying comment in type_map explaining why ImageBytes and
PdfBytes are not included in the dbt type mapping. While these types
exist in Feast, dbt manifests only expose generic BYTES type without
semantic information to distinguish between regular bytes, images, or
PDFs.

Example: A dbt model with image and PDF columns both appear as
'BYTES' in the manifest, making ImageBytes/PdfBytes types unmappable
from dbt artifacts.

Addresses feedback from PR feast-dev#5827 review (franciscojavierarceo).

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
- Fix E402 linter error in feast/dbt/codegen.py by moving imports before logger initialization
- Update requirements files to include dbt-artifacts-parser in pydantic dependency comments

Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
@YassinNouh21 YassinNouh21 force-pushed the feat/dbt-feast-integration-3335-clean branch from 05866d7 to 7a50c73 Compare January 16, 2026 14:23
Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, going to ship this. @YassinNouh21 it'd be great if we can get a proper integration test in a follow up PR that sets up a light weight local dbt project that then does the compilation but i don't want to block that here

@franciscojavierarceo
Copy link
Member

okay, going to ship this. @YassinNouh21 it'd be great if we can get a proper integration test in a follow up PR that sets up a light weight local dbt project that then does the compilation but i don't want to block that here

@copilot can you make a github issue for this?

@YassinNouh21
Copy link
Contributor Author

Code Review - Issues Created

I've reviewed PR #5827 and created the following issues to track improvements and edge cases:

High Priority

  1. Add integration tests for dbt import with local dbt project #5869 - Add integration tests for dbt import with local dbt project

  2. dbt integration: Improve error handling for missing dbt-artifacts-parser #5870 - Improve error handling for missing dbt-artifacts-parser (good first issue)

    • CLI doesn't catch ImportError from parser
  3. dbt integration: FileSource generates invalid hardcoded paths #5874 - FileSource generates invalid hardcoded paths

    • /data/{model}.parquet likely doesn't exist

Validation & Safety

  1. dbt integration: Add validation for timestamp field data type #5871 - Add validation for timestamp field data type

    • Should verify timestamp field is actually a timestamp type
  2. dbt integration: Validate entity column data type is appropriate #5876 - Validate entity column data type is appropriate

    • Should warn if entity column is FLOAT, BYTES, etc.

Feature Enhancements

  1. dbt integration: Support multiple entities per FeatureView #5872 - Support multiple entities per FeatureView

    • Real-world use cases often need composite keys
  2. dbt integration: Make TTL default configurable and increase default value #5873 - Make TTL default configurable and increase default value

    • 1 day default is very aggressive, consider 7 days
  3. dbt integration: Add support for data source connection configuration #5875 - Add support for data source connection configuration

    • Need way to specify BigQuery project, Snowflake account, etc.

Overall Assessment

The PR is well-structured with good test coverage for the core functionality. The main gaps are:

  • Integration tests with actual dbt projects
  • Validation of input data types
  • Configuration flexibility for production use cases

Great work on getting this feature merged! 🎉

franciscojavierarceo pushed a commit that referenced this pull request Jan 16, 2026
# [0.59.0](v0.58.0...v0.59.0) (2026-01-16)

### Bug Fixes

* Add get_table_query_string_with_alias() for PostgreSQL subquery aliasing ([#5811](#5811)) ([11122ce](11122ce))
* Add hybrid online store to ONLINE_STORE_CLASS_FOR_TYPE mapping ([#5810](#5810)) ([678589b](678589b))
* Add possibility to overwrite send_receive_timeout for clickhouse offline store ([#5792](#5792)) ([59dbb33](59dbb33))
* Denial by default to all resources when no permissions set  ([#5663](#5663)) ([1524f1c](1524f1c))
* Make operator include full OIDC secret in repo config ([#5676](#5676)) ([#5809](#5809)) ([a536bc2](a536bc2))
* Populate Postgres `registry.path` during `feast init` ([#5785](#5785)) ([f293ae8](f293ae8))
* **redis:** Preserve millisecond timestamp precision for Redis online store ([#5807](#5807)) ([9e3f213](9e3f213))
* Search API to return all matching tags in matched_tags field ([#5843](#5843)) ([de37f66](de37f66))
* Spark Materialization Engine Cannot Infer Schema ([#5806](#5806)) ([58d0325](58d0325)), closes [#5594](#5594) [#5594](#5594)
* Support arro3 table schema with newer deltalake packages ([#5799](#5799)) ([103c5e9](103c5e9))
* Timestamp formatting and lakehouse-type connector for trino_offline_store. ([#5846](#5846)) ([c2ea7e9](c2ea7e9))
* Update model_validator to use instance method signature (Pydantic v2.12 deprecation) ([#5825](#5825)) ([3c10b6e](3c10b6e))

### Features

* Add dbt integration for importing models as FeatureViews ([#5827](#5827)) ([b997361](b997361)), closes [#3335](#3335) [#3335](#3335) [#3335](#3335)
* Add GCS registry store in Go feature server ([#5818](#5818)) ([1dc2be5](1dc2be5))
* Add progress bar to CLI from feast apply ([#5867](#5867)) ([ab3562b](ab3562b))
* Add RBAC blog post to website ([#5861](#5861)) ([b1844a3](b1844a3))
* Add skip_feature_view_validation parameter to FeatureStore.apply() and plan() ([#5859](#5859)) ([5482a0e](5482a0e))
* Added batching to feature server /push to offline store ([#5683](#5683)) ([#5729](#5729)) ([ce35ce6](ce35ce6))
* Enable static artifacts for feature server that can be used in Feature Transformations ([#5787](#5787)) ([edefc3f](edefc3f))
* Improve lambda materialization engine ([#5829](#5829)) ([f6116f9](f6116f9))
* Offline Store historical features retrieval based on datetime range in Ray ([#5738](#5738)) ([e484c12](e484c12))
* Read, Save docs and chat fixes ([#5865](#5865)) ([2081b55](2081b55))
* Resolve pyarrow >21 installation with ibis-framework ([#5847](#5847)) ([8b9bb50](8b9bb50))
* Support staging for spark materialization ([#5671](#5671)) ([#5797](#5797)) ([5b787af](5b787af))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feast <> dbt integration

4 participants