# Feature Stores

### A feature store provides a single pane of glass for sharing all available features across the organization along with their metadata. When data scientists start a new project, they can access this catalog and easily find features. But a feature store is not just a data layer; it is also a data transformation service enabling users to manipulate raw data and store it as features ready to be used for offline (training) and online (serving), without duplicating the work. In addition, some feature stores support strong security, versioning, and data snapshots, enabling better data lineage, compliance, and manageability.

## Here are some major benefits of a feature store:

- Faster development with far fewer engineering resources

- Smooth migration from development to production

- Increased model accuracy (same pipeline for online and offline)

- Better collaboration and security across teams

- Ability to track lineage and address regulatory compliance

### Feature stores are a factory and central repository for machine learning features. Feature stores handle the collection of raw data from various sources, the transformation pipeline, storage, cataloging, versioning, security, serving, and monitoring. They automate many processes described in this chapter, while accelerating production time and reducing engineering efforts. Feature stores form a shared catalog of production-ready features, enable collaboration and sharing between teams, and accelerate the innovation and delivery of new AI applications.

## The core components of a feature store are:

### Transformation layer
- Converts raw offline or online data into features and stores them in both an online (key/value) and offline (object) store.

### Storage layer
- Stores multiple versions of a feature in feature tables (feature sets) and manages the data lifecycle (create, append, delete, monitor, and secure the data). The data layer stores each feature in two forms: offline for training and analysis and online for serving and monitoring.

### Feature retrieval
- Accepts requests for multiple features (feature vectors) and other properties (such as time ranges and event data), and produces an offline data snapshot for training or a real-time vector for serving.

### Metadata management and cataloging
- Stores the feature definition, metadata, labels, and relations.

## Ingesting Data into the feature store

### Direct ingestion
- Ingest the data directly from the client/notebook (interactively).

### Batch/scheduled ingestion
- Create a service/job that ingests data from the source (for example, file, DB, and so on).

### Real-time/streaming ingestion
- Create an online service that accepts real-time events (from a stream, HTTP, and so on) and pushes them into the feature store.