## Batch attributes from existing tables using Signals

This notebook creates a new attribute group using the SDK that will be computed using stream processing.

### Flow of data

```mermaid
flowchart LR
    wh[(Warehouse)]
    cron[/Materialization CRON job/]
    signals(Signals)

    wh --> cron
    cron --> signals
```

---

# Installation and setup

In [None]:
%pip install snowplow-signals

In [1]:
from snowplow_signals import Signals
import os

try:
    from google.colab import userdata
    sp_signals = Signals(
            api_url=userdata.get('SP_API_URL'),
            api_key=userdata.get('SP_API_KEY'),
            api_key_id=userdata.get('SP_API_KEY_ID'),
            org_id=userdata.get('SP_ORG_ID'),
        )
except ImportError:
    from dotenv import load_dotenv
    load_dotenv()
    sp_signals = Signals(
        api_url=os.environ['SP_API_URL'],
        api_key=os.environ['SP_API_KEY'],
        api_key_id=os.environ['SP_API_KEY_ID'],
        org_id=os.environ['SP_ORG_ID'],
)

### Define a new data source

Creates a data source with the source Snowflake table configuration.

In [None]:
from snowplow_signals import BatchSource

data_source = BatchSource(
    name="ecommerce_transaction_interactions_source",
    database="SNOWPLOW_DEV1",
    schema="SIGNALS",
    table="SNOWPLOW_ECOMMERCE_TRANSACTION_INTERACTIONS_FEATURES",
    timestamp_field="UPDATED_AT",
)

### Create an attribute group with the table fields

Attribute groups define the attributes inside the data sources and the attribute key types they relate to.

In [None]:
from snowplow_signals import ExternalBatchAttributeGroup, domain_userid, Field

attribute_group = ExternalBatchAttributeGroup(
    name="ecommerce_transaction_interactions_attributes",
    version=1,
    attribute_key=domain_userid,
    fields=[
        Field(
            name="TOTAL_TRANSACTIONS",
            type="int32",
        ),
        Field(
            name="TOTAL_REVENUE",
            type="int32",
        ),
        Field(
            name="AVG_TRANSACTION_REVENUE",
            type="int32",
        ),
    ],
    batch_source=data_source,
    owner="user@company.com",
)

### Applying the data source and attribute group to Signals

The following block pushes the data source and attribute group definition to the Signals API and makes it available for a background CRON job that incrementally materializes the data from the warehouse table to the online attribute store.

In [None]:
applied = sp_signals.publish([attribute_group])
print(f"{len(applied)} objects applied")

### Retrieving data

One can fetch the latest attribute values for a particular user from the attribute group as follow.

In [None]:
response = attribute_group.get_attributes(
    signals=sp_signals,
    identifier="9999999999999999999999999",
)

response