iterable-data-import
is a library for ad-hoc bulk imports to Iterable.
Install using pip:
$ pip install iterable-data-import
To run an import, create anIterableDataImport
instance and call
IterableDataImport.run
with a function that accepts a
SourceDataRecord and returns an
ImportAction. When IterableDataImort.run
is called, the
library will:
- Stream records 1 at a time from your data source
- Parse each record into a
SourceDataRecord
- Call your function to map the
SourceDataRecord
to an IterableImportAction
- Use batching to efficiently transfer the data to Iterable
Given the following example data:
id,email,lifetime_value
1,test@iterable.com,79
Example usage:
import pathlib
from iterable_data_import import (
IterableDataImport,
FileFormat,
UserProfile,
ImportAction,
UpdateUserProfile,
SourceDataRecord,
)
if __name__ == "__main__":
def map_function(record: SourceDataRecord) -> ImportAction:
email = record["email"]
user_id = record["id"]
data_fields = {"ltv": record["lifetime_value"]}
user = UserProfile(email, user_id, data_fields)
return UpdateUserProfile(user)
idi = IterableDataImport.create(
api_key="some_api_key",
source_file_path=pathlib.Path(__file__).parent / "data.csv",
source_file_format=FileFormat.CSV
)
idi.run(map_function)
When you run the import, each source record will be deserialized to a
SourceDataRecord
, which is a type alias for Dict[str, object]
.
If you select FileFormat.NEWLINE_DELIMITED_JSON
as your source file format,
the SourceDataRecord
passed into your map function will be a Dict[str, object]
where the object
type is one of the Python types that can be decoded from JSON.
iterable-data-import
uses the json
standard library module to decode your
source JSON objects. For documentation on how json
translates JSON values to
Python types, see the standard library page for json - JSON encoder and
decoder.
If you select FileFormat.CSV
as your source file format, the
SourceDataRecord
passed into your map function will be a Dict[str, str]
. You
may need to cast the values in the dict
from str
to their proper Python
type.
iterable-data-import
uses a DictReader
from the csv
standard library
module to decode your source CSV rows. For additional documentation on how the
DictReader
parses each csv row, see the standard library page for csv - CSV
File Reading and
Writing.
ImportAction
represents actions that the library can perform on an
IterableResource. Your map function should return a single
ImportAction
or a list of ImportAction
.
class UpdateUserProfile(ImportAction):
def __init__(self, user: UserProfile) -> None:
class TrackCustomEvent(ImportAction):
def __init__(self, event: CustomEvent) -> None:
class TrackPurchase(ImportAction):
def __init__(self, purchase: Purchase) -> None:
IterableResource
represents the entities to be imported or updated in
Iterable.
At least 1 of email
or user_id
must be provided.
class UserProfile(IterableResource):
def __init__(
self,
email: Optional[str] = None,
user_id: Optional[str] = None,
data_fields: Optional[Dict[str, Any]] = None,
prefer_user_id: bool = False,
merge_nested_objects: bool = False,
) -> None:
At least 1 of email
or user_id
must be provided. created_at
is a Unix Epoch.
class CustomEvent(IterableResource):
def __init__(
self,
event_name: str,
email: Optional[str] = None,
user_id: Optional[str] = None,
data_fields: Optional[Dict[str, Any]] = None,
event_id: Optional[str] = None,
template_id: Optional[int] = None,
campaign_id: Optional[int] = None,
created_at: Optional[int] = None,
):
class CommerceItem(IterableResource):
def __init__(
self,
item_id: str,
name: str,
price: float,
quantity: int,
sku: Optional[str] = None,
description: Optional[str] = None,
categories: Optional[List[str]] = None,
image_url: Optional[str] = None,
url: Optional[str] = None,
data_fields: Optional[Dict[str, Any]] = None,
) -> None:
created_at
is a Unix Epoch.
class Purchase(IterableResource):
def __init__(
self,
user: UserProfile,
items: List[CommerceItem],
total: float,
created_at: Optional[int] = None,
data_fields: Optional[Dict[str, Any]] = None,
purchase_id: Optional[str] = None,
campaign_id: Optional[int] = None,
template_id: Optional[int] = None,
):