-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Source Iterable: add new streams #5915
Conversation
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
airbyte-integrations/connectors/source-iterable/source_iterable/api.py
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,8 @@ | |||
{ | |||
"properties": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there is a reason why we have so generic schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I can see there's no a strict schema for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this can be useful for users, because it lacks campaign id, also we definitely know at least some structure:
metric: <string>
value: <number>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the id of campaign is inside of data
, lets maybe rename it to metrics? or even better move it to the top level of object? i.e. campaign_metrics
will looks like:
{
<id>: <campaign_id>,
<metric_i>: <value_i>,
...
}
instead of
{
data: {...}
}
airbyte-integrations/connectors/source-iterable/source_iterable/api.py
Outdated
Show resolved
Hide resolved
{ | ||
"properties": { | ||
"data": { | ||
"type": ["null", "object"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question regarding events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, custom fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some questions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, only small changes.
try: | ||
result[key] = int(value) | ||
except ValueError: | ||
result[key] = float(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain what is it for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response from the campaign_metrics endpoint is a csv file, the first line contains the name of the values, the second contains the values. There are 2 types of values: integer and floating point, so here we are casting from a string to the actual value type in a pretty naive way.
{ | ||
"api_key": "test-api-key", | ||
"start_date": "2020-12-12T00:00:00Z" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newline
"requests==2.25.1", | ||
"airbyte-cdk~=0.1", | ||
"pendulum~=1.2", | ||
"requests", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you didn't set version?
yield {"data": self._parse_csv_string_to_dict(content)} | ||
|
||
@staticmethod | ||
def _parse_csv_string_to_dict(csv_string: str) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
important to note that we assume that there is only one line in csv input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mb move it outside of the class and make it return list of objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically there are two lines, the first one is for variable names, and the second – for values.
This is used only here at the moment, and we can extract it when necessary.
We return CampaignMetircs as a raw json, so converting to object and then back to dict looks unnecessary, what do you think?
|
||
return params | ||
|
||
def stream_slices(self, **kwargs) -> Iterable[Optional[Mapping[str, any]]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about slice not by one campaign, but by 20-30, since we already doing a lot of sub reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
# Conflicts: # airbyte-integrations/connectors/source-iterable/Dockerfile # airbyte-integrations/connectors/source-iterable/source_iterable/api.py # docs/integrations/sources/iterable.md
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/test connector=connectors/source-iterable
|
/publish connector=connectors/source-iterable
|
* Add new streams * Upd requirements versions * Upd docs * Remove tests for the templates stream * Upd csv field parsing * Fix file permissions * Set dependency version * Refactor * Merge * Upd licence * Add bulk metrics retrieving * Actualize schema
What
New streams added: Campaign Metrics, Events
How
Recommended reading order
source.py
api.py
schemas/campaigns_metrics.json
schemas/events.json
Pre-merge Checklist
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing./publish
command described here