Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Source Harvest: Fix pendulum parsing error #35305

Merged
merged 6 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions airbyte-integrations/connectors/source-harvest/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: fe2b4084-3386-4d3b-9ad6-308f61a6f1e6
dockerImageTag: 0.1.22
dockerImageTag: 0.1.23
dockerRepository: airbyte/source-harvest
documentationUrl: https://docs.airbyte.com/integrations/sources/harvest
githubIssueLabel: source-harvest
Expand All @@ -24,7 +24,6 @@ data:
registries:
cloud:
enabled: true
dockerImageTag: 0.1.21
oss:
enabled: true
releaseStage: generally_available
Expand Down
155 changes: 45 additions & 110 deletions airbyte-integrations/connectors/source-harvest/poetry.lock

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ include = "source_harvest"

[tool.poetry.dependencies]
python = "^3.9,<3.12"
airbyte-cdk = "==0.55.2"
airbyte-cdk = ">=0.62.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this <3


[tool.poetry.scripts]
source-harvest = "source_harvest.run:run"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def parse_response(self, response: requests.Response, **kwargs) -> Iterable[Mapp
class IncrementalHarvestStream(HarvestStream, ABC):
cursor_field = "updated_at"

def __init__(self, replication_start_date: pendulum.datetime = None, **kwargs):
def __init__(self, replication_start_date: Optional[pendulum.DateTime] = None, **kwargs) -> None:
super().__init__(**kwargs)
self._replication_start_date = replication_start_date

Expand All @@ -96,7 +96,12 @@ def request_params(
next_page_token: Mapping[str, Any] = None,
) -> MutableMapping[str, Any]:
params = super().request_params(stream_state=stream_state, stream_slice=stream_slice, next_page_token=next_page_token)
replication_start_date = stream_state.get(self.cursor_field) or self._replication_start_date

replication_start_date = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this code is not needed given that the CDK pins pendulum version to <3.0.0 but this will allow us to unpin the CDK version. I was able to test that by:

  • Testing with previous version
    • pip install pendulum==3.0.0 with previous code
      • integration tests fail
  • Testing with previous version
    • Checking out this code
      • integration tests pass

if stream_state.get(self.cursor_field):
replication_start_date = stream_state.get(self.cursor_field)
elif self._replication_start_date:
replication_start_date = self._replication_start_date.format("YYYY-MM-DDTHH:mm:ssZ")
params.update({"updated_since": replication_start_date})
return params

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.

from datetime import datetime
from typing import Any, Dict


class ConfigBuilder:
def __init__(self) -> None:
self._config: Dict[str, Any] = {
"account_id": "an account id",
"replication_start_date": "2021-01-01T00:00:00Z",
"credentials": {
"api_token": "an api key"
}
}

def with_account_id(self, account_id: str) -> "ConfigBuilder":
self._config["account_id"] = account_id
return self

def with_replication_start_date(self, replication_start_date: datetime) -> "ConfigBuilder":
self._config["start_date"] = replication_start_date.isoformat()[:-13]+"Z"
return self

def with_api_token(self, api_token: str) -> "ConfigBuilder":
self._config["credentials"]["api_token"] = api_token
return self

def build(self) -> Dict[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where can I read more about ConfigBuilder as a concept? For example, I gather that build() can hypothetically have dynamic pieces and that's why it's a function — but in this implementation it just returns the underlying dict.

Is ConfigBuilder part of the integration tests architecture that we want to see in other connectors? Is it a "standard" class? Should we have a parent class for it that mandates it's behavior then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where can I read more about ConfigBuilder as a concept?

Personally, I like builders because they are expressive (if a part of the config is not important to my test, I don't need to configure it) and add a layer of abstraction to avoid breaking change/mass changes

but in this implementation it just returns the underlying dict.

I think that to be precise, it should be a mapping but eh!

connection_specification: Mapping[str, Any]

Is ConfigBuilder part of the integration tests architecture

Not really as configs are often very specific for connectors

return self._config
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.

from datetime import datetime, timedelta, timezone
from typing import Any, Dict, Optional
from unittest import TestCase

from airbyte_cdk.sources.source import TState
from airbyte_cdk.test.catalog_builder import CatalogBuilder
from airbyte_cdk.test.entrypoint_wrapper import EntrypointOutput, read
from airbyte_cdk.test.mock_http import HttpMocker, HttpRequest, HttpResponse
from airbyte_cdk.test.mock_http.response_builder import (
FieldPath,
HttpResponseBuilder,
NestedPath,
RecordBuilder,
create_record_builder,
create_response_builder,
find_template,
)
from airbyte_protocol.models import ConfiguredAirbyteCatalog, FailureType, SyncMode
from integration.config import ConfigBuilder
from source_harvest import SourceHarvest

_A_REPLICATION_START_DATE = "2021-01-01T00:00:00+00:00"
_AN_ACCOUNT_ID = "1209384"
_AN_API_KEY = "harvestapikey"
_AN_INVOICE_ID = "an-invoice-id"
_STREAM_NAME = "invoice_messages"
_TEMPLATE_NAME = "invoice_messages"
_INVOICES_TEMPLATE_NAME = "invoices"
_RECORDS_PATH = FieldPath("invoice_messages")
_INVOICES_RECORDS_PATH = FieldPath("invoices")


def _a_message() -> RecordBuilder:
return create_record_builder(
find_template(_TEMPLATE_NAME, __file__),
_RECORDS_PATH,
)


def _invoice_messages_response() -> HttpResponseBuilder:
return create_response_builder(
find_template(_TEMPLATE_NAME, __file__),
_RECORDS_PATH,
)


def _an_invoice() -> RecordBuilder:
return create_record_builder(
find_template(_INVOICES_TEMPLATE_NAME, __file__),
_INVOICES_RECORDS_PATH,
record_id_path=FieldPath("id"),
)


def _invoices_response() -> HttpResponseBuilder:
return create_response_builder(
find_template(_INVOICES_TEMPLATE_NAME, __file__),
_INVOICES_RECORDS_PATH,
)


def _read(
config_builder: ConfigBuilder,
state: Optional[Dict[str, Any]] = None,
expecting_exception: bool = False
) -> EntrypointOutput:
return read(
SourceHarvest(),
config_builder.build(),
CatalogBuilder().with_stream(_STREAM_NAME, SyncMode.full_refresh).build(),
state,
expecting_exception
)


class InvoicesTest(TestCase):
@HttpMocker()
def test_given_replication_start_date_when_read_then_request_is_created_properly(self, http_mocker: HttpMocker):
http_mocker.get(
HttpRequest(
url="https://api.harvestapp.com/v2/invoices",
query_params={
"per_page": "50",
},
),
_invoices_response().with_record(_an_invoice().with_id(_AN_INVOICE_ID)).build()
)
http_mocker.get(
HttpRequest(
url=f"https://api.harvestapp.com/v2/invoices/{_AN_INVOICE_ID}/messages",
query_params={
"per_page": "50",
"updated_since": _A_REPLICATION_START_DATE,
},
),
_invoices_response().with_record(_a_message()).build()
)

_read(ConfigBuilder().with_account_id(_AN_ACCOUNT_ID).with_api_token(_AN_API_KEY).with_replication_start_date(datetime.fromisoformat(_A_REPLICATION_START_DATE)))

# endpoint is called
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.

from datetime import datetime, timedelta, timezone
from typing import Any, Dict, Optional
from unittest import TestCase

from airbyte_cdk.sources.source import TState
from airbyte_cdk.test.catalog_builder import CatalogBuilder
from airbyte_cdk.test.entrypoint_wrapper import EntrypointOutput, read
from airbyte_cdk.test.mock_http import HttpMocker, HttpRequest, HttpResponse
from airbyte_cdk.test.mock_http.response_builder import (
FieldPath,
HttpResponseBuilder,
NestedPath,
RecordBuilder,
create_record_builder,
create_response_builder,
find_template,
)
from airbyte_protocol.models import ConfiguredAirbyteCatalog, FailureType, SyncMode
from integration.config import ConfigBuilder
from source_harvest import SourceHarvest

_A_REPLICATION_START_DATE = "2021-01-01T00:00:00+00:00"
_AN_ACCOUNT_ID = "1209384"
_AN_API_KEY = "harvestapikey"
_STREAM_NAME = "invoices"
_TEMPLATE_NAME = "invoices"
_RECORDS_PATH = FieldPath("invoices")


def _an_invoice() -> RecordBuilder:
return create_record_builder(
find_template(_TEMPLATE_NAME, __file__),
_RECORDS_PATH,
)


def _invoices_response() -> HttpResponseBuilder:
return create_response_builder(
find_template(_TEMPLATE_NAME, __file__),
_RECORDS_PATH,
)


def _read(
config_builder: ConfigBuilder,
state: Optional[Dict[str, Any]] = None,
expecting_exception: bool = False
) -> EntrypointOutput:
return read(
SourceHarvest(),
config_builder.build(),
CatalogBuilder().with_stream(_STREAM_NAME, SyncMode.full_refresh).build(),
state,
expecting_exception
)


class InvoicesTest(TestCase):
@HttpMocker()
def test_given_replication_start_date_when_read_then_request_is_created_properly(self, http_mocker: HttpMocker):
http_mocker.get(
HttpRequest(
url="https://api.harvestapp.com/v2/invoices",
query_params={
"per_page": "50",
"updated_since": _A_REPLICATION_START_DATE,
},
headers={
"Authorization": f"Bearer {_AN_API_KEY}",
"Harvest-Account-ID": _AN_ACCOUNT_ID,
}
),
_invoices_response().build()
)

_read(ConfigBuilder().with_account_id(_AN_ACCOUNT_ID).with_api_token(_AN_API_KEY).with_replication_start_date(datetime.fromisoformat(_A_REPLICATION_START_DATE)))

# endpoint is called
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"invoice_messages": [
{
"id": 27835209,
"sent_by": "Bob Powell",
"sent_by_email": "bobpowell@example.com",
"sent_from": "Bob Powell",
"sent_from_email": "bobpowell@example.com",
"include_link_to_client_invoice": false,
"send_me_a_copy": false,
"thank_you": false,
"reminder": false,
"send_reminder_on": null,
"created_at": "2017-08-23T22:15:06Z",
"updated_at": "2017-08-23T22:15:06Z",
"attach_pdf": true,
"event_type": null,
"recipients": [
{
"name": "Richard Roe",
"email": "richardroe@example.com"
}
],
"subject": "Past due invoice reminder: #1001 from API Examples",
"body": "Dear Customer,\r\n\r\nThis is a friendly reminder to let you know that Invoice 1001 is 144 days past due. If you have already sent the payment, please disregard this message. If not, we would appreciate your prompt attention to this matter.\r\n\r\nThank you for your business.\r\n\r\nCheers,\r\nAPI Examples"
},
{
"id": 27835207,
"sent_by": "Bob Powell",
"sent_by_email": "bobpowell@example.com",
"sent_from": "Bob Powell",
"sent_from_email": "bobpowell@example.com",
"include_link_to_client_invoice": false,
"send_me_a_copy": true,
"thank_you": false,
"reminder": false,
"send_reminder_on": null,
"created_at": "2017-08-23T22:14:49Z",
"updated_at": "2017-08-23T22:14:49Z",
"attach_pdf": true,
"event_type": null,
"recipients": [
{
"name": "Richard Roe",
"email": "richardroe@example.com"
},
{
"name": "Bob Powell",
"email": "bobpowell@example.com"
}
],
"subject": "Invoice #1001 from API Examples",
"body": "---------------------------------------------\r\nInvoice Summary\r\n---------------------------------------------\r\nInvoice ID: 1001\r\nIssue Date: 04/01/2017\r\nClient: 123 Industries\r\nP.O. Number: \r\nAmount: €288.90\r\nDue: 04/01/2017 (upon receipt)\r\n\r\nThe detailed invoice is attached as a PDF.\r\n\r\nThank you!\r\n---------------------------------------------"
}
],
"per_page": 2000,
"total_pages": 1,
"total_entries": 2,
"next_page": null,
"previous_page": null,
"page": 1,
"links": {
"first": "https://api.harvestapp.com/api/v2/invoices/13150403/messages?page=1&per_page=2000",
"next": null,
"previous": null,
"last": "https://api.harvestapp.com/v2/invoices/13150403/messages?page=1&per_page=2000"
}
}
Loading
Loading