Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Zenloop: add new stream properties #15843

Merged
merged 11 commits into from Sep 19, 2022

Conversation

jablonskijakub
Copy link
Contributor

@jablonskijakub jablonskijakub commented Aug 22, 2022

What

Include the stream 'Properties' in the connector

How

Pull out all properties using a survey_id

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@CLAassistant
Copy link

CLAassistant commented Aug 22, 2022

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added area/connectors Connector related issues area/documentation Improvements or additions to documentation labels Aug 22, 2022
@jablonskijakub jablonskijakub changed the title add new stream properties Zenloop: add new stream properties Aug 22, 2022
@marcosmarxm
Copy link
Member

marcosmarxm commented Aug 23, 2022

/test connector=connectors/source-zenloop

🕑 connectors/source-zenloop https://github.com/airbytehq/airbyte/actions/runs/2909087320
❌ connectors/source-zenloop https://github.com/airbytehq/airbyte/actions/runs/2909087320
🐛 https://gradle.com/s/2xrlthlfc5kse

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestSpec::test_additional_properties_is_true[inputs0] - ...
FAILED test_core.py::TestConnection::test_check[inputs0] - AssertionError: as...
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - so...
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
FAILED test_incremental.py::TestIncremental::test_two_sequential_reads[inputs0]
FAILED test_incremental.py::TestIncremental::test_read_sequential_slices[inputs0]
FAILED test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
======================== 8 failed, 21 passed in 29.27s =========================

@sajarin sajarin changed the title Zenloop: add new stream properties Source Zenloop: add new stream properties Aug 24, 2022
@sajarin sajarin added bounty-M Maintainer program: claimable medium bounty PR bounty-S Maintainer program: claimable small bounty PR and removed bounty-M Maintainer program: claimable medium bounty PR labels Aug 24, 2022
@sajarin
Copy link
Contributor

sajarin commented Aug 24, 2022

assigned to @monai

Copy link
Contributor

@monai monai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After you make the requested changes, run ./gradlew --no-daemon :airbyte-integrations:connectors:source-zenloop:airbytePythonFormat in the Airbyte root directory to fix the remaining code formatting issues.

@@ -1124,7 +1124,7 @@
- name: Zenloop
sourceDefinitionId: f1e4c7f6-db5c-4035-981f-d35ab4998794
dockerRepository: airbyte/source-zenloop
dockerImageTag: 0.1.1
dockerImageTag: 0.1.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert property dockerImageTag. The version should only be bumped in Dockerfile, as you did. The publish script will automatically update version in all the required files, including this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

@@ -5,7 +5,7 @@
"title": "Zenloop Spec",
"type": "object",
"required": ["api_token"],
"additionalProperties": false,
"additionalProperties": true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove "additionalProperties": true. This property is true by default, so there is no need to set it explicitly. For context, see #14878 and #14924.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -11017,7 +11017,7 @@
supportsDBT: false
supported_destination_sync_modes:
- "append"
- dockerImage: "airbyte/source-zenloop:0.1.1"
- dockerImage: "airbyte/source-zenloop:0.1.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert airbyte-config/init/src/main/resources/seed/source_specs.yaml. This file is autogenerated by publishing script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

@@ -0,0 +1,14 @@
{
"type": ["null", "object"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add $schema: http://json-schema.org/draft-07/schema# as it's mainly used version across all connectors. The schema version, as far as I know, is not used for processing and validation, but it's good practice to include it.

Excerpt from the JSON Schema reference states:

If $schema is not used, an implementation might allow you to specify a value externally or it might make assumptions about which specification version should be used to evaluate the schema. It’s recommended that all JSON Schemas have a $schema keyword to communicate to readers and tooling which specification version is intended.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

{
"stream": {
"name": "properties",
"json_schema": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set value of property json_schema to contents of airbyte-integrations/connectors/source-zenloop/source_zenloop/schemas/properties.json. Without it, integration tests may report success when there's an actual failure. All other configured catalogs lack it too, but they're a bad example and should be fixed eventually in other PRs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set. could it be done with $ref ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer is no.

Airbyte partially supports $refs only in the schema directory to reference other files in the shared subdirectory. For more information, you can check #7966 and related PRs. For example, see https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-flexport/source_flexport/schemas.

Files in the integration_tests directory are solely for tests and examples. Their implementation should not depend on connectors code; hence referencing schema files, even if that were possible technically, still isn't a good design.

"name": "properties",
"supported_sync_modes": ["full_refresh"],
"source_defined_cursor": true,
"json_schema": {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set value of property json_schema to contents of airbyte-integrations/connectors/source-zenloop/source_zenloop/schemas/properties.json.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set

@monai
Copy link
Contributor

monai commented Aug 24, 2022

I have a suggestion for future improvement of this connector. It's out of the scope of this PR, but I highly recommend implementing it before transitioning the connector from Alpha to Beta.

This connector implements the class ChildStreamMixin, and duplicates the functionality of the CDK class HttpSubStream. The latter, in addition, supports request caching where it's applicable.

@jablonskijakub
Copy link
Contributor Author

After you make the requested changes, run ./gradlew --no-daemon :airbyte-integrations:connectors:source-zenloop:airbytePythonFormat in the Airbyte root directory to fix the remaining code formatting issues.

I'm having troubles to run this command. I tried to fix it but could not find the solution.
image

@monai
Copy link
Contributor

monai commented Aug 25, 2022

It looks like your development environment doesn't have Java compiler. Airbyte works well with OpenJDK, a free, open-source Java implementation.

For macOS easiest way to install it is using Homebrew:

brew install openjdk
sudo ln -sfn /usr/local/opt/openjdk/libexec/openjdk.jdk /Library/Java/JavaVirtualMachines/openjdk.jdk

For Linux, I suggest using the system package manager.

And for Windows, I think this SO thread might help you.

Make sure that command javac is in the $PATH.

@sajarin
Copy link
Contributor

sajarin commented Aug 29, 2022

/test connector=connectors/source-zenloop

@monai
Copy link
Contributor

monai commented Aug 31, 2022

The code looks good now. However, integration tests on CI servers are failing due to account issues. We're looking into it. Can you please share a screenshot of passing integration tests?

First, build a Docker container with the latest code changes and then run the test suite:

docker build . -t airbyte/source-zenloop:dev
python -m pytest -p integration_tests.acceptance

@jablonskijakub
Copy link
Contributor Author

Unfortunately I cannot get all the acceptance tests passing. Could you please help?

image

image

@monai
Copy link
Contributor

monai commented Sep 12, 2022

Unfortunately I cannot get all the acceptance tests passing. Could you please help?

Please paste the complete log, including the command line you used to run tests. It's hard to tell what's the cause from the trimmed screenshot.

@jablonskijakub
Copy link
Contributor Author

jablonskijakub commented Sep 12, 2022

thanks for your reply. I ran the commands suggested by you

docker build . -t airbyte/source-zenloop:dev
python -m pytest -p integration_tests.acceptance

zenloop_integration_test.txt

@monai
Copy link
Contributor

monai commented Sep 12, 2022

self = <socket.SocketIO object at 0x1084f2b90>, b = <memory at 0x1098a7340>

    def readinto(self, b):
        """Read up to len(b) bytes into the writable buffer *b* and return
        the number of bytes read.  If the socket is non-blocking and no bytes
        are available, None is returned.
    
        If *b* is non-empty, a 0 return value indicates that the connection
        was shutdown at the other end.
        """
        self._checkClosed()
        self._checkReadable()
        if self._timeout_occurred:
            raise OSError("cannot read from timed out object")
        while True:
            try:
>               return self._sock.recv_into(b)
E               Failed: Timeout >300.0s

The reading test exceeds a timeout of 300 seconds. The problem can be solved in two ways.

First, update date_from in secrets/config.json to a more recent date - ideally, the latest possible date with enough data. That will solve the issue in your local development environment.

The other, more reliable solution would be to split configured catalog for each stream. This would help to spot the particular stream that causes a such issue. See https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-flexport/acceptance-test-config.yml for example. Note basic_read and following tests, also referenced configured catalog files. It's not necessary for the scope of this PR, but if you could contribute this change, it would improve the quality of tests.

@jablonskijakub
Copy link
Contributor Author

I added additional configured_catalog for each stream and was able to identify the one which causes the time out. It's properties whose full refresh takes ~ 12 mins (I was able to sync the data). I guess that's the problem?

@monai
Copy link
Contributor

monai commented Sep 13, 2022

Yes, that's an inherent issue of full_refresh streams and read tests. However, Airbytes test accounts, I think, have few data points, so the sync is quick and doesn't fail on Github CI. Anyway, setting start_date to as late as possible, if the connector and streams support it, may fix the issue in this and other connectors.

I experienced this issue with other connectors that extract lots of data using real production accounts. A good example is the Amazon Ads connector, in which report generation may take up to half an hour while test timeout is only 300 seconds. Therefore the test almost always fails with a production account and actual data.

@jablonskijakub
Copy link
Contributor Author

I limited the number of the data points and now all the test have passed. thank you @monai for your help.

zenloop_integration_tests.txt

@@ -35,6 +35,29 @@
},
"sync_mode": "full_refresh",
"destination_sync_mode": "overwrite"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One big configured_catalog.json is no longer needed since you split it to separate files. Remove it and a reference from acceptance-test-config.yml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

{
"stream": {
"name": "answers",
"json_schema": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add streams JSON Schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

{
"stream": {
"name": "answers_survey_group",
"json_schema": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add streams JSON Schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

{
"stream": {
"name": "survey_groups",
"json_schema": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add streams JSON Schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

{
"stream": {
"name": "surveys",
"json_schema": {},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add streams JSON Schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -25,6 +25,26 @@
"source_defined_cursor": true,
"default_cursor_field": "test",
"json_schema": {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're entirely refactoring integration tests, fill in the json_schema properties for all streams in catalog.json to be completely consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@sajarin
Copy link
Contributor

sajarin commented Sep 19, 2022

/test connector=connectors/source-zenloop

🕑 connectors/source-zenloop https://github.com/airbytehq/airbyte/actions/runs/3081132784
✅ connectors/source-zenloop https://github.com/airbytehq/airbyte/actions/runs/3081132784
Python tests coverage:

Name                         Stmts   Miss  Cover
------------------------------------------------
source_zenloop/__init__.py       2      0   100%
source_zenloop/source.py       137     25    82%
------------------------------------------------
TOTAL                          139     25    82%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       121     25    79%   21-23, 29-31, 36-43, 48-61, 208-216
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       110     48    56%   23-26, 32, 36, 39-64, 67-69, 72-74, 77-79, 82-84, 87-89, 92-110, 144-146
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1325    463    65%

Build Passed

Test summary info:

All Passed

@sajarin
Copy link
Contributor

sajarin commented Sep 19, 2022

/publish connector=connectors/source-zenloop

🕑 Publishing the following connectors:
connectors/source-zenloop
https://github.com/airbytehq/airbyte/actions/runs/3081232001


Connector Did it publish? Were definitions generated?
connectors/source-zenloop

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@sajarin sajarin merged commit 1c0d237 into airbytehq:master Sep 19, 2022
robbinhan pushed a commit to robbinhan/airbyte that referenced this pull request Sep 29, 2022
* add new stream properties

* add PR number

* change additionalProperties to true

* revert bumoed version of an image, add json schema in the tests files, remove additionalProperties: true

* gradle build completed

* improve test acceptance quality, adjust the order of objects for the properties json schema

* delete reduntant configured_catalog, add json schema for all streams

* auto-bump connector version [ci skip]

Co-authored-by: Sajarin <sajarindider@gmail.com>
Co-authored-by: Marie Amelie Sandrock <marieamelie94@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
* add new stream properties

* add PR number

* change additionalProperties to true

* revert bumoed version of an image, add json schema in the tests files, remove additionalProperties: true

* gradle build completed

* improve test acceptance quality, adjust the order of objects for the properties json schema

* delete reduntant configured_catalog, add json schema for all streams

* auto-bump connector version [ci skip]

Co-authored-by: Sajarin <sajarindider@gmail.com>
Co-authored-by: Marie Amelie Sandrock <marieamelie94@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation bounty bounty-S Maintainer program: claimable small bounty PR community connectors/source/zenloop reward-50
Projects
Development

Successfully merging this pull request may close these issues.

None yet

8 participants