Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 Source Shopify: Add metafield streams #17962

Merged
merged 17 commits into from
Oct 17, 2022

Conversation

artem1205
Copy link
Collaborator

What

Resolving #5493

How

Add metafield streams

Recommended reading order

  1. x.java
  2. y.python

🚨 User Impact 🚨

Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here
Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub and connector version bumped by running the /publish command described here
Connector Generator
  • Issue acceptance criteria met
  • PR name follows PR naming conventions
  • If adding a new generator, add it to the list of scaffold modules being tested
  • The generator test modules (all connectors with -scaffold in their name) have been updated with the latest scaffold by running ./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates then checking in your changes
  • Documentation which references the generator is updated as needed

Tests

Unit

Put your unit tests output here.

Integration

Put your integration tests output here.

Acceptance

Put your acceptance tests output here.

@artem1205

This comment was marked as outdated.

@artem1205

This comment was marked as outdated.

@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Oct 13, 2022
Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor changes are required. Please refer to the comments.

Also, let's try to cache the Products stream, since we have it re-used a lot for sub-sequent streams. To do so, simply add use_cache = True attribute to the Products stream class, the cache will be created and should be re-used for all other subsequent stream classes where possible (if possible), this will also reduce the sync time dramatically, I think. WDYT?

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 14, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3252588598
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3252588598
🐛 https://gradle.com/s/eanicrrk2ds6a

Build Failed

Test summary info:

	 =========================== short test summary info ============================
	 FAILED unit_tests/unit_test.py::test_privileges_validation - AssertionError: ...
	 �[31m================== �[31m�[1m1 failed�[0m, �[32m23 passed�[0m, �[33m92 warnings�[0m�[31m in 0.65s�[0m�[31m ===================�[0m

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 14, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3252691468
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3252691468
🐛 https://gradle.com/s/3tddte24fal4o

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - so...
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs1] - so...
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs2] - so...
FAILED test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
================== 4 failed, 47 passed in 1341.51s (0:22:21) ===================

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264128887
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264128887
🐛 https://gradle.com/s/ickzt6lg4buve

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs0] - so...
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs1] - so...
FAILED test_core.py::TestDiscovery::test_backward_compatibility[inputs2] - so...
FAILED test_core.py::TestBasicRead::test_read[inputs0] - docker.errors.Contai...
FAILED test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
FAILED test_incremental.py::TestIncremental::test_two_sequential_reads[inputs0]
FAILED test_incremental.py::TestIncremental::test_read_sequential_slices[inputs0]
FAILED test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
======================== 8 failed, 43 passed in 48.10s =========================

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264428114
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264428114
🐛 https://gradle.com/s/nh7u6thvdsvjw

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
SKIPPED [3] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.1.38.
============= 1 failed, 47 passed, 3 skipped in 1337.53s (0:22:17) =============

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264945674
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3264945674
🐛 https://gradle.com/s/xit4hgrmcx5bo

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_incremental.py::TestIncremental::test_state_with_abnormally_large_values[inputs0]
SKIPPED [3] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.1.38.
============= 1 failed, 47 passed, 3 skipped in 1291.03s (0:21:31) =============

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3265970253
❌ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3265970253
🐛 https://gradle.com/s/stf52pgshqtsa

Build Failed

Test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - AssertionError: All ...
SKIPPED [3] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.1.38.
============= 1 failed, 47 passed, 3 skipped in 1587.67s (0:26:27) =============

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3266658108
✅ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3266658108
Python tests coverage:

Name                          Stmts   Miss  Cover
-------------------------------------------------
source_shopify/__init__.py        2      0   100%
source_shopify/transform.py      58      3    95%
source_shopify/utils.py          58      6    90%
source_shopify/auth.py           20      4    80%
source_shopify/source.py        402    122    70%
-------------------------------------------------
TOTAL                           540    135    75%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       152     26    83%   21-23, 29-31, 36-43, 48-61, 239, 250-258
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     17    78%   15-16, 24-30, 47-54, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-67, 70-72, 75-77, 80-82, 85-87, 90-92, 95-113, 147-149
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1358    466    66%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [3] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.1.38.
================== 48 passed, 3 skipped in 1366.08s (0:22:46) ==================

@bazarnov
Copy link
Collaborator

bazarnov commented Oct 17, 2022

/test connector=connectors/source-shopify

🕑 connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3267559648
✅ connectors/source-shopify https://github.com/airbytehq/airbyte/actions/runs/3267559648
Python tests coverage:

Name                          Stmts   Miss  Cover
-------------------------------------------------
source_shopify/__init__.py        2      0   100%
source_shopify/transform.py      58      3    95%
source_shopify/utils.py          58      6    90%
source_shopify/auth.py           20      4    80%
source_shopify/source.py        403    123    69%
-------------------------------------------------
TOTAL                           541    136    75%
	 Name                                                 Stmts   Miss  Cover   Missing
	 ----------------------------------------------------------------------------------
	 source_acceptance_test/base.py                          10      4    60%   15-18
	 source_acceptance_test/config.py                        83      6    93%   78-80, 84-86
	 source_acceptance_test/conftest.py                     164    164     0%   6-282
	 source_acceptance_test/plugin.py                        48     48     0%   6-104
	 source_acceptance_test/tests/test_core.py              329    111    66%   39, 50-58, 63-70, 74-75, 79-80, 164, 202-219, 228-236, 240-245, 251, 284-289, 327-334, 374-376, 379, 439-448, 477-478, 484, 487, 520-530, 543-568, 573-577
	 source_acceptance_test/tests/test_full_refresh.py       52      2    96%   34, 65
	 source_acceptance_test/tests/test_incremental.py       145     20    86%   21-23, 29-31, 36-43, 48-61, 224
	 source_acceptance_test/utils/asserts.py                 37      2    95%   57-58
	 source_acceptance_test/utils/common.py                  77     10    87%   15-16, 24-30, 64, 67
	 source_acceptance_test/utils/compare.py                 62     23    63%   21-51, 68, 97-99
	 source_acceptance_test/utils/connector_runner.py       112     50    55%   23-26, 32, 36, 39-67, 70-72, 75-77, 80-82, 85-87, 90-92, 95-113, 147-149
	 source_acceptance_test/utils/json_schema_helper.py     105     13    88%   30-31, 38, 41, 65-68, 96, 120, 190-192
	 ----------------------------------------------------------------------------------
	 TOTAL                                                 1351    453    66%

Build Passed

Test summary info:

=========================== short test summary info ============================
SKIPPED [3] ../usr/local/lib/python3.9/site-packages/source_acceptance_test/tests/test_core.py:235: Backward compatibility tests are disabled for version 0.1.38.
================== 48 passed, 3 skipped in 1754.60s (0:29:14) ==================

@bazarnov bazarnov linked an issue Oct 17, 2022 that may be closed by this pull request
Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

@artem1205
Copy link
Collaborator Author

artem1205 commented Oct 17, 2022

/publish connector=connectors/source-shopify

🕑 Publishing the following connectors:
connectors/source-shopify
https://github.com/airbytehq/airbyte/actions/runs/3268171033


Connector Did it publish? Were definitions generated?
connectors/source-shopify

if you have connectors that successfully published but failed definition generation, follow step 4 here ▶️

@octavia-squidington-iii octavia-squidington-iii temporarily deployed to more-secrets October 17, 2022 20:46 Inactive
@artem1205 artem1205 merged commit fc8be13 into master Oct 17, 2022
@artem1205 artem1205 deleted the 5493-source-shopify-fetch-metafields branch October 17, 2022 21:01
YatsukBogdan1 pushed a commit that referenced this pull request Oct 18, 2022
* 🎉 Source Shopify: Add metafield streams

* Source Shopify: fix unittest

* Source Shopify: docs update

* Source Shopify: fix backward compatibility test

* Source Shopify: fix schemas

* Source Shopify: fix state filter

* Source Shopify: refactor & optimize

* Source Shopify: fix test privileges

* Source Shopify: fix stream filter

* Source Shopify: fix streams

* Source Shopify: update abnormal state

* Source Shopify: fix abnormal state streams

* Source Shopify: fix streams

* updated methods, formated code

* Source Shopify: typo fix

* auto-bump connector version

Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
YatsukBogdan1 added a commit that referenced this pull request Oct 19, 2022
* Implement ColumnSortButton component

* Updates component name; Moves component to ui/Table folder; Refactors formattedMessageId property into using render content as children directly; Removes minor SortIcon component

* Update airbyte-webapp/src/App.tsx

Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com>

* Updates next properties: wasActive -> isActive, lowToLarge -> isAscending

* Skip psql stop in acceptance test for gke (#18023)

* Checks for iterator hasNext element (#18041)

* Checks for iterator hasNext element

* Fix linter with newline

* Add Message Migration to Destination Connection Checks (#17954)

* Add Message Migration to Destination Connection Checks

* Fix test setup

* Update helm release workflow (#18048)

* Update workflow

* Update trigger rules

* fix: Update release workflow with abillity to add tags

* Update workflow

* Remove unused `airbyte-cli` (#18009)

* 🐛  [low-code] $options shouldn't overwrite values that are already defined (#18060)

* fix

* Add missing test

* remove prints

* extract to method

* rename

* Add missing test

* rename

* bump

* Update helm chart comments (#18072)

* Update helm charts (#18073)

* add test

* fix chart.yaml

* 16250 Destination Redis: Add SSH support (#17951)

* 16250 Destination Redis: Add SSH support

* 16250 Resolve port issue

* 11679 Bump version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* Bump helm chart version reference to 0.40.20 (#18074)

* Bump helm chart version reference to 0.40.20

* remove binary

Co-authored-by: xpuska513 <xpuska513@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* Helm Chart: Create service annotations for airbyte-server (#17932)

* Support annotations for airbyte-server as well, update version and update docs.

* Fix auto-indent.

Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* Bmoric/remove dep server worker (#17894)

* test [ci skip]

* Autogenerated files

* Add missing annotation

* Remove unused json2Schema block from worker

* Move tess

* Missing deps and format

* Fix test build

* TMP

* Add missing dependencies

* PR comments

* Tmp

* [ci skip] Tmp

* Fix acceptance test and add the seed dependency

* Fix build

* For diff

* tmp

* Build pass

* make the worker to be  on the platform only

* fix setting.yaml

* Fix pmd

* Fix Cron

* Add chart

* Fix cron

* Fix server build.gradle

* Fix jar conflict

* PR comments

* Add cron micronaut environemnt

* Updated connector catalog page (#18076)

* Move the port forward outside of the main docker-compose (#17864)

* Bump Airbyte version from 0.40.14 to 0.40.15 (#17970)

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>

* 🎉 Source Shopify: Add metafield streams (#17962)

* 🎉 Source Shopify: Add metafield streams

* Source Shopify: fix unittest

* Source Shopify: docs update

* Source Shopify: fix backward compatibility test

* Source Shopify: fix schemas

* Source Shopify: fix state filter

* Source Shopify: refactor & optimize

* Source Shopify: fix test privileges

* Source Shopify: fix stream filter

* Source Shopify: fix streams

* Source Shopify: update abnormal state

* Source Shopify: fix abnormal state streams

* Source Shopify: fix streams

* updated methods, formated code

* Source Shopify: typo fix

* auto-bump connector version

Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* fix check for streams that do not use a stream slicer (#18080)

* fix check for streams that do not use a stream slicer

* increment version and changelog before publish

* tolerate database nulls in webhook operation configs (#18084)

* Implement webhook operation in the sync workflow (#18022)

Implements the webhook operation as part of the sync workflow.

- Introduces the new activity implementation
- Updates the various interfaces that pass input to get the relevant configs to the sync workflow
- Hooks the new activity into the sync workflow
- Passes the webhook configs along into the sync workflow job

* Bump helm chart version reference to 0.40.22 (#18077)

* Added new "filters" python file, along with a "hash" filter. This can… (#18000)

* Added new "filters" python file, along with a "hash" filter. This can be extended to include other custom filters in the future.

* Added additional comments

* Moved usage of the hash_obj inside the conditional that confirms it exists

* Moved the hash function call inside a condition to ensure that it exists

* Fixed the application of the salt , so that it does not modify the hash unless it is actually passed in.

* Added unit tests to validate new jinja hash functionality

* Updated unit test to pass numeric value as a float instead of string

* Removed unreferenced import to pytest

* Updated version

* format

* format

* format

* format

* format

Co-authored-by: Alexandre Girard <alexandre@airbyte.io>

* Bump helm chart version reference to 0.40.24 (#18081)

* Bump helm chart version reference to 0.40.24

* Update .gitignore

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* SATs: allow new records in a sequential read for full refresh test (#17660)

* SATs: allow new records in a sequential read for full refresh test

* SATs: upd changelog

* SATs: change the output when failing full refresh test

* SATs: upd according to code review

* Source facebook-marketing: remove `pixel` from custom conversions stream (#18045)

* #744 source facebook-marketing: rm pixel from custom conversions stream

* #744 source fb marketing: upd changelog

* #744 source facebook-marketing - add custom_conversions to the test catalog

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* #17506 fix klaviyo & marketo expected_records (#18101)

Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com>
Co-authored-by: terencecho <3916587+terencecho@users.noreply.github.com>
Co-authored-by: Ryan Fu <ryan.fu@airbyte.io>
Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: Yevhen Sukhomud <suhomud@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: xpuska513 <xpuska513@users.noreply.github.com>
Co-authored-by: Prasanth <72515998+sfc-gh-pkommini@users.noreply.github.com>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <90398440+octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com>
Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Michael Siega <109092231+mfsiega-airbyte@users.noreply.github.com>
Co-authored-by: Alexander Marquardt <alexander.marquardt@gmail.com>
Co-authored-by: Denys Davydov <davydov.den18@gmail.com>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
* 🎉 Source Shopify: Add metafield streams

* Source Shopify: fix unittest

* Source Shopify: docs update

* Source Shopify: fix backward compatibility test

* Source Shopify: fix schemas

* Source Shopify: fix state filter

* Source Shopify: refactor & optimize

* Source Shopify: fix test privileges

* Source Shopify: fix stream filter

* Source Shopify: fix streams

* Source Shopify: update abnormal state

* Source Shopify: fix abnormal state streams

* Source Shopify: fix streams

* updated methods, formated code

* Source Shopify: typo fix

* auto-bump connector version

Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
jhammarstedt pushed a commit to jhammarstedt/airbyte that referenced this pull request Oct 31, 2022
* Implement ColumnSortButton component

* Updates component name; Moves component to ui/Table folder; Refactors formattedMessageId property into using render content as children directly; Removes minor SortIcon component

* Update airbyte-webapp/src/App.tsx

Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com>

* Updates next properties: wasActive -> isActive, lowToLarge -> isAscending

* Skip psql stop in acceptance test for gke (airbytehq#18023)

* Checks for iterator hasNext element (airbytehq#18041)

* Checks for iterator hasNext element

* Fix linter with newline

* Add Message Migration to Destination Connection Checks (airbytehq#17954)

* Add Message Migration to Destination Connection Checks

* Fix test setup

* Update helm release workflow (airbytehq#18048)

* Update workflow

* Update trigger rules

* fix: Update release workflow with abillity to add tags

* Update workflow

* Remove unused `airbyte-cli` (airbytehq#18009)

* 🐛  [low-code] $options shouldn't overwrite values that are already defined (airbytehq#18060)

* fix

* Add missing test

* remove prints

* extract to method

* rename

* Add missing test

* rename

* bump

* Update helm chart comments (airbytehq#18072)

* Update helm charts (airbytehq#18073)

* add test

* fix chart.yaml

* 16250 Destination Redis: Add SSH support (airbytehq#17951)

* 16250 Destination Redis: Add SSH support

* 16250 Resolve port issue

* 11679 Bump version

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* Bump helm chart version reference to 0.40.20 (airbytehq#18074)

* Bump helm chart version reference to 0.40.20

* remove binary

Co-authored-by: xpuska513 <xpuska513@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* Helm Chart: Create service annotations for airbyte-server (airbytehq#17932)

* Support annotations for airbyte-server as well, update version and update docs.

* Fix auto-indent.

Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* Bmoric/remove dep server worker (airbytehq#17894)

* test [ci skip]

* Autogenerated files

* Add missing annotation

* Remove unused json2Schema block from worker

* Move tess

* Missing deps and format

* Fix test build

* TMP

* Add missing dependencies

* PR comments

* Tmp

* [ci skip] Tmp

* Fix acceptance test and add the seed dependency

* Fix build

* For diff

* tmp

* Build pass

* make the worker to be  on the platform only

* fix setting.yaml

* Fix pmd

* Fix Cron

* Add chart

* Fix cron

* Fix server build.gradle

* Fix jar conflict

* PR comments

* Add cron micronaut environemnt

* Updated connector catalog page (airbytehq#18076)

* Move the port forward outside of the main docker-compose (airbytehq#17864)

* Bump Airbyte version from 0.40.14 to 0.40.15 (airbytehq#17970)

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>

* 🎉 Source Shopify: Add metafield streams (airbytehq#17962)

* 🎉 Source Shopify: Add metafield streams

* Source Shopify: fix unittest

* Source Shopify: docs update

* Source Shopify: fix backward compatibility test

* Source Shopify: fix schemas

* Source Shopify: fix state filter

* Source Shopify: refactor & optimize

* Source Shopify: fix test privileges

* Source Shopify: fix stream filter

* Source Shopify: fix streams

* Source Shopify: update abnormal state

* Source Shopify: fix abnormal state streams

* Source Shopify: fix streams

* updated methods, formated code

* Source Shopify: typo fix

* auto-bump connector version

Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* fix check for streams that do not use a stream slicer (airbytehq#18080)

* fix check for streams that do not use a stream slicer

* increment version and changelog before publish

* tolerate database nulls in webhook operation configs (airbytehq#18084)

* Implement webhook operation in the sync workflow (airbytehq#18022)

Implements the webhook operation as part of the sync workflow.

- Introduces the new activity implementation
- Updates the various interfaces that pass input to get the relevant configs to the sync workflow
- Hooks the new activity into the sync workflow
- Passes the webhook configs along into the sync workflow job

* Bump helm chart version reference to 0.40.22 (airbytehq#18077)

* Added new "filters" python file, along with a "hash" filter. This can… (airbytehq#18000)

* Added new "filters" python file, along with a "hash" filter. This can be extended to include other custom filters in the future.

* Added additional comments

* Moved usage of the hash_obj inside the conditional that confirms it exists

* Moved the hash function call inside a condition to ensure that it exists

* Fixed the application of the salt , so that it does not modify the hash unless it is actually passed in.

* Added unit tests to validate new jinja hash functionality

* Updated unit test to pass numeric value as a float instead of string

* Removed unreferenced import to pytest

* Updated version

* format

* format

* format

* format

* format

Co-authored-by: Alexandre Girard <alexandre@airbyte.io>

* Bump helm chart version reference to 0.40.24 (airbytehq#18081)

* Bump helm chart version reference to 0.40.24

* Update .gitignore

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>

* SATs: allow new records in a sequential read for full refresh test (airbytehq#17660)

* SATs: allow new records in a sequential read for full refresh test

* SATs: upd changelog

* SATs: change the output when failing full refresh test

* SATs: upd according to code review

* Source facebook-marketing: remove `pixel` from custom conversions stream (airbytehq#18045)

* airbytehq#744 source facebook-marketing: rm pixel from custom conversions stream

* airbytehq#744 source fb marketing: upd changelog

* airbytehq#744 source facebook-marketing - add custom_conversions to the test catalog

* auto-bump connector version

Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>

* #17506 fix klaviyo & marketo expected_records (airbytehq#18101)

Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com>
Co-authored-by: terencecho <3916587+terencecho@users.noreply.github.com>
Co-authored-by: Ryan Fu <ryan.fu@airbyte.io>
Co-authored-by: Jimmy Ma <gosusnp@users.noreply.github.com>
Co-authored-by: Kyryl Skobylko <xpuska513@gmail.com>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: Yevhen Sukhomud <suhomud@gmail.com>
Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: xpuska513 <xpuska513@users.noreply.github.com>
Co-authored-by: Prasanth <72515998+sfc-gh-pkommini@users.noreply.github.com>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Octavia Squidington III <90398440+octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com>
Co-authored-by: Oleksandr Bazarnov <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: Michael Siega <109092231+mfsiega-airbyte@users.noreply.github.com>
Co-authored-by: Alexander Marquardt <alexander.marquardt@gmail.com>
Co-authored-by: Denys Davydov <davydov.den18@gmail.com>
@loris
Copy link

loris commented Jul 18, 2023

Minor changes are required. Please refer to the comments.

Also, let's try to cache the Products stream, since we have it re-used a lot for sub-sequent streams. To do so, simply add use_cache = True attribute to the Products stream class, the cache will be created and should be re-used for all other subsequent stream classes where possible (if possible), this will also reduce the sync time dramatically, I think. WDYT?

Hi, @bazarnov I am looking at the source-shopify connector performance lately, especially to speed up syncs of metafields. I am wondering why you only enabled cache for the products stream here? Did you leave others (like customers) without cache on purpose?

@bazarnov
Copy link
Collaborator

bazarnov commented Jul 19, 2023

@loris My best guess here, we didn't realize the need.

@artem1205 are there any difficulties with caching the metadata-* related streams to improve the performance?

As for me, this will not add more speed to the connector, since these streams are not re-used but standalone based on the parent stream they need the ids from. But we can try to research this anyway.

Did you create the issue for this? @loris

@loris
Copy link

loris commented Jul 24, 2023

Hi @bazarnov. Thanks for the reply, I will create a new issue regarding Shopify metafields, but first I have some question you can answer to help me understand some Airbyte design choices to handle metafields with substreams.

Considering the way metafields are modelized and exposed in Shopify:

  • You can only fetch the metafields of a single parent resource (ie, /admin/products/{product_id}/metafields.json)
  • If you change the metafields of a parent resource (add, update or remove any metafields), then the parent resource document will have its updated_at property updated

I don't understand why metafields (and substream in general) are designed the way they are in Airbyte right now as separate standalone streams, with their own state, and which needs to use the parent stream query to fetch the resource IDs (hopefully, some queries can be cached, but it comes with memory challenge). My understanding, is that it was designed this way to make it possible in an Airbyte connection to sync only the substream (ie, the metafields) without syncing the parent stream (ie, the products), but I don't get why someone would want to do that.

It would be easier to consider metafields as dependant streams of the parent resource, meaning:

  • that you cannot enable it if you have not enabled the parent resource stream
  • that they don't have their own state, this is the parent stream which use its own state to fetch the updated parent resources (ie, /admin/products.json?updated_at_min={state}) then fetch the metafields for each parent resources and emit all the metafields returned by Shopify in the destination (this is only way to handle deletions of metafields)

WDYT? Are you aware of other Airbyte connectors which have these "required" streams (ie, you cannot enable stream A if you don't have stream B enabled)

@bazarnov
Copy link
Collaborator

bazarnov commented Jul 24, 2023

@loris Regarding your question:

My understanding, is that it was designed this way to make it possible in an Airbyte connection to sync only the substream (ie, the metafields) without syncing the parent stream (ie, the products), but I don't get why someone would want to do that.

me either, but there are Customers who need just 1 or 2 streams fetched independently, but in a nutshell - the metafields are dependent on their parent streams even though it looks independent in the UI, the backend fetches the parent stream first, then the substream using the parrent_id, like you've described earlier here:

that they don't have their own state, this is the parent stream which use its own state to fetch the updated parent resources (ie, /admin/products.json?updated_at_min={state}) then fetch the metafields for each parent resources and emit all the metafields returned by Shopify in the destination (this is only way to handle deletions of metafields)

As for:

WDYT? Are you aware of other Airbyte connectors which have these "required" streams (ie, you cannot enable stream A if you don't have stream B enabled)

The Shopify connector is one among the few others that use this technique to fetch the streams explicitly independent providing the parent stream state and the substream state to fetch the substream based on the parent stream to provide the incremental functionality. As has already been said - they look independent in the UI but on the backend, they are synced based on the parent updates. If not, we will consider all the issues around the metafields streams during the planned maintenance we are already working on.

@loris
Copy link

loris commented Jul 24, 2023

@bazarnov Thanks for the detailed answer.
However I'm convinced the substream has its own state which is not the same as the parent one. I opened up an issue a while ago (#27355) where one of our shopify store had the following state at some point:

[
  {
    "streamDescriptor": {
      "name": "customers"
    },
    "streamState": {
      "updated_at": "2023-06-14T09:40:41+02:00"
    }
  },
  {
    "streamDescriptor": {
      "name": "metafield_customers"
    },
    "streamState": {
      "customers": {
        "updated_at": "2023-03-30T05:32:13+02:00"
      },
      "updated_at": "2021-09-01T17:57:41+02:00"
    }
  }
]

As you can see, the metafield_customers stream has its own states (updated_at for metafield_customers and updated_at for customers, that value being the most recent customer updated_at value found in the most recent metafield_customers value).And the bug in that case is that the sync will:

  • correctly synchronize the customers stream (using updated_at_min=2023-06-14T09:40:41+02:00)
  • but will use the 2023-03-30T05:32:13+02:00 timestamp to fetch all customer IDs, and then all metafields for those customers (instead of just using the 2023-06-14T09:40:41+02:00 timestamp from the parent stream)

@bazarnov
Copy link
Collaborator

@loris have you noticed that the metafields could be added and regardless the actual customer object for instance? like the metafields field "name" could be updated without the update for the actual customer id directly using API? which means there is a possibility someone will update the properties of the metafields instead of updating the actual record id. that's why the metafields have their "updated_at" because they are not stateless.
Making the direct dependency to the parent object state in this case could cause missing metafileds records for some users, if the parent record state would have a bigger value than the metafiled state.

@bazarnov
Copy link
Collaborator

I'll double-check the actual logic how the metafields streams work, once i work on this issue in a few days, thanks you for your findings

@loris
Copy link

loris commented Jul 25, 2023

@bazarnov Just did some tests on a shopify store we own to validate which state we should use:

Initial state

customer has an updated_at=2023-07-20T10:13:58+02:00 and no metafields

GET /admin/customers/7150840250556.json?fields=id,updated_at
{
    "customer": {
        "id": 7150840250556,
        "updated_at": "2023-07-20T10:13:58+02:00"
    }
}

GET /admin/customers/7150840250556/metafields.json?fields=id,created_at,updated_at
{
    "metafields": []
}

Adding a metafield

customer state got properly updated as we add a new metafield

POST /admin/customers/7150840250556/metafields.json {"metafield":{"namespace":"foo","key":"bar","type":"single_line_text_field","value":"baz"}}

GET /admin/customers/7150840250556.json?fields=id,updated_at
{
    "customer": {
        "id": 7150840250556,
        "updated_at": "2023-07-25T10:22:09+02:00"
    }
}

GET /admin/customers/7150840250556/metafields.json?fields=id,created_at,updated_at
{
    "metafields": [
        {
            "id": 37437136044220,
            "created_at": "2023-07-25T10:22:09+02:00",
            "updated_at": "2023-07-25T10:22:09+02:00"
        }
    ]
}

Updating a metafield

customer state got properly updated as we update an existing metafield

PUT /admin/customers/7150840250556/metafields/37437136044220.json {"metafield":{"value":"baz2"}}

GET /admin/customers/7150840250556.json?fields=id,updated_at
{
    "customer": {
        "id": 7150840250556,
        "updated_at": "2023-07-25T10:25:58+02:00"
    }
}

GET /admin/customers/7150840250556/metafields.json?fields=id,created_at,updated_at
{
    "metafields": [
        {
            "id": 37437136044220,
            "created_at": "2023-07-25T10:22:09+02:00",
            "updated_at": "2023-07-25T10:25:58+02:00"
        }
    ]
}

Deleting a metafield

customer state got also updated as we delete an existing metafield

DELETE /admin/customers/7150840250556/metafields/37437136044220.json

GET /admin/customers/7150840250556.json?fields=id,updated_at
{
    "customer": {
        "id": 7150840250556,
        "updated_at": "2023-07-25T10:27:27+02:00"
    }
}

GET /admin/customers/7150840250556/metafields.json?fields=id,created_at,updated_at
{
    "metafields": []
}

As you can see, the only proper way to synchronize the metafields of a Shopify resource (and handle all operations: creation, modification and deletion of metafields) is to do a full refresh of the metafields of a given parent resource when the parent resource has its updated_at value changed. That's why I proposed (but not sure how it goes in the opposite direction of how Airbyte handles child stream) to sync metafields dependently of the parent streams sync: we only store the updated_at state of the parent stream, fetch updated parent resource, and for each fetched parent records fetch the associated metafields (no need for any metafields state). WDYT?

@bazarnov
Copy link
Collaborator

@loris
I've just double-checked the connector's logic, against this statement:

to sync metafields dependently of the parent streams sync: we only store the updated_at state of the parent stream, fetch updated parent resource, and for each fetched parent records fetch the associated metafields (no need for any metafields state). WDYT?

It turns out that we're indeed doing like you propose:

We sync the parent incrementally and get a few records, then we pass the ids we fetched for parent to fetch metafields from the acquired ids from parent (Incremental parent to Incremental substream) if state is available.

See the logic here: https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-shopify/source_shopify/source.py#L257

This method produces the slices with ids, let's say we use metafields Customers stream for instance here.

The ids are taken and passed to fetch all metafields from the incrementally fetched customers ids, which are updated after the state value.

Initial sync sample:

Parent stream

Customer slice example: {'id': 123, 'updated_at': '2023-04-24T06:53:48-07:00'}

Substream:

// Metafields Customer

[
  {
        "id": 1111,
        "namespace": "custom",
        "value": "Test\n",
        "description": null,
        "owner_id": 123,
        "created_at": "2023-04-13T04:50:10-07:00",
        "updated_at": "2023-04-13T04:50:10-07:00",
        ....
  },
 {
        "id": 222,
        "namespace": "custom",
        "value": "Test15",
        "description": null,
        "owner_id": 123,
        "created_at": "2023-04-13T04:50:10-07:00",
        "updated_at": "2023-04-14T05:31:11-07:00",
        ....
   }
]

State

the state, in this case, for the metafield_customers stream would be:

{
   "state": {
            "metafield_customers": {
                "updated_at": "2023-04-14T05:31:11-07:00",
                "customers": {
                    "updated_at": "2023-04-24T06:53:48-07:00"
                }
            }
        }
  }

Because the parent record was updated and the substream for this parent record - no. It was updated days ago manually, after this the parent record was updated with the other information, non-related to the metafields.
There are 2 records for our sandbox listed above for this parent one, because it was added at the same time twice.

Now, when we run the subsequent sync for the same stream metafields_customer we now have the latest state available:

Subsequent Incremental sync

Parent stream

Customer slice example: {'id': 123, 'updated_at': '2023-04-24T06:53:48-07:00'}

The updated_at was not changed because it's a static value for this test.

We fetch the same customer again but we expect a different number of values, smaller ones.

Substream:

{
        "id": 222,
        "namespace": "custom",
        "value": "Test15",
        "description": null,
        "owner_id": 123,
        "created_at": "2023-04-13T04:50:10-07:00",
        "updated_at": "2023-04-14T05:31:11-07:00",
        ....
   }

State

{
   "state": {
            "metafield_customers": {
                "updated_at": "2023-04-14T05:31:11-07:00",
                "customers": {
                    "updated_at": "2023-04-24T06:53:48-07:00"
                }
            }
        }
  }

The reason the metafields are designed like this, and I totally agree with what you've said earlier - is because the parent object could be updated with non-metafields information, leaving the related metafields behind, if they are not explicitly updated manually or in another way.

if we fetch the records like you propose, this record will be omitted for the substream leading us to the lost records and lots of issues in the future, we'd like to avoid unnecessary complications by making this logic transparent and understandable for the end-user.

@loris
Copy link

loris commented Aug 18, 2023

@bazarnov Thanks for the investigation, but how the current design (in code and that you described) can handle deletions of metafields?

Is this a known issue / limitations that Airbyte will not propagate deletions of metafields (and thus we should warn customers and give them workarounds, like doing full refresh)? Or did I missed something?

I still do understand the only way to handle deletions is to do a full refresh (aka, fetch all metafields for a given parent, and emit them all to the destination) for any parent we get to sync

As you can see, the only proper way to synchronize the metafields of a Shopify resource (and handle all operations: creation, modification and deletion of metafields) is to do a full refresh of the metafields of a given parent resource when the parent resource has its updated_at value changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/shopify
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🎉 Source Shopify: fetch all Shopify metafields
5 participants