Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸŽ‰ New Source: Faker #11738

Merged
merged 20 commits into from
Apr 14, 2022
Merged

πŸŽ‰ New Source: Faker #11738

merged 20 commits into from
Apr 14, 2022

Conversation

evantahler
Copy link
Contributor

@evantahler evantahler commented Apr 5, 2022

What

Adds a Source Connection that uses the Faker Python library to generate e-commerce-like user data for testing.

Screen Shot 2022-04-05 at 4 04 19 PM

🚨 User Impact 🚨

None! A new connector appears.

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • If new credentials are required for use in CI, add them to GSM. Instructions.
  • /test connector=connectors/<name> command is passing
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Tests

Unit
python -m pytest unit_tests
Test session starts (platform: darwin, Python 3.9.12, pytest 6.1.2, pytest-sugar 0.9.4)
cachedir: .pytest_cache
rootdir: /Users/evan/workspace/airbyte/airbyte, configfile: pytest.ini
plugins: sugar-0.9.4, Faker-13.3.1, timeout-1.4.2
collecting ...
 airbyte-integrations/connectors/source-faker/unit_tests/unit_test.py::test_source_streams βœ“                                                                                         33% β–ˆβ–ˆβ–ˆβ–
 airbyte-integrations/connectors/source-faker/unit_tests/unit_test.py::test_read_random_data βœ“                                                                                       67% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹
 airbyte-integrations/connectors/source-faker/unit_tests/unit_test.py::test_read_with_seed βœ“                                                                                        100% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
======================================================================================== warnings summary =========================================================================================
.venv/lib/python3.9/site-packages/airbyte_cdk/sources/utils/transform.py:12
  /Users/evan/workspace/airbyte/airbyte/airbyte-integrations/connectors/source-faker/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/utils/transform.py:12: DeprecationWarning: Call to deprecated class AirbyteLogger. (Use logging.getLogger('airbyte') instead) -- Deprecated since version 0.1.47.
    logger = AirbyteLogger()

.venv/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/rate_limiting.py:19
  /Users/evan/workspace/airbyte/airbyte/airbyte-integrations/connectors/source-faker/.venv/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/rate_limiting.py:19: DeprecationWarning: Call to deprecated class AirbyteLogger. (Use logging.getLogger('airbyte') instead) -- Deprecated since version 0.1.47.
    logger = AirbyteLogger()

.venv/lib/python3.9/site-packages/airbyte_cdk/utils/event_timing.py:13
  /Users/evan/workspace/airbyte/airbyte/airbyte-integrations/connectors/source-faker/.venv/lib/python3.9/site-packages/airbyte_cdk/utils/event_timing.py:13: DeprecationWarning: Call to deprecated class AirbyteLogger. (Use logging.getLogger('airbyte') instead) -- Deprecated since version 0.1.47.
    logger = AirbyteLogger()

-- Docs: https://docs.pytest.org/en/stable/warnings.html

Results (0.22s):
       3 passed
Integration
./acceptance-test-docker.sh
[+] Building 0.6s (16/16) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                               0.0s
 => => transferring dockerfile: 37B                                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                                  0.0s
 => => transferring context: 34B                                                                                                                                                   0.0s
 => [internal] load metadata for docker.io/library/python:3.9.11-alpine3.15                                                                                                        0.5s
 => [base 1/1] FROM docker.io/library/python:3.9.11-alpine3.15@sha256:45ddd216e6b4efee0617e15d541e9148ffd6898203fcbe86a9f5bf906ce7837f                                             0.0s
 => [internal] load build context                                                                                                                                                  0.0s
 => => transferring context: 494B                                                                                                                                                  0.0s
 => CACHED [builder 1/4] WORKDIR /airbyte/integration_code                                                                                                                         0.0s
 => CACHED [builder 2/4] RUN apk --no-cache upgrade     && pip install --upgrade pip     && apk --no-cache add tzdata build-base                                                   0.0s
 => CACHED [builder 3/4] COPY setup.py ./                                                                                                                                          0.0s
 => CACHED [builder 4/4] RUN pip install --prefix=/install .                                                                                                                       0.0s
 => CACHED [stage-2 2/7] COPY --from=builder /install /usr/local                                                                                                                   0.0s
 => CACHED [stage-2 3/7] COPY --from=builder /usr/share/zoneinfo/Etc/UTC /etc/localtime                                                                                            0.0s
 => CACHED [stage-2 4/7] RUN echo "Etc/UTC" > /etc/timezone                                                                                                                        0.0s
 => CACHED [stage-2 5/7] RUN apk --no-cache add bash                                                                                                                               0.0s
 => CACHED [stage-2 6/7] COPY main.py ./                                                                                                                                           0.0s
 => CACHED [stage-2 7/7] COPY source_faker ./source_faker                                                                                                                          0.0s
 => exporting to image                                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                                            0.0s
 => => writing image sha256:e41adcd00dcea5e727d718f1329f24f20b8c04553860c625986e70941ca02905                                                                                       0.0s
 => => naming to docker.io/airbyte/source-faker:dev                                                                                                                                0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
latest: Pulling from airbyte/source-acceptance-test
Digest: sha256:b38eb9b2205246b354ec7233b898f0e8d98a813046d2247b5f85e01ab94a522c
Status: Image is up to date for airbyte/source-acceptance-test:latest
docker.io/airbyte/source-acceptance-test:latest
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Test session starts (platform: linux, Python 3.7.11, pytest 6.2.5, pytest-sugar 0.9.4)
rootdir: /test_input
plugins: sugar-0.9.4, timeout-1.4.2
collecting ...
 test_core.py βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“βœ“                                                                                                                                         95% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ
 test_full_refresh.py βœ“                                                                                                                                                  100% β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ

=============================================================================== short test summary info ================================================================================
SKIPPED [1] source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config

Results (14.18s):
      20 passed

@github-actions github-actions bot added the area/connectors Connector related issues label Apr 5, 2022
@evantahler
Copy link
Contributor Author

evantahler commented Apr 5, 2022

We learned #11743 along the way

@evantahler evantahler self-assigned this Apr 6, 2022
@evantahler evantahler changed the title Faker Source πŸŽ‰ New Source: Faker Apr 13, 2022
@github-actions github-actions bot added the area/documentation Improvements or additions to documentation label Apr 13, 2022
Copy link
Contributor Author

@evantahler evantahler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love some help with the callouts below (mostly me learning 🐍) and with the failing SAT test (see description).

The SAT test is failing, but I'm not really sure on what.

E   pydantic.error_wrappers.ValidationError: 3 validation errors for AirbyteRecordMessage
E   stream
E     field required (type=value_error.missing)
E   data
E     field required (type=value_error.missing)
E   emitted_at
E     field required (type=value_error.missing)

Seems to indicate that those properties of the record are missing, but:

{"type": "RECORD", "record": {"stream": "Users", "data": {"job": "Musician", "company": "Williams-Sheppard", "ssn": "498-52-4970", "residence": "Unit 5938 Box 2421\nDPO AP 33335", "current_location": [52.958961, 143.143712], "blood_group": "B+", "website": ["http://www.rivera.com/", "http://grimes-green.net/", "http://www.larsen.com/"], "username": "leeashley", "name": "Gary Cross", "sex": "M", "address": "711 Golden Overpass\nWest Andreaville, MA 71317", "mail": "tamaramorrison@hotmail.com", "birthdate": "1945-06-05", "id": 1, "created_at": "2022-04-12T17:32:21.905904", "updated_at": "2022-04-12T17:32:21.905904"}, "emitted_at": 1649809941000}}

@evantahler
Copy link
Contributor Author

Thanks to @girarda and @sherifnada the tests are passing!

@evantahler evantahler requested a review from girarda April 13, 2022 23:31
@evantahler evantahler marked this pull request as ready for review April 13, 2022 23:31
@evantahler
Copy link
Contributor Author

evantahler commented Apr 13, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164310341
❌ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164310341
πŸ› https://gradle.com/s/iyajzbwh6yt2o
Python short test summary info:

=========================== short test summary info ============================
ERROR test_core.py::TestSpec::test_config_match_spec[inputs0] - FileNotFoundE...
ERROR test_core.py::TestConnection::test_check[inputs0] - FileNotFoundError: ...
ERROR test_core.py::TestDiscovery::test_discover[inputs0] - FileNotFoundError...
ERROR test_core.py::TestDiscovery::test_defined_cursors_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_refs_exist_in_schema[inputs0]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-allOf]
ERROR test_core.py::TestDiscovery::test_defined_keyword_exist_in_schema[inputs0-not]
ERROR test_core.py::TestDiscovery::test_primary_keys_exist_in_schema[inputs0]
ERROR test_core.py::TestBasicRead::test_read[inputs0] - FileNotFoundError: [E...
ERROR test_full_refresh.py::TestFullRefresh::test_sequential_reads[inputs0]
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
=================== 10 passed, 1 skipped, 10 errors in 8.77s ===================

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164401122
❌ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164401122
πŸ› https://gradle.com/s/rdpuu3np6hk7g
Python short test summary info:

=========================== short test summary info ============================
FAILED test_core.py::TestBasicRead::test_read[inputs0] - AssertionError: Stre...
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
=================== 1 failed, 19 passed, 1 skipped in 16.26s ===================

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164453823
βœ… connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2164453823
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              285    106    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  886    259    71%
Name                       Stmts   Miss  Cover
----------------------------------------------
source_faker/__init__.py       2      0   100%
source_faker/source.py        51      3    94%
----------------------------------------------
TOTAL                         53      3    94%

Python short test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======================== 20 passed, 1 skipped in 15.17s ========================

@codecov
Copy link

codecov bot commented Apr 14, 2022

Codecov Report

❗ No coverage uploaded for pull request base (master@6ef2f80). Click here to learn what that means.
The diff coverage is n/a.

@@            Coverage Diff            @@
##             master   #11738   +/-   ##
=========================================
  Coverage          ?   96.00%           
=========================================
  Files             ?        2           
  Lines             ?       50           
  Branches          ?        0           
=========================================
  Hits              ?       48           
  Misses            ?        2           
  Partials          ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Ξ” = absolute <relative> (impact), ΓΈ = not affected, ? = missing data
Powered by Codecov. Last update 6ef2f80...8fba7d7. Read the comment docs.

github-actions bot referenced this pull request Apr 14, 2022
Copy link
Contributor

@Phlair Phlair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, first source!!!

Nothing major on my review, mostly just nit comments and one point on repeating index numbers on incremental syncs.

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168805305
❌ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168805305
πŸ› https://gradle.com/s/bp4cwk2jdkncc

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168885822
βœ… connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168885822
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              285    106    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  886    259    71%
Name                       Stmts   Miss  Cover
----------------------------------------------
source_faker/__init__.py       2      0   100%
source_faker/source.py        48      2    96%
----------------------------------------------
TOTAL                         50      2    96%

Python short test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======================== 20 passed, 1 skipped in 15.61s ========================

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/test connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168899496
βœ… connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2168899496
Python tests coverage:

Name                                                 Stmts   Miss  Cover
------------------------------------------------------------------------
source_acceptance_test/utils/__init__.py                 6      0   100%
source_acceptance_test/tests/__init__.py                 4      0   100%
source_acceptance_test/__init__.py                       2      0   100%
source_acceptance_test/tests/test_full_refresh.py       52      2    96%
source_acceptance_test/utils/asserts.py                 37      2    95%
source_acceptance_test/config.py                        74      6    92%
source_acceptance_test/utils/json_schema_helper.py     105     13    88%
source_acceptance_test/utils/common.py                  70     17    76%
source_acceptance_test/utils/compare.py                 62     23    63%
source_acceptance_test/tests/test_core.py              285    106    63%
source_acceptance_test/base.py                          10      4    60%
source_acceptance_test/utils/connector_runner.py       110     48    56%
source_acceptance_test/tests/test_incremental.py        69     38    45%
------------------------------------------------------------------------
TOTAL                                                  886    259    71%
Name                       Stmts   Miss  Cover
----------------------------------------------
source_faker/__init__.py       2      0   100%
source_faker/source.py        48      2    96%
----------------------------------------------
TOTAL                         50      2    96%

Python short test summary info:

=========================== short test summary info ============================
SKIPPED [1] ../usr/local/lib/python3.7/site-packages/source_acceptance_test/plugin.py:56: Skipping TestIncremental.test_two_sequential_reads because not found in the config
======================== 20 passed, 1 skipped in 16.37s ========================

@evantahler
Copy link
Contributor Author

evantahler commented Apr 14, 2022

/publish connector=connectors/source-faker

πŸ•‘ connectors/source-faker https://github.com/airbytehq/airbyte/actions/runs/2169591843
πŸš€ Successfully published connectors/source-faker
❌ Couldn't auto-bump version for connectors/source-faker

@evantahler evantahler temporarily deployed to more-secrets April 14, 2022 21:26 Inactive
@evantahler evantahler temporarily deployed to more-secrets April 14, 2022 21:26 Inactive
@evantahler evantahler merged commit 8293ce3 into master Apr 14, 2022
@evantahler evantahler deleted the faker-source branch April 14, 2022 22:21
suhomud pushed a commit that referenced this pull request May 23, 2022
* Faker WIP

* Update catalog to handle dates better

* Adding unit tests for faker source

* WIP  - tests mostly passing

* add docs

* bump python version and fix unit tests

* test array types

* remove comment

* better python map

* update `moduleDirectory`

* simplify intiilization\

* use `ConfiguredAirbyteCatalog` in test rather than custom dict class

* Tests passing by using deterministic time

* Bump birthdays

* Update airbyte-integrations/connectors/source-faker/integration_tests/acceptance.py

Co-authored-by: George Claireaux <george@claireaux.co.uk>

* remove bootstrap and stronger types

* better incremental support

* fixup un-used imports

* bump to test codecov

* Add connector to medatata files

Co-authored-by: George Claireaux <george@claireaux.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues area/documentation Improvements or additions to documentation team/extensibility
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants