Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source Facebook Marketing: SAT fails on schema validation #4827

Closed
keu opened this issue Jul 19, 2021 · 10 comments · Fixed by #4978
Closed

Source Facebook Marketing: SAT fails on schema validation #4827

keu opened this issue Jul 19, 2021 · 10 comments · Fixed by #4978

Comments

@keu
Copy link
Contributor

keu commented Jul 19, 2021

The problem with schema is that FB wraps numeric values into strings.
The schema says it is null or numeric but in the record it is "1234".

Steps to Reproduce

Run SAT on the latest FB connector
UPDATE: use this comment for instructions #4827 (comment)

Possible fix

  1. update schemas - set type = ['null', 'string'], this is less desirable solution because destination will loose any type info.
  2. do transform while reading inside streams, according to schema type. i.e. 'numeric' -> int, float
@keu keu added the type/bug Something isn't working label Jul 19, 2021
@sherifnada sherifnada added the area/connectors Connector related issues label Jul 20, 2021
@sherifnada sherifnada added this to the Connectors August 6th milestone Jul 23, 2021
@Zirochkaa Zirochkaa self-assigned this Jul 26, 2021
@Zirochkaa
Copy link
Contributor

I can't reproduce this bug because of the following error:

image

The above error was also occurring in these 3 failed builds:
image

@Zirochkaa
Copy link
Contributor

Zirochkaa commented Jul 26, 2021

I had to change few things in order to run tests and all tests passed, see screenshot. Also I linked a PR with changes I made to tests. So I can't reproduce this issue.

Lonk to zira.

@Zirochkaa
Copy link
Contributor

I commented out validate_schema: no line in acceptance-test-config.yml file and still wasn't able to reproduce this issue.

image

@keu
Copy link
Contributor Author

keu commented Jul 26, 2021

Ok, it seems our config bypass the dates with records, so all problematic streams are empty. To reproduce this use the following config:

{
  "start_date": "2021-02-01T00:00:00Z",
  "account_id": "your_account",
  "access_token": "your_token",
  "include_deleted": false,
  "insights_lookback_window": 14,
  "insights_days_per_job": 1,
}
ERROR    root:test_core.py:136 The ad_sets stream has the following schema errors:
'0' is not of type 'null', 'number'

Failed validating 'type' in schema['properties']['daily_budget']:
    {'exclusiveMaximum': True,
     'exclusiveMinimum': True,
     'maximum': 100000000000000000000000000000000,
     'minimum': -100000000000000000000000000000000,
     'multipleOf': 1e-06,
     'type': ['null', 'number']}

On instance['daily_budget']:
    '0'
--------------------------------------------------------------------------------
'5000' is not of type 'null', 'number'

Failed validating 'type' in schema['properties']['budget_remaining']:
    {'exclusiveMaximum': True,
     'exclusiveMinimum': True,
     'maximum': 100000000000000000000000000000000,
     'minimum': -100000000000000000000000000000000,
     'multipleOf': 1e-06,
     'type': ['null', 'number']}

On instance['budget_remaining']:
    '5000'
--------------------------------------------------------------------------------

in the end

        if streams_errors:
>           pytest.fail(f"Please check your json_schema in selected streams {tuple(streams_errors.keys())}.")
E           Failed: Please check your json_schema in selected streams ('ad_sets', 'ads_insights', 'ads_insights_age_and_gender').

The tests passed when I set validate_schema: no
@Zirochkaa

@keu
Copy link
Contributor Author

keu commented Jul 26, 2021

@Zirochkaa please check this PR to get the idea
#4994

@keu keu removed their assignment Jul 26, 2021
@Zirochkaa
Copy link
Contributor

@keu which config do we have in GitHub secrets for Facebook Marketing source?

@Zirochkaa
Copy link
Contributor

Zirochkaa commented Aug 3, 2021

We need to:

  • Create transform() function and apply it on each record (~10h)
  • Check that transform() works for 3 above mentioned streams (~2h)

@Zirochkaa
Copy link
Contributor

We decided to use this issue as an experiment to write a generic class for performing transformations of record so it'll require more time than the estimate above.

@Zirochkaa
Copy link
Contributor

With latest changes from master branch SAT tests are failing:

image

@Zirochkaa
Copy link
Contributor

Blocked on #5592.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants