Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📝 (providers_google) add a location check #19571

Merged
merged 2 commits into from
Feb 13, 2022

Conversation

david30907d
Copy link
Contributor

@david30907d david30907d commented Nov 13, 2021

if you didn't, you'll get this returned "Not found: Job xxx" exception
closes: #19570

Although BigQueryHook says location is optional, turns out it's required under some scenario
code

when we invoke get_records() (here), it would end up go to this run_query() to get the job id (here).

another alternative is raise a missing location exception in run_query()

thoughts on this?

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1157, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1331, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1361, in _execute_task
    result = task_copy.execute(context=context)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 150, in execute
    return_value = self.execute_callable()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 161, in execute_callable
    return self.python_callable(*self.op_args, **self.op_kwargs)
  File "/opt/airflow/dags/dags/utils/others/subscription_related.py", line 112, in wrapper
    return func(*args, **kwargs)
  File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 75, in wrapper
    return func(*args, **kwargs)
  File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 1019, in add_abandoned
    abandoned_checkouts_of_this_page = _parse_this_page(response_json)
  File "/opt/airflow/dags/dags/utils/extractors/platform_data_extractors/shopify_extractor.py", line 980, in _parse_this_page
    persons_queried_by_checkout_id = db_hook.get_records(
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/hooks/dbapi.py", line 135, in get_records
    return cur.fetchall()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2886, in fetchall
    one = self.fetchone()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2811, in fetchone
    return self.next()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/bigquery.py", line 2827, in next
    self.service.jobs()
  File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 134, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/googleapiclient/http.py", line 915, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://bigquery.googleapis.com/bigquery/v2/projects/xxxxxx/queries/airflow_xxxxx?alt=json returned "Not found: Job xxxxx:xxxxx

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Nov 13, 2021
@potiuk
Copy link
Member

potiuk commented Nov 14, 2021

Could you please add a unit test for that ?

Comment on lines 2635 to 2636
if self.location is None:
raise Exception("Need to specify location when instantiating BigQueryHook, otherwise it would result in Job Not Found error!")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of failing on runtime, can we instead detect this in __init__ and fail the entire DAG when location is invalid? Or is location=None a valid value for certain use cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea location=None works for other function and seems to me that it only fails when invoking get_records() ...
btw I might need to postpone this PR, will try to finish it by the end of Nov
will ping @uranusjr for help 😂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK this sounds like a good reason.

Could you change this to raise AirflowException (or a subclass BigQueryLocationUnset) instead? This pattern is more common in Airflow than raising a barebone Exception.

I’d do

raise BigQueryLocationUnset("Parameter 'location' is required to fetch records with BigQuery hook")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do. Thx for the inputs 🎉

@david30907d
Copy link
Contributor Author

Could you please add a unit test for that ?

sure~

@github-actions
Copy link

github-actions bot commented Jan 1, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 1, 2022
@turbaszek
Copy link
Member

@david30907d are you still willing to work on this one?

@david30907d
Copy link
Contributor Author

david30907d commented Jan 3, 2022

@david30907d are you still willing to work on this one?

Hi @turbaszek , probably not 😅
I'm afraid that I've to focus on this one first. Sorry about that

btw, if you get a chance, would you check this problem please? 🙏
#19508 (comment)

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jan 4, 2022
uranusjr
uranusjr previously approved these changes Jan 5, 2022
@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Jan 5, 2022
@github-actions
Copy link

github-actions bot commented Jan 5, 2022

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk potiuk closed this Jan 8, 2022
@potiuk potiuk reopened this Jan 8, 2022
@eladkal
Copy link
Contributor

eladkal commented Feb 6, 2022

@david30907d can you resolve conflicts and rebase?

@eladkal
Copy link
Contributor

eladkal commented Feb 13, 2022

@david30907d can you resolve conflicts and rebase so we can merge this PR?

@david30907d
Copy link
Contributor Author

@david30907d can you resolve conflicts and rebase so we can merge this PR?

oh so sorry I missed this thread, on it!

david30907d and others added 2 commits February 13, 2022 22:56
if you didn't, you'll get this `returned "Not found: Job xxx"` exception
@david30907d david30907d force-pushed the bq-get_records branch 2 times, most recently from 41d4225 to e82c4fa Compare February 13, 2022 14:58
@david30907d
Copy link
Contributor Author

@eladkal sorry for the late reply, I finally figure out how to rebase 😅 (i'm git noob)

@potiuk
Copy link
Member

potiuk commented Feb 13, 2022

@eladkal sorry for the late reply, I finally figure out how to rebase sweat_smile (i'm git noob)

As of last week you can EASILY do rebase with the GitHub UI:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers okay to merge It's ok to merge this PR as it does not require more tests provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_records() don't work out for BigQueryHook
5 participants