Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon Glue job providers not printing log when job get completed or failed. #26196

Closed
2 tasks done
nikhi-suthar opened this issue Sep 7, 2022 · 5 comments
Closed
2 tasks done
Assignees
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues

Comments

@nikhi-suthar
Copy link

nikhi-suthar commented Sep 7, 2022

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

5.0.0

Apache Airflow version

>= 2.3.2

Operating System

Linux

Deployment

Docker-Compose

Deployment details

docker-compose

What happened

The method job_completion of GlueJobHook call print_job_logs in finally block that will be never call when job get completed either with successfully state or failed state since when job get completed successfully it return value by using return statement (that will not execute finally block) and similarly in case of failure it will raise exception that will also not execute the finally block.
Due to that airflow does not show Glue job logs from CloudWatch.


def job_completion(self, job_name: str, run_id: str, verbose: bool = False) -> Dict[str, str]:
        """
        Waits until Glue job with job_name completes or
        fails and return final state if finished.
        Raises AirflowException when the job failed
        :param job_name: unique job name per AWS account
        :param run_id: The job-run ID of the predecessor job run
        :param verbose: If True, more Glue Job Run logs show in the Airflow Task Logs.  (default: False)
        :return: Dict of JobRunState and JobRunId
        """
        failed_states = ['FAILED', 'TIMEOUT']
        finished_states = ['SUCCEEDED', 'STOPPED']
        next_log_token = None
        job_failed = False

        while True:
            try:
                job_run_state = self.get_job_state(job_name, run_id)
                if job_run_state in finished_states:
                    self.log.info('Exiting Job %s Run State: %s', run_id, job_run_state)
                    return {'JobRunState': job_run_state, 'JobRunId': run_id}
                if job_run_state in failed_states:
                    job_failed = True
                    job_error_message = f'Exiting Job {run_id} Run State: {job_run_state}'
                    self.log.info(job_error_message)
                    raise AirflowException(job_error_message)
                else:
                    self.log.info(
                        'Polling for AWS Glue Job %s current run state with status %s',
                        job_name,
                        job_run_state,
                    )
                    time.sleep(self.JOB_POLL_INTERVAL)
            finally:
                if verbose:
                    next_log_token = self.print_job_logs(
                        job_name=job_name,
                        run_id=run_id,
                        job_failed=job_failed,
                        next_token=next_log_token,
                    )

What you think should happen instead

It should print the log in all cases, (failure or success) when verbose=True.

How to reproduce

Use latest version of amazon providers==5.0.0 and create a Airflow task for any Glue job.
Make sure to pass verbose = True in GlueJobOperator as below:

job_task = GlueJobOperator(task_id ="testJob", 
                                               job_name= "<Glue job name>",
                                               job_desc="Test Job",
                                               region_name="ap-south-1",
                                               verbose=True,
                                               script_location="s3://..//../", 
                                               num_of_dpus=2) 

Anything else

This is a code bug that will result no **cloudwatch** log print in every case. I will provide the solution with enhancement for continuous logging.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@nikhi-suthar nikhi-suthar added area:providers kind:bug This is a clearly a bug labels Sep 7, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Sep 7, 2022

Thanks for opening your first issue here! Be sure to follow the issue template!

@o-nikolas
Copy link
Contributor

when job get completed successfully it return value by using return statement (that will not execute finally block) and similarly in case of failure it will raise exception that will also not execute the finally block.

Is this really true? From Python docs here:

If a finally clause is present, the finally clause will execute as the last task before the try statement completes. The finally clause runs whether or not the try statement produces an exception. The following points discuss more complex cases when an exception occurs:

  • If an exception occurs during execution of the try clause, the exception may be handled by an except clause. If the exception is not handled by an except clause, the exception is re-raised after the finally clause has been executed.
  • An exception could occur during execution of an except or else clause. Again, the exception is re-raised after the finally clause has been executed.
  • If the try statement reaches a break, continue or return statement, the finally clause will execute just prior to the break, continue or return statement’s execution.

@ferruzzi
Copy link
Contributor

ferruzzi commented Oct 17, 2022

As Niko said, the "finally" should be executed whether there is an exception or not:

image

@o-nikolas
Copy link
Contributor

Hey @nikhi-suthar,

Is this still a relevant issue? This Issue is quite old. I see a few PRs that are linked but they've been closed.

@o-nikolas
Copy link
Contributor

Not seeing any action on this one, if the issue arises again we can re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers good first issue kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants