Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit "logs not found" message when ES logs appear to be missing #21261

Merged
merged 7 commits into from
Feb 7, 2022

Conversation

dstandish
Copy link
Contributor

Current ES log handler will wait up to 5 minutes for logs to appear (or for more logs to appear since last log message was emitted). This produces undesirable behavior when the log message has been deleted from the elasticsearch cluster. A user may wait a long time thinking that the logs are coming when they are not.

To resolve this, if no logs whatsoever have been retrieved after 5 seconds of trying, we give up and emit a "logs not found" message.

If the task has only just started, this may be a "false negative", and we guide the user to refresh if they think that might be the case.

@uranusjr
Copy link
Member

uranusjr commented Feb 3, 2022

Logic lgtm.

dstandish and others added 6 commits February 2, 2022 23:40
Current ES log handler will wait up to 5 minutes for logs to appear (or for _more_ logs to appear since last log message was emitted).  This produces undesirable behavior when the log message has been deleted from the elasticsearch cluster.  A user may wait a long time thinking that the logs are coming when they are not.

To resolve this, if no logs whatsoever have been retrieved after 5 seconds of trying, we give up and emit a "logs not  found" message.

If the task has only just started, this may be a "false negative", and we guide the user to refresh if they think that might be the case.
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Comment on lines +42 to +52
def get_ti(dag_id, task_id, execution_date, create_task_instance):
ti = create_task_instance(
dag_id=dag_id,
task_id=task_id,
execution_date=execution_date,
dagrun_state=DagRunState.RUNNING,
state=TaskInstanceState.RUNNING,
)
ti.try_number = 1
ti.raw = False
return ti
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about turning this into a fixture?

@pytest.fixture()
def create_running_task_instance(create_task_instance):
    def _create_ti(**kwargs):
        ti = create_task_instance(
            dagrun_state=DagRunState.RUNNING,
            state=TaskInstanceState.RUNNING,
            **kwargs,
        )
        ti.try_number = 1
        ti.raw = False
        return ti

    return _create_ti

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was a fixture but i pulled it out so i could make a TI with diff params...

but i guess you can parametirize a fixture like so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can but it shouldn’t be generally be needed, it’s easier to make the fixture return a function that takes arguments instead (like this one here). I searched your changes and this implementation seems to be good enough for the usages in this PR. You’d do

@pytest.fixture()
def ti(self, create_running_task_instance):
    yield create_running_task_instance(
        dag_id=self.DAG_ID,
        task_id=self.TASK_ID,
        execution_date=self.EXECUTION_DATE,
    )
    clear_db_runs()
    clear_db_dags()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm confused @uranusjr
i'm not seeing how this fixture helps me.
there is already a fixture here like this now. i just pulled out a portion of it (and still use it in the existing fixture) but i just want to be able to specify a different execution date in my specific test than the one used by the fixture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you are saying i should just change the fixture so that it returns a create_ti(execution_date) funciton and update all the other tests call that function (insstead of just using a returned TI) then i can do -- lemme know

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Feb 7, 2022
@github-actions
Copy link

github-actions bot commented Feb 7, 2022

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:logging area:providers okay to merge It's ok to merge this PR as it does not require more tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants