New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log filename template records #20165
Log filename template records #20165
Conversation
airflow/models/dagrun.py
Outdated
@provide_session | ||
def get_log_filename_template(self, *, session: Session = NEW_SESSION) -> Optional[str]: | ||
if self.log_filename_id is None: # DagRun created before LogFilename introduction. | ||
template = session.query(LogFilename.template).order_by(LogFilename.id).limit(1).scalar() | ||
else: | ||
template = session.query(LogFilename.template).filter_by(id=self.log_filename_id).one_or_none() | ||
if template is not None: | ||
return template | ||
return airflow_conf.get("logging", "LOG_FILENAME_TEMPLATE") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure if setting up a relationship (for caching) is worthwile. Maybe?
sa.Column("timestamp", sa.UtcDateTime, nullable=False), | ||
) | ||
with op.batch_alter_table("task_instance") as batch_op: | ||
batch_op.add_column(sa.Column("log_filename_id")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the server_default
(or equivalent) needs to go here for it to have any effect: https://docs.sqlalchemy.org/en/13/core/defaults.html#server-defaults
A variant on the SQL expression default is the Column.server_default, which gets placed in the CREATE TABLE statement during a Table.create() operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to use default
instead. If I read the SQLAlchemy docs correctly, supplying an expression would make it use the expression as an subquery in INSERT INTO
?
3795d9a
to
3082d5a
Compare
Test failures are likely unrelated (one flaky scheduler test failure as usual, the MSSQL job exploded). This should be good to go. |
Can you rebase on main and fix conflicts too plz |
Oh because I just merged a PR with model changes… |
3082d5a
to
f5ce0e4
Compare
airflow/models/tasklog.py
Outdated
|
||
id = Column(Integer, primary_key=True) | ||
template = Column(Text, nullable=False) | ||
timestamp = Column(UtcDateTime, nullable=False, default=timezone.utcnow) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just informational, right?
Call it created_at
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah only informational. The name is borrowed from XCom
; I like created_at
more myself as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now, one possible column to rename
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
f5ce0e4
to
2c608d9
Compare
Eh MSSQL does not support |
Oh yeah. Unfortunately SQLAlchemy "common" interface starts to break easily when we start using more and more sophisticated DB functionality. |
2c608d9
to
dfb5c3e
Compare
dfb5c3e
to
d867d0e
Compare
Changed to |
See #19058 (comment) and #19625 for context. This is the main prerequisite to make it possible for us to change the
log_filename_template
config’s default value.This adds a new table
LogFilename
, and whenever an Airflow command is run (except those explicit setcheck_db=False
), the user config value oflog_filename_template
is sync-ed into the table. EachDagRun
gets a newlog_filename_id
foreign key (populated on creation) that can be used to look up what template they use to render task log filenames. All existing DagRun rows set this value to NULL (for performance reasons), and internally this makes them all use the first row inLogFilename
, which should be the value in use when a user upgrades to 2.3.The first commit is from #20163; that one needs to be merged first.Merged.Edit: Oh, and this still needs tests, and an entry in
UPDATING.md
.