Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-16871][runtime] Make more build time information available at runtime #11592

Merged
merged 1 commit into from May 6, 2020

Conversation

nielsbasjes
Copy link
Contributor

@nielsbasjes nielsbasjes commented Apr 1, 2020

FLINK-16871: Make more build time information available at runtime

What is the purpose of the change

As discussed here additional build time information is needed at runtime. This pull request is to make this specific change a separate thing.

Brief change log

  • Generate a Version.java file from maven with several properties.
  • Update the git-commit-id-plugin to the latest version (only the mentioned 2.1.14 was problematic). And updated the url in the comment to the git-commit-id-plugin github repo.
  • By fixating the datetime format in the git-commit-id-plugin these can be reliably exposed as returning an Instant. This makes it possible to reliably format these timestamps in any form we like. The code has been structured in such a way that as little as possible memory remains allocated. For the latest commit and build time a String version of those timestamps is also available and those are formatted using the Europe/Berlin timezone (the origin of Flink) and thus independent of the locale/timezone of the system that created the build.
  • The .version.properties is no longer generated as this was part of the old solution and it is not used anywhere else in the code base.

Verifying this change

This change added a few basic test steps.

Verifying the actual values is not possible because after commit and on each build the values will (and should) be different. Also checking if the commit time is not older than X would mean that a working version of the software would fail the build a while later.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): yes: build time plugin is updated
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no, not really A few more getters are now available in the EnvironmentInformation which only affects developers who need this information.
  • If yes, how is the feature documented? JavaDocs

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 1, 2020

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit 5e4d9cc (Wed Apr 01 08:35:50 UTC 2020)

Warnings:

  • 3 pom.xml files were touched: Check for build and licensing issues.
  • No documentation files were touched! Remember to keep the Flink docs up to date!
  • This pull request references an unassigned Jira ticket. According to the code contribution guide, tickets need to be assigned before starting with the implementation work.

Mention the bot in a comment to re-run the automated checks.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@flinkbot
Copy link
Collaborator

flinkbot commented Apr 1, 2020

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run travis re-run the last Travis build
  • @flinkbot run azure re-run the last Azure build

@nielsbasjes nielsbasjes changed the title [WIP] [FLINK-16871][runtime] Make more build time information available at runtime [FLINK-16871][runtime] Make more build time information available at runtime Apr 1, 2020
@nielsbasjes
Copy link
Contributor Author

nielsbasjes commented Apr 1, 2020

On my machine the mvn clean verify passed.
It took 2:21h

@nielsbasjes
Copy link
Contributor Author

@azagrebin / @wangyang0918 Would you be so kind to have a look at this.
This is purely the "Environment variable" part we discussed.

@azagrebin
Copy link
Contributor

Thanks for splitting this @nielsbasjes

As I understand, the current PR still requires running code generation otherwise the runtime module is broken. I agree that e.g. IntelliJ Idea has a button in maven panel to generate sources which takes some time, at least initially. Nonetheless, this will take time for people to adopt to this.

As mentioned in the original discussion, we could try approach where we have a Version.java file in our code base with "unknown" versions and pack the real versions into the built artefacts.

@zentol might also have further opinion about this.

Copy link
Contributor

@zentol zentol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should not be required to first generate some code before we can do anything in the IDE. This will be super annoying when switching branches.

pom.xml Show resolved Hide resolved
flink-runtime/pom.xml Outdated Show resolved Hide resolved
@nielsbasjes
Copy link
Contributor Author

@azagrebin @zentol About generating the Version.java and switching branches.
If you look at the rest of the Flink code base you'll find that in several places things are generated in the generate-sources phase. I see a lot of places using things like Protobuf, Avro and an SQL parser that all rely on this same mechanism and the same way of working.

So in general the problems you are addressing already exist in a large scale in the project. You cannot simply switch branch and then successfully run the project without doing a mvn clean generate-sources. I see this in almost all of my own projects and also in almost all Apache projects: code needs to be generated before the project can be compiled and tested.

Putting a question back: What implementation approach do you recommend?

@zentol
Copy link
Contributor

zentol commented Apr 8, 2020

You cannot simply switch branch and then successfully run the project without doing a mvn clean generate-sources.

This isn't quite accurate. By virtue of us rarely adding code generation and generally not changing it, it over time becomes feasible to switch between the currently supported version branches.
Naturally this goes out the window the moment we add a new code generation, hence this discussion.

Putting a question back: What implementation approach do you recommend?

So, I only briefly looked into it.

My original idea was to have a normal Version.java in the source, to be modified during the build only, but you'd have to jump through quite a few hoops to make this work. (Exclude the vanilla version from the source, add the modified file into a custom directory so IntelliJ doesn't complain, add this directory to the list of source directories). So this isn't really feasible, which is unfortunate, and a bit surprising since the use-case is fairly simple.

One approach we could take though is similar to the old one we had.
Since we're just interested in exposing maven properties we should be able to add a text file into src/main/resources containing lines like PROJECT_VERSION: ${project.version} and configure maven to use filtering for this resource.

<project>
  <build>
    <resources>
        <resource>
            <directory>src/main/resources</directory>
            <filtering>true</filtering>
        </resource>
    </resources>
  </build>
</project>

Then just parse it back. This could even be a JSON file that we throw jackson at.

However, I do realize that I may be too strict in regards to code-generation, so I would encourage you to raise the issue on the mailing list in search for supporters.

@nielsbasjes
Copy link
Contributor Author

I went back to the generating a .version.properties file which is then loaded as a resource using classloader (which I consider to be brittle and unreliable!). This way you can say 'if no file then use default' as requested. The big downside of this is that someone can (accidentally) change the code by overriding this resource by putting a different one in the classpath (which happens) and IntelliJ still messes things up occasionally. So the runtime code has to catch these problems.

Also in the original code

  1. The .version.properties is generated somewhere into src/main/... instead of target/...
    Why? Generated code (that is not committed) should never be placed in the src/main.
    I changed that to generate to target/
  2. A second .version.properties is generated somewhere into flink-dist/src/main/flink-bin/
    Why? The only place the .version.properties is referenced is in the flink-runtime.
    I consider the overriding these build time variables is a bad idea.
    I removed that.

NOTE: If you run this now from within IntelliJ it still does not generate the files correctly (i.e. IntelliJ does not follow the instructions defined in the pom.xml) which sometimes creates a version file like this (which is handled by the runtime code):

project.version=1.11-SNAPSHOT
scala.binary.version=2.11

git.commit.id=${git.commit.id}
git.commit.id.abbrev=${git.commit.id.abbrev}
git.commit.time=${git.commit.time}
git.build.time=${git.build.time}

Side note: Looking a the RevisionInformation class I found that all uses get that class and then take out the two values (or even only one of them) and throw it away again. Seems too much overhead to me.

@nielsbasjes
Copy link
Contributor Author

@zentol The patch I submitted just now does what was originally the case (i.e. the generated properties file) which has been extended with a few additional attributes. Essentially the solution you suggested just now

Although I really think the generated Version.java (as I submitted before) is the best way to do this, I fine with either solution.

Can you please review what is there now (and my previous post, which we both posted at almost exactly the same moment) so this foundation for https://issues.apache.org/jira/browse/FLINK-15794 can be used to begin that change.

Thanks.

@zentol zentol self-assigned this Apr 9, 2020
@nielsbasjes
Copy link
Contributor Author

@nielsbasjes
Copy link
Contributor Author

@zentol I have been thinking about everything that was said and I think I found a way that is in line with what was said in the mailing list.
In my opinion the file must be present and have valid values. Because these values will be used to load the correct docker image: If it has invalid values it is unsafe to continue as this will lead to bad testing.

In line with the remark by @tillrohrmann if the file is missing or badly generated the system will throw a
java.lang.IllegalStateException: The file .flink-runtime.version.properties has not been generated correctly. You MUST run 'mvn generate-sources' in the flink-runtime module.

So the file is either valid or will lead to an exception.

Please indicate if this is an acceptable solution.

@nielsbasjes nielsbasjes force-pushed the FLINK-16871-MoreBuildInfo branch 2 times, most recently from 85b8520 to 3f68dbf Compare April 20, 2020 12:47
@nielsbasjes
Copy link
Contributor Author

nielsbasjes commented Apr 28, 2020

So what it does now:

  1. There is no properties file --> everything returns a "default" value.
    These are the defaults I have chosen:
Version : <unknown>
ScalaVersion : <unknown>
BuildTime : 1970-01-01T00:00:00Z
BuildTimeString : 1970-01-01T00:00:00+0000
GitCommitId : DecafC0ffeeD0d0F00d
GitCommitIdAbbrev : DeadD0d0
GitCommitTime : 1970-01-01T00:00:00Z
GitCommitTimeString : 1970-01-01T00:00:00+0000
  1. A poorly generated properties file exists --> Only the bad values return a default.
    This happens because the maven git plugin is not run by the IDE.
    Example:
project.version=1.11-SNAPSHOT
scala.binary.version=2.11
git.commit.id=${git.commit.id}
git.commit.id.abbrev=${git.commit.id.abbrev}
git.commit.time=${git.commit.time}
git.build.time=${git.build.time}
  1. A full generated properties file exists --> Everything returns the right value.
    Example:
project.version=1.11-SNAPSHOT
scala.binary.version=2.11
git.commit.id=8e2642ec0046bb02fe9c1648b589051e66c2f956
git.commit.id.abbrev=8e2642e
git.commit.time=2020-04-28T09:22:20+0200
git.build.time=2020-04-28T10:06:42+0200

Copy link
Contributor

@zentol zentol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found some minor things, but I overall it is good.

@nielsbasjes
Copy link
Contributor Author

I did a rebase + squash to the latest master branch.

@zentol zentol force-pushed the FLINK-16871-MoreBuildInfo branch from 6bfea91 to ec92e1b Compare May 6, 2020 08:45
@zentol zentol merged commit a9f72f9 into apache:master May 6, 2020
@nielsbasjes nielsbasjes deleted the FLINK-16871-MoreBuildInfo branch May 16, 2020 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants