Java: Regenerate framework models automatically #7767

bmuskalla · 2022-01-27T10:17:09Z

Takes the model generator and any framework model we have on main and regenerates the model using the latest and greatest model generator. This enables us to update and evolve the models without manual intervention.

bmuskalla · 2022-01-27T10:58:54Z

.github/workflows/mad_regenerate-models.yml

+      - name: Build database
+        env:
+          SLUG: ${{ matrix.slug }}
+          REF: ${{ matrix.ref }}
+        run: |
+          mkdir dbs
+          cd repos/${REF}
+          SHORTNAME=${SLUG//[^a-zA-Z0-9_]/}
+          codeql database create --language=java ../../dbs/${SHORTNAME}


We could consider using actions/cache for this

bmuskalla · 2022-02-02T09:31:36Z

@tamasvajk In case you have some time, I'd appreciate if someone could look at this action. Happy to pair if anything is unclear

michaelnebel

This is really good work!

I have added a couple of suggestions, but these could also just be follow ups - so not anything blocking.

michaelnebel · 2022-02-04T08:04:50Z

java/ql/src/utils/model-generator/RegenerateModels.py

+import sys
+
+
+lgtmSlugToModelFile = {


Should we consider introducing an invariant instead of an explicit mapping?
The root path could be java/ql/lib/semmle/code/java/frameworks/ and the slug can then be used to decide subfolder an file name.
eg. slug = "apache/commons-beanutils" -> apache/CommonsBeanutilsGenerated.qll.

This would solve the minor code cohesion issue that we otherwise need to remember to update both the workflow (slug, ref) pair and add an entry into lgtmSlugToModelFile.

I used to do that but quickly realized that we have different locations for our models. And yes, we could try and move/rename them all into the same spot but hold back on that. I'm happy either way.

Are you thinking about models that has been auto generated and then don't follow this pattern on location or hand written models?

Yes, mostly because of not having a good naming pattern (apache/commons-io -> apache/IO.qll vs apachecommonsio.qll vs apache/commonsio.qll). E.g. when we have multiple models from an organization, we keep them in folders whereas others, we split one framework into multiple parts (still to be discussed when it comes to the generator). Generally, I agree we can extract the root path though

Alright. Thank you for the context. If this is undecided, then just leave it as it is :-)

michaelnebel · 2022-02-04T08:06:18Z

java/ql/src/utils/model-generator/RegenerateModels.py

+    print("============================================================")
+    print("Generating models for " + lgtmSlug)
+    print("============================================================")
+    modelFile = lgtmSlugToModelFile[lgtmSlug]


Maybe add some exception handling as we might have forgotten to update the lgtmSlugToModelFile, when adding entries to the strategy matrix in the workflow (if not removing the lookup)?

bmuskalla · 2022-02-04T08:37:59Z

@michaelnebel Good points overall on the lgtmSlugToModelFile - one major consideration I had (and I'm still torn) is whether we even need the python script at all. We could simplify the whole PR by moving everything into the action. The only downside would be that we loose the ability to locally regenerate those (but that case is already a lot harder than initially as I had to move the database building into the action. Thinking about it right now, I feel like moving this into the action makes the most sense. Locally, you can always run GenerateFlowModel.py manually and point it to the right database. Thoughts?

michaelnebel · 2022-02-04T09:03:36Z

@michaelnebel Good points overall on the lgtmSlugToModelFile - one major consideration I had (and I'm still torn) is whether we even need the python script at all. We could simplify the whole PR by moving everything into the action. The only downside would be that we loose the ability to locally regenerate those (but that case is already a lot harder than initially as I had to move the database building into the action. Thinking about it right now, I feel like moving this into the action makes the most sense. Locally, you can always run GenerateFlowModel.py manually and point it to the right database. Thoughts?

Without any experience, it is hard for me to say, but my immediate response would be to keep it as a python script, just to ease potential debugging/extensions/rewriting of the script.

tamasvajk

Looks plausible to me.

BTW, where did you test this workflow? I used to push my workflow changes first to dsp-testing to see them in action (pun intended).

tamasvajk · 2022-02-04T09:46:26Z

.github/workflows/mad_regenerate-models.yml

+        slug: ["placeholder"]
+        ref: ["placeholder"]
+        include:
+          - slug: "apache/commons-io"
+            ref: "8985de8fe74f6622a419b37a6eed0dbc484dc128"
+        exclude:
+          - slug: "placeholder"
+            ref: "placeholder"


Why do we need the placeholders here (, which are excluded)?

In order to create a paramterized job without the combinatoric nature of a matrix build (and all axis are required to have at least one value)

bmuskalla · 2022-02-04T10:09:35Z

Without any experience, it is hard for me to say, but my immediate response would be to keep it as a python script, just to ease potential debugging/extensions/rewriting of the script

Fair enough, keeping it for now

BTW, where did you test this workflow?
Used to test it on a branch. But will actually make it so it runs on a PR that changes the workflow itself

michaelnebel

Looks good to me!

Automation to regenerate framework models

c1b5565

bmuskalla added the Java label Jan 27, 2022

bmuskalla requested a review from a team as a code owner January 27, 2022 10:17

bmuskalla added the no-change-note-required This PR does not need a change note label Jan 27, 2022

bmuskalla commented Jan 27, 2022

View reviewed changes

bmuskalla requested a review from adityasharad February 1, 2022 11:07

bmuskalla requested a review from michaelnebel February 3, 2022 12:29

michaelnebel previously approved these changes Feb 4, 2022

View reviewed changes

tamasvajk reviewed Feb 4, 2022

View reviewed changes

Benjamin Muskalla added 2 commits February 4, 2022 11:26

Improve error handling and refactor base path

b747391

Enable debugging action

fcaead4

bmuskalla dismissed michaelnebel’s stale review via fcaead4 February 4, 2022 10:30

Fix path expression

bc5753c

michaelnebel approved these changes Feb 4, 2022

View reviewed changes

bmuskalla merged commit eee03eb into github:main Feb 4, 2022

		import sys


		lgtmSlugToModelFile = {

Java: Regenerate framework models automatically #7767

Java: Regenerate framework models automatically #7767

Uh oh!

Conversation

bmuskalla commented Jan 27, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmuskalla commented Feb 2, 2022

Uh oh!

michaelnebel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelnebel Feb 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmuskalla commented Feb 4, 2022

Uh oh!

michaelnebel commented Feb 4, 2022

Uh oh!

tamasvajk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bmuskalla commented Feb 4, 2022

Uh oh!

michaelnebel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michaelnebel Feb 4, 2022 •

edited

Loading