Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build only changed examples and do it parallel #559

Merged
merged 1 commit into from Mar 11, 2024

Conversation

ihostage
Copy link
Member

@ihostage ihostage commented Mar 1, 2024

Main reason: decrease a build time

Target behaviour:

  1. For Pull Requests check only changed examples
  2. For long-live branches check all examples
  3. Check any example in separated job

@ihostage ihostage force-pushed the parallel-build branch 19 times, most recently from 2f97bf4 to 3cb0e06 Compare March 5, 2024 13:57
@ihostage ihostage changed the title Build examples in parallel Build only changed examples and do it parallel Mar 5, 2024
Copy link
Member Author

@ihostage ihostage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkurz Can you check/approve my idea so that I'm going to continue this work?

cmd: |
./test.sh --netty
changes:
if: github.event_name == 'pull_request'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detect changes only for PR builds

- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create named filter per example


finish:
name: Finish
if: github.event_name == 'pull_request'
if: ${{ github.event_name == 'pull_request' && !failure() && !cancelled() }}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Success if depends was skipped

@ihostage ihostage requested a review from mkurz March 5, 2024 14:29
@mkurz
Copy link
Member

mkurz commented Mar 5, 2024

@mkurz Can you check/approve my idea so that I'm going to continue this work?

Looking now.

gradle-tests:
name: Gradle Tests
uses: playframework/.github/.github/workflows/cmd.yml@v3
java-chatroom:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So does this mean you have to duplicate this section for each samples, like 31 times? Each time basically the same except a different sample name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right. Potentially, we can try to pass needs.changes.outputs.examples array as another dimension for matrix job, but because I need to do that only once, currently I prefer duplicate jobs per example to have a full control of build logic for every example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a reasonable cost to me to get the target. But of course it's discussable 😉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking...

find -name build.sbt | xargs sed -i 's/\/\/.enablePlugins(PlayNettyServer)/.enablePlugins(PlayNettyServer)/g'
fi
sbt ++$MATRIX_SCALA test
./gradlew -Dscala.version="$MATRIX_SCALA" check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this needs to be duplicated 31 times?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it isn't duplicate, it's a moving from scripts folder in every examples, that I want to delete in the future 😂 But technically, you're absolutely right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this introduced much more duplication, we have

find -name build.sbt | xargs sed -i 's/\/\/.enablePlugins(PlayNettyServer)/.enablePlugins(PlayNettyServer)/g'

just once in the test.sh file now, but now you have to add that (and the if condition) 31 times...

@mkurz
Copy link
Member

mkurz commented Mar 7, 2024

Sorry for the delay Sergey...
The problem for me is if we move everything into the yml file and remove the script files (./test.sh and the scripts within the scripts/ folder of each sample), I will not be able to run the tests locally anymore (without maintaining my own script files). I actually do run the test scripts locally quite often... (I just need to export MATRIX_SCALA=... and then run ./test.sh or if testing a single sample just run that sample's script). This way I also found the bug in Scala 3.3.2 which now even has it's own blog post..
So, personally, I would prefer to keep the script files (they will be excluded in the zip files anyway).

So what about if we do following?
Allow the test.sh script to take an additional parameter --sample=play-java-chatroom-example which you can use to pass the example to run. If --sample=... is missing all samples will run, just like before.
So, together with chatgtp I came up with following modified test.sh script (so not 100% sure if it really works because "AI" 😉):

#!/usr/bin/env bash
if [ -z "$MATRIX_SCALA" ]; then
    echo "Error: the environment variable MATRIX_SCALA is not set"
    exit 1
fi

# Initialize variables
netty=false
sample=""

# Parse command line arguments
while [[ "$#" -gt 0 ]]; do
  case $1 in
    --netty) netty=true ;;  # Set netty to true if --netty is passed
    --sample=*) sample="${1#*=}" ;;  # Extract the value right of the equal sign
    *) echo "Unknown parameter passed: $1"; exit 1 ;;
  esac
  shift  # Move to next argument
done

if [ "$netty" = true ]; then
  find -name build.sbt | xargs sed -i 's/\/\/.enablePlugins(PlayNettyServer)/.enablePlugins(PlayNettyServer)/g'
fi

if [ -n "$sample" ]; then
  # Sample is set and not empty
  pushd $sample                           && scripts/test-sbt && popd
else
  # sample not set, therefore run all samples
  pushd play-java-chatroom-example        && scripts/test-sbt && popd
  pushd play-java-compile-di-example      && scripts/test-sbt && popd
  # ...
  # ... and all other samples just like we have now
fi

It should be possble now to call the script from the workflow yml with

  • ./test.sh --netty --sample=play-java-chatroom-example if you duplicate the jobs 31 times in the yml files (which I do not like so much)
  • ./test.sh --netty --sample=$MATRIX_SAMPLE if we add the output json array as another dimension for matrix job (which I actually prefer to avoid so much duplication in the yml file and also now I think with my proposed solution this makes much more sense because we keep the logic in the sh scripts anyway, so there is not much to control anymore in the yml file...)

And now even better:
I saw you introduced a $MATRIX_SERVER, so instead of having a --netty flag we should just introduce a --backend-server=netty flag in the test.sh script. If the flag is present and has the value netty then the netty server is used, otherwise just fall back to pekko as default (if no flag passed or if its value is unrecognized like asdf).
This way we do not even have to make that if condition just to pass the --netty flag and we can build a very nice matrix to run all the 62 jobs (31 samples x 2 backend servers) with just one line in a single job (actually even more jobs will run because the matrix is even bigger with java: 17, 11, scala: 2.13.x, 3.x):

./test.sh --backend-server=$MATRIX_SERVER --sample=$MATRIX_SAMPLE

What do you think about that?
My solution keeps the scripts so I/we can still run everything locally, but also allows us to parallelize all samples executions on CI in parallel. The only thing you don't have is full control in the yml file, but you still have them in the sh files (which is not bad IMHO, so if you look the the test script it's directly in the samples folder...)

@mkurz
Copy link
Member

mkurz commented Mar 7, 2024

Aso one more idea (in addition to my last comment):

If you want we can remove some of the <sample>/scripts/test-sbt files, because some of them just are duplicates and just run sbt ++$MATRIX_SCALA test so in the bash script we could check if such a test-sbt file exists and only run it when it does exist and if not just fall back to sbt ++$MATRIX_SCALA test in the test.sh script itself.
So we keep only some test-sbt scripts which are doing specific things.
We don't have to do that, just an additional idea. WDYT?

@ihostage ihostage force-pushed the parallel-build branch 3 times, most recently from e3e4e31 to 698bd4d Compare March 7, 2024 12:16
@ihostage
Copy link
Member Author

ihostage commented Mar 7, 2024

So, Matthias, I totally agree with many of your points 👍
And I remember that you use this common script to check samples locally 😉
Can you check a new version locally again?

@ihostage
Copy link
Member Author

ihostage commented Mar 7, 2024

It should be possble now to call the script from the workflow yml with

  • ./test.sh --netty --sample=play-java-chatroom-example if you duplicate the jobs 31 times in the yml files (which I do not like so much)
  • ./test.sh --netty --sample=$MATRIX_SAMPLE if we add the output json array as another dimension for matrix job (which I actually prefer to avoid so much duplication in the yml file and also now I think with my proposed solution this makes much more sense because we keep the logic in the sh scripts anyway, so there is not much to control anymore in the yml file...)

One main reason to me of using duplicates in YAML is visualization in Action page. If we will use samples as one matrix dimension we will have a long ungrouped list of jobs and finding in this list one needed job log will be very complicated 😞
image

test.sh Outdated Show resolved Hide resolved
test.sh Outdated Show resolved Hide resolved
test.sh Show resolved Hide resolved
@mkurz
Copy link
Member

mkurz commented Mar 7, 2024

Can you check a new version locally again?

I checked locally and the scripts works for me 👍 Added some comments.

One main reason to me of using duplicates in YAML is visualization in Action page. If we will use samples as one matrix dimension we will have a long ungrouped list of jobs and finding in this list one needed job log will be very complicated 😞

For me it is ok if you add the 31 jobs manually and not adding it to the matrix if you think it's better. We will have much less duplication now anyway, it should always be just one line I think?

One more thing, which is just for you and you probably know anyway. Currently using netty together with gradle does not "work", will still use pekko. So you maybe want to make this work also, but it's up to you:

... --backend-server=netty --build-tool=gradle # this combination does not work yet

@ihostage ihostage force-pushed the parallel-build branch 3 times, most recently from 664d4fa to ce4a611 Compare March 7, 2024 14:26
@mkurz
Copy link
Member

mkurz commented Mar 7, 2024

One main reason to me of using duplicates in YAML is visualization in Action page. If we will use samples as one matrix dimension we will have a long ungrouped list of jobs and finding in this list one needed job log will be very complicated 😞
image

Actually, a bit of a stupid question, but why do you want to find a specific job within the matrix?
I think only when a job fails you click on it and then you see anyway which job it was (in the title or in the logs).
Not sure if we really are looking for a specific job very often....

@ihostage ihostage force-pushed the parallel-build branch 2 times, most recently from 524ced6 to b08484a Compare March 7, 2024 18:17
@ihostage
Copy link
Member Author

ihostage commented Mar 7, 2024

Not sure if we really are looking for a specific job very often....

Yes, but sometimes it can be useful and our workflow should be ready to this case 😄

Actually, a bit of a stupid question, but why do you want to find a specific job within the matrix?

Maybe in the future we can back to this discussion, but currently I see the next cases when separated jobs are useful:

  • gradle-build-root property. I don't remember, but it can be trouble using matrix variable as value of this property on the same level as matrix definition
  • I change a few samples and want to be sure that all changes was checked = check the name of samples on the second column in action summary page or list of jobs on the left side of page
  • Temporary add/update/remove matrix settings for one sample. I mean Java/Scala/Sbt/Gradle/Pekko/Netty settings for concrete sample. Juggling (include/exclude) settings of unified matrix isn't as I want to do and it isn't a quickly 😄
  • Simplification sequential working on New folder structure #558

@ihostage ihostage marked this pull request as ready for review March 7, 2024 18:46
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
.github/workflows/build-test.yml Outdated Show resolved Hide resolved
echo "+----------------------------+"
echo "| Executing tests using sbt |"
echo "+----------------------------+"
sbt ++$MATRIX_SCALA test docs/paradox
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You removed docs/paradox? Why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removed. This docs had been publishing in Lightbend Developer Portal, but we don't publish it anymore. That why I think it isn't actual for us now and we don't need to check it during build. We should migrate this docs to Antora in the future 🤔

echo "+----------------------------+"
echo "| Executing tests using sbt |"
echo "+----------------------------+"
sbt ++$MATRIX_SCALA test docs/paradox
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again docs/paradox removed?

@mkurz
Copy link
Member

mkurz commented Mar 11, 2024

Hmm... why was

and

skipped?

Maybe because the java file upload sample failed and these two started later so they did not even start running because of !failure()?

Should we maybe just restart all jobs once more to see if they pass? Maybe can you rebase?

@ihostage
Copy link
Member Author

Hmm... why was

and

skipped?

Because these samples didn't change 😉

Copy link
Member

@mkurz mkurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these samples didn't change 😉

Yeah, that makes sense 😉

If you are finished you can merge this. I am not totally happy about that many jobs and its duplicated lines in the yml file but I can life with it. I have a bit of a feeling we might overengineer things here... But let's give it a try.

@ihostage ihostage merged commit 051d2e1 into playframework:main Mar 11, 2024
313 checks passed
@ihostage ihostage deleted the parallel-build branch March 11, 2024 11:48
@ihostage
Copy link
Member Author

@Mergifyio backport 3.0.x

Copy link
Contributor

mergify bot commented Mar 11, 2024

backport 3.0.x

✅ Backports have been created

@ihostage
Copy link
Member Author

@Mergifyio backport 2.9.x

Copy link
Contributor

mergify bot commented Mar 11, 2024

backport 2.9.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants