Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ATH-1036] Fix ath-container and docker based tests #1037

Merged
merged 16 commits into from
Mar 8, 2023

Conversation

raul-arabaolaza
Copy link
Contributor

See #1036 since #878 docker based tests are not run on ci.jenkins.io because docker can not be found on the ath image.

I have updated the docker installation process according to https://docs.docker.com/engine/install/ubuntu/ and also reverted c54ef5f#diff-59086a3e54e38095246aa3a051f9020274dbf005b4652a7808cd4c6c239c536eL67 so ath-user can see and run the docker command.

I have checked that the image builds and you can run docker inside as ath-user. Note that by running again docker tests we may find failures that were ignored before by accident. The purpose of this PR is not to solve those tests but to make the failures visible again. To check this is working we can compare the number of tests run for the WorkflowPluginTest suite in master and for this PR

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes

Copy link
Collaborator

@dduportal dduportal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from the jenkins-infra point of view

@raul-arabaolaza
Copy link
Contributor Author

Too many failures here, and I can not make sense of the error, I suspect it is because of an infra problem.

On the good side it looks like more tests are run, I will retrigger this build by using an empty commit

src/main/resources/ath-container/Dockerfile Outdated Show resolved Hide resolved
@jtnord
Copy link
Member

jtnord commented Feb 23, 2023

it looks like something has seriously regressed in either the LTS or was merged that broke pretty much all the tests.

https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/683/testReport/

could not see any merges around this timeline and is mostly around save - given the nature of the first few tests I looked at...

something is clearly wrong though.

https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/683/testReport/junit/core/FreestyleJobTest/java_11_jenkins_lts_split4___discardBuilds/ if you look at the last screenshot you see a jenkins version that is not an LTS version.

Could well be a bug in the junit attachment plugin but still....

Co-authored-by: James Nord <jtnord@users.noreply.github.com>
@raul-arabaolaza
Copy link
Contributor Author

So, 56 skipped tests in this pr vs 226 in master -> 170 new tests run

Also for example https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/PR-1037/4/testReport/plugins/WorkflowPluginTest/java_11_jenkins_latest_split6___hello_world_from_git/ is run in this pr, but in https://ci.jenkins.io/job/Core/job/acceptance-test-harness/job/master/684/testReport/plugins/WorkflowPluginTest/java_11_jenkins_lts_split8___hello_world_from_git/ is skipped because docker is not found.

I do believe we can merge this, as failures are unrelated and the goal of unlocking docker tests is achieved

Copy link
Member

@timja timja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do believe we can merge this, as failures are unrelated and the goal of unlocking docker tests is achieved

Please do not merge this, a broken main build is the most important activity. Without a green baseline we cannot correctly evaluate this change and make sure there's no regression introduced, e.g. if this breaks any docker tests etc.

@timja
Copy link
Member

timja commented Feb 27, 2023

it looks like something has seriously regressed in either the LTS or was merged that broke pretty much all the tests.

Filed #1039 which should be fixed before merging this.
From an initial look so far seem to be infra / container related, on my local machine the test I ran passed just fine.

@raul-arabaolaza
Copy link
Contributor Author

@timja

What makes you believe those now ignored docker tests will be green? Actually in proprietary runs using another infra most of those tests are failing because of changes already merged in master. I can reproduce also locally without any infra changes involved.

I understand a green master would be desirable before merging any PR. But in this case I still believe it does not matter. Why? Because that green master will be an illusion. Even if all tests come back to be green in master that would be no guarantee at all as 170 docker based tests are skipped by error. Not because of this PR but other changes already merged. Should we continue ignoring those failures trying to get a fake green on master? I believe is better to get all tests run again and then confidently get master back to a real green.

Or in other words, I do prefer a legit red build that I can fix with confidence step by step than trying to fix something that will be anyway broken without me knowing. Because right now any fix that I may propose to for example WorkflowPluginTest#hello_world_from_git will not be validated until this pr is merged. And I will not create a gigantic PR that is gonna restore docker based tests and contain unrelated ( to the original intention of the PR) test fixes. I would rather do those in separate, small, isolated and easily reviewable PRs after the PR builder is working.

@timja
Copy link
Member

timja commented Feb 27, 2023

Or in other words, I do prefer a legit red build that I can fix with confidence step by step than trying to fix something that will be anyway broken without me knowing. Because right now any fix that I may propose to for example WorkflowPluginTest#hello_world_from_git will not be validated until this pr is merged. And I will not create a gigantic PR that is gonna restore docker based tests and contain unrelated ( to the original intention of the PR) test fixes. I would rather do those in separate, small, isolated and easily reviewable PRs after the PR builder is working.

You (or anyone else) can branch from this PR to fix those as well. Merging on top of red is something I absolutely do not support.

(I've ran out of time to check them for now but the next thing I would try it reverting #1024 to see if they go back to green)

@raul-arabaolaza
Copy link
Contributor Author

raul-arabaolaza commented Feb 27, 2023

#1024 has proved blind faith in green builds is not a guarantee of success. But anyway I do not have more time to dedicate to this for the moment. If anyone wants to use this as base for future PRs please go forward.

@timja
Copy link
Member

timja commented Mar 6, 2023

Seems lots of docker tests are failing for not being able to build the image currently

@raul-arabaolaza
Copy link
Contributor Author

Let me check, I had problems with old docker versions locally

@raul-arabaolaza
Copy link
Contributor Author

Not all tests are failing because of that, of my random 4 picks it was 50%, anyway, let's see what is failing

@raul-arabaolaza
Copy link
Contributor Author

Oh, 2.977 Package viewvc is not available, but is referred to by another package. #6 2.977 This may mean that the package is missing, has been obsoleted, or #6 2.977 is only available from another source #6 2.977 #6 3.040 E: Package 'viewvc' has no installation candidate

Those kind of errors should be easy to fix or to die trying

@raul-arabaolaza
Copy link
Contributor Author

viewvc/viewvc#310

See refenrence in Dockerfile
@raul-arabaolaza
Copy link
Contributor Author

So, subversion and (I hope) jira plugin should be fixed on the last run, next step git plugin tests

@raul-arabaolaza
Copy link
Contributor Author

DeplyPluginTests should be fixed now

@@ -3,7 +3,7 @@
# and prepares for execution of gitplugin tests.
#

FROM ubuntu:22.04
FROM ubuntu:xenial
Copy link
Member

@jtnord jtnord Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most likely the line 24 and 19 are incorrect

RUN cat /home/git/unsafe.pub >> /home/git/.ssh/authorized_keys
is likely creating the file using the default keymask (infact the line 19 mkdir -p /home/git/.ssh is probably just as bad.

the .ssh folder should be 0700 and be owned by the git user (not root)

line 30 RUN chown -R git /home/git changes the ownership back but would not address any permission changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had some success with this suggestion locally, let's see what the pr builder says

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with @jtnord . I'm not sure how this Java code works to spin up Docker container (and I'm too scared to try it), but I got the following updated Dockerfile here tested as below (as a first step) and passed through hadolint (which has very very good tips and actionnables, like shellcheck):

FROM ubuntu:22.04

SHELL ["/bin/bash", "-o", "pipefail", "-c"]

# install SSHD, Git and zip, always on the latest available version
# hadolint ignore=DL3008
RUN apt-get update \
    && apt-get install --yes --no-install-recommends \
    ca-certificates \
    openssh-server \
    git \
    zip \
    && rm -rf /var/lib/apt/lists/*

# create a git user and setup its .ssh dir (including public key)
COPY unsafe.pub  /home/git/.ssh/authorized_keys
RUN useradd git -d /home/git \
    && echo "git:git" | chpasswd \
    && mkdir -p /var/run/sshd \
    && chmod 0700 /home/git/.ssh \
    && chmod 0600 /home/git/.ssh/* \
    && chown -R git /home/git

# run SSHD in the foreground with error messages to stderr
ENTRYPOINT ["/usr/sbin/sshd"]
CMD ["-D","-e"]
$ pwd
<my workspace>/acceptance-test-harness/src/main/resources/org/jenkinsci/test/acceptance/docker/fixtures/GitContainer

$ docker build -t sshd ./
# Success

$ docker run --rm --detach --name=sshd-test --publish 2222:22 sshd
# Run in background the container

$ ssh -i ./unsafe -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -p2222 git@localhost whoami
# ..
git
## It works

@raul-arabaolaza raul-arabaolaza requested review from timja and dduportal and removed request for timja and dduportal March 7, 2023 20:19
@raul-arabaolaza
Copy link
Contributor Author

@timja @dduportal @jtnord Can you re review please? AFAICT the JobConfigHistoryTest is a flake, it has been working and failing randomly during the course of this PR while my changes should not affect it at all

RUN chmod 700 /home/git/.ssh
RUN chmod 600 /home/git/.ssh/authorized_keys

RUN echo "HostbasedAcceptedKeyTypes +ssh-rsa\nHostKeyAlgorithms +ssh-rsa\nPubkeyAcceptedKeyTypes +ssh-rsa" >> /etc/ssh/sshd_config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this for jsch to be able to sftp into the container, I tried with more modern algorithms for the keys but jsch did not supported them


@Test
@Native("bash")
@WithCredentials(credentialType = WithCredentials.USERNAME_PASSWORD, values = {"admin", "tomcat"}, id = "tomcat")
public void deploy_sample_webapp_to_tomcat7() throws IOException, InterruptedException {
public void deploy_sample_webapp_to_tomcat10() throws IOException, InterruptedException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not believe deploying in tomcat 10 vs deploying in tomcat 7 changes the intent of the test

@timja
Copy link
Member

timja commented Mar 7, 2023

@timja @dduportal @jtnord Can you re review please? AFAICT the JobConfigHistoryTest is a flake, it has been working and failing randomly during the course of this PR while my changes should not affect it at all

You could file an issue and disable it if you don’t have time to look at it.

auto-merge was automatically disabled March 8, 2023 08:26

Head branch was pushed to by a user without write access

@raul-arabaolaza
Copy link
Contributor Author

Issues created, see link, and test ignored

@raul-arabaolaza
Copy link
Contributor Author

raul-arabaolaza commented Mar 8, 2023

@timja Can you please reenable auto merge or merge if green?

@raul-arabaolaza raul-arabaolaza changed the title [ATH-1036] Fix docker image so docker based tests are run [ATH-1036] Fix ath-container and docker based tests Mar 8, 2023
Copy link
Collaborator

@dduportal dduportal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 🚀

@dduportal dduportal merged commit 8ac1ea0 into jenkinsci:master Mar 8, 2023
@raul-arabaolaza raul-arabaolaza deleted the ATH-1036 branch March 8, 2023 12:54
@timja
Copy link
Member

timja commented Aug 11, 2023

Caused #1322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants