Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support uploading to multiple Artifactory servers #8817

Merged
merged 2 commits into from Mar 27, 2020

Conversation

AdamBrousseau
Copy link
Contributor

@AdamBrousseau AdamBrousseau commented Mar 10, 2020

Add support for multiple Artifactory servers

Artifactory bandwidth has become an issue in the CI builds.
This has caused issues with upload and download times, as well
as failing downloads due to infrastructure inadequacy.
Particularly for UNB machines, where a large portion of the
farm is located, combined with the pipe into UNB being small.

Being able to setup multiple Artifactory servers spread across
geographies and colocated with pools of machines will allow us
to push and more importantly pull from a nearby server.

The design leaves one site as the main (default) server (OSU). We
will always upload to the default server. We will only upload
to a secondary server(s) if there are machines with a matching
platform, colocated with another (non-default) server. The design
is laid out so it will scale with more servers without any code
change other than adding to the defaults.yml config.

Since we cannot determine where a test will land (geographically),
we need to upload to any/all servers that are colocated with machines
of matching platform. We also will always pass the default SDK URL.
We will add a curl wrapper on the nodes which have colocated servers.
This wrapper will redirect requests from one server to another. In
this case redirect OSU requests to UNB. We will also strip off
the user/password since a) The servers allow anonymous access and
b) The user's api key will be different for every server.

Changes introduced:

  • Redesign the Artifactory config in defaults.yml to support multiple
    servers. Identify servers based on geo(graphy) and identify one
    geo as default.
  • Redesign the Artifactory variables as a single hashmap. This keeps
    all the config together. Continue to write out the default server
    values to env, in order for the parent job to continue as-is.
  • Move Artifactory setup onto the compile node. This is needed
    in order to determine where we are compiling and where we need
    to upload.
  • Add support for uploading to a server behind a vpn. If we aren't
    compiling behind the vpn, stash the SDK for later when we can
    grab a node in the same geo as the server we need to push to.
    Note: We cannot publish buildInfo to vpn'd servers. Artifacts
    come from nodes but buildInfo comes from Master, so Master needs
    to see the server.
  • Add some more details to the Artifacotry README. Also add more
    info specific to the UNB setup.

Also:

  • Unrelated change to stop adding JAVADOC_LIB_URL to the CUSTOMIZED_SDK_URL.
    CUSTOMIZED_SDK_URL is for test and test doesn't need JAVADOC.

[skip ci]
Issue #8425

Signed-off-by: Adam Brousseau adam.brousseau88@gmail.com

4 use cases

  1. Compile & Test with default server (OSU) (eg. zlinux)
  2. Compile & Test at UNB (eg. xlinux)
  3. Compile at OSU and Test at UNB (eg. plinux)
  4. Compile at UNB and Test at OSU (eg. plinux)

@AdamBrousseau
Copy link
Contributor Author

Jenkins compile zlinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins compile plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins compile plinux jdk11

3 similar comments
@AdamBrousseau
Copy link
Contributor Author

Jenkins compile plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins compile plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins compile plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity,extended plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

2 similar comments
@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

1 similar comment
@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

1 similar comment
@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

buildenv/jenkins/common/build.groovy Outdated Show resolved Hide resolved
buildenv/jenkins/common/pipeline-functions.groovy Outdated Show resolved Hide resolved
buildenv/jenkins/common/variables-functions.groovy Outdated Show resolved Hide resolved
buildenv/jenkins/common/variables-functions.groovy Outdated Show resolved Hide resolved
@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

6 similar comments
@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity zlinux jdk11

@AdamBrousseau
Copy link
Contributor Author

Jenkins test sanity zlinux jdk11

@AdamBrousseau AdamBrousseau force-pushed the multi_artifactory branch 2 times, most recently from 17c6d87 to f7923c4 Compare March 24, 2020 04:05
Artifactory bandwidth has become an issue in the CI builds.
This has caused issues with upload and download times, as well
as failing downloads due to infrastructure inadequacy.
Particularly for UNB machines, where a large portion of the
farm is located, combined with the pipe into UNB being small.

Being able to setup multiple Artifactory servers spread across
geographies and colocated with pools of machines will allow us
to push and more importantly pull from a nearby server.

The design leaves one site as the main (default) server (OSU). We
will always upload to the default server. We will only upload
to a secondary server(s) if there are machines with a matching
platform, colocated with another (non-default) server. The design
is laid out so it will scale with more servers without any code
change other than adding to the defaults.yml config.

Since we cannot determine where a test will land (geographically),
we need to upload to any/all servers that are colocated with machines
of matching platform. We also will always pass the default SDK URL.
We will add a curl wrapper on the nodes which have colocated servers.
This wrapper will redirect requests from one server to another. In
this case redirect OSU requests to UNB. We will also strip off
the user/password since a) The servers allow anonymous access and
b) The user's api key will be different for every server.

Changes introduced:
- Redesign the Artifactory config in defaults.yml to support multiple
  servers. Identify servers based on geo(graphy) and identify one
  geo as default.
- Redesign the Artifactory variables as a single hashmap. This keeps
  all the config together. Continue to write out the default server
  values to env, in order for the parent job to continue as-is.
- Move Artifactory setup onto the compile node. This is needed
  in order to determine where we are compiling and where we need
  to upload.
- Add support for uploading to a server behind a vpn. If we aren't
  compiling behind the vpn, stash the SDK for later when we can
  grab a node in the same geo as the server we need to push to.
  Note: We cannot publish buildInfo to vpn'd servers. Artifacts
  come from nodes but buildInfo comes from Master, so Master needs
  to see the server.
- Add some more details to the Artifacotry README. Also add more
  info specific to the UNB setup.

Also:
- Unrelated change to stop adding JAVADOC_LIB_URL to the CUSTOMIZED_SDK_URL.
  CUSTOMIZED_SDK_URL is for test and test doesn't need JAVADOC.

[skip ci]
Issue eclipse-openj9#8425

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
@AdamBrousseau
Copy link
Contributor Author

Jenkins compile xlinux,plinux,zlinux,osx jdk11

@AdamBrousseau
Copy link
Contributor Author

Might have an issue with the cert on cent6/x machines.

Jenkins compile xlinux,xlinuxxl jdk8,jdk11

@AdamBrousseau
Copy link
Contributor Author

Seems ok now. Going to try and stress it a bit.

Jenkins compile xlinux,xlinuxxl,plinux,plinuxxl,osx,osxxl jdk8,jdk11

@pshipton
Copy link
Member

We don't want to stress the UNB plinux box / artifactory too much while we don't have any access to UNB to reboot it if something goes wrong. I discussed with Joe not enabling artifactory at UNB until after the lockdown is lifted, just in case.

@AdamBrousseau
Copy link
Contributor Author

Didn't see any issues on the last build which had 12 compiles. Other than that 1 failure earlier, I have only seen 1 other retry that succeeded on 1st retry attempt.

While I understand we do not want to take down the host, my understanding from Raj, via @jdekonin, was that the issue was related to CPU being overcommited and there was high confidence the issue was resolved. I understand it's mostly you (Pete) feeling the effects of the nightly curl failures so it's up to you to decide if merging this is worth the risk. I'm fine if we block it for now I just don't want it to sit for 6 months and then have a bunch of merge conflicts that may not be trivial.

@AdamBrousseau
Copy link
Contributor Author

Another option would be to merge this without the unb server added to defauilts.yml. That way the code is in but the server isn't enabled.

@pshipton
Copy link
Member

I'm fine to merge without the unb server added. There aren't any nightly curl failures any more, since the retry problem was resolved. It doesn't seem worth the risk of taking down the host, because if we do take it down we'll be stuck.

@AdamBrousseau
Copy link
Contributor Author

We'll still have 3 OSU plinux but yes.... :D
I will update the change and add a comment.

- We will hold off on enabling until we
  have physical access to UNB again. If we
  take down the p host we will be stuck.

[skip ci]
Issue eclipse-openj9#8425

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
@AdamBrousseau
Copy link
Contributor Author

jenkins compile plinux jdk11

@AdamBrousseau
Copy link
Contributor Author

AdamBrousseau commented Mar 26, 2020

@vsebe @pshipton
Commented out UNB line in defaults. Works as expected
Can this be merged now then?

ARTIFACTORY_CONFIG:'[defaultGeo:osu, geos:[osu], repo:ci-eclipse-openj9, uploadDir:ci-eclipse-openj9/Build_JDK11_ppc64le_linux_Personal/789/, osu:[server:ci-eclipse-openj9, numArtifacts:30, daysToKeepArtifacts:50, manualCleanup:true, vpn:false, uploadBool:true]]'

@pshipton pshipton self-assigned this Mar 27, 2020
@pshipton pshipton merged commit bf841bb into eclipse-openj9:master Mar 27, 2020
AdamBrousseau added a commit to AdamBrousseau/openj9 that referenced this pull request Apr 16, 2020
This enabled uploading to the UNB
Artifactory server. Downloading will
need to be enabled on each machine
as per eclipse-openj9#8817.

Related eclipse-openj9#8425
[skip ci]

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
AdamBrousseau added a commit to AdamBrousseau/openj9 that referenced this pull request May 6, 2020
See eclipse-openj9#8817 eclipse-openj9#9258 eclipse-openj9#9274
[skip ci]

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
@AdamBrousseau
Copy link
Contributor Author

Note that the cent6 machines do not use usr/bin/curl they use usr local curl-7

All machines were using usr/bin/curl except for the 7 cent6-x which were using /usr/local which was a symlink to curl 7.61.1. I left both curl wrapper scripts on the proxy

$ curl-usr-bin-curl-link
echo $@ | sed 's#--user .*:.* ##' | xargs /usr/bin/curl --resolve 140-211-168-230-openstack.osuosl.org:443:192.168.10.216
$ curl-usr-local-curl_7_61_1-link
echo $@ | sed 's#--user .*:.* ##' | xargs /usr/local/curl-7.61.1/bin/curl --resolve 140-211-168-230-openstack.osuosl.org:443:192.168.10.216

Also note, the wrapper has been updated via #10985

AdamBrousseau added a commit to AdamBrousseau/openj9 that referenced this pull request Oct 16, 2023
In the orignal change to add support for multiple
Artifactory servers, including servers behind a VPN
(eclipse-openj9#8817), I made a note that buildInfo cannot be published
to VPN'd servers because while the artifacts are pushed
from the node, the buildInfo is pushed from the Controller
and therefore would fail to push if the Controller could
not see the Artifactory server behind the VPN. Since that
change, we've move the UNB VPN connection directly onto our
Controller node, this means we can now publish buildInfo to
the UNB server. This essentially means that as of today, we
can push buildInfo to all servers.

Note we still need the Artifactory vpn config info for other
parts of the code.

Also note that the if condition  written prior to
this change was incorrect so we actually have never published
buildInfo since 8817 went in.

Supersedes eclipse-openj9#18203

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
midronij pushed a commit to midronij/openj9 that referenced this pull request Oct 26, 2023
In the orignal change to add support for multiple
Artifactory servers, including servers behind a VPN
(eclipse-openj9#8817), I made a note that buildInfo cannot be published
to VPN'd servers because while the artifacts are pushed
from the node, the buildInfo is pushed from the Controller
and therefore would fail to push if the Controller could
not see the Artifactory server behind the VPN. Since that
change, we've move the UNB VPN connection directly onto our
Controller node, this means we can now publish buildInfo to
the UNB server. This essentially means that as of today, we
can push buildInfo to all servers.

Note we still need the Artifactory vpn config info for other
parts of the code.

Also note that the if condition  written prior to
this change was incorrect so we actually have never published
buildInfo since 8817 went in.

Supersedes eclipse-openj9#18203

Signed-off-by: Adam Brousseau <adam.brousseau88@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants