Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Ubuntu Build in Pipeline #6030

Merged

Conversation

xinzweb
Copy link
Member

@xinzweb xinzweb commented Oct 17, 2018

  • Rename continuous-integration to gp-continuous-integration
  • gpdb-tpl: add compile_gpdb_ubuntu16
  • Copy out the gpdb_bin.tar.gz at the end of build
  • Add docker based Ubuntu build
  • Rename ubuntu-16.04/ to ubuntu16-runner/
  • Move compile_gpdb_ubuntu16 task config into task yaml

@bradfordboyle
Copy link
Contributor

There is a lot to unpack in this PR:

  1. Updating references to continuous-integration
  2. Switching {{}} to (())
  3. Adding the debian packaging to the pipeline

It seems like the first two are not strict requirements for the last one and they significantly contribute to the "visual" clutter in reviewing the new packaging tasks. I think it makes sense to split this into three separate PRs.

At this point, I'm not sold on switching from {{}} to (()). The concourse documentation does not document the use of {{}} but they are still supported and I don't think there are plans to remove support for them. One of the advantages of sticking with the older style {{}}`` is that fly set-pipelinewill error out (w/ a helpful message) if any variables are missing; the new style(())does not do this check. Concourse v4.1.0 add a--check-credoption tofly set-pipeline` that you can use to check for missing variables. Maybe it makes sense to switch after upgrading concourse.

@kmacoskey
Copy link
Contributor

kmacoskey commented Oct 18, 2018

Maybe it makes sense to switch after upgrading concourse.

+1 to this idea. Losing input validation outright does not seem desirable. The toolsmiths are actively working on upgrades so it won't be long until the features of v4.1.0 are available. In the interim, if there are specific variable interpolations that do require (()) then those could be merged in before that upgrade.

I think it makes sense to split this into three separate PRs.

+1 to this as well. I'd like to hold off adding review comments to the PR if everyone agrees that it should be split up.

@larham
Copy link
Contributor

larham commented Oct 18, 2018

@kmacoskey @bradfordboyle Good point about the {{}} change. We removed that commit. (Turns out we're not using the secrets file that has nested keys after all, so we don't need to do that at this time.)

Regarding splitting this into multiple PRs: Part of the motivation for this is now gone with the removal of the {{}} change. Further, the commit for the gp-continuous-integration rename is 3c85a90

So there are 2 main changes in this PR: the rename, and the "meat" of this PR, all the rest of the commits.

What do you think of reviewing all the commits separately (see "Commits" tab) within one PR? If not, we can separate out into 2 PRs if it hard to make sense of this; feeling like the big issue was the {{}} change...

@larham larham force-pushed the 5X_ubuntu_build_160839258 branch 3 times, most recently from d70fa8c to 7d983c9 Compare October 18, 2018 21:21
@kmacoskey
Copy link
Contributor

Lets pause on this PR until we can have some context sharing about this from a release engineering perspective. Comments in the PR may not be the best place to start this conversation.

@kmacoskey
Copy link
Contributor

@jpatel-pivotal Thank you for providing some insight into the longer term plans! Getting this PR into 5X_STABLE is still the goal and next step.

bucket: ((bucket-name))
region_name: ((aws-region))
secret_access_key: ((bucket-secret-access-key))
regexp: deb_package_ubuntu16/greenplum-db_(.*)_amd64.deb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if I am mistaken, but the regexp here resolves to VERSION=$(</tmp/gpdb_src/VERSION awk '{print $1}')?

Is there a downstream reason it's preferable to have the SHA within the Debian package name in s3? The other blobs being exported to ((bucket-name)) are all generic names like bin_gpdb.tar.gz so that downstream processes can consume those versioned objects (it's a versioned bucket) as versioned_files. With a SHA in the name you cannot consume either as a regexp, because it's not semi-semantically sortable, or as a versioned_filed because each object name will be unique.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is using the regexp to find the version information, and we follow the extended semantic versioning with + to additional information. We need the SHA in the file name to ensure the downstream packaging knows exactly which commit we depend on.

It's way easier to help us check which feature in and which feature out by containing a commit SHA here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part of the downstream packaging needs to know this SHA? Typically any downstream packaging should be based on either using release candidates, which means just consuming the latest versioned object out of an s3 bucket, or be based on an actual release which is driven by a tag on a repository.

What specifically drives knowing exactly which SHA should be consumed in any downstream process?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmacoskey regarding the bucket-file naming with included SHA, the key consumer of that file in deliverables/ will be code refactored from the gp-enterprise-debian-package repo. Currently, the SHA does not appear to be used. In particular, the version discovery in Makefile discovers the Greenplum version by running greenplum. @xinzweb may have more context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kmacoskey For official release, we don't need the SHA, since the release tag of the GPDB is good enough. If we are doing the bleeding edge testing to verify the certain behavior of GPDB (not released yet) in our environment, then we need to know which SHA we are referring to.

What's the additional complexity you worried about having the .deb file with the version and SHA? Maybe we overlooked the naming issue here. Thanks.

concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
@@ -1019,6 +1053,8 @@ jobs:
- get: centos-mingw
- get: ubuntu-gpdb-dev-16
- get: ubuntu-gpdb-debian-dev-16
- get: docker-in-concourse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it does fit the pattern, it's not necessary to have this resource, docker-in-concourse, pass through the gate_compile_start job. I believe it would be reasonable to completely remove that job at this point. It's original purpose for staggered starts and prior to the pipeline template generator has long since past.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I was wondering on the same thing, but just want to follow the pattern. We will remove it.

- get: gpaddon_src
passed: [gate_compile_start]
- get: docker-in-concourse
passed: [gate_compile_start]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the above, this resource doesn't need to pass through gate_compile_start

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to know that. That should be easy to fix.

concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
@kmacoskey
Copy link
Contributor

Is there a dev copy of this pipeline set anywhere? I'd like to see that the pipeline is green and to take a look at the build output for the new job.

Copy link
Contributor

@bradfordboyle bradfordboyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems to follow a consistent pattern of inline shell scripts in task yaml and Dockerfile. What does this pattern do better than having this logic as a separate bash script? I would think that separate scripts would increase the potential reuse and make it easier to test/lint.

Other than that, this looks OK.

args:
- -ec
- |
. docker-in-concourse/dind.bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a lot of bash in a task definition. Is there a reason to not have this as a bash script?

WORKDIR /tmp/gpdb_src

RUN \
VERSION=$(</tmp/gpdb_src/VERSION awk '{print $1}') && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to not have this be a separate script file and then COPY and RUN in the dockerfile?

Copy link
Contributor

@kmacoskey kmacoskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this job being run in a concourse pipeline, what is this error that occurs a large number of times? A single build included this error in the stdout 966 times.

ERROR: ld.so: object 'libfakeroot-sysv.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.

When the deb package is built at the end of the job execution there is another echoing of the contents of a built gpdb binary directory. Could this be avoided? It's adding to the already large amount of stdout generated by this concourse job and I am unsure of the value of having this particular output included in the stdout.

Example output:

E: greenplum-db: dir-or-file-in-opt opt/gpdb/bin/.rcfile
E: greenplum-db: dir-or-file-in-opt opt/gpdb/bin/README
E: greenplum-db: dir-or-file-in-opt opt/gpdb/bin/analyzedb

src/tools/docker/ubuntu16/Makefile Outdated Show resolved Hide resolved
@@ -0,0 +1,95 @@
FROM pivotaldata/ubuntu-gpdb-dev:16.04_gcc_6_4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are currently only using GCC 6.2 to compile gpdb5. Has switching to 6.4 been validated as an ok change? I'd suggest just sticking with 6.2 and let compiler changes be driven by more comprehensive testing, as well it could be confusing for support if there are ever issues down that road that have been introduced by using 6.4 but it is not known that this particular platform is compiled with 6.4 while the rest of the platforms are with 6.2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, please see email to directors to get their opinion on this.

@larham
Copy link
Contributor

larham commented Oct 22, 2018

@bradfordboyle @kmacoskey We added 4 commits for the refactors you suggested, extracting shell files out of the concourse yaml, and quieting both tar and the debian build step dh_shlibdeps. Thanks for that feedback.

Regarding the final issue Kris mentioned: gcc version.

I recall discussing this during LL; I recall we formerly pinned the gcc version to 6.x, but changed that to the current situation, where we are using the latest debian-native version of gcc 6. I'd like to verify that the whole team wants to pin that to a previous gcc version. Email forthcoming to confirm.

Thanks,

Larry & Fei

@Eulerizeit
Copy link

I don't really have a strong opinion on this PR given what it is doing and the fact that it is doing it on 5.

On Master I think that it is important for all of the Compile, testing packaging etc. jobs to be the same. Meaning that changes made to one should be made to all.

@dsharp-pivotal dsharp-pivotal force-pushed the 5X_ubuntu_build_160839258 branch 2 times, most recently from 0dfdd44 to e09a2fd Compare October 23, 2018 23:59
@dotyjim-work
Copy link
Contributor

I took a look at this, and I see two things that I am trying to wrap my head around:

  1. The introduction of the new docker in docker pattern - I don't understand the benefits of this pattern in concourse, and thus can currently only see the increased complexity.
  2. The pattern of copying objects off of the Ubuntu container into ${INSTALL_LOC}/.... I feel that we have shifted from copying them from a location that we control (Artifactory) to a location that we don't control (the Ubuntu container). If upstream changes one of those artifacts, we consume and release it, and then have to come back and revert that, will we be able to? Its hard now, and I am trying to understand if these changes will make it harder, are neutral, or are an improvement.

@jpatel-pivotal
Copy link
Contributor

@doty-pivotal I will reach out and we can talk through the topics in your comment this am. Thanks

@goutamtadi1 goutamtadi1 force-pushed the 5X_ubuntu_build_160839258 branch 3 times, most recently from 4dbda90 to fc7e8d7 Compare November 6, 2018 22:31
@goutamtadi1 goutamtadi1 force-pushed the 5X_ubuntu_build_160839258 branch 13 times, most recently from 6866389 to bb47677 Compare November 6, 2018 23:49
@jpatel-pivotal
Copy link
Contributor

@kmacoskey we have cleaned up and squashed them down to a simpler commit history. Let us know as we would like to merge this PR in sooner rather than later.

dsharp-pivotal and others added 2 commits November 7, 2018 10:47
[#160667453]

Authored-by: David Sharp <dsharp@pivotal.io>
Co-authored-by: Xin Zhang <xzhang@pivotal.io>
Co-authored-by: David Sharp <dsharp@pivotal.io>
Co-authored-by: Larry Hamel <lhamel@pivotal.io>
Copy link
Contributor

@kmacoskey kmacoskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks as though the generated pipeline based on the changes to gpdb-tpl.yml is not included in the PR. If you generate the pipeline based on the template, there are a number of issues that will need to be resolved.

concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
concourse/pipelines/templates/gpdb-tpl.yml Outdated Show resolved Hide resolved
concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
concourse/pipelines/templates/gpdb-tpl.yml Show resolved Hide resolved
dsharp-pivotal and others added 3 commits November 7, 2018 16:08
…ntu16

Authored-by: David Sharp <dsharp@pivotal.io>
- Copy out the gpdb_bin.tar.gz at the end of build
- for details, please see /src/tools/docker/ubuntu16/README.md
- Skip building gphdfs, orafce, pgbench, pgbouncer on Ubuntu
[#160667453]

Co-authored-by: Xin Zhang <xzhang@pivotal.io>
Co-authored-by: Fei Yang <fyang@pivotal.io>
Co-authored-by: David Sharp <dsharp@pivotal.io>
Co-authored-by: Goutam Tadi <gtadi@pivotal.io>
Co-authored-by: Larry Hamel <lhamel@pivotal.io>
Co-authored-by: Jemish Patel <jpatel@pivotal.io>
@jpatel-pivotal jpatel-pivotal force-pushed the 5X_ubuntu_build_160839258 branch 2 times, most recently from 1e00000 to c6cf8ff Compare November 8, 2018 00:20
@jpatel-pivotal
Copy link
Contributor

@kmacoskey should be all good now. PTAL.

@kmacoskey
Copy link
Contributor

Ok, in order to confirm things on my side i've checked that the template matches the 5X_STABLE-generated.yml and that the generated pipeline sets as expected and matches the set development pipeline that is green. Everything looks ok there.

xinzweb and others added 2 commits November 8, 2018 10:54
Co-authored-by: Xin Zhang <xzhang@pivotal.io>
Co-authored-by: Fei Yang <fyang@pivotal.io>
Co-authored-by: Larry Hamel <lhamel@pivotal.io>
Co-authored-by: David Sharp <dsharp@pivotal.io>
Co-authored-by: Goutam Tadi <gtadi@pivotal.io>
Co-authored-by: Xin Zhang <xzhang@pivotal.io>
Co-authored-by: Larry Hamel <lhamel@pivotal.io>
Co-authored-by: David Sharp <dsharp@pivotal.io>
Co-authored-by: Fei Yang <fyang@pivotal.io>
@dsharp-pivotal dsharp-pivotal merged commit cd9f477 into greenplum-db:5X_STABLE Nov 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants