Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI improvements #525

Merged

Conversation

andyundso
Copy link
Contributor

@andyundso andyundso commented Jan 11, 2023

Background

I read through @aharpervc feedback on my PR for the Ruby 3.0 support (#524). He asked to implement three features for the CI noted here.

  • Cache the compiled ports on Windows
  • Publish an artifact of the built gem
  • Publish test results to Circle CI

The issue I see with the current CI is that each Windows build step compiles the gem for itself. I assume this means that the compiled version works exactly for one Ruby version. So in order to test it on Windows, you would need to download the artifact using your Ruby version and architecture that you want to test. However, in the final fat gem, the needed binaries are provided for all the supported Ruby versions.

Second issue I see is the build time on Windows. Although the time would go down when using a cache, the gem was only build on Windows, but not tested.

So the main idea behind those pipeline changes are:

  • Compile the ports and tiny_tds code on Linux for Windows as it's faster.
  • Build the regular gem and the fat gem for Windows on Windows using the precompiled ports and code and store them in a workspace.
  • Run the tests on Windows using the precompiled ports and code to have confidence that they work.

I originally had another step in mind that runs after all the tests passed and will build the actual gem, but this proofed to not work very well with cache, as it always recompiled the tiny_tds code, which is the one thing I didn't wanted to happen.

Preparation

There were a couple of changes necessary to enable this workflow. Some are the same as in #524.

  • Lock Bundler down to v2.3.26 when installing it on Windows, as it's the last release to support Ruby 2.4 and 2.5.
  • Remove a workaround for rake-compiler-dock on Windows. There are more details about this in 7b33fe3. But TL;DR Docker without WSL2 on Windows isn't support anymore and WSL2 is Linux again, so fix is obsolete.
  • Bump OpenSSL to v1.1.1s. I got an error when compiling the 32bit version of the ports for Windows. I went over the changelog and didn't see anything where I would assume that it breaks the gem. However, I'm not really knowledgable how this gem and openssl interact, so chances are that I overlooked something.

Windows setup

I would say the Windows setup is quite simple:

  • Install the correct Ruby version as done before.
  • Install an MSSQL Express 2017. The PowerShell script is adapted from the MSSQL Windows containers.
  • Install toxiproxy-server
  • Run the tests

There is one test named raises TinyTds exception with dead connection network failure that didn't work. I was able to reproduce this on my Windows machine, but didn't really see how I can fix that, so I skip it on Windows for now as it seems to work fine on Linux.

Performance

Previously, the Windows builds had about 30min to complete. I checked several runs with this pipeline to see how it performs now.

  • The cross-compile takes about 6 minutes without cache, 90 seconds if the cache is present.
  • Ruby installation clocks between 5 and 6 minutes.
  • MSSQL installation takes about 5 minutes without cache, 3 minutes with cache.
  • toxiproxy-server is another 90 seconds.
  • tests takes about 90 seconds.
  • Building the final gems takes another 90 seconds.

So totalling at between 14 and 22 minutes, depending if the cache is present.

... speaking of caches

I added caches for the following things:

  • The precompiled ports.
  • The precompiled code in tmp, rebuilt on every commit but then shared between jobs.
  • The MSSQL installation file.

Potentially we can also cache:

  • The built ports on Linux
  • The downloaded Ruby versions for Windows.

Other changes

  • I added builds against Ruby 2.4 just to see if it works. @aharpervc asked if it makes sense to drop support for older Ruby version. on one hand I think it would be good to keep it for now as Draft: Ruby 3.1 #523 could introduce a few breaking changes. But on the other hand, the gemspec specifies that the gem supports Ruby 2.0 and newer and for the versions 2.0 to 2.3, tests don't run on CI and the code is not precompiled on Windows. So support for those versions is already somewhat not given.
  • I noted that the Linux steps always ran against Ruby 2.7, so I fixed it.
  • Publish test results to CircleCI.

Sorry for the very long PR description, I wanted to be explicit what I built and thought as there are quite some changes.

andyundso and others added 10 commits January 10, 2023 16:05
This is the last version that supports Ruby 2.4 and 2.5.
The original author states that the code I removed is for using virtualbox on Windows to compile the gem. I read up on this and while rake-compiler-dock still supports using docker-machine to compile gems, the project has been discontinued in favor of Docker Desktop, which nowadays only works with Docker in WSL2.
1.1.1d gives an error when compiling against i686-w64-mingw32. Upgrading it seems to work.
Copy link
Contributor

@aharpervc aharpervc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, exciting! Lots to think about here. I left some comments for consideration

name: save ports cache
paths:
- ./ports
key: ports-libiconv115-openssl111s-freetds1124
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally wanted to use the checksum of extconsts.rb for the ports cache, right now I hardcoded the used version into the name of the cache, but it could also be an option to check if we manage to get the same checksum on both systems. I could further investigate here in a follow-up to make the cache key not dependent on a hardcoded value.

I'm trying to brainstorm alternate approaches here... even something like v1-ports-{{ checksum "extconsts.rb" }} seems like an improvement, although that ignores the fact that those versions are overridable via env var, so another brainstorm: a new task rake ports:versions_file that's basically File.write '.ports_versions' { "iconv=#{ICONV_VERSION}\nfreetds=#{FREETDS_VERSION}\n" }, etc. then our CI config could have {{ checksum ".ports_versions" }}.

Also, can you clarify why it's necessary to switch from one job step per ports to compiling them all in one step? I preferred the version where we were able to see each of those individually. If it was just for caching, I assume we could still do restore cache, iconv, freetds, openssl, save cache, since each one of those individual steps would (or wouldn't) benefit from the restored cache, and would all be present by the time save_cache ran.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9d12f78 implements what you suggest with file the base the cache on. Let me know what you think about the code changes. I didn't want to repeat the library names again, that was my main concern behind the changes.

Also, can you clarify why it's necessary to switch from one job step per ports to compiling them all in one step?

You kind of mentioned my intention behind this below:

(I also recognize the counterpoint, which is the final fat gem that we publish is cross compiled, so why not do it that same way here... I'm conflicted).

I'll give a more detailed answer in that conversation.

Comment on lines 237 to 242
# Move gem files to a separate directory, as CircleCI artifacts don't support wildcards for paths
mkdir pkg_artifacts
mv pkg/*.gem pkg_artifacts

- store_artifacts:
path: pkg_artifacts
Copy link
Contributor

@aharpervc aharpervc Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... here's some other options for discussion:

  1. do this as is
  2. store the whole pkg dir (and including the other files besides .gems)
  3. run the equivalent of find ./pkg ! -name "*.gem" -delete to delete non-.gem files
  4. tell rake gem to put the .gems somewhere in a folder by themselves

I'm not fully opposed to the current approach, although would you consider calling the new subfolder "gems" since that's what'll go in (rather than pkg_artifacts)?

Also, I recommend having the mkdir/move be a separate job step. Generally my preference is to have more single-purpose job steps rather than job steps with multiple commands.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented your suggestions with 741d1c5.

tasks/ports/libiconv.rb Show resolved Hide resolved
Comment on lines +158 to +160
- restore_cache:
name: restore mssql installation file
key: downloads-{{ checksum "test/bin/install-mssql.ps1" }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify the benefit of caching this installation script? It's just another source file in the repo, so it should already be available on disk in the job step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't cache the PowerShell script, but the C:/Downloads folder where I download the MSSQL installation file into. As the URL is hardcoded into this script, I thought it was fine to use the script for the checksum.

Potential side-effect is that if you change something in the script, it'll re-download the installation file. But it was also continue to use the cache even if Microsoft updates the SQL installation file on their side. I wouldn't say it's too important that the CI uses the latest version and rather take the 2 minute speed-up from caching the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't cache the PowerShell script, but the C:/Downloads folder where I download the MSSQL installation file into

Oh yeah... duh.

Comment on lines 148 to 151
$rubyArchitecture = (ruby -e 'puts RUBY_PLATFORM').Trim()
$source = "C:\home\circleci\project\tmp\$rubyArchitecture\stage\lib\tiny_tds"
$destination = ".\lib\tiny_tds"
Get-ChildItem $source -Recurse -Exclude "*.rb" | Copy-Item -Destination {Join-Path $destination $_.FullName.Substring($source.length)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I recommend using relative paths if possible
  2. Generally this job step seems a bit strange, and that's related to my understanding of moving the ports code to a separate job. If it were to happen that CircleCI cleared the workflow cache after the ports job ran but before this job ran, wouldn't that mean that this job would fail? Is it guaranteed that won't happen? What is the recommended way to transfer files between jobs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were to happen that CircleCI cleared the workflow cache after the ports job ran but before this job ran, wouldn't that mean that this job would fail? Is it guaranteed that won't happen? What is the recommended way to transfer files between jobs?

Good point. So the Windows job has a requires declaration on the precompile job. So in case that one fails, the Windows jobs wouldn't even start.

Regarding cache, I did some more research here and looks like workspaces are the better data store, as they are meant to pass build artifacts down in CI. Didn't know about that before, but it's implement now with 2a071ad. This also gets rid of the absolute paths.


workflows:
test_supported_ruby_versions:
jobs:
- test_linux:
matrix:
- cross_compile_ports
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still trying to understand this strategy. If I'm understanding you, the idea is to quit trying to build the native dependencies on each OS in isolation, but rather to have a separate job here that cross compiles the native deps for Windows but from Ubuntu. On Ubuntu the strategy remains to build the native deps for itself.

If that understanding is correct, I'm conflicted about this part of the PR. The longer build time ought to be a non-issue after caching the ports folder. This also seems much more complicated but also not essential. Plus, I think it is useful to demonstrate that the native deps can be compiled on each OS, for that OS, with each version of Ruby in the list.

Another way to say it is that I see virtue in having the structure & list of job steps between the OS's (including potentially macOS in the future) be as similar as possible, even if the contents of the step needs to be altered to fit the OS (eg, different exact syntax to install/activate a particular version of Ruby, etc).

(I also recognize the counterpoint, which is the final fat gem that we publish is cross compiled, so why not do it that same way here... I'm conflicted).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of right now, as a Windows user I would use the fat gem while on Linux / Mac OS X I get the "regular" gem where I as the user need to make sure that I installed freetds. This is now reflected in CI with the approach I suggested.

I see the point that showing that tiny_tds can be compiled against those versions specified in extconsts.conf is helpful, although I'm not quite sure about Windows here. You can force the installation of the "regular" gem on Windows instead of the fat gem. Since tiny_tds specifies msys2_mingw_dependencies in its gemspec, freetds will be installed prior to the compile process of the gem defined with extconf.rb. If you want to influence the openssl / libiconv / freetds version in your msys2 mingw system, you need to correct this yourself.

The point I want to bring across is: If you don't use the fat gem on Windows, the compile process is out of our hands. However, the compiled ports are used for the fat gem, and it would be good if we are confident that those work.

That said, I could imagine that there could be two windows jobs, one using the precompiled code and the other one compiling the code on the system.

I added some improvements with 9d12f78. So instead of sharing the cached code between the jobs, I share the final gem. Then on Windows, I first install the appropriate fat gem, which is a nice test to also make sure the build gem works, and copy the compiled resources out of the precompiled gem into the project folder to run the tests. I think this simplifies the setup by a bit.

If you still think it's not a good idea, I'm fine with reverting the changes regarding the precompiled code and just adding the cache to the Windows jobs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I could imagine that there could be two windows jobs, one using the precompiled code and the other one compiling the code on the system.

Eh, I don't think that's necessary for right now

Then on Windows, I first install the appropriate fat gem, which is a nice test to also make sure the build gem works

This makes sense and seems like a good idea

copy the compiled resources out of the precompiled gem into the project folder to run the tests

This seems a bit weird, but maybe on balance not worth changing further right now

Invoke-WebRequest -Uri "https://go.microsoft.com/fwlink/?linkid=829176" -OutFile "C:\Downloads\sqlexpress.exe"
}

Write-Host "Installing SQL Express ..."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree it'd be great to test on Windows (especially since the job name is already "test windows"... oops... 🤦‍♂️). However, installing SQL Server is SUCH a headache. Are we able to use Docker for this? I don't think the Windows container is available any longer, so we'd need to use the Linux image, which ought to be possible by connecting to it via setup_remote_docker, I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the Windows container are no longer available. But also Windows containers were quite a pain to set up, I can see why it was easier for Microsoft to just push WSL2 😁

https://circleci.com/docs/building-docker-images/#accessing-services doesn't allow to access services from the primary executor. It looks like it's a detached machine somewhere else in the CircleCI cloud.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I use a similar version of this script at work and has been working without any changes for more than a year now. I also use it for SQL server 2019 and 2022, so in case the pipeline should be extended at some point to test across different SQL server versions, it should be possible with little modification.

command: |
$Env:PATH = "C:\\Ruby<< parameters.ruby_version >>-x64\\bin;$Env:PATH"
bundle exec rake ports
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, oops

this seems to be the more appropriate type of data store according to the CircleCI documentation.
@andyundso
Copy link
Contributor Author

ping @aharpervc :)

Copy link
Contributor

@aharpervc aharpervc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Generally, I think this moves the project forward and gets what we fundamentally want, which is the path to 3.1+ support tested across platforms. So basically I'm ready to approve this pending opinions from others.
  2. I added a couple other folks to the PR review for visibility @bryanwieg feel free to comment as well.
  3. I am slightly disappointed that the CI config now feels much more complicated than before, particularly in jumbling all the ports together (rather than itemized steps for each one) and needing to actually install SQL Server on Windows. I'm thinking about this in terms of future support & diagnostics, where 6 months from now something random breaks and now we have to figure it out. (Again, not necessarily something to block progress over)
  4. I'd still like to figure out how to get equivalent CI job steps across all platforms: install dependencies, build ports, build gem, test code. As discussed, this is in tension with wanting to set things up the way people actually use this gem somewhat, such as producing the fat gem from Linux for Windows. Not sure yet if it's possible to elegantly accommodate both of these needs

Again, net positive here. It'll be huge for all future PR's to direct casual reviewers to the artifacts page to be able to install & validate changes https://app.circleci.com/pipelines/github/andyundso/tiny_tds/65/workflows/44c766c0-9daa-42c8-b7d4-6a3f0fad355e/jobs/416/artifacts

@andyundso
Copy link
Contributor Author

glad to hear that the changes generally look good to you.

I updated the PR description now to better reflect what's actually in the PR, as some thing changed with your initial review.

@andyundso
Copy link
Contributor Author

since it's been two weeks, I would like to ask if you could review the PR so can get it merged (@aharpervc @bvogelzang @larskanis)?

@andyundso
Copy link
Contributor Author

andyundso commented Feb 28, 2023

sorry to annoy again, but Ruby 2.7 goes end of life in about a month and it would be nice if I could start working on the support for Ruby 3, but those changes need to get merged first (@aharpervc @larskanis @bvogelzang) (not sure why Github removed the review requests , i didn't do it intentionally ..)

@andyundso andyundso requested review from aharpervc and removed request for bvogelzang and larskanis February 28, 2023 09:46
@aharpervc
Copy link
Contributor

@andyundso I've approved this PR & added you as a collaborator, so you should be able to merge when you're ready

@andyundso
Copy link
Contributor Author

andyundso commented Mar 30, 2023

@andyundso I've approved this PR & added you as a collaborator, so you should be able to merge when you're ready

great, thanks for the trust! will probably merge on Sunday so I can get back to #524 afterwards.

Looks like a new option that has been introduced in the most recent 2.7 version and explains the timeout we see with the Ruby installation on CircleCI.
https://github.com/oneclick/rubyinstaller2/wiki/FAQ#user-content-install-mode

Older versions seem to ignore the parameter and run the installation as usual.
@andyundso andyundso merged commit 65757c4 into rails-sqlserver:master Apr 2, 2023
@andyundso andyundso deleted the circle-ci-improvements branch April 2, 2023 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants