Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test Native and GSL Implementations #196

Merged
merged 1 commit into from
May 13, 2022

Conversation

mkasberg
Copy link
Contributor

classifier-reborn is designed to work with or without
GSL support.

Note that `LSI` will work without these libraries, but as soon as they are installed, classifier will make use of them. No configuration changes are needed, we like to keep things ridiculously easy for you.

If GSL is installed, classifier-reborn will detect it and use it. If GSL
is not installed, classifier-reborn will fall back to a pure-ruby
implementation. The mechanism for doing so is in lsi.rb:

begin
raise LoadError if ENV['NATIVE_VECTOR'] == 'true' # to test the native vector class, try `rake test NATIVE_VECTOR=true`
require 'gsl' # requires https://github.com/SciRuby/rb-gsl
require_relative 'extensions/vector_serialize'
$GSL = true
rescue LoadError
$GSL = false
require_relative 'extensions/vector'
require_relative 'extensions/zero_vector'
end

I think it's important to test the classifier-reborn gem with GSL
support in CI. One of my goals is to add similar support using Numo,
and I'd like that to be tested in CI as well. Also, I want to make sure I
don't do anything along the way that could break existing GSL support.
As far as I can tell, GSL support was never tested in CI before now
(though it was previously discussed
here).

In this PR, I'm expanding our text matrix to test with and without GSL
enabled. When the matrix has GSL enabled, we install the gsl gem
(which is not included in our Gemfile since it's optional). Lucky for
us, GitHub already includes libgsl as pre-installed
software
,
so we don't need to do anything special for the apt package. Our gem
already has everything needed to build & install. When matrix.gsl is
false, we won't install the gem and tests will run the native Ruby
implementation. When matrix.gsl is true, we'll install the gem tests
will run the GSL implementation.

Currently, GSL only works with Ruby 2.7. (One of the main reasons I want
to add support for Numo is because GSL is becoming difficult to
support.) As such, I've excluded other versions of ruby in our test
matrix for now. They'll still be tested with GSL disabled, but not with
it enabled.

While working on this, I noticed some tests in the LSI spec that return
early when $GSL is not enabled. It would be better for those tests to
report as skipped when GSL is not enabled (and this matches the pattern
of the redis tests, that report as skipped if redis isn't available).

classifier-reborn is designed to work with or without
[GSL](https://www.gnu.org/software/gsl/) support.

https://github.com/jekyll/classifier-reborn/blob/99d13af5adf040ba40a6fe77dbe0b28756562fcc/docs/index.md?plain=1#L68

If GSL is installed, classifier-reborn will detect it and use it. If GSL
is not installed, classifier-reborn will fall back to a pure-ruby
implementation. The mechanism for doing so is in `lsi.rb`:

https://github.com/jekyll/classifier-reborn/blob/99d13af5adf040ba40a6fe77dbe0b28756562fcc/lib/classifier-reborn/lsi.rb#L7-L17

I think it's important to test the classifier-reborn gem with GSL
support in CI. One of my goals is to add similar support using Numo,
and I'd like that to be tested in CI as well. Also, I want to make sure I
don't do anything along the way that could break existing GSL support.
As far as I can tell, GSL support was never tested in CI before now
(though it was previously discussed
[here](jekyll#46 (comment))).

In this PR, I'm expanding our text matrix to test with and without GSL
enabled. When the matrix has GSL enabled, we install the `gsl` gem
(which is not included in our Gemfile since it's optional). Lucky for
us, GitHub already [includes libgsl as pre-installed
software](https://github.com/actions/virtual-environments/blob/main/images/linux/Ubuntu2004-Readme.md#installed-apt-packages),
so we don't need to do anything special for the apt package. Our gem
already has everything needed to build & install. When `matrix.gsl` is
false, we won't install the gem and tests will run the native Ruby
implementation. When `matrix.gsl` is true, we'll install the gem tests
will run the GSL implementation.

Currently, GSL only works with Ruby 2.7. (One of the main reasons I want
to add support for Numo is because GSL is becoming difficult to
support.) As such, I've excluded other versions of ruby in our test
matrix for now. They'll still be tested with GSL disabled, but not with
it enabled.

While working on this, I noticed some tests in the LSI spec that return
early when `$GSL` is not enabled. It would be better for those tests to
report as skipped when GSL is not enabled (and this matches the pattern
of the redis tests, that report as skipped if redis isn't available).
@mattr-
Copy link
Member

mattr- commented May 13, 2022

@jekyllbot: merge +dev

@jekyllbot jekyllbot merged commit 0d3d235 into jekyll:master May 13, 2022
jekyllbot added a commit that referenced this pull request May 13, 2022
@mattr-
Copy link
Member

mattr- commented May 13, 2022

Thanks! This is a great improvement!

@ashmaroli
Copy link
Member

@mkasberg I think there is some misunderstanding here.

The lone job that runs with gsl: true installs the gsl gem in a separate step using gem install ... However, the test suite is consequently run with bundle exec .., meaning the test-suite is run in the context of pre-existing Gemfile.lock (that does not include ``gsl` gem).

Theoretically, my argument can be proven / disproved by having the test-helper issue an appropriate log output before running the tests.

To remove the ambiguity nevertheless, the best route is to have Bundler install the gsl gem during bundle install by using an environment variable:

# Gemfile

gem 'gsl' if ENV["LOAD_GSL"]

And then initialize the environmental variable before setting up Ruby:

# workflow config

jobs:
  ci:
    env:
      # See https://github.com/marketplace/actions/setup-ruby-jruby-and-truffleruby#matrix-of-gemfiles
      BUNDLE_GEMFILE: ${{ matrix.gemfile }}
      LOAD_GSL: ${{ matrix.gsl }}

@mkasberg
Copy link
Contributor Author

mkasberg commented May 14, 2022

You're right! These two tests should not be skipped when GSL is actually loaded:

image

Thanks for the suggestion, I think I can implement something like that and I'll make a PR to do so! Might take me about a week to get to it, I'm going on vacation this week.

mkasberg added a commit to mkasberg/classifier-reborn that referenced this pull request May 23, 2022
In jekyll#196, I attempted to make the CI tests execute both with and without
GSL support because classifier-reborn is supposed to work in both cases
(though GSL increases performance). However, while doing so, I missed
the fact that `script/test` (which our CI job runs) uses `bundle exec`
to run the tests. And when the tests are run with `bundle exec`, Bundler
will not load gems that aren't specified in the Gemfile. So our GSL
tests weren't actually using the GSL gem.

To fix this, we add `gem 'gsl' if ENV["LOAD_GSL"]` to our Gemfile. (This
solution was proposed by ashmaroli.) This won't do anything in most
cases (when `LOAD_GSL` isn't set). But when we set it (in our CI env),
it will include GSL in our bundled gems so that we can actually test
with GSL support in CI.

See here for more info:
jekyll#196 (comment)
@mkasberg mkasberg deleted the native-gsl-tests branch May 23, 2022 02:38
mkasberg added a commit to mkasberg/classifier-reborn that referenced this pull request May 23, 2022
In jekyll#196, I attempted to make the CI tests execute both with and without
GSL support because classifier-reborn is supposed to work in both cases
(though GSL increases performance). However, while doing so, I missed
the fact that `script/test` (which our CI job runs) uses `bundle exec`
to run the tests. And when the tests are run with `bundle exec`, Bundler
will not load gems that aren't specified in the Gemfile. So our GSL
tests weren't actually using the GSL gem.

To fix this, we add `gem 'gsl' if ENV["LOAD_GSL"]` to our Gemfile. (This
solution was proposed by ashmaroli.) This won't do anything in most
cases (when `LOAD_GSL` isn't set). But when we set it (in our CI env),
it will include GSL in our bundled gems so that we can actually test
with GSL support in CI.

See here for more info:
jekyll#196 (comment)
mkasberg added a commit to mkasberg/classifier-reborn that referenced this pull request May 23, 2022
In jekyll#196, I attempted to make the CI tests execute both with and without
GSL support because classifier-reborn is supposed to work in both cases
(though GSL increases performance). However, while doing so, I missed
the fact that `script/test` (which our CI job runs) uses `bundle exec`
to run the tests. And when the tests are run with `bundle exec`, Bundler
will not load gems that aren't specified in the Gemfile. So our GSL
tests weren't actually using the GSL gem.

To fix this, we add `gem 'gsl' if ENV["LOAD_GSL"]` to our Gemfile. (This
solution was proposed by ashmaroli.) This won't do anything in most
cases (when `LOAD_GSL` isn't set). But when we set it (in our CI env),
it will include GSL in our bundled gems so that we can actually test
with GSL support in CI.

See here for more info:
jekyll#196 (comment)
mkasberg added a commit to mkasberg/classifier-reborn that referenced this pull request May 23, 2022
In jekyll#196, I attempted to make the CI tests execute both with and without
GSL support because classifier-reborn is supposed to work in both cases
(though GSL increases performance). However, while doing so, I missed
the fact that `script/test` (which our CI job runs) uses `bundle exec`
to run the tests. And when the tests are run with `bundle exec`, Bundler
will not load gems that aren't specified in the Gemfile. So our GSL
tests weren't actually using the GSL gem.

To fix this, we add `gem 'gsl' if ENV["LOAD_GSL"]` to our Gemfile. (This
solution was proposed by ashmaroli.) This won't do anything in most
cases (when `LOAD_GSL` isn't set). But when we set it (in our CI env),
it will include GSL in our bundled gems so that we can actually test
with GSL support in CI.

See here for more info:
jekyll#196 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants