Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish implementation of compact index api draft #435 #547

Closed
wants to merge 73 commits into from

Conversation

LukeIGS
Copy link

@LukeIGS LukeIGS commented May 30, 2023

Closes: #435

Saves the result of merging rubygems.org and local repo versions file to a file containing both indexes because this merging process takes a while and consumes a ton of memory to generate, so it should only be regenerated when necessary.

Also fix the concurrent implementation to function on windows.

Plus everything else @skaes implemented in the original draft.

ahorek and others added 30 commits February 2, 2023 10:21
This is a preparation step to move the indexing logic into separate
classes/modules which will not access Sinatra helpers.
On our server this reduces the gem upload time from around five minutes
down to a few seconds.

This also moves the reindex logic from the Sinatra class context to
the instance context. As a consequence, proxied gems are no longer added
to the index. We'll restore the old behavior once the indexer logic has
been extracted to a separate class.
I would suggest to drop support for such ancient ruby versions.
This uses the same strategy as employed by rubygems for the older index
files: write the new files to a fresh temp directory, then use the OS
move command to update them.
The old temp dir has been deleted already by the Gem::Indexer
The /info/:gem_name API will only be called by bundler after it has
called /versions first. Since /versions will build all info files, the
extra logic for computing /info/:gem_name on demand can be removed.
…x again

Whether this is the right thing to do is debatable, but it restores
functionality available on master.
This makes it somewhat easier to see that combining local and remote
gem versions and compact index cannot be enabled at the same time.
this commit also introduces units tests for indexer logic
Bundler uses MD5 hashes as etags in combination with the if-none-match
HTTP request header to avoid downloading large files, such as the
/versions files.
This approach allows us to store upstream gems locally without polluting
the local gem index and thereby forcing manual uploads to the server for
all other versions of a proxied gem.

Etags are used to only transfer upstream files when they have changed.
This option is no longer supported by modern rubygems versions and we
don't need it either.
Since we only build modern indexes we can drop a number paths that will
always return 404 anyway.
Bundler can only work efficiently when the gem server offers the compact
index. If not available, bundler will fetch them gem dependencies for
each versions for all gems that are part of the dependency graph given
by the Gemfile. This can mean hundreds if not thousands of slow calls.
With the compact index, only n+1 calls are made, where n is the number
of gems in the dependency graph and the one extra call is caused by
fetching the list of all versions for all gems.

In addition we also drop all insecure merge strategies: they all permit
attackers to inject malicious gems that replace in-house gems that are
not on ruybgems.org but on your geminabox server.
@LukeIGS
Copy link
Author

LukeIGS commented Aug 29, 2023

Combination of a few things,

  1. satisfying code climate tests
  2. Some of the specs defined in the compact index tests are unnecessary, such as the test for deduplication of gems in the version file where they get rewritten to be merged. Rubygems.org uses append only logic and rewrites the entire thing every month or so, so just excluding any gems that the local server has with the same name as a remote gem is likely good enough.
  3. Decision of actual specs that need to be fulfilled. I don't fully trust myself to think of everything that should be tested and this is an open source project that has potentially larger impact.
  4. Getting someone in the maintainers list to actually review and merge.

@nelsnelson
Copy link

That does seems like a bit of work. I'll take a look at the codeclimate stuff, and see what I can address.

@LukeIGS
Copy link
Author

LukeIGS commented Aug 31, 2023

https://github.com/igs-opensource/geminabox/tree/feature/compact-index-using-file-streams
Here's the current state, feel free to pr to this branch, i'll likely pr this into igsos:main after and then get the finished work into this pr. It should be mostly functional, i've lightly tested it locally and have been working at fixing up the specs. The current unit tests will not work on a windows machine, because they rely on fork.

@github-actions
Copy link

You should update this pull request by commenting on it. Otherwise the PR will be closed in 14 days.

@natebird
Copy link
Contributor

natebird commented Nov 7, 2023

Any updates on this branch? Is there a workaround until this is merged?

@LukeIGS
Copy link
Author

LukeIGS commented Nov 7, 2023

running geminabox from https://github.com/igs-opensource/geminabox/tree/feature/compact-index-using-file-streams branch for the most part seems to work just fine for in the meantime as far as i can tell. I've yet to run into any issues with it locally, though more testing would increase my confidence a bit.

@ahorek
Copy link
Contributor

ahorek commented Nov 7, 2023

hello @tnir is there any plan / hope to move this forward? it's likely bundler will remove support for old APIs sooner or later (rubygems.org already dropped support 5 months ago).
There's always room for further improvements, but is there anything important missing? Something we can fix / change / discuss? I understand it's a large feature, but it's just gathering dust for 3 years without any feedback from the maintainers.

as you can see, many users who want to use the new API are forced to install random forks which also isn't ideal... thanks for any reply :)

@conathan
Copy link

conathan commented Nov 13, 2023

Ran into some issues with a early version of the #435 patch (didn't investigate yet), so redid our setup to https://github.com/igs-opensource/geminabox/tree/feature/compact-index-using-file-streams.

Ran into a 500 error,

App 1784719 output: 2023-11-12 18:36:27 - TypeError - no implicit conversion of nil into String:
App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/gem_versions_merge.rb:14:in initialize' App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/gem_versions_merge.rb:14:in open'
App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/gem_versions_merge.rb:14:in block in merge' App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/gem_versions_merge.rb:13:in open'
App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/gem_versions_merge.rb:13:in merge' App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/compact_index_api.rb:37:in versions'
App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-db3c501500a0/lib/geminabox/server.rb:82:in block in <class:Server>' App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1697:in call'
App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1697:in block in compile!' App 1784719 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1030:in block (3 levels) in route!'

lib/geminabox/gem_versions_merge.rb
    def self.merge(local_gem_list)
    ...
  • local_gem_list was nil

Also working off a new folder of /gems/rubygems.org, as wasn't sure if anything changed in the folder structure between that branch, and pull request #435

config.ru

require "rubygems"
require "geminabox"

Geminabox.data = "/gems/rubygems.org"
Geminabox.rubygems_proxy = true

Geminabox.allow_remote_failure = true
# No longer needed since patch 435
#Geminabox.rubygems_proxy_merge_strategy = :remote_gems_take_precedence_over_local_gems
run Geminabox

This is a caching mirror, and no local gems.

---- what triggered it
AlmaLinux 9.2, with stock packages
bundler 2.2.33
ruby 3.0.4p208

Gemfile with:
gem "eventmachine", '1.2.7'
gem "rails", "7.1.1"
gem "sprockets-rails"
gem "bootsnap", require: false
gem "dalli"

Probably shouldn't be posting it here, but I didn't see a issues link on https://github.com/igs-opensource/geminabox/tree/feature/compact-index-using-file-streams

@LukeIGS
Copy link
Author

LukeIGS commented Nov 13, 2023

That error basically happens when the local gem list either hasn't been populated or there are no gems on the server... i'll try to get a fix out for it. Skae's intitial configuration instructions i believe had you build the local index before doing anything else using the cli tool... but in reality it should just work out of box like it always has, just install and go.

@LukeIGS
Copy link
Author

LukeIGS commented Nov 13, 2023

@conathan Try the newest version i just pushed to that branch, it should now just generate an index if there isn't one already rather than requiring you to set one up yourself. In reality this really just won't add much time on a fresh server.

@conathan
Copy link

conathan commented Nov 14, 2023

Ran into:

App 1900072 output: 2023-11-13 17:59:17 - ArgumentError - wrong number of arguments (given 3, expected 2):
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:39:in `write_version_entries'
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:32:in `block (2 levels) in merge'
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `open'
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `block in merge'
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `open'
App 1900072 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `merge'

lib/geminabox/gem_versions_merge.rb
Changed
write_version_entries(preamble, local_io, remote_versions_file)
to
#write_version_entries(preamble, local_io, remote_versions_file) # Probably affected below line numbers by 1
write_version_entries(local_io, remote_versions_file)

Then got:

App 1900906 output: 2023-11-13 18:03:56 - EOFError - end of file reached:
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:49:in `readline'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:49:in `block in write_version_entries'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:92:in `block in try_load_cached_file'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:94:in `open'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:94:in `try_load_cached_file'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:41:in `write_version_entries'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:33:in `block (2 levels) in merge'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `open'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `block in merge'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `open'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `merge'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/compact_index_api.rb:35:in `versions'
App 1900906 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/server.rb:82:in `block in '

Wiped out the files I had in /gems/rubygems.org, as thought last exception could have caused it, then got

App 1901941 output: 2023-11-13 18:06:50 - ArgumentError - no time information in "at:":
App 1901941 output: 	/usr/share/ruby/time.rb:194:in `make_time'
App 1901941 output: 	/usr/share/ruby/time.rb:380:in `parse'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:23:in `block (3 levels) in merge'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:92:in `block in try_load_cached_file'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:94:in `open'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:94:in `try_load_cached_file'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:20:in `block (2 levels) in merge'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `open'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:14:in `block in merge'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `open'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:13:in `merge'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/compact_index_api.rb:35:in `versions'
App 1901941 output: 	/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/server.rb:82:in `block in '

lib/geminabox/gem_versions_merge.rb: self.write_version_entries

Changed
preamble = "created at: #{Time.now}"
To
preamble = "created_at: #{Time.now}"
(Wiped out files again for clean slate)
App 1903164 output: /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb:49:in `readline' (again)

root@gems [/gems/rubygems.org/compact_index]# cat versions 
created_at: 2023-11-14T02:12:10.149+0000
---
root@gems [/gems/rubygems.org/compact_index]# cat merged_versions 
created_at: 2023-11-13 18:12:40 -0800---

Noticed a missing newline after created_at:

Changed in self.write_version_entries
merged_versions_file.write("#{preamble}")
To
merged_versions_file.write("#{preamble}\n")
(Suspect unrelated looking at the code).

Still same error, but wanted to post what I got so far.

Also to confirm, have not built the index (current_local_gem = local_gem_io.readline) seems to be where it's failing right now). Will be continuing on.

@conathan
Copy link

For the last issue I ran into, local_gems_list is
"created_at: 2023-11-14T02:20:24.633+0000\n---\n"

self.write_version_entries

does a rewind, and 3 readline's (3rd being current_local_gem)

irb(main):001:0> require 'stringio'
=> true
irb(main):002:0> s=StringIO.new "created_at: 2023-11-14T02:20:24.633+0000\n---\n"
=> #
irb(main):003:0> s.readline
=> "created_at: 2023-11-14T02:20:24.633+0000\n"
irb(main):004:0> s.readline
=> "---\n"
irb(main):005:0> s.readline
(irb):5:in `readline': end of file reached (EOFError)
	from (irb):5:in `'
	from /usr/share/gems/gems/irb-1.3.5/exe/irb:11:in `'
	from /usr/bin/irb:23:in `load'
	from /usr/bin/irb:23:in `'

Not familiar enough with the codebase to suggest a fix for this, but did:

        current_local_gem = local_gem_io.readline rescue nil
        return nil if current_local_gem.nil?

In our own environment (and got what appears to be a successful bundle install going).

@conathan
Copy link

Putting together everything as a diff.

diff --git a/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b.orig/lib/geminabox/gem_versions_merge.rb b/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb
index 7d57901..c31c973 100644
--- a/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b.orig/lib/geminabox/gem_versions_merge.rb
+++ b/home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-b4e8a262200b/lib/geminabox/gem_versions_merge.rb
@@ -29,7 +29,7 @@ module Geminabox
             end
           end
 
-          write_version_entries(preamble, local_io, remote_versions_file)
+          write_version_entries(local_io, remote_versions_file)
           File.read(merged_versions_path)
         end
       end
@@ -41,11 +41,12 @@ module Geminabox
         local_gem_list = local_gem_io.readlines.map { |it| it.split[0] }
         local_gem_io.rewind
         local_gem_io.readline # advance past preamble since we're writing our own merged versions file
-        preamble = "created at: #{Time.now}"
+        preamble = "created_at: #{Time.now}"
         merged_versions_file.flock(File::LOCK_EX)
-        merged_versions_file.write("#{preamble}")
+        merged_versions_file.write("#{preamble}\n")
         merged_versions_file.write(local_gem_io.readline) # get the ---
-        current_local_gem = local_gem_io.readline
+        current_local_gem = local_gem_io.readline rescue nil
+        return nil if current_local_gem.nil?
         remote_version_io.each_line do |remote_version_line|
           next if remote_version_line.eql?("---\n")
           remote_gem_name = remote_version_line.split[0]

@LukeIGS
Copy link
Author

LukeIGS commented Nov 14, 2023

Just so i'm sure i understand this, the context here is that you're performing a bundle install with no local gems? Just trying to figure out what process you're using so i can replicate it.

@LukeIGS
Copy link
Author

LukeIGS commented Nov 14, 2023

okay, further updates are in the branch, one of the errors was caused by me forgetting to put an underscore between createdand at in the merged file, so when it split it would try to parse at: to a time.

@conathan
Copy link

That is correct. Speaking of, your latest changes fix all my issues. Thumbsup.

Copy link

You should update this pull request by commenting on it. Otherwise the PR will be closed in 14 days.

@conathan
Copy link

This is still not merged, (mostly replying for github actions)

@conathan
Copy link

@LukeIGS Not sure where better to report this (then a closed ticket I don't think I can reopen), but was having troubles with some bundle installs (when ran in parallel via capistrano) saying things like ffi not found in any source.

And found this coresponding with it on geminabox

App 373483 output: 2024-06-10 17:47:35 - ArgumentError - mon out of range:
App 373483 output:  /usr/share/ruby/time.rb:266:in `local'
App 373483 output:  /usr/share/ruby/time.rb:266:in `make_time'
App 373483 output:  /usr/share/ruby/time.rb:380:in `parse'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:23:in `block (3 levels) in merge'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:91:in `block in try_load_cached_file'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:93:in `open'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:93:in `try_load_cached_file'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:20:in `block (2 levels) in merge'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:14:in `open'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:14:in `block in merge'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:13:in `open'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb:13:in `merge'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/compact_index_api.rb:35:in `versions'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox/server.rb:82:in `block in '
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1697:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1697:in `block in compile!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1030:in `block (3 levels) in route!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1049:in `route_eval'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1030:in `block (2 levels) in route!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1078:in `block in process_route'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1076:in `catch'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1076:in `process_route'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1028:in `block in route!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1025:in `each'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1025:in `route!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1147:in `block in dispatch!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `block in invoke'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `catch'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `invoke'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1142:in `dispatch!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:956:in `block in call!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `block in invoke'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `catch'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1119:in `invoke'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:956:in `call!'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:945:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/xss_header.rb:18:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/path_traversal.rb:16:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/json_csrf.rb:26:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/base.rb:50:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/base.rb:50:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-protection-2.2.4/lib/rack/protection/frame_options.rb:31:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-2.2.8/lib/rack/null_logger.rb:11:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-2.2.8/lib/rack/head.rb:12:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/rack-2.2.8/lib/rack/method_override.rb:24:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:218:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:2004:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1564:in `block in call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1780:in `synchronize'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/gems/sinatra-2.2.4/lib/sinatra/base.rb:1564:in `call'
App 373483 output:  /home/webapps/geminabox-bn/shared/bundle/ruby/3.0.0/bundler/gems/geminabox-1c2fc3f28da5/lib/geminabox.rb:85:in `call'
App 373483 output:  /usr/share/ruby/vendor_ruby/phusion_passenger/rack/thread_handler_extension.rb:107:in `process_request'
App 373483 output:  /usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:149:in `accept_and_process_next_request'
App 373483 output:  /usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in `main_loop'
App 373483 output:  /usr/share/ruby/vendor_ruby/phusion_passenger/request_handler.rb:419:in `block (3 levels) in start_threads'
App 373483 output:  /usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception'

I printed out the results of merged_versions_preamble from lib/geminabox/gem_versions_merge.rb, I believe it is expected to be:
Created At: Jan 1 2024 (not exact)

But when it failed I was getting lines like
LanGrove 0.0.1,0.0.2 d37557b5d8ab1b6f85301f50d2618c59

I theorize when there are multiple requests at the same time, it can be reading the file while it is in the process of being created, so tried the following patch

diff -Naurp geminabox-1c2fc3f28da5.orig/lib/geminabox/gem_versions_merge.rb geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb
--- geminabox-1c2fc3f28da5.orig/lib/geminabox/gem_versions_merge.rb     2024-06-11 10:14:34.312054817 -0700
+++ geminabox-1c2fc3f28da5/lib/geminabox/gem_versions_merge.rb  2024-06-11 10:20:24.289371650 -0700
@@ -19,6 +19,10 @@ module Geminabox
           preamble = (local_time > remote_time ? local_preamble : remote_preamble)
           try_load_cached_file do |merged_versions_file|
             unless merged_versions_file.eof?
+              #  - Multiple requests in parallel, first line is not created_at:
+              # This seems to resolve it
+              merged_versions_file.flock(File::LOCK_EX)
+              #  - End
               merged_versions_preamble = merged_versions_file.readline
               merged_version_time = Time.parse(merged_versions_preamble.split[1])
               preamble_time = Time.parse(preamble.split[1])

And no longer able to reproduce the problem since (Appears to solve it, but I don't know the codebase well enough to say if it's the right solution)

@conathan
Copy link

Also adding to this, we have been using @LukeIGS 's branch since mid November, 2023 with this being the only issue we encountered so far (Was working fine when used from a single server, but doing deploy's with bundle install's in parallel caused the issue reported in the previous comment). Seems to be working well

@LukeIGS
Copy link
Author

LukeIGS commented Jun 11, 2024 via email

@conathan
Copy link

conathan commented Jun 11, 2024

@LukeIGS
To give a simpler reproceable version of the error,

merged_versions_preamble = "LanGrove 0.0.1,0.0.2 d37557b5d8ab1b6f85301f50d2618c59"
Time.parse(merged_versions_preamble.split[1])

(or)
Time.parse("0.0.1,0.0.2")

irb(main):001:0> merged_versions_preamble = "LanGrove 0.0.1,0.0.2 d37557b5d8ab1b6f85301f50d2618c59"
irb(main):002:0> Time.parse(merged_versions_preamble.split[1])
(irb):2:in `': undefined method `parse' for Time:Class (NoMethodError)
	from /usr/share/gems/gems/irb-1.3.5/exe/irb:11:in `'
	from /usr/bin/irb:23:in `load'
	from /usr/bin/irb:23:in `'

More specifically, intermittently, it's not grabbing the date, but one of the gems via readline and trying to parse it's version as the date. (Does not happen when a single server is using it).

@conathan
Copy link

@LukeIGS Will be my first time doing a PR, so I may not get to it immediately. (in our case, I just patched locally, but thought useful to report upstream).

@LukeIGS
Copy link
Author

LukeIGS commented Jun 11, 2024 via email

@LukeIGS
Copy link
Author

LukeIGS commented Jun 12, 2024

It appears i also, at some point made this patch locally and then forgot to upstream it.

@conathan
Copy link

@LukeIGS I'll skip the PR then.

Hopefully someday this can get merged into geminabox, Seems like the github actions are a bit aggressive for closing these down (was a bit late to keep it open the last time)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants