-
Notifications
You must be signed in to change notification settings - Fork 39
Fix missing dependencies (as seen from the api) #18
Conversation
Issue rubygems#17 describes a problem with gem dependencies missing when viewed through the api. These same dependencies are visible from the gem and from the rubygems.org dependency list. The fix is actually quite simple: the version_id was not being captured in the case where new versions are inserted for gems. As dependencies are associated with both the gem and the version (associations set up in the Rails app) the API returns no dependencies when the version is missing. Also added the built_at attribute to be the same as that used in rubygems.org - it was causing postgres constraint violation in development.
|
Wow, very nice, guys. Thanks! |
@indirect, I'm currently putting together some tests around this. A little challenging because the specs are really wound up with the jobs. |
This shores up the dependency fixes with specs to test expectations for existing behavior and new behavior. Verify that dependencies are _not_ created when a gem is added before its dependents. (Existing behavior) Verify that dependencies _are_ created when a gem is added after its dependents. (New behavior) Extras: * Update GemspecHelper#generate_gemspec to interpolate the name in more fields so we can create gems other than 'foo'. * Update GemBuilder#create_version to add built_at so we don't run foul of database constraints.
And the specs are IN! |
👍 on the tests |
Something to keep in mind: this is a database issue so there will need to be a way to go back and update existing gems that currently have missing dependencies. (I'd be happy to help with that.) |
Nice work Gentlemen! ✨ |
Maybe we could write a job or migration to backfill the missing dependencies in production? On Nov 24, 2012, at 12:50 PM, Don Morrison notifications@github.com wrote:
|
Fix missing dependencies (as seen from the api)
Thanks guys for catching the bug I introduced. That first spec still seems like broken functionality in the sync code. |
@hone, if you're referring to the fact that dependencies will get skipped if the gems don't already exist, then yes. It seems like a bug but I wanted to catch the existing behavior. Have you been able to deploy the fix to production? And if so, were you able to write up something to catch the database up? If not, I would like to take a stab at it. I just don't want to duplicate effort. This is keeping people from being able to do their work. Saw someone say something in #bundler and my guess is we'll see more and more as the days go by. |
As a temporary workaround, anyone can run "bundle install --full-index" in order to not use the Bundler API while it's acting up. On Nov 25, 2012, at 3:16 PM, Don Morrison notifications@github.com wrote:
|
@indirect, ah, good to know, that helps me in the short-term. I wonder how many know about that. After digging through these issues I am keen on helping with this project. It is a hugely important piece of infrastructure and I have some ideas on how best to help. Is there a way I can dive in and help get the index updated? |
Hopefully anyone who runs "bundle help install", but I'm not sure how many people that is. :) I know that @hone has been working on the index stuff, so I'm going to ask him to chime in here… what would you suggest? On Nov 25, 2012, at 4:11 PM, Don Morrison notifications@github.com wrote:
|
@indirect, most of what I would like to do is optimize this project for use by bundler (and other applications that use the api to get dependency information). This probably isn't the best place for the comments, but you asked. :-) Also, I should mention, that I totally understand the circumstances under which this project was written - the time constraints and application considerations. Most of my thoughts are around the complications of keeping this db in sync with rubygems.org and on responding to request quickly. Sync The api is just polling for changes by checking when the spec index was last modified. Then comparing that to the local data. I know it's a future scenario but this looks like a good place for a hook. Just shoot off a message to the api site(s) and they can get what they need. Again, that's future but it's one we can enable right now with access to the rg codebase and this project. In the absence of hooks we can optimize the local data for resolving that changed status so the sync can be faster and lightweight. Data Requests I plan to spike on alternative ways to structure the data here that would make it easier to keep updated and faster to respond to requests. When looking at what the api serves up - it is the gem name, version and a list of dependencies. That's it. And all of that is contained in the gem spec. Feels like the database is overkill. We need a way to index them but couldn't we just store the data as it needs to be returned? If I had to choose what to start with:
So, yeah, braindump. I am very serious about helping though. Thanks for listening. |
You should check out the youtube video where we talk about some of this stuff. It'd be great to have help. As for 4, it's something we've talked about doing and is on the roadmap. The polling was just the easiest thing to get working off the ground since it was the least invasive to rubygems. When we originally built this, I didn't even know if it would go into production. I'm not sure what you mean by #1? The system is using a threaded consumer pool and a queue. Potentially, I might want to pull the resque 2 code for worker management in the future whenever we finish that. For #2, we're playing around with using flatfiles potentially on S3. @dpiddy can probably say more about that. The syncs are actually relatively fast once it's warmed up. I've been meaning to document most of the architecture but haven't gotten around to it. I've deployed the fix to prevent further gems from being messed up. I'm still working on the rake task and will get something out by tomorrow to fix the existing data. |
@hone, thanks for the response. Again, I totally understand the situation you were in when you wrote it. Nothing I am saying is an attack on you or the decisions you have made and I am not suggesting anything I have here is unique or magical. It's working, right? So, seriously, great job! I have watched the video I was mainly just responding to @indirect since he asked what I would suggest. For 1. I'm a huge fan of leveraging other libraries to do the pool and queue work for me - like Yeah 2, is just my gut reaction to seeing you keep a database that is the same as the production database to serve a different purpose. @evanphx mentioned spiking on some memcache work and the flat files on s3 could work too. In the end, I'd love to be involved, I have a growing understanding of the problem, and a desire to contribute so direct me where I can be the most help and I'll jump in. This is such a new project that it seems like a good time to question things and discuss them. |
@elskwid oh I didn't take it that way, sorry if my response came off that way. I just wanted to show the current state of where we stood on all the points. I'd love your help to make this better. The new data model is interesting. Let's chat about it off this issue? ping me on #bundler in freenode. |
Issue #17 describes a problem with gem
dependencies missing when viewed through the api.
These same dependencies are visible from the gem
and from the rubygems.org dependency list.
The fix is actually quite simple: the version_id
was not being captured in the case where new
versions are inserted for gems. As dependencies
are associated with both the gem and the version
(associations set up in the Rails app) the
API returns no dependencies when the version is
missing.
Also added the built_at attribute to be the same
as that used in rubygems.org - it was causing
postgres constraint violation in development.
(Fixes #17) Thanks to @phlipper for working with
me all day on this and to @rsutphin for doing a nice
job with the initial report.