Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[express] handle the "unclaimed" case in authentication & "asynchronize" owner read/write #107
Thanks Tom. I appreciate the contribution, but I do have a couple of questions though. I know I've got a lot of bugs still, but in the interest of learning as I go here it would help to know what bugs a commit fixes.
The changes to authenticated also worry me. It's route middleware that could potentially get called a lot, and now requires a hit to the file system where it didn't before. I'm also not sure we want the default behavior of a wiki that thinks it's unclaimed to be treat everyone like they own it. The idea now is that an unclaimed wiki can be claimed, but to exploit that on an established wiki you would have to break it in such a way that the wiki thought it wasn't claimed, but it wasn't so broken that you couldn't reclaim it yourself. With this new setup anything falsy happens in the path.exists call and the wiki is accepting all writes.
The setOwner call back stuff is probably a good idea though. I hadn't run into a problem, but that should be more robust going forward.
Quick response before I crawl into bed -- I've already stayed up far too late :)
As you've identified, there's room for improvement on the performance of authenticated() -- we can cache the "claimed" state in memory, for example.
wrt these specific changes, we were just trying to address what I think might just be a bit of a misunderstanding: if I understood Ward correctly (and I'm sure he'll correct me if I'm lying!) an "unclaimed" Wiki should be editable by anyone. The ruby/sinatra code base reflects this (see server.rb:84 and server.rb:208 as at b24872c...).
Once a Wiki has been claimed, the openid filesystem "tag" gets created. The logic in master for the express server seems to be overlooking the "unclaimed" case.
Hope I'm making sense there!
I don't understand most of what you were saying from "The idea now is ..." onwards, sorry -- it must be getting too late.
And yup, the setOwner thing was more a simple "correctness" thing. Wasn't causing issues, wasn't likely to cause issues, but theoretically it could have caused issues. :)
In any case, thanks for taking a look.
Yeah, as you can tell I was up way too late as well :)
So instead of having separate claimed logic like the ruby implementation does, I've instead been using the owner variable to show that the wiki is unclaimed if it is falsy. The whole
Good point on how the ruby works, I had missed that. So I'll have to champion the idea that current behavior is wrong :)
I think that this behavior would improve security, making the default action if a wiki is broken and can't tell who owns it to be read only instead of accept all writes. It would also prevent heartbreak from users in the case where someone starts editing a wiki thinking "i'll claim this later, for whatever inane reason i'm too lazy right now, even though I'm putting in time, effort, and important data", and then someone else just comes along and claims it out from under them.
Looking back up I appear to have mostly been babbling incoherently after "The idea now is", sorry. I think I covered my arguments in a less crazy way this time... I hope.
I am absolutely glad for "correctness" changes, my ideas on best practices in node are still evolving, so that is actually quite helpful :)
I'm going to hold off on this merge until Nick has a chance to think through the logic of the changes we made. I'll explain myself here.
Why make unclaimed sites writable?
Stephen and I made this choice when we first added authentication. I encouraged it because it preserved the existing behavior and kept our integration tests working. Its also how the original wiki worked: 100% open.
Nick correctly points out that anyone could claim a shared wiki. That doesn't make sense. It is one more way that our federated codebase does not support effective sharing. This doesn't concern me. I'd rather find more ways to get people into their own servers.
I've used this "write first and claim later" workflow to amaze people when I demo. Faced with the protest: "I don't want to setup a server" I just say, "You already have one". We then go to "their site" on my farm and start editing. Two edits into the experience they start wondering how this could possibly happen and what makes it theirs. Then I say, one more step: claim this site by pressing this (G) button right here. Done.
This is an instance of the Have-Fun-First pattern for encouraging viral behavior. Here is what the viral loop could look like:
I'm suspicious of most viral loops as get rich quick schemes. However, since we can return to the author (or anyone else) the complete contents of the site as one large JSON, we give everyone the opportunity to defect. Git has this property too. GitHub does not.
Why touch the filesytem with each request?
Touching the filesystem with each request is foolish in a high volume environment. And, of course, high volume is where node.js is known to shine.
However, when I first got the farm logic working in sinatra I blew an important demo because I had cached information incorrectly flowing between sites. I've since de-optimized the sinatra server in order to get the logic right first.
My suggestion is to write node.js code in the style that confines de-optimizations and follows all the performance patterns one would expect of node.js code, even if it isn't currently optimized.
More discussion? Do the pull? What do you think?
I definitely agree that the do things fist and set it up later workflow amazes people. However I think a typical scenario will see this demoed virally with the demoers own wiki, making the magic editing locally, and then pushing the demoee's changes up to a farmed server they then own. The other way still works with an unclaimed site and default-data, but the demo is local until the claim happens. I think this setup does more to funnel visitors into contributors then an open site, however I'm willing to give up this argument to you're judgement here.
However, I'm still going to argue about how we handle the claimed/owned state internally in the express port :)
This commit moves us to what I see as a bad combination of the two where we track ownership in memory, but then don't trust that owner information as to the claimed status. While I agree that emphasizing the logic over the performance is key at this stage,I don't think that tracking the owner in memory and persisting to disk is premature optimization.
If i'm not overstepping my bounds what I'd like to do is merge in the setOwner cb changes, but remove isClaimed, and instead trust that a truthy owner is the same as the wiki being claimed, and go with your judgment on changing an unclaimed wiki until I can come up with more compelling arguments :)
Nick, I like your plan. Can you pull the mods from Tom's repo and put together a pull request?
One thing more you could do: We thought that the claim logic modifications would make the integration tests pass again. Unfortunately we had trouble getting RSpec/Sinatra running on Tom's machine. It was late. Would you see if our work makes the tests pass again. It would make us feel good to know that. Thanks.