Conversation
There was a problem hiding this comment.
The infamous nested repository feature strikes back! I don't think this will match test/mytest.git, for example.
|
Hi Paul, This looks pretty amazing. We'll definitely squeeze this in for 1.7.0. We'll also need some documentation on how to use this. I haven't used Github's LFS before. Do you know of an example repo? |
a9724a3 to
a4c7ae9
Compare
|
Hi James, thanks for the feedback, I felt there was a lot of duplication in the GitLFS specific filter that I created first time around so I've made some minor modifications to the accessRestrictionFilter to give access to the method and auth headers to allow the GitLFS filter to be part of the standard GitFilter. This has also allowed the default GitLFS path to be used to simply getting up and running with using the client. I'm not aware of there being any open access LFS repositories as GitHub appears to be testing only on a small subset of users, in fact, from the looks of things GitBlit will be one of the first open-source integrated providers! :) Even if there were open access though we still would be unable to use that for testing as this PR is only for the server side, not the client (that's pending a JGit update for smudge and clean filters) I've identified some limitations on the new help page, but I may have missed some so it would be great for another pair of eyes to go over that ;) As a result I've created tests for the servlet by creating the filestore aspects using the separately tested Manager functions. I've also updated the repository extraction to handle nested repositories and added tests for the regex for that. Let me know how it looks and if there's any changes you'd like to see, cheers |
There was a problem hiding this comment.
This should probably be canClone() to be consistent with the rest of Gitblit.
There was a problem hiding this comment.
Ah yes I had this thought a few times. I was going to put canClone() but then in the case that a user has view permissions but not clone permissions they would be unable to request the blob via the web interface (not sure how likely that is). Probably a more realistic scenario though is for in a later PR I'm hoping to use the filestore to handle ticket attachments. In that case I thought it may be more feasible to have someone with view but not clone rights, what do you think?
There was a problem hiding this comment.
Ah. That makes sense, leave it.
|
I made a couple of small suggestions. One thing that is giving me pause is how and how often the metadata is stored. I do like object serialization and I use it in a ton of other places in my work, but I'm not sure about it's use here as the source of truth for blob metadata. Would a JSON array of The other thing I notice is it looks like the backing store is out-of-sync with the in-memory representation until Gitblit is cleanly shutdown. In a perfect world of uninterrupted power and clean shutdowns, that would be fine but I'm thinking we need to persist the metadata when a blob is added & deleted. Do you agree? |
There was a problem hiding this comment.
Broken link. Need https://.
That's a good idea! I have to admit in the first instance I was just wanting to get something that worked without too many changes so serialization seemed the most straight forward approach, though I'm not sure how it would scale. Certainly JSON would be a better approach. Is there a particular example would recommend me look at, maybe the ticket journal implementation?
Fully agree! I wasn't sure what the best way is and thought persisting on each transaction might become an issue with serializing the whole map. I guess using JSON following the ticket journal approach would minimize this and could also provide an audit log if needed. In that case would you say JSON array or journal be better? |
|
I hadn't considered the ticket journal. That would actually be a pretty good fit. The ticket backends initially used a SHA1 for the ticket id (instead of a long) and the sharded directory structure - same as you are using in the LFS directory & similar to Gerrit. We could do something like Or we could go with JSON, for now, synchronize the add/delete methods, and ensure the metadata is re-serialized on those operations. I imagine that updating/removing the blobs probably has a low-incidence, at least for a Gitblit install. No doubt GitHub has to think harder about concurrency and persistence I/O bottlenecks with their 27 million(?) repos but I think Gitblit and it's userbase can tolerate a synchronized lock on the |
I wonder if this would provide a simple means by which to federate the filestore later on... In the mean time, going with the JSON array approach I think I'm almost there. Just having second thoughts on the use of |
aea3cc0 to
7269b81
Compare
|
@gitblit Ok I've implemented the metadata storage as an append-only JSON journal file so it automatically provides an audit log on each status-changing transaction. Also to simplify deserialization of the FilestoreModel I've switched to a string of the principal user name rather than the UserModel. Let me know if there's anything else that needs improving. |
There was a problem hiding this comment.
As you saw in ConfigUserService I usually write to a temp file and then rename to the target file on successful completion of write. I'd like to have that logic here.
There was a problem hiding this comment.
try (RandomAccessFile fs = new RandomAccessFile(metaFile, "rw"))
|
@paulsputer We're almost there. I made a few comments on making sure we cleanup streams and we safely persist your JSON array. |
+ Metadata maintained in append-only JSON file providing complete audit history. + Filestore menu item + Lists filestore items + Current size and availability + Link to GitBlit Filestore help page (top right) + Hooks into existing repository permissions + Uses default repository path for out-of-box operation with Git-LFS client + accessRestrictionFilter now has access to http method and auth header + Testing for servlet and manager
7269b81 to
bd0e83e
Compare
|
Thanks @gitblit, I've added in writing to a temporary file then copying, is this ok? I'm not caching the whole history of the changes, just the current status. Not sure which is the best approach. |
|
Looks good. 👍 |
Noticed quite a few people in the forums mentioning Git-LFS support but haven't seen any PR's yet so here is mine. I'm sure there's a fair bit of clean up needed yet but It would be great to get some feedback for improvements and would be fantastic if it could be part of the 1.7 release :)
Notes
<server>/<user>/<repoName>/info/lfs