Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LFS Migration and Mirror #14726

Merged
merged 85 commits into from
Apr 8, 2021
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
5e4642d
Implemented LFS client.
KN4CK3R Feb 18, 2021
6a236f1
Implemented scanning for pointer files.
KN4CK3R Feb 18, 2021
ac8127b
Implemented downloading of lfs files.
KN4CK3R Feb 18, 2021
0189942
Moved model-dependent code into services.
KN4CK3R Feb 18, 2021
9aa898f
Removed models dependency. Added TryReadPointerFromBuffer.
KN4CK3R Feb 18, 2021
4fd390f
Migrated code from service to module.
KN4CK3R Feb 18, 2021
f94c599
Centralised storage creation.
KN4CK3R Feb 18, 2021
85621c9
Removed dependency from models.
KN4CK3R Feb 18, 2021
ae60e34
Moved ContentStore into modules.
KN4CK3R Feb 18, 2021
37fa1c2
Share structs between server and client.
KN4CK3R Feb 18, 2021
c6ce58c
Moved method to services.
KN4CK3R Feb 18, 2021
7a86be8
Implemented lfs download on clone.
KN4CK3R Feb 18, 2021
68e34cd
Implemented LFS sync on clone and mirror update.
KN4CK3R Feb 18, 2021
4775b80
Added form fields.
KN4CK3R Feb 18, 2021
e458d64
Updated templates.
KN4CK3R Feb 18, 2021
4d2abc3
Fixed condition.
KN4CK3R Feb 18, 2021
a5e64dd
Use alternate endpoint.
KN4CK3R Feb 18, 2021
5615d1e
Added missing methods.
KN4CK3R Feb 19, 2021
a6b0d26
Fixed typo and make linter happy.
KN4CK3R Feb 19, 2021
dd889c6
Detached pointer parser from gogit dependency.
KN4CK3R Feb 19, 2021
86acab3
Fixed TestGetLFSRange test.
KN4CK3R Feb 19, 2021
534886c
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Feb 20, 2021
a951992
Added context to support cancellation.
KN4CK3R Feb 20, 2021
3b69a7d
Use ReadFull to probably read more data.
KN4CK3R Feb 21, 2021
ceefad0
Removed duplicated code from models.
KN4CK3R Feb 21, 2021
ef86fa8
Moved scan implementation into pointer_scanner_nogogit.
KN4CK3R Feb 21, 2021
bffe86c
Changed method name.
KN4CK3R Feb 21, 2021
50d4814
Added comments.
KN4CK3R Feb 21, 2021
e9024a5
Added more/specific log/error messages.
KN4CK3R Feb 21, 2021
59a55b3
Embedded lfs.Pointer into models.LFSMetaObject.
KN4CK3R Feb 22, 2021
649d236
Moved code from models to module.
KN4CK3R Feb 22, 2021
73ce39d
Moved code from models to module.
KN4CK3R Feb 22, 2021
80fe20b
Moved code from models to module.
KN4CK3R Feb 22, 2021
37a1d02
Reduced pointer usage.
KN4CK3R Feb 22, 2021
b0e0804
Embedded type.
KN4CK3R Feb 22, 2021
3779dea
Use promoted fields.
KN4CK3R Feb 22, 2021
932a370
Fixed unexpected eof.
KN4CK3R Feb 23, 2021
6a47890
Added unit tests.
KN4CK3R Feb 23, 2021
55618f0
Merge branch 'master' into feature-lfs-clone
KN4CK3R Mar 5, 2021
bbac160
Merge branch 'master' into feature-lfs-clone
KN4CK3R Mar 7, 2021
64dc148
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 8, 2021
3e30421
Implemented migration of local file paths.
KN4CK3R Mar 8, 2021
0ef54f7
Show an error on invalid LFS endpoints.
KN4CK3R Mar 8, 2021
a44781f
Hide settings if not used.
KN4CK3R Mar 8, 2021
856416a
Added LFS info to mirror struct.
KN4CK3R Mar 10, 2021
5e9c606
Fixed comment.
KN4CK3R Mar 10, 2021
c0b5e50
Check LFS endpoint.
KN4CK3R Mar 10, 2021
4c127de
Manage LFS settings from mirror page.
KN4CK3R Mar 10, 2021
9cf903c
Fixed selector.
KN4CK3R Mar 10, 2021
745812b
Adjusted selector.
KN4CK3R Mar 10, 2021
8808d77
Added more tests.
KN4CK3R Mar 11, 2021
4fcc2f6
Added local filesystem migration test.
KN4CK3R Mar 11, 2021
3283e58
Fixed typo.
KN4CK3R Mar 11, 2021
d190403
Reset settings.
KN4CK3R Mar 11, 2021
4e864f3
Added special windows path handling.
KN4CK3R Mar 11, 2021
750ad79
Added unit test for HTTPClient.
KN4CK3R Mar 12, 2021
8fd1f7d
Added unit test for BasicTransferAdapter.
KN4CK3R Mar 12, 2021
028cbd9
Moved into util package.
KN4CK3R Mar 13, 2021
8df4b21
Test if LFS endpoint is allowed.
KN4CK3R Mar 13, 2021
72b2a08
Merge branch 'master' into feature-lfs-clone
KN4CK3R Mar 13, 2021
d8844ba
Merge branch 'master' into feature-lfs-clone
KN4CK3R Mar 15, 2021
27ad4e4
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 16, 2021
5b33ef4
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 17, 2021
184a949
Added support for git://
KN4CK3R Mar 17, 2021
5a1f3ed
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 18, 2021
b7774e5
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 21, 2021
ba58438
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Mar 21, 2021
a727b46
Just use a static placeholder as the displayed url may be invalid.
KN4CK3R Mar 21, 2021
60a4177
Reverted to original code.
KN4CK3R Mar 21, 2021
2555639
Added "Advanced Settings".
KN4CK3R Mar 23, 2021
1eae7e1
Updated wording.
KN4CK3R Mar 23, 2021
70d0594
Added discovery info link.
KN4CK3R Mar 23, 2021
8b4fedd
Implemented suggestion.
KN4CK3R Mar 23, 2021
1c7181d
Fixed missing format parameter.
KN4CK3R Mar 25, 2021
2a80bd9
Added Pointer.IsValid().
KN4CK3R Mar 25, 2021
b49dc68
Always remove model on error.
KN4CK3R Mar 27, 2021
7aeeaf9
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Apr 5, 2021
32058a1
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Apr 6, 2021
71477b9
Merge branch 'master' of https://github.com/go-gitea/gitea into featu…
KN4CK3R Apr 7, 2021
a9b3ca2
Added suggestions.
KN4CK3R Apr 7, 2021
b5c1649
Use channel instead of array.
KN4CK3R Apr 8, 2021
70a4b72
Update routers/repo/migrate.go
zeripath Apr 8, 2021
45d560c
Merge branch 'master' into feature-lfs-clone
zeripath Apr 8, 2021
0a2593d
fmt
zeripath Apr 8, 2021
0592552
Merge branch 'master' into feature-lfs-clone
zeripath Apr 8, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cmd/serv.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@ import (
"time"

"code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/lfs"
"code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/pprof"
"code.gitea.io/gitea/modules/private"
"code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/services/lfs"

"github.com/dgrijalva/jwt-go"
"github.com/kballard/go-shellquote"
Expand Down
8 changes: 4 additions & 4 deletions integrations/lfs_getobject_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ import (
"code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/lfs"
"code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/storage"
"code.gitea.io/gitea/routers/routes"

gzipp "github.com/klauspost/compress/gzip"
Expand Down Expand Up @@ -50,11 +49,12 @@ func storeObjectInRepo(t *testing.T, repositoryID int64, content *[]byte) string
lfsID++
lfsMetaObject, err = models.NewLFSMetaObject(lfsMetaObject)
assert.NoError(t, err)
contentStore := &lfs.ContentStore{ObjectStorage: storage.LFS}
exist, err := contentStore.Exists(lfsMetaObject)
pointer := lfsMetaObject.AsPointer()
contentStore := lfs.NewContentStore()
exist, err := contentStore.Exists(pointer)
assert.NoError(t, err)
if !exist {
err := contentStore.Put(lfsMetaObject, bytes.NewReader(*content))
err := contentStore.Put(pointer, bytes.NewReader(*content))
assert.NoError(t, err)
}
return oid
Expand Down
7 changes: 7 additions & 0 deletions models/lfs.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"io"
"path"

"code.gitea.io/gitea/modules/lfs"
"code.gitea.io/gitea/modules/timeutil"

"xorm.io/builder"
Expand Down Expand Up @@ -41,6 +42,12 @@ func (m *LFSMetaObject) Pointer() string {
return fmt.Sprintf("%s\n%s%s\nsize %d\n", LFSMetaFileIdentifier, LFSMetaFileOidPrefix, m.Oid, m.Size)
}

// AsPointer creates a Pointer with Oid and Size
func (m *LFSMetaObject) AsPointer() *lfs.Pointer {
KN4CK3R marked this conversation as resolved.
Show resolved Hide resolved
pointer := &lfs.Pointer{Oid: m.Oid, Size: m.Size}
return pointer
}

// LFSTokenResponse defines the JSON structure in which the JWT token is stored.
// This structure is fetched via SSH and passed by the Git LFS client to the server
// endpoint for authorization.
Expand Down
2 changes: 2 additions & 0 deletions modules/forms/repo_form.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ type MigrateRepoForm struct {
// required: true
RepoName string `json:"repo_name" binding:"Required;AlphaDashDot;MaxSize(100)"`
Mirror bool `json:"mirror"`
LFS bool `json:"lfs"`
LFSEndpoint string `json:"lfs_endpoint"`
Private bool `json:"private"`
Description string `json:"description" binding:"MaxSize(255)"`
Wiki bool `json:"wiki"`
Expand Down
111 changes: 111 additions & 0 deletions modules/lfs/client.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

package lfs

import (
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"strings"
)

// Client is used to communicate with the LFS server
type Client struct {
client *http.Client
transfers map[string]TransferAdapter
}

// NewClient creates a LFS client
func NewClient(hc *http.Client) *Client {
client := &Client{hc, make(map[string]TransferAdapter)}

basic := &BasicTransferAdapter{hc}

client.transfers[basic.Name()] = basic

return client
}

func (c *Client) transferNames() []string {
keys := make([]string, len(c.transfers))

i := 0
for k := range c.transfers {
keys[i] = k
i++
}

return keys
}

func (c *Client) batch(url, operation string, objects []*Pointer) (*BatchResponse, error) {
KN4CK3R marked this conversation as resolved.
Show resolved Hide resolved
url = fmt.Sprintf("%s.git/info/lfs/objects/batch", strings.TrimSuffix(url, ".git"))

request := &BatchRequest{operation, c.transferNames(), nil, objects}

payload := new(bytes.Buffer)
err := json.NewEncoder(payload).Encode(request)
if err != nil {
return nil, err
}

req, err := http.NewRequest("POST", url, payload)
if err != nil {
return nil, err
}
req.Header.Set("Content-type", MediaType)
req.Header.Set("Accept", MediaType)

res, err := c.client.Do(req)
if err != nil {
return nil, err
}
defer res.Body.Close()

if res.StatusCode != http.StatusOK {
return nil, fmt.Errorf("Unexpected servers response: %s", res.Status)
}

var response BatchResponse
err = json.NewDecoder(res.Body).Decode(&response)
if err != nil {
return nil, err
}

if len(response.Transfer) == 0 {
response.Transfer = "basic"
}

return &response, nil
}

// Download reads the specific LFS object from the LFS server
func (c *Client) Download(url, oid string, size int64) (io.ReadCloser, error) {
KN4CK3R marked this conversation as resolved.
Show resolved Hide resolved
var objects []*Pointer
objects = append(objects, &Pointer{oid, size})

result, err := c.batch(url, "download", objects)
if err != nil {
return nil, err
}

transferAdapter, ok := c.transfers[result.Transfer]
if !ok {
return nil, fmt.Errorf("Transferadapter not found: %s", result.Transfer)
}

if len(result.Objects) == 0 {
return nil, errors.New("No objects in result")
}

content, err := transferAdapter.Download(result.Objects[0])
if err != nil {
return nil, err
}
return content, nil
}
76 changes: 49 additions & 27 deletions modules/lfs/content_store.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,91 +11,107 @@ import (
"fmt"
"io"
"os"
"path"

"code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/storage"
)

var (
errHashMismatch = errors.New("Content hash does not match OID")
errSizeMismatch = errors.New("Content size does not match")
// ErrHashMismatch occurs if the content has does not match OID
ErrHashMismatch = errors.New("Content hash does not match OID")
// ErrSizeMismatch occurs if the content size does not match
ErrSizeMismatch = errors.New("Content size does not match")
)

// ErrRangeNotSatisfiable represents an error which request range is not satisfiable.
type ErrRangeNotSatisfiable struct {
FromByte int64
}

func (err ErrRangeNotSatisfiable) Error() string {
return fmt.Sprintf("Requested range %d is not satisfiable", err.FromByte)
}

// IsErrRangeNotSatisfiable returns true if the error is an ErrRangeNotSatisfiable
func IsErrRangeNotSatisfiable(err error) bool {
_, ok := err.(ErrRangeNotSatisfiable)
return ok
}

func (err ErrRangeNotSatisfiable) Error() string {
return fmt.Sprintf("Requested range %d is not satisfiable", err.FromByte)
}

// ContentStore provides a simple file system based storage.
type ContentStore struct {
storage.ObjectStorage
}

// NewContentStore creates the default ContentStore
func NewContentStore() *ContentStore {
contentStore := &ContentStore{ObjectStorage: storage.LFS}
return contentStore
}

func relativePath(p *Pointer) string {
if len(p.Oid) < 5 {
return p.Oid
}

return path.Join(p.Oid[0:2], p.Oid[2:4], p.Oid[4:])
}

// Get takes a Meta object and retrieves the content from the store, returning
// it as an io.Reader. If fromByte > 0, the reader starts from that byte
func (s *ContentStore) Get(meta *models.LFSMetaObject, fromByte int64) (io.ReadCloser, error) {
f, err := s.Open(meta.RelativePath())
func (s *ContentStore) Get(pointer *Pointer, fromByte int64) (io.ReadCloser, error) {
f, err := s.Open(relativePath(pointer))
if err != nil {
log.Error("Whilst trying to read LFS OID[%s]: Unable to open Error: %v", meta.Oid, err)
log.Error("Whilst trying to read LFS OID[%s]: Unable to open Error: %v", pointer.Oid, err)
return nil, err
}
if fromByte > 0 {
if fromByte >= meta.Size {
if fromByte >= pointer.Size {
return nil, ErrRangeNotSatisfiable{
KN4CK3R marked this conversation as resolved.
Show resolved Hide resolved
FromByte: fromByte,
}
}
_, err = f.Seek(fromByte, io.SeekStart)
if err != nil {
log.Error("Whilst trying to read LFS OID[%s]: Unable to seek to %d Error: %v", meta.Oid, fromByte, err)
log.Error("Whilst trying to read LFS OID[%s]: Unable to seek to %d Error: %v", pointer.Oid, fromByte, err)
}
}
return f, err
}

// Put takes a Meta object and an io.Reader and writes the content to the store.
func (s *ContentStore) Put(meta *models.LFSMetaObject, r io.Reader) error {
func (s *ContentStore) Put(pointer *Pointer, r io.Reader) error {
hash := sha256.New()
rd := io.TeeReader(r, hash)
p := meta.RelativePath()
p := relativePath(pointer)
written, err := s.Save(p, rd)
if err != nil {
log.Error("Whilst putting LFS OID[%s]: Failed to copy to tmpPath: %s Error: %v", meta.Oid, p, err)
log.Error("Whilst putting LFS OID[%s]: Failed to copy to tmpPath: %s Error: %v", pointer.Oid, p, err)
return err
}

if written != meta.Size {
if written != pointer.Size {
if err := s.Delete(p); err != nil {
log.Error("Cleaning the LFS OID[%s] failed: %v", meta.Oid, err)
log.Error("Cleaning the LFS OID[%s] failed: %v", pointer.Oid, err)
}
return errSizeMismatch
return ErrSizeMismatch
}

shaStr := hex.EncodeToString(hash.Sum(nil))
if shaStr != meta.Oid {
if shaStr != pointer.Oid {
if err := s.Delete(p); err != nil {
log.Error("Cleaning the LFS OID[%s] failed: %v", meta.Oid, err)
log.Error("Cleaning the LFS OID[%s] failed: %v", pointer.Oid, err)
}
return errHashMismatch
return ErrHashMismatch
}

return nil
}

// Exists returns true if the object exists in the content store.
func (s *ContentStore) Exists(meta *models.LFSMetaObject) (bool, error) {
_, err := s.ObjectStorage.Stat(meta.RelativePath())
func (s *ContentStore) Exists(pointer *Pointer) (bool, error) {
_, err := s.ObjectStorage.Stat(relativePath(pointer))
if err != nil {
if os.IsNotExist(err) {
return false, nil
Expand All @@ -106,15 +122,21 @@ func (s *ContentStore) Exists(meta *models.LFSMetaObject) (bool, error) {
}

// Verify returns true if the object exists in the content store and size is correct.
func (s *ContentStore) Verify(meta *models.LFSMetaObject) (bool, error) {
p := meta.RelativePath()
func (s *ContentStore) Verify(pointer *Pointer) (bool, error) {
p := relativePath(pointer)
fi, err := s.ObjectStorage.Stat(p)
if os.IsNotExist(err) || (err == nil && fi.Size() != meta.Size) {
if os.IsNotExist(err) || (err == nil && fi.Size() != pointer.Size) {
return false, nil
} else if err != nil {
log.Error("Unable stat file: %s for LFS OID[%s] Error: %v", p, meta.Oid, err)
log.Error("Unable stat file: %s for LFS OID[%s] Error: %v", p, pointer.Oid, err)
return false, err
}

return true, nil
}

// ReadMetaObject will read a models.LFSMetaObject and return a reader
func ReadMetaObject(pointer *Pointer) (io.ReadCloser, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to consider cancellation here and elsewhere in this file - but that probably can wait for another pr.

contentStore := NewContentStore()
return contentStore.Get(pointer, 0)
}
54 changes: 54 additions & 0 deletions modules/lfs/pointer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.

package lfs

import (
"io"
"strconv"
"strings"
)

const (
blobSizeCutoff = 1024

// TODO remove duplicate from models

// LFSMetaFileIdentifier is the string appearing at the first line of LFS pointer files.
// https://github.com/git-lfs/git-lfs/blob/master/docs/spec.md
LFSMetaFileIdentifier = "version https://git-lfs.github.com/spec/v1"

// LFSMetaFileOidPrefix appears in LFS pointer files on a line before the sha256 hash.
LFSMetaFileOidPrefix = "oid sha256:"
)

// TryReadPointer tries to read LFS pointer data from the reader
func TryReadPointer(reader io.Reader) *Pointer {
buf := make([]byte, blobSizeCutoff)
n, _ := reader.Read(buf)
KN4CK3R marked this conversation as resolved.
Show resolved Hide resolved
buf = buf[:n]

return TryReadPointerFromBuffer(buf)
}

// TryReadPointerFromBuffer will return a pointer if the provided byte slice is a pointer file or nil otherwise.
func TryReadPointerFromBuffer(buf []byte) *Pointer {
headString := string(buf)
if !strings.HasPrefix(headString, LFSMetaFileIdentifier) {
return nil
}

splitLines := strings.Split(headString, "\n")
if len(splitLines) < 3 {
return nil
}

oid := strings.TrimPrefix(splitLines[1], LFSMetaFileOidPrefix)
size, err := strconv.ParseInt(strings.TrimPrefix(splitLines[2], "size "), 10, 64)
if len(oid) != 64 || err != nil {
return nil
}

return &Pointer{Oid: oid, Size: size}
}
Loading