Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle max storage when gc is disabled #5768

Closed
wants to merge 1 commit into from
Closed

Conversation

kjzz
Copy link
Contributor

@kjzz kjzz commented Nov 12, 2018

feature #5140

License: MIT
Signed-off-by: Kejie Zhang 601172892@qq.com

@schomatis
Copy link
Contributor

@kjzz Can I trouble you with a brief description of the PR please? Just a couple of sentences explaining what did you wanted to achieve and how, thanks!

@kjzz
Copy link
Contributor Author

kjzz commented Nov 20, 2018

Hi @schomatis , this pr is aimed to solve the problem that when a repo size limit is configured bug gc is disabled (i.e., no --enable-gc flag has been passed), I think we should have multiple levels of warnings:

1.warning when datastore usage is at 90%
2.non fatal error when datastore usage is at 95%
3.fatal error, preventing startup when datastore usage is at 99%

you can also see the discussion in issue #5140 .

@@ -24,6 +25,10 @@ import (
mfs "gx/ipfs/QmcUXFi2Fp7oguoFT81f2poJpnb44dFkZanQhDBHMoYyG9/go-mfs"
)

var (
errFullStorage = errors.New("DataStorage is full,please enable gc and restart node")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a space after the "," (same for the log errors below)

@@ -126,6 +131,23 @@ func (api *UnixfsAPI) Add(ctx context.Context, files files.File, opts ...options
fileAdder.SetMfsRoot(mr)
}

// Get storageMax when node is online.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Add the only entry point that will make the repo grow?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Unfortunately, this really needs to go much lower. We probably need some kind of datastore wrapper that rejects writes when the datastore is full (and calls a user-provided "free" function at N% of full).

@@ -126,6 +131,23 @@ func (api *UnixfsAPI) Add(ctx context.Context, files files.File, opts ...options
fileAdder.SetMfsRoot(mr)
}

// Get storageMax when node is online.
if fileAdder.Pin && !n.EnableGC && n.OnlineMode() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check for Pin here? the file will be added anyway even if we don't pin it, right? (it can be removed later but it will make the repo grow)

@@ -126,6 +131,23 @@ func (api *UnixfsAPI) Add(ctx context.Context, files files.File, opts ...options
fileAdder.SetMfsRoot(mr)
}

// Get storageMax when node is online.
if fileAdder.Pin && !n.EnableGC && n.OnlineMode() {
storageUsage, err := n.Repo.GetStorageUsage()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is GetStorageUsage a faithful metric of repo size? Is it compatible with StorageMax? (this is a naive question but I'm not familiar with the repo API)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing , maybe GetStorageUsage is a faithful metric of repo size.

return nil, err
}
if !gc.EnableGC {
log.Warningf("GC: %s", "Repo GC is disabled.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this warning? Is it part of the original intention of the PR?

Copy link
Contributor Author

@kjzz kjzz Jan 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it part of the original intention of the PR?

We should print some warning message when the gc is disable.
You can see the comment in the issue

@@ -23,10 +23,12 @@ var ErrMaxStorageExceeded = errors.New("maximum storage limit exceeded. Try to u
type GC struct {
Node *core.IpfsNode
Repo repo.Repo
EnableGC bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this second EnableGC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I need to get this param in func maybeRunGC(req *cmds.Request, node *core.IpfsNode) (<-chan error, error) .

if err != nil {
return nil, err
}
gc.EnableGC = enableGC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep enableGC set-value and load-value operations as close as possible, move the enableGC, _ := req.Options[enableGCKwd].(bool) line down here if it's not used anywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because i should get this param in other function.

func maybeRunGC(req *cmds.Request, node *core.IpfsNode) (<-chan error, error) {
	gc, err := initGC(req, node)
	if err != nil {
		return nil, err
	}
	if !gc.EnableGC {
		log.Warningf("GC: %s", "Repo GC is disabled.")
		return nil, nil
	}
.........

}

if cfg.Datastore.GCPeriod == "" {
cfg.Datastore.GCPeriod = "1h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR (since this was already present in the code) but could we add a warning here to notify the user the GCPeriod wasn't set and we're using a default value of an hour?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

if err != nil {
return nil, err
}
storPercentage := float64(storageUsage) / float64(n.StorageMax)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From gc.go I'm seeing that StorageMax can be left unset by the user and we'll silently assume a 10GB limit which can make this test a little brittle, a user who just doesn't care about any of this (didn't set the GC, didn't set a max capacity for the repo) will hit this error (which will even prevent it from using IPFS) without understanding what's really going on (and how to fix it).

@schomatis
Copy link
Contributor

schomatis commented Nov 20, 2018

Left several comments but the most important one to address first is #5768 (diff) (which may require a change in the approach of the PR's proposed solution).

@schomatis
Copy link
Contributor

ping @kjzz

@schomatis schomatis added the need/author-input Needs input from the original author label Dec 13, 2018
@kjzz
Copy link
Contributor Author

kjzz commented Dec 14, 2018

Sorry @schomatis,recently I'm preparing the opening report of my graduation thesis.I will do for this on next week.Thx.

@schomatis
Copy link
Contributor

Hey no rush, take your time, definitively concentrate on your thesis now :)

License: MIT
Signed-off-by: Kejie Zhang <601172892@qq.com>
@kjzz
Copy link
Contributor Author

kjzz commented Jan 4, 2019

@schomatis sorry for delay,i have update my pr.please help me review it again.Thx a lot!

@magik6k magik6k added need/review Needs a review and removed need/author-input Needs input from the original author labels Jan 7, 2019
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for jumping in on this very late but, unfortunately, we need to handle this case at a much lower level. Ideally at the datastore level but we probably don't want to reject writes of non-blocks for any reason (will make the daemon unusable).

I believe the correct approach (for now) is to do this at the blockstore level. That is, we'll have to modify the blockstore to reject puts if RepoSize grows too much.

}

// PeriodicGC create a goroutine to gc.
func PeriodicGC(ctx context.Context, gc *GC) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, this should just be a function on the GC instance.

@@ -126,6 +131,23 @@ func (api *UnixfsAPI) Add(ctx context.Context, files files.File, opts ...options
fileAdder.SetMfsRoot(mr)
}

// Get storageMax when node is online.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Unfortunately, this really needs to go much lower. We probably need some kind of datastore wrapper that rejects writes when the datastore is full (and calls a user-provided "free" function at N% of full).

@momack2 momack2 added this to Needs Review in ipfs/go-ipfs May 9, 2019
@Stebalien Stebalien closed this May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/review Needs a review
Projects
No open projects
ipfs/go-ipfs
Needs Review
Development

Successfully merging this pull request may close these issues.

None yet

4 participants