Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pmempool create] should support auto growing poolsets (directories) #4223

Closed
sscargal opened this issue Aug 1, 2019 · 11 comments
Closed

[pmempool create] should support auto growing poolsets (directories) #4223

sscargal opened this issue Aug 1, 2019 · 11 comments
Labels
pmempool src/libpmempool and src/tools/pmempool Type: Feature A feature request won't do The requested improvement is not planned to be done.
Milestone

Comments

@sscargal
Copy link

sscargal commented Aug 1, 2019

FEAT: 'pmempool create' should support creating auto-growing poolsets

Rationale

This feature request is different to the pmempool resize feature #4170 in that this one allows the user to create a poolset from the start that auto grows on demand to the limit of available space within the filesystem. The pmempool resize feature allows users to grow an existing single pool.

At creation time, it is not always known how large a pool should be. The amount of data plus space to grow is a good starting point. Poolsets support the ability to dynamically grow (in directory mode) by adding small 128MB pools to an existing poolset. This requires manual administration to create the initial DIRECTORY based poolset. Currently, pmempool create does not support this.

Description

It would be nice if pmempool create allowed the user to specify a base directory, initial size, optional max size, and growth chunk size.

From poolset(5) - http://pmem.io/pmdk/manpages/linux/master/poolset/poolset.5

DIRECTORIES
Providing a directory as a part’s pathname allows the pool to dynamically create files and consequently removes the user-imposed limit on the size of the pool.

The size argument of a part in a directory poolset becomes the size of the address space reservation required for the pool. In other words, the size argument is the maximum theoretical size of the mapping. This value can be freely increased between instances of the application, but decreasing it below the real required space will result in an error when attempting to open the pool.

The directory must NOT contain user created files with extension .pmem, otherwise the behavior is undefined. If a file created by the library within the directory is in any way altered (resized, renamed) the behavior is undefined.

A directory poolset must exclusively use directories to specify paths - combining files and directories will result in an error. A single replica can consist of one or more directories. If there are multiple directories, the address space reservation is equal to the sum of the sizes.

The order in which the files are created is unspecified, but the library will try to maintain equal usage of the directories.

By default pools grow in 128 megabyte increments.

Only poolsets with the SINGLEHDR option can safely use directories.

API Changes

pmempool needs to support files and directories as input arguments. Currently it assumes just a file.

Internally, no API changes should be needed. The feature is currently integrated into PMDK, we just need a user interface to set it up. The only caveat maybe if we want to support multiple directories or file systems to store the poolset parts in case one fills up.

Implementation details

The proposal would provide the following user command options and extend the use of existing ones (--size and --maxsize)

pmempool create [<options>] [<type>] [<bsize>] <file|directory>

Available options:
       -a, --autogrow

       Create an auto growing poolset

       -g, --growby <size>

       Auto growing poolsets will automatically grow by the given increment size.  Defaults to 128MiB.

       -M, --max-size <size>

       Set the maximum size of auto growing pools if a value is provided, or set the size of pool to available space of underlying file system if no value is provided.

       -s, --size <size>

       Size of pool file or initial size of auto growing pools.
Examples:

Example 1:
Create an auto growing pool within the mounted /mnt/pmemfs file system with an initial size of 1GiB.  This pool will grow until we run out of space in the file system

    pmempool create --autogrow --size=1GiB /mnt/pmemfs

Example 2:
Create an auto growing pool within the mounted /mnt/pmemfs file system with an initial size of 1GiB and a maximum size of 10GiB.  The pool will grow in 1GiB increments

    pmempool create --autogrow --size=1GiB --maxsize=10GiB --growby 1GiB /mnt/pmemfs

If we wanted to support concatenating or striping auto growing pools across multiple file systems, we should also allow this syntax:

pmempool create [<options>] [<type>] [<bsize>] <file|directory> ...

Example 3:
Create an auto growing pool within the mounted /mnt/pmemfs0 file system with an initial size of 1GiB and a maximum size of 10GiB.  The pool will grow in 1GiB increments

    pmempool create --autogrow --size=1GiB --growby 1GiB /mnt/pmemfs0 /mnt/pmemfs1 /mnt/pmemfs2

Meta

@pbalcer
Copy link
Member

pbalcer commented Aug 1, 2019

That's a good idea, and should be relatively simple to implement.
👍

@seghcder
Copy link
Contributor

seghcder commented Aug 1, 2019

Coming from an operations perspective, can we also include shrinking pmem files? I can imagine its harder, as we'd need to move the live data back within the target smaller size like a defrag.

Another feature/option might be to allow adding (and removing a pool file to/from an existing poolset.
(Edit - dynamic adding is supported based on man page )

Shrink might be better as its own FEAT, as I agree the ability to autogrow is a nice feature.

Is there also a way to get the "current utilisation" of a pool from a function call within libpmemobj while running?

@pbalcer
Copy link
Member

pbalcer commented Aug 1, 2019

We might implement pool shrinking after we implement #4187. Right now it's just not realistic.
And yes, you should be able to add parts to an existing poolset.

How would you define "current utilization"? % of occupied space?

@seghcder
Copy link
Contributor

seghcder commented Aug 3, 2019

In general yes. However with fragmentation it might be misleading too, so it can only be taken as an indicator.

Eg, If someone looks at their pool and it reports 10% free of a 100GB pool, they might expect 10GB contiguous free. Then a 5GB allocation fails because the largest contiguous space is only 4GB. Or they are confused why an autogrow was triggered when it seemed like there was sufficient space.

Another stat might "largest contiguous space"... might be hard or expensive to track though?

Ideally we could get the info from "pmempool info" through the API from within an app with the pool open. I understand pmempool info only works on offline pools at present. However, in future an app may want to provide an SNMP interface and/or send traps based on info from those stats.

Alternatively could pmempool info (and even check with no repair) be adapted to run in read-only mode against an open pool, and that way monitoring tools (Nagios, SCOM, SolarWinds etc) could write an agent for any pool.

@pbalcer
Copy link
Member

pbalcer commented Aug 5, 2019

I've been very reluctant to introduce any generic interfaces around space utilization for the reasons you list. Right now there's an API to retrieve the total size of allocated objects, but that might not be very useful.

Tracking largest free contiguous space - we could probably implement a rough estimate, but tracking this accurately would negatively impact scalability and overall performance.
One statistic I can reasonably and efficiently expose is number of free/used/run chunks in the pool. This can be used to approximately deduce utilization for the pool. We could also track allocated/freed objects at the allocation class level.

@sscargal
Copy link
Author

sscargal commented Aug 5, 2019

Within this feature enhancement, we should support the AUTO value for size within the poolset file when using FSDAX. Currently, directories support uses the as the maximum size.

From poolset(5):

The size argument of a part in a directory poolset becomes the size of the address space reservation required for the pool. In other words, the size argument is the maximum theoretical size of the mapping. This value can be freely increased between instances of the application, but decreasing it below the real required space will result in an error when attempting to open the pool.

AUTO works for devdax only.

Pools created on Device DAX have additional options and restrictions:

The size may be set to “AUTO”, in which case the size of the device will be automatically resolved at pool creation time.

In other words, the following myautogrowingpool.set configuration fails:

PMEMPOOLSET
OPTION SINGLEHDR
AUTO /pmemfs0/
# pmempool create --layout="mylayout" obj myautogrowpool.set
error: 'myautogrowpool.set' -- directory based pools are not supported for poolsets with headers (without SINGLEHDR option)
error: creating pool file failed

But this works:

PMEMPOOLSET
OPTION SINGLEHDR
10GiB /pmemfs0/
# pmempool create --layout="mylayout" obj myautogrowpool.set
#
# ls -lh /pmemfs0/*.pmem
-rw-rw-r--. 1 root root 8.0M Aug  5 05:08 /pmemfs0/000000.pmem

So the proposed -M, --maxsize <size> option should default to AUTO if no value is provided and we can determine the available capacity within the FSDAX filesystem at the time of pool creation. If the file system fills up, we'll return ENOSPC when trying to grow.

@lplewa
Copy link
Member

lplewa commented Aug 8, 2019

@pbalcer How this feature is supposed to work?

As far I know autogrow is only supported for pooleset files. Should we create automatically a poolset in the given directory?

if yes, how i will open this pool? Should i use directory path as my pool or use poolset file created by pmempool(how a user will find it?)?

@pbalcer
Copy link
Member

pbalcer commented Aug 8, 2019

Not in the directory, but where the user specified.
I think it should look like this:

pmempool create [<options>] [<type>] [<bsize>] <file>

Available options:
       -a, --autogrow [directories ...]

       Create an auto growing poolset

where the file is the poolset, so:

./pmempool create obj --autogrow /mnt/pmem --size=1GiB --maxsize=10GiB poolset.file

@marcinslusarz
Copy link
Contributor

pmempool doesn't create poolset files by design. Autogrowing poolsets are no different than other types of poolsets, so I don't see why we should implement this particular feature, but leave other types out.

So the fundamental question is - what problem are we trying to solve here?
The number of characters you have to type to create autogrowing poolset by hand would be similar to the number of characters needed to use this new feature, so this can't be just that...

@pbalcer
Copy link
Member

pbalcer commented Aug 12, 2019

I think the problem is discoverability of this feature. Most probably skip over the directories section or poolsets man page sections.
Maybe we should add a command to create a poolset?

@marcinslusarz marcinslusarz transferred this issue from pmem/issues Nov 7, 2019
@marcinslusarz marcinslusarz added the Type: Feature A feature request label Nov 7, 2019
@marcinslusarz marcinslusarz added the pmempool src/libpmempool and src/tools/pmempool label Mar 31, 2020
@pbalcer pbalcer added this to the 1.11 milestone Feb 2, 2021
@janekmi janekmi added the won't do The requested improvement is not planned to be done. label Aug 31, 2023
@janekmi
Copy link
Contributor

janekmi commented Aug 31, 2023

This improvement is not considered vital at the moment. So, we do not have the resources to fulfil your request. Sorry.

@janekmi janekmi closed this as completed Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pmempool src/libpmempool and src/tools/pmempool Type: Feature A feature request won't do The requested improvement is not planned to be done.
Projects
None yet
Development

No branches or pull requests

7 participants