-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-27204 Support planes with multiple devices (striping) #15782
HPCC-27204 Support planes with multiple devices (striping) #15782
Conversation
@ghalliday - please review, leaving a draft for the moment, because I need to do some more testing. |
https://track.hpccsystems.com/browse/HPCC-27204 |
c2705da
to
ff6eabc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakesmith a few comments from an initial scan. Main comment is that a logical file/file descriptor should have a baseDevice field - generally set to a hash of the logical filename - which is added to the part number before modulus. (Then the hash code is localised to creating the file, rather than the readers.)
dali/base/dadfs.cpp
Outdated
unsigned numStripedDevices = queryPartDiskMapping(cn).numStripedDevices; | ||
unsigned stripeNum = 0; | ||
if (numStripedDevices>1) | ||
stripeNum = (i%numStripedDevices)+1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more: also hash of logical filename/base hash stored in the file descriptor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately left this functionality off in this draft so far (but there is a comment somewhere), so prove the rest worked.
mspec.flags &= ~CPDMSF_striped; | ||
#else | ||
// Bare-metal can have multiple devices per plane (e.g. data + mirror), but it doesn't stripe across them | ||
mspec.numStripedDevices = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is possibly a flaw. May want to revisit bare metal at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, it would break bare metal to consider what it thinks of multidevice planes now as striped, but we could have a switch where new plane types were flagged as striped.
As you say, for future.
dali/base/dafdesc.hpp
Outdated
@@ -326,7 +328,8 @@ extern da_decl StringBuffer &makePhysicalPartName( | |||
unsigned replicateLevel, // uses replication directory | |||
DFD_OS os, // os must be specified if no dir specified | |||
const char *diroverride, // override default directory | |||
bool dirPerPart); // generate a subdirectory per part | |||
bool dirPerPart, // generate a subdirectory per part | |||
unsigned stripeNum); // strip number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial: indent of comment and "strip"->"stripe"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix in next commit
189ab8d
to
6cc22dc
Compare
@jakesmith The error with the failed spray_dir_test.ecl is:
|
6cc22dc
to
da24759
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jakesmith looks good. a few relatively minor comments/questions
@@ -546,6 +546,7 @@ class IndexWriteSlaveActivity : public ProcessSlaveActivity, public ILookAheadSt | |||
getPartFilename(*tlkDesc, l, path, true); | |||
if (0 == l) | |||
{ | |||
ensureDirectoryForFile(path.str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
efficiency: I will open a separate PR, but in most cases it would be better to try and copy a file and only try and create the directory if that fails. Not an issue for local file systems, but more of an issue for remote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, revisit this and other similar places in that new JIRA then I think.
common/thorhelper/roxiehelper.cpp
Outdated
makePhysicalPartName(logicalName.get(), 0, 0, dir, 0, DFD_OSdefault, prefix, false, false); | ||
|
||
StringBuffer fullPath; | ||
makePhysicalPartName(logicalName.get(), 1, 1, fullPath, 0, DFD_OSdefault, prefix, false, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is passing 0 for stripe correct?
@@ -2399,7 +2400,10 @@ void ClusterWriteHandler::getPhysicalName(StringBuffer & name, const char * clus | |||
{ | |||
Owned<IStoragePlane> plane = getDataStoragePlane(cluster, false); | |||
const char * prefix = plane ? plane->queryPrefix() : nullptr; | |||
makePhysicalPartName(logicalName.get(), 1, 1, name, 0, DFD_OSdefault, prefix, false); | |||
unsigned stripeNum = 0; | |||
if (plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if plane && numDevices > 1? or calcStripeNumber should return 0 if numPlanes = 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I'll add a check to calcStrripeNumber and simplify the calls
dali/base/dafdesc.cpp
Outdated
lfnHash = attr->getPropInt("@lfnHash"); | ||
else if (tracename.length()) | ||
{ | ||
lfnHash = hashc((const unsigned char *)tracename.str(), tracename.length(), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: More encapsulated if the function to calculate the hash was in a separately named function - so the actual hash function is isolated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will change.
dali/base/dafdesc.cpp
Outdated
@@ -2336,7 +2428,7 @@ IFileDescriptor *createFileDescriptor(const char *lname,IGroup *grp,IPropertyTre | |||
width = grp->ordinality(); | |||
StringBuffer s; | |||
for (unsigned i=0;i<width;i++) { | |||
makePhysicalPartName(lname, i+1, width, s.clear(), 0, os, nullptr, false); | |||
makePhysicalPartName(lname, i+1, width, s.clear(), 0, os, nullptr, false, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 0 for the stripe correct? Would be worth having a comment why. (added later?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is, but this version of createFileDescriptor is only called in one context in dfurun.cpp to do with keydiff.
However, a more sensible/normalized version should be called which would create an IFileDescriptor based on plane/striping.
I'll change and remove this defunct version.
dali/base/dautils.cpp
Outdated
#endif | ||
StringBuffer descPath; | ||
makePhysicalPartName(lfn.get(), 0, 0, descPath, 0, DFD_OSdefault, dir, false, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last parameter should be a stripeNumber rather than a boolean. Is it worth having a special function for this since similar code occurs elsewhere (I assume it gets the base directory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I'll introduce a helper for clarify, but call common code.
extern da_decl void addStripeDirectory(StringBuffer &out, const char *directory, const char *planeName, unsigned partNum, unsigned lfnHash, unsigned numStripes); | ||
inline unsigned calcStripeNumber(unsigned partNum, unsigned lfnHash, unsigned numStripes) | ||
{ | ||
return ((partNum+lfnHash)%numStripes)+1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: return 0 if numStripes <= 1? (to simplify calling code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, will change to that.
51baa32
to
a7f944f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ghalliday - please review latest commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last changes look good and cleaned up the code a bit.
@richardkchapman - want me to squash? |
@jakesmith Yes please |
Signed-off-by: Jake Smith <jake.smith@lexisnexisrisk.com>
a7f944f
to
743e752
Compare
@richardkchapman - squashed |
Automated Smoketest: ✅ Unit tests result:
Regression test result:
HPCC Stop: OK
|
Signed-off-by: Jake Smith jake.smith@lexisnexisrisk.com
Type of change:
Checklist:
Smoketest:
Testing: