Skip to content

Simplify rev opts #352

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 7, 2025
Merged

Simplify rev opts #352

merged 16 commits into from
Jan 7, 2025

Conversation

leekillough
Copy link
Collaborator

@leekillough leekillough commented Dec 23, 2024

This change removes duplicate code in RevOpts.h and RevOpts.cc by templatizing the functions.

  • The mappings are simplified to be std::vector instead of std::unordered_map, since the indices are integers and we know numCores in advance
  • However, startSym must be a map rather than a vector of initialized elements, because of how it is constructed piecemeal
  • Therefore the code has been made general, with templates, if constexpr and generic lambdas, to handle any std::vector, std::unordered_map, or std::map.
  • The std::vector mapping members have default initializers which are based on the value of numCores, which is already initialized before them because it's declared before them. This initializes them the way they were initialized in a loop in the RevOpts constructor before.
  • numCores, numHarts and verbosity have been marked as const members, so that failure to initialize them in the constructor is an error, and because they never change after construction
  • A new GetProperty() template function returns the Core -> Val mappings. For std::vector it returns true iff Core < numCores, and for std::unordered_map it returns true iff a mapping entry for Core exists.
  • It also is able to accept std::tie() as an assignable rvalue-reference argument for its destination, in order to unpack a pair or tuple into a set of destinations. This is how GetMemCost() extracts both the Min and Max values from a vector of pairs, by simply passing std::tie( Min, Max ) to GetProperty().
  • A new InitPropertyMap() template function parses the options, setting one for each Core:Value pair. It will work for both vector and map containers, using templates, if constexpr, and lambdas to make the code compact .
  • A new InitPropertyMapCores() template function allows an optional CORES:value and broadcasts the value to all cores. If it does not detect CORES:value, it falls back on InitPropertyMap() which requires per-element values.
  • Instead of calling map.insert(std::pair<type,type>(i, value)) in a loop to initialize the mappings, default member initialization is used to initialize the vectors with numCores copies of a value.
  • Some mappings go from Core to an integer value, while others go from Core to a string. With if constexpr() both cases can be handled with an if which is resolved in the template at compile-time and whose false case is ignored so it cannot produce errors
  • Some mappings are handed with std::vector while others are handled with std::unordered_map. With if constexpr() both cases can be similarly handled. To prevent source code explosion, generic lambdas are used to handle multiple cases with small code.
  • The old code had an off-by-one error. It was only comparing Core > numCores, potentially allowing Core == numCores which is outside of the range of [0, numCores-1].
  • The old code used the unsafe map.find( Core )->second = value which could crash if the key was not already in the map. Because all values were initialized with defaults, it map.find( Core ) never returned map.end() which is not dereferenceable. I've replaced it with the simpler syntax map[ Core] = value which works for both std::vector and std::unordered_map and in the latter case, creates a new entry if one doesn't already exist.
  • Because map keys and values might have different types, I've used explicit casting based on the underlying decltypes when calling std::stoull to avoid potential warnings. This way the declaration of the map variables is the only place where the underlying types need to be declared.

struct is_vector : std::false_type {};

template<typename T>
struct is_vector<std::vector<T>> : std::true_type {};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is standard traits programming, where a trait query class is defined as false in general, but then a partial specialization is defined for which it's true for a proper subset of the original template.

auto it = Map.find( Core );
return it != Map.end() ? Val = it->second, true : false;
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one of the if constexpr() branches will be taken, and the other will be discarded at compile-time, and will produce no errors.

So we can write one function which operates differently depending on whether MAP is a vector or not.

VAL&& Val is a universal reference, meaning VAL will be of lvalue reference type for lvalue arguments (arguments with a name whose address can be taken) and VAL will be of rvalue reference type for rvalue arguments (arguments with no name or address, which represent temporary expressions which are fleeting and whose resources can be moved instead of deep-copied). Inside GetProperty, Val is an lvalue even if it bound to an rvalue. std::tie() creates an rvalue tuple of references, and can be assigned to even though it is an rvalue, in which case it assigns to its lvalue reference elements.

@@ -59,22 +90,24 @@ class RevOpts {
bool InitPrefetchDepth( const std::vector<std::string>& Depths );

/// RevOpts: retrieve the start address for the target core
bool GetStartAddr( uint32_t Core, uint64_t& StartAddr );
bool GetStartAddr( uint32_t Core, uint64_t& StartAddr ) const { return GetProperty( Core, startAddr, StartAddr ); }
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the Get* functions are now implemented in terms of GetProperty().


/// RevOpts: retrieve the memory cost range for the target core
bool GetMemCost( uint32_t Core, uint32_t& Min, uint32_t& Max );
bool GetMemCost( uint32_t Core, uint32_t& Min, uint32_t& Max ) const {
return GetProperty( Core, memCosts, std::tie( Min, Max ) );
Copy link
Collaborator Author

@leekillough leekillough Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::tie( Min, Max ) is passed to GetProperty( ), whose universal reference VAL will be deduced to be an rvalue reference. But it can still be assigned to, and the assignment to Val will extract elements of std::pair or std::tuple into Min and Max.

std::vector<std::string> table{decltype( table )( numCores, "_REV_INTERNAL_" )}; ///< RevOpts: inst table
std::vector<std::pair<uint32_t, uint32_t>> memCosts{decltype( memCosts )( numCores, {0, 10} )}; ///< RevOpts: memory cost range
std::vector<uint32_t> prefetchDepth{decltype( prefetchDepth )( numCores, 16 )}; ///< RevOpts: prefretch depth
// clang-format on
Copy link
Collaborator Author

@leekillough leekillough Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the vectors which are initialized to default values are initialized at constructor time, after numCores, declared above them, is initialized. The decltype() syntax is syntactic sugar for not having to repeat the type in the far-left column a second time, which we need in order to make direct constructor calls to std::vector rather than initialization-list calls to std::vector, which have a different meaning.

std::vector<uint32_t> prefetchDepth{decltype( prefetchDepth )( numCores, 16 )}; ///< RevOpts: prefretch depth
// clang-format on

std::unordered_map<uint32_t, std::string> startSym; ///< RevOpts: starting symbol
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of how RevLoader works, startSym has to remain a std::unordered_map instead of a std::vector. Here it is default-initialized as being empty.

prefetchDepth.insert( std::pair<uint32_t, uint32_t>( i, 16 ) );
}
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that loop has been moved to the constructor initialization syntax above, where std::vector(numProcs, value) initializes the vector to length numProcs copies of value.

src/RevOpts.cc Outdated
std::string s = Depths[i];
template<typename MAP>
bool RevOpts::InitPropertyMap( const std::vector<std::string>& Opts, MAP& map ) {
for( auto& s : Opts ) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range-based-for is generally preferred to old C-style for loops when it works.

} else {
map[Core] = vstr[1];
}
};
Copy link
Collaborator Author

@leekillough leekillough Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lambda represents an anonymous function. The [&] makes it capture any expressions it uses in its parent by reference. The auto val is an argument which is only used in decltype( val ) but which is a way of making this lambda a "generic lambda" which is like a template under-the-hood. The if constexpr will only execute one of its branches, for integral and non-integral vector/map elements.

parse( typename MAP::value_type{} );
} else {
parse( typename MAP::mapped_type{} );
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the parse lambda defined above is called one of two ways based on whether MAP is a vector type or not. If it's a vector, then its value_type member defines the vector element type, and we pass a default {} (or zero) value of it. If it's not a vector, we assume it's a map / unordered_map and get the value type from its mapped_type member and pass a similar value to the lambda.

if constexpr( is_vector<MAP>::value ) {
parse( typename MAP::value_type{} );
} else {
parse( typename MAP::mapped_type{} );
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we have the similar lambda / template / if constexpr trickery, generating the cross product of {map, vector} x { integral element type, non-integral element type} with as little code as possible.

return false;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InitMemCosts() is different from the others because it has {Min, Max} pairs. But it does not need to be as general as the others, so it is fairly short code. The decltype( memCosts[Core].first ) and decltype( memCosts[Core].second ) are used to cast stoull() to the target {Min, Max} types, to prevent warnings in case they are smaller than stoull().

return true;
}
}
return InitPropertyMap( Opts, map );
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where InitPropertyMapCores() falls back on InitPropertyMap() if no CORES:* is found.

Copy link
Collaborator

@kpgriesser kpgriesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extremely helpful detailed code descriptions. I have no specific feedback on the code itself.

@leekillough leekillough merged commit 55f8d37 into devel Jan 7, 2025
@leekillough leekillough deleted the SimplifyRevOpts branch January 7, 2025 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants