-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-step conversion + documentation changes + bug fixes + small code changes #206
Conversation
e7655a1
to
f389917
Compare
acb9635
to
18c8622
Compare
… it from code docs.
18c8622
to
821a3f5
Compare
``` | ||
Every preprocessing algorithm in the library (reordering, partitioning, feature extraction, etc.) can be implemented for many format types. In this case, `RCMReorder`is only implemented for the `CSR` type. However, when we set the `convert_input` parameter (last parameter) to `true` in the call, this will allow function matching to take place. | ||
|
||
Function matching takes the input formats to a preprocessing and the contexts the user passes to the function call, and, if the input format can't be used directly for the preprocessing (i.e., no function exists for the input's format type), it attempts to convert the input (using the passed contexts) to a format for which an implementation exists in the preprocessing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe specifically mention what conversion via the context means. So in this case the CSR actually gets copied to the CPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
ConversionCondition; | ||
|
||
//! A single conversion step composed of a conversion function and a context to use for the function | ||
typedef std::tuple<ConversionFunction, context::Context *> ConversionStep; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason this is not an std::pair
. Is this where we would add weights?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there isn't really a reason. It can be a pair. I opted to associate the cost with an entire chain rather than a step, but now that I think about it it's more general to associate steps with costs. I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That cost is currently not being used, but in the future, it might become useful.
* \return a vector of format with the first being the original format, the last being the target format, | ||
* and the rest being intermediate formats. If a conversion is empty or false, only returns the original format. | ||
*/ | ||
static std::vector<format::Format*> ApplyConversionChain(const ConversionChain& chain, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if move conversion is enabled, only the last format of this vector is valid. But if it's not, doesn't it contain duplicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. Technically, all the formats in the chain will be different representations of the same logical Format object. However, if users don't want these formats, they will just get deleted at the next step in the execution (Execute
).
I see now that users might want to save memory by deleting a format as soon as it is not needed. Maybe we can replace the move options with such an option? I mean, if someone wants to move convert, they probably don't want the intermediate formats, right? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that was my thought process as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Let me add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so here's what I did:
- For
ApplyConversionChain
, I replacedis_move_conversion
withclear_intermediate
.is_move_conversion
wasn't being used to begin with since at that point, the conversion functions were already decided byGetConversionChain
.clear_intermediate
would delete intermediate formats after each step, keeping only the start format and destination format.
- Replaced
is_move_conversion
withclear_intermediate
inApplyConversionSchema
for the same reason as above. - In
FunctionMatcher
, added aclear_conversion
parameter toCachedExecute
. Here is what happens when it's true and when it's false:clear_intermediate == true
: if a chain size > 1, all the formats between the first and the last are deleted. It will only return the destination format. (In contrast, Execute doesn't return any formats)clear_intermediate == false
: will return all the destination format and all the intermediate formats.
- In
FunctionMatcher
inExecute
, the call toCachedExecute
will always clear intermediate formats. Then insideExecute
, the destination formats are also deleted.
Now, I want to retract my statement that "if someone wants to move convert they probably don't want the intermediate formats." That can actually be the case. Imagine move converting CSC->CSR. This will do CSC -> COO -> CSR. However, if move conversions are used, we will end up with three objects but only 6 arrays:
row_ptr
(CSR and COO share this)
col_ptr
(CSC)
col
(CSR and COO share this)
vals
(COO and CSR share this)
row
(CSC)
vals
(CSR and COO share this)
vals_csc
(CSC)
(CSC vals and the other two vals are different because of how edges are sorted)
Which is what the "view" idea would've required. So I think clearing intermediate and moving should be separate ideas.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good. If you fix the small nitpicky things I requested and answer my questions I can approve.
# Conflicts: # docs/pages/getting_started/usage.md
…eps, and fixed some docs.
Added multi-step conversion to the library. More specific changes are:
Converter
no longer finds a conversion function, but a conversion chain of function-context pairs.Convert
that returns the intermediate formats when a multi-step conversion occurs.Convert
functions now can take a vector of contexts rather than a single destination context. Note: a version that takes a single context still exists for simplicity.FunctionMatcherMixin
to work with multi-step conversion.Additional changes:
add_opt_library
in header only.AutoX
to indicate users shouldn't touch them.typedefs
in Converter to make code easier to read.utils.h