Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved the bulk of the implementation of configure_pipeline_features() to #4462

Merged
merged 6 commits into from Jan 22, 2020

Conversation

benoitsteiner
Copy link
Contributor

the DefaultCostModel class. This allows custom cost models more
flexibility in modeling pipeline features.

@steven-johnson
Copy link
Contributor

Lots of build failures, apparently

@benoitsteiner
Copy link
Contributor Author

It looks like the command I used to validate the PR (make test) didn't trigger a full build. I'm at a conference this week, so I'm not sure I'll be able to make much progress towards fixing the errors, but I'll have time to get back to this PR next week.

@@ -36,7 +36,7 @@ set_target_properties(retrain_cost_model_process
target_include_directories(retrain_cost_model_process
PRIVATE ${CMAKE_CURRENT_SOURCE_DIR}/../support)
target_link_libraries(retrain_cost_model_process
PRIVATE cost_model train_cost_model)
PRIVATE cost_model train_cost_model Halide)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not need to link in libHalide -- in general, you should only need libHalide if you are using the JIT (or you are a Generator). Why did you need to add this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the libHalide dependency, symbols corresponding to multiple methods of the Halide::Internal::Introspection and Halide::Internal::ErrorReport::ErrorReport classes are missing. I picked up the dependency through the #include "Error.h" in FunctionDAG.h. I'm guessing that the introspection dependency came through the added dependency on the Error class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch. We need to not do that. It is a bad code smell to need to link in libHalide just for error reporting.

apps/support/autoscheduler.inc Show resolved Hide resolved
@steven-johnson
Copy link
Contributor

Just to be clear: we don't want to land this as-is, as libHalide should not be necessary to use in this way. We should instead fix retrain_cost_model to avoid the need forit.

@benoitsteiner
Copy link
Contributor Author

It looks like this requires a fair amount of work to create a standalone library with basic utilities such as logging and error handling that can then be linked into various apps without having to pull in libHalide. The only other solution I can think of is to forego the Halide error handling code in the autoscheduler code and rewrite the code to either assert or throw exceptions whenever something bad happens.

@steven-johnson
Copy link
Contributor

Linking in all of libHalide just to get the logging & error-handling code is definitely a code smell. I would be very reluctant to approve it.

@abadams
Copy link
Member

abadams commented Dec 27, 2019

Most of this code already uses aslog for logging and regular old assert for error checking. Let's just keep doing that. I think there's a new internal_assert in this PR that should just be assert

@benoitsteiner
Copy link
Contributor Author

Removing all the #include "Errors.h" still result in missing symbols erros related to Halide::Internal::Introspection::get_source_locationabi:cxx11. It looks like this gets included from HalideBuffer.h.
Most of the autoscheduler code depends one way or another on Halide internals, so even if I manage to remove the dependency of retrain_cost_model on libHalide.so the problem will reappear sooner or later.

@abadams
Copy link
Member

abadams commented Jan 6, 2020

HalideBuffer.h definitely doesn't depend on libHalide - otherwise we'd be unable to ship Halide code without shipping the whole compiler.

Elsewhere in the project we have a hard line between things that depend on libHalide, and things that depend on Halide-AOT-compiled code. retrain_cost_model is supposed to be in that latter category. The autoscheduler itself is awkwardly in the middle.

@steven-johnson
Copy link
Contributor

retrain_cost_model is used by the autoscheduling script, but it is a standalone tool, not part of the autoscheduler library itself. It should not depend on libHalide. If it needs to, we're doing something very wrong.

@benoitsteiner
Copy link
Contributor Author

I just managed to trace the root cause of the problem: FunctionDAG.h includes Halide.h, which contains the following code:

#ifndef COMPILING_HALIDE
namespace Halide {
namespace Internal {
static bool check_introspection(const void *var, const std::string &type,
                                const std::string &correct_name,
                                const std::string &correct_file, int line) {
    std::string correct_loc = correct_file + ":" + std::to_string(line);
    std::string loc = Introspection::get_source_location();
    std::string name = Introspection::get_variable_name(var, type);
    return name == correct_name && loc == correct_loc;
}
}  // namespace Internal
}  // namespace Halide
#endif 

Both Introspection::get_source_location() and Introspection::get_variable_name() are declared in Halide.h but not defined, which is why we end up with these 2 missing symbols unless we link Halide.
I'm not sure that the proper way to fix this is. Adding #define COMPILING_HALIDE in FunctionDAG.h before including Halide.h would solve this problem, but it looks like an ugly hack.

@abadams
Copy link
Member

abadams commented Jan 9, 2020

FunctionDAG.h definitely depends on libHalide. It uses Exprs. retrain_cost_model shouldn't need FunctionDAG.h

@benoitsteiner
Copy link
Contributor Author

I updated CostModel::set_pipeline_features to take a FunctionDAG in order to enable us extract different pipeline features from a function dag without having to fork and update the code in several places. As a result the default cost model and ultimately retrain_cost_model also depend on FunctionDAG,

@steven-johnson
Copy link
Contributor

Unfortunately it sounds like some refactoring is really necessary.

@abadams
Copy link
Member

abadams commented Jan 10, 2020

OK, I think I get it now. This PR is a refactoring that gives cost models more flexibility in how they extract pipeline features, but in doing so it changes the cost model interface - instead of grabbing a predefined common set of pipeline features in a specific order as a Runtime::Buffer, it consumes a whole FunctionDAG. This adds all of libHalide as a dependency to anything that wants to deal with a cost model, because a FunctionDAG uses things like Exprs.

Hrm. Having cost model subclasses accept a FunctionDAG and do their own featurization is a reasonable bit of flexibility to add. I guess we either eat the new dependency, or we refactor FunctionDAG to separate the libHalide-using parts from the other parts, which seems unnecessarily difficult compared to just add a -lHalide. So Steven: I think there's a good reason for the new dependency.

@steven-johnson
Copy link
Contributor

Mixing AOT and JIT code in the same runtime is treading on interesting ground. Before we go down this path, can we get a little more background on why this is a must-have feature?

@benoitsteiner
Copy link
Contributor Author

The CostModel class allows us to experiment with different ways to predict costs given a set of pipeline features. However, with the current code we have to modify AutoScheduler.cc every time we want to experiment with different pipeline features. This makes the whole process much more difficult than it needs to be, and has forced us to fork the autoscheduler code. With the PR, we gain the flexibility we need to experiment with different pipeline features.

@steven-johnson
Copy link
Contributor

Thanks for the details. I'll re-review today with that in mind.

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

various formatting nits that are best solved by applying clang-format

}
internal_assert(stage == num_stages);
pipeline_feat_queue = pipeline_features;
assert(params.parallelism > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We almost certainly shouldn't be mixing assert() and internal_assert() in the same file; IMHO, if internal_assert is available, we should prefer it, as it has consistent meaning (doesn't rely on build settings) and has much richer failure messages.

assert(pipeline_feat_queue.data() && "Call set_schedule_features before calling enqueue\n");
const int max_num_stages = pipeline_feat_queue.dim(2).extent();
if (num_stages > max_num_stages) {
std::cerr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment; if we are committed to making libHalide available here, we should use internal_error or internal_assert rather than std::cerr and/or abort.

@steven-johnson
Copy link
Contributor

(Apologies for the very-belated-review!)

This PR is a refactoring that gives cost models more flexibility in how they extract pipeline features, but in doing so it changes the cost model interface - instead of grabbing a predefined common set of pipeline features in a specific order as a Runtime::Buffer, it consumes a whole FunctionDAG. This adds all of libHalide as a dependency to anything that wants to deal with a cost model, because a FunctionDAG uses things like Exprs.

Yeah, I see that now. That said: IIUC, DefaultCostModel::set_pipeline_features() doesn't actually use any of the libHalide-needing-parts of FunctionDAG; AFAICT, it walks over the nodes list to count the input stages, then extracts parts of the PipelineFeatures.

A refactoring to avoid bringing in libHalide would be much preferable and not too difficult (e.g., make an abstract class to allow accessing the PipelineFeatures, make that the input to set_pipeline_features() instead of FunctionDAG, and have FunctionDAG provide an implementation of that class (via wrapper, inheritance, etc).

Am I missing something here? Based on what I'm seeing, this still seems like an unnecessary (and undesirable) dependence.

steven-johnson added a commit that referenced this pull request Jan 17, 2020
This is my attempt at an alternate version of #4462 that avoids the need to add a libHalide dependency to retrain_cost_model; this achieves the same immediate goal, though I'm not sure if it has the same long-term goal. Please take a look.
@steven-johnson
Copy link
Contributor

I've taken the liberty of prepping an alternate take on this code (#4536) that I think achieves the same goal without adding the libHalide dependency. Please take a look.

@abadams
Copy link
Member

abadams commented Jan 17, 2020

Discussed with Steven offline why a CostModel subclass might need all of libHalide in a way that it can't encode in the PipelineFeatures. The current state of this PR LGTM assuming it passes test. I guess we need it to clone it into the main repo to trigger full testing.

@benoitsteiner benoitsteiner merged commit 646779e into halide:master Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants