Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up #4672

Closed
wants to merge 10 commits into from
Closed

Clean up #4672

wants to merge 10 commits into from

Conversation

Sentient07
Copy link
Contributor

@Sentient07 Sentient07 commented Jun 27, 2016

The cleanup of PR #4570. Will rebase with the master once #4570 gets merged.
Tasks :

  • Correctly handle HostFromGpu nodes
  • Remove the special case for Alloc / AllocEmpty from the new optimizer, and register these lifters in an "out2in" topooptimizer
  • Replace op_lifter with out2in
  • Factor out caching of Op instances
  • Check if all the optimizers used by op_lifter is being used by op_lifter_topo
  • Implement backward pass.
  • Remove op_lifter from EquilibriumOptimizer

@lamblin Am i missing anything in the to-do list?

@@ -468,4 +469,4 @@ def use_gpu_cumsumop(op, ctx_name, inputs, outputs):
if axis is None:
axis = 0
assert isinstance(x.type, GpuArrayType)
return GpuCumsum(axis)(x)
return GpuCumsum(axis, ctx_name)(x)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt it is better to pass the context_name. Will change if necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As documented here, the best practice is to infer the context during make_node, not to force the context when creating the Op.
Please remove the context_name attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i'll remove that commit.

@Sentient07
Copy link
Contributor Author

In line:467 of theano/gpuarray/type.py, if i make a call to transfer instead of host_from_gpu, it gives ,
RuntimeError: maximum recursion depth exceeded

@Sentient07
Copy link
Contributor Author

In this PR, i've removed ShapeOptimizer from fast_compile. There are some optimizations that need ShapeOptimizer and is a part of fast_compile. I'm removing those optimizations from fast_compile. Yet to open that PR of removing ShapeOptimizer. @lamblin I believe that PR should get merged before this ?

gpu_seqopt.register('gpuarray_local_optimiziations', gpu_optimizer, 1,
'fast_compile', 'fast_run', 'gpuarray')
gpu_seqopt.register('gpuarray_cut_transfers', gpu_cut_copies, 2,
'fast_compile', 'fast_run', 'gpuarray')

gpu_seqopt.register('op_lifter_topo', TopoOptimizer(gpu_topo.query('+fast_compile'), order='out_to_in'),
10, 'fast_run', 'gpuarray', 'fast_compile')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use something between -0.5 and 1 instead of 10 here.

@lamblin
Copy link
Member

lamblin commented Jun 30, 2016

It seems to make sense for a first draft of the backward pass (using TopoOptimizer and LocalGroupDB).
When you change the priorities so it is executed before the EquilibriumOptimizer, we can see in the profiling if there is something to gain, and if we need to optimize the implementation further.
For instance, maybe we would need to implement tracks in LocalGroupDB, like in EquilibriumDB.

@Sentient07
Copy link
Contributor Author

For instance, maybe we would need to implement tracks in LocalGroupDB, like in EquilibriumDB.

Could you please elaborate on this ?

@lamblin
Copy link
Member

lamblin commented Jun 30, 2016

EquilibriumDB has "tracks", or ops to match when trying to apply its local optimizers. For instance, some local optimizers will only be tried on nodes corresponding to some Ops. It uses the Ops passed to the local_optimizer decorator.
For instance, if a local opt registered with @local_optimizer([GpuFromHost, GpuToGpu, HostFromGpu]), then the EquilibriumOptimizer will only try to apply it on nodes where node.owner.op is an instance of those types.
I'm not sure if the LocalGroup uses this information. If it does not, then we may need to implement that logic, like in EquilibriumOptimizer.

@Sentient07
Copy link
Contributor Author

From what I see,LocalGroup does not use that info. LocalOptimizer uses track but I don't think LocalGroup does. I'll confirm again once.

@Sentient07
Copy link
Contributor Author

@nouiz The new optimizer op_lifter_topo doesn't show all the local optimizers as gpu_optimizer. Am I doing the query wrong ?

@theano-bot
Copy link
Contributor

Can one of the admins verify this patch?

@Sentient07
Copy link
Contributor Author

We're registering theano.tensor.opt.local_remove_all_assert into gpu_optimizer separately. Don't we need to do the same for gpu_optimizer2 ?

@@ -22,8 +22,9 @@ class GpuCumsum(GpuKernelBase, Op):
SUPPORTED_NDIMS = 3
__props__ = ('axis',)

def __init__(self, axis):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? Just remove this commit, or explain why it is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! This should have gotten removed from the rebase. It existed before, but i have changed this behaviour in #4570 itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this diff doesn't exist anymore. It has been removed. Earlier, i thought it was better to pass context_name

@Sentient07
Copy link
Contributor Author

There are few redundant commits. #128f257 and #3855671 (Did in #4570 but with a different name), #5bbbb31 did as a different PR. Shall I squash them ?

@nouiz nouiz added this to the 0.9 milestone Dec 1, 2016
@lamblin
Copy link
Member

lamblin commented Feb 14, 2017

We will need to the the automatic cache (in Op's metaclass) first, and then go on with the cleaning up.
We will not be able to do that for 0.9, so bumping to 0.10

@lamblin lamblin modified the milestones: 0.10, 0.9 Feb 14, 2017
@ReyhaneAskari ReyhaneAskari mentioned this pull request Feb 20, 2017
@Sentient07
Copy link
Contributor Author

Closing as continued in #5579

@Sentient07 Sentient07 closed this Feb 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants