Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up circuit building by not forgetting about cached objects #5280

Merged
merged 15 commits into from
May 9, 2022

Conversation

dabacon
Copy link
Collaborator

@dabacon dabacon commented Apr 21, 2022

This speeds up a 10^2 qubit by 10^3 deep creation of a circuit made up entirely of cirq.X gates, created by just appending these gates onto the circuit, from 25s to 5s.

The issue is that Moment's are immutable, so they need to be copied when adding in new operations. Before this PR we don't copy two cached objects, the measurement key objects, and control keys during this. This copies these caches over and update them.

Because using insert on circuit for earliest insertion strategy has to look up measurement keys or control keys (in order to not move an object with such a key before a moment that has such a key), moments during creating are always being asked what their measurement and control keys are.

@dabacon dabacon requested review from a team, vtomole and cduck as code owners April 21, 2022 21:08
@dabacon dabacon requested a review from viathor April 21, 2022 21:08
@CirqBot CirqBot added the size: M 50< lines changed <250 label Apr 21, 2022
@dabacon
Copy link
Collaborator Author

dabacon commented Apr 21, 2022

The question this does beg is whether we should be caching these objects at all on the moment, or whether we should just be storing them and updating them appropriately. I don't have a strong opinion about this, it is true that these are for gates that don't often show up (measurement, classically controlled), but on the other hand it is likely that only people who construct circuits moment by moment benefit from this caching.

@@ -1880,10 +1880,11 @@ def earliest_available_moment(
while k > 0:
k -= 1
moment = self._moments[k]
if moment.operates_on(op_qubits):
return last_available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this break instead, since we already have return last_available after the loop? Same this for the following if-statement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally the same, but I think returning reads easier since I don't have to go down and see what the rest of the function does as I'm reading it.

Comment on lines 195 to 196
measurement_key_objs = set(self._measurement_key_objs or {})
control_keys = set(self._control_keys or {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
measurement_key_objs = set(self._measurement_key_objs or {})
control_keys = set(self._control_keys or {})
measurement_key_objs = set(self._measurement_key_objs_())
control_keys = set(self._control_keys_())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines 172 to 175
m._measurement_key_objs = set(self._measurement_key_objs or {})
m._control_keys = set(self._control_keys or {})
m._measurement_key_objs |= protocols.measurement_key_objs(operation)
m._control_keys |= protocols.control_keys(operation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will be incorrect if self._measurement_key_objs or self._control_keys have not yet been computed on this moment. Instead, should access these through the methods that will populate them if needed:

Suggested change
m._measurement_key_objs = set(self._measurement_key_objs or {})
m._control_keys = set(self._control_keys or {})
m._measurement_key_objs |= protocols.measurement_key_objs(operation)
m._control_keys |= protocols.control_keys(operation)
m._measurement_key_objs = set(self._measurement_key_objs_())
m._measurement_key_objs |= protocols.measurement_key_objs(operation)
m._control_keys = set(self._control_keys_())
m._control_keys |= protocols.control_keys(operation)

Also, it strikes me that maybe this should just delegate to self.with_operations(operation) to avoid duplicating most of the code. I suspect the cost of the loop over one operation would be negligible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh doh, I went through like 5 iterations of this thing trying to see if I could get more speed and ended up with the bad one

The problem with delegating is that self.with_operations creates a new Moment while copying over all the information each time, so you are constantly creating and destroying a bunch of sets which I think would be bad.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Contributor

@maffoo maffoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think the mix of set and frozenset is confusing, but it looks like this is an existing issue in the Moment implementation that we could clean up separately.

@@ -168,6 +168,11 @@ def with_operation(self, operation: 'cirq.Operation') -> 'cirq.Moment':
m._operations = self._operations + (operation,)
m._qubits = frozenset(self._qubits.union(set(operation.qubits)))
m._qubit_to_op = self._qubit_to_op.copy()

m._measurement_key_objs = set(self._measurement_key_objs_())
m._control_keys = frozenset(self._control_keys_())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A bit odd to mix use of set and frozenset here. Using set might be more efficient since it can be modified below instead of creating a new set. If you want frozenset for m to ensure these are not modified, could combine things together and freeze once:

m._control_keys = frozenset(itertools.chain(self._control_keys_(), protocols.control_keys(operation)))

Comment on lines 210 to 211
m._measurement_key_objs = measurement_key_objs
m._control_keys = frozenset(control_keys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, too, it's a bit odd to mix set and frozenset. It looks like this might be coming from the fact that _control_keys is defined as returning FrozenSet, while _measurement_key_objs_ is defined as returning AbstractSet. This seems like an oversight because the protocol methods for control_keys and measurement_key_objs are both defined as returning AbstractSet.

Copy link
Collaborator Author

@dabacon dabacon Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is broken, but I think it needs more discussion in a separate follow up. The problem is that the protocol return value (for both protocols) is AbstractSet, which means that you could return a set and mutate it and expect it to mutate the set on the object. I'm not sure that was the right choice. But even if it was, then Moment should definitely use FrozenSet since we don't want to make it mutable via these protocols. But then restricting the return type from AbstractSet to FrozenSet is technically a "breaking" change.

I've gone and used itertools.chain to hopefully only read once where possible across these methods. Opened up #5298 for the frozen set issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of AbstractSet is it's meant to indicate that the return value should not be mutated (AbstractSet itself does not include mutation methods), even though we might return a mutable subclass like Set. This relies on callers respecting the annotated type, even if the runtime type is mutable, which is not great but sort of par for the course in python (is anything ever really immutable? :-) ). That said, if we annotate the type as AbstractSet we can still return FrozenSet if we're paranoid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah typing.Set isthe type for the stdlib Set. I'd forgot how butchered that namespace is.

Then I think both should be frozensets so people don't shoot themselves in the foot mutating these things.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I updated to return frozenset for measurement_key_obj and control_keys. Also moved to a pattern that utilized union on frozenset (just straight using itertools.chain isn't as good if the initial sets have values).

Also confirmed these still give speedups.

@daxfohl
Copy link
Contributor

daxfohl commented Apr 28, 2022

I think dedicated CircuitBuilder / MomentBuilder classes that are mutable and optimized for this would be best. All the temp object allocations and frozen set chaining is probably causing another order of magnitude perf overhead.

@dabacon
Copy link
Collaborator Author

dabacon commented Apr 28, 2022

I think dedicated CircuitBuilder / MomentBuilder classes that are mutable and optimized for this would be best. All the temp object allocations and frozen set chaining is probably causing another order of magnitude perf overhead.

Yeah, definitely if you look at the profile there is still a ton of time spent on these sets.

@quantumlib quantumlib deleted a comment from review-notebook-app bot Apr 28, 2022
@daxfohl
Copy link
Contributor

daxfohl commented Apr 29, 2022

Just as a POC, I changed Moment.init to

    def __init__(self, *contents: 'cirq.OP_TREE') -> None:
        self._operations = []
        self._qubit_to_op: Dict['cirq.Qid', 'cirq.Operation'] = {}
        self._qubits: Set['cirq.Qid'] = set()
        self._measurement_key_objs: Set['cirq.MeasurementKey'] = set()
        self._control_keys: Set['cirq.MeasurementKey'] = set()

        for op in op_tree.flatten_to_ops(contents):
            self.with_operation(op)

and with_operation to

    def with_operation(self, operation: 'cirq.Operation') -> 'cirq.Moment':
        self._operations.append(operation)
        self._qubits.update(operation.qubits)
        for q in operation.qubits:
            self._qubit_to_op[q] = operation
        keys = protocols.measurement_key_objs(operation)
        if keys is not None:
            self._measurement_key_objs.update(keys)
        keys = protocols.control_keys(operation)
        self._control_keys.update(keys)
        return self

(and also operates_on to return not self.qubits.isdisjoint(qubits))

And I am getting 720 ms

def test_me():
    qs = 100
    ms = 1000
    xs = [cirq.X(cirq.LineQubit(i)) for i in range(qs)]
    all = [xs[i] for _ in range(ms) for i in range(qs)]
    print(len(all))
    t = time.perf_counter()
    circuit = cirq.Circuit(all)
    print(time.perf_counter() - t)
    print(len(circuit))

Obviously we can't do this for Moment since it relies on mutation, but we can create MomentBuilder with the above, and then as last step convert the 1000 MomentBuilders to 1000 Moments, which should be ~instantaneous.

@daxfohl
Copy link
Contributor

daxfohl commented Apr 29, 2022

An additional option for speeding this up is to have dedicated measurement_keys() and control_keys() methods on the Operation and Moment classes, with default implementation to return a singleton empty frozenset. This would be a couple clock ticks. The protocol on the other hand is quite slow; it reflectively checks for several different dunder methods before finally giving up and allocating/returning a new empty set. A lot of overhead for an inner loop.

Having the method may also be a bit more discoverable than requiring users to dig through top-level protocols too.

This would be a breaking change though, as any 3rd party custom measurement gate that correctly implemented the measurement_keys protocol, would suddenly start falling past controlled ops on the same key, because we're no longer checking the protocol. We'd have to check both places for a deprecation cycle.

@daxfohl
Copy link
Contributor

daxfohl commented May 5, 2022

I went ahead and created a PR for using a MomentBuilder #5331. The changes here are independent and still beneficial for users who create empty circuits and add gates programmatically.

@dabacon dabacon added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label May 9, 2022
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label May 9, 2022
@CirqBot CirqBot merged commit b378ebd into quantumlib:master May 9, 2022
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels May 9, 2022
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
…ntumlib#5280)

This speeds up a 10^2 qubit by 10^3 deep creation of a circuit made up entirely of `cirq.X` gates, created by just appending these gates onto the circuit, from 25s to 5s.

The issue is that Moment's are immutable, so they need to be copied when adding in new operations.  Before this PR we don't copy two cached objects, the measurement key objects, and control keys during this.  This copies these caches over and update them.

Because using `insert` on circuit for earliest insertion strategy has to look up measurement keys or control keys (in order to not move an object with such a key before a moment that has such a key), moments during creating are always being asked what their measurement and control keys are.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: M 50< lines changed <250
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants