Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add seconds to engine program id generation #3906

Merged
merged 3 commits into from
Mar 16, 2021

Conversation

dstrain115
Copy link
Collaborator

  • Clients were occasionally getting duplicate program ids
    when running successive results.
  • Root cause is unknown, but likely cause is that a sub-module
    was calling random.seed() somewhere.
  • This changes program id generation to include seconds in the id,
    so that successive calls will generate distinct ids even if the
    random number generator is seeded.

- Clients were occasionally getting duplicate program ids
when running successive results.
- Root cause is unknown, but likely cause is that a sub-module
was calling random.seed() somewhere.
- This changes program id generation to include seconds in the id,
so that successive calls will generate distinct ids even if the
random number generator is seeded.
@dstrain115 dstrain115 requested review from cduck, vtomole, wcourtney and a team as code owners March 11, 2021 14:33
@google-cla google-cla bot added the cla: yes Makes googlebot stop complaining. label Mar 11, 2021
@@ -65,7 +65,7 @@ class ProtoVersion(enum.Enum):
def _make_random_id(prefix: str, length: int = 16):
random_digits = [random.choice(string.ascii_uppercase + string.digits) for _ in range(length)]
suffix = ''.join(random_digits)
suffix += datetime.date.today().strftime('%y%m%d')
suffix += datetime.date.today().strftime('%y%m%d-%H%M%S')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use UUID?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with this second based approach that it assumes that if the faulty (assumingly seeding) client program is generating more than one program per second, you'll still get collisions. With UUID, neither seeding nor speed won't be a problem - though it will be longer (36 characters).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A UUID based on what? I am not sure if that solves the problem.
Would you hash based on the time + circuit? If we use a random uuid, then that would be based on random() and also fail if seeded.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that https://docs.python.org/3/library/uuid.html#uuid.uuid4 is based on os.urandom(). It can't be seeded or manipulated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

@balopat balopat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for experimenting but I think it might still not really fix the issue for clients that generate a lot of programs fast.

Copy link
Collaborator

@mrwojtek mrwojtek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also fine with this as a temporary fix. I like the UUID approach, this is marginal compared to the rest of request payload.

@dstrain115
Copy link
Collaborator Author

LGTM for experimenting but I think it might still not really fix the issue for clients that generate a lot of programs fast.

Possibly, but you would have to generate more than one program in a second and also set random.seed() accidentally before generating each program.

@balopat balopat added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Mar 15, 2021
@balopat
Copy link
Contributor

balopat commented Mar 15, 2021

Let's experiment then!

@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Mar 15, 2021
@CirqBot
Copy link
Collaborator

CirqBot commented Mar 15, 2021

Automerge cancelled: A required status check is not present.

Missing statuses: ['Build docs', 'Build protos', 'Changed Notebooks Isolated Test against Cirq stable', 'Changed files test', 'Coverage check', 'Doc test', 'Format check', 'Lint check', 'Misc check', 'Notebook formatting', 'Pytest MacOS (3.7)', 'Pytest MacOS (3.8)', 'Pytest Ubuntu (3.7)', 'Pytest Ubuntu (3.8)', 'Pytest Windows (3.7)', 'Pytest Windows (3.8)', 'Type check']

@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Mar 15, 2021
@dstrain115 dstrain115 merged commit 8cf7825 into quantumlib:master Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Makes googlebot stop complaining.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants