Skip to content

Make random value generation and use more reliable#1065

Merged
rhatdan merged 7 commits intocontainers:mainfrom
mtrmac:random
Jun 15, 2022
Merged

Make random value generation and use more reliable#1065
rhatdan merged 7 commits intocontainers:mainfrom
mtrmac:random

Conversation

@mtrmac
Copy link
Copy Markdown
Collaborator

@mtrmac mtrmac commented Nov 19, 2021

  • Maintain a private RNG for pkg/stringid so that it is unaffected by anyone else thinking that math/rand.Seed() guarantees anything in a large codebase.
  • When generating container/image/layer IDs, make sure not to generate duplicates, to eliminate the slightest risk of this randomly failing.

Compare the plight at containers/podman#12155 .

⚠️ Please review the getMaxSizeFromImage part carefully; it’s quite possible that the use of the container ID there is intentional, I wouldn’t know.


It might make sense to merge this only after various other users of math/rand are modified/fixed, in case they unknowingly depend on this seeding the global RNG with unpredictable values (vs. the Go standard library of Seed(1)).

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Nov 19, 2021

When generating container/image/layer IDs, make sure not to generate duplicates, to eliminate the slightest risk of this randomly failing.

There are many more instances of this in the tests; this PR doesn’t fix that.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Nov 19, 2021

Another place where random values are assumed never to conflict is the lockfile.lw value.

So, this PR is now trying to make it consistently unique. That code complexity is bordering on not worth it — feel free to tell me to drop this part. The one possibly redeeming quality of it is that we now record a timestamp and PID of the lock owner into the lock state, which might help with later debugging.

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Nov 19, 2021

I think we need a PR opened against Podman and maybe buildah to make sure there are no regressions.

@TomSweeneyRedHat
Copy link
Copy Markdown
Member

@rhatdan tests aren't happy and you might need a rebase.

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Nov 19, 2021

@TomSweeneyRedHat You meant @mtrmac or commented in the wrong PR.

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Nov 22, 2021

@mtrmac Any idea why this is blowing up?

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Nov 22, 2021

Not yet, looking…

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Apr 20, 2022

LGTM

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented May 17, 2022

@mtrmac Is this still worth doing?

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 17, 2022

I think so; but I’m not comfortable with merging this without someone much more familiar with c/storage reviewing it and making that call.

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented May 17, 2022

@giuseppe @nalind PTAL

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented May 17, 2022

LGTM

@nalind
Copy link
Copy Markdown
Member

nalind commented May 17, 2022

Yes, this all looks fine.
Using strong entropy for 44 bytes of the ID might be overkill, but I can't guarantee that people aren't trying to share lock files across PID namespaces, so I can't rule out the need for it definitively.
I don't see any particular reason that we'd need to force the layer ID in getMaxSizeFromImage(). In retrospect, though, not retrofitting it to use store.MountImage() when we added that method was a missed opportunity.

@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented May 17, 2022

Using strong entropy for 44 bytes of the ID might be overkill

True. The previous version used crypto/rand for all of the 64 bytes, so this is at least a reduction. I can make the random part shorter (padding with zeroes), but then we get to discuss how short exactly :) Probably at least 64 bits, to be commensurate with the PID namespace, and to account for the birthday paradox. 128 bits? 256?

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Jun 15, 2022

@mtrmac @nalind Is this ready to merge?

mtrmac added 7 commits June 15, 2022 17:44
The global math/rand RNG is a shared resource, and
various packages like to seed it, in particular
usually just using time.Now().UnixNano(), making
our attempts to use a better seed unreliable.

So, use a private RNG instead.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Include the image ID (name?) being searched,
not the ID of the container that would use the image.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
There isn't any easy-to-see reason to use the same ID for
the two objects, and not needing to known the container ID
at that point will help generating it correctly without
breaking the ContainerStore abstraction.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
If callers to Store don't provide container/image/layer
IDs, Store generates random values without checking
for collisions.

Meanwhile, the underlying per-object stores do check for
collisions, which is more reliable, and nothing in
Store needs to know the values in advance (any more), so
just delete the code that generates the random values,
exposing the better per-object ID generation code.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... to have only one place that generates the value,
and make it easier to change how it is constructed.

For now, this is just a simple function call forwarding,
but we'll change that shortly.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... simplifying the code.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
... by including the time, PID, and a per-process counter.

Pad the rest with rando values for continuity.

This feels a bit like overkill, OTOH adding the PID might be
useful for debugging, so there's that.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Copy Markdown
Collaborator Author

mtrmac commented Jun 15, 2022

I have updated the comment with @nalind ’s non-trivial point about collisions across PID namespaces. Other than that I think this is done, if @nalind is fine with it.

@nalind
Copy link
Copy Markdown
Member

nalind commented Jun 15, 2022

LGTM

@rhatdan rhatdan merged commit 587fa86 into containers:main Jun 15, 2022
@mtrmac mtrmac deleted the random branch June 15, 2022 20:30
@kolyshkin
Copy link
Copy Markdown
Contributor

OTOH moby/moby removed the non-crypto functions from the package: moby/moby#39336

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Sep 13, 2022

Want to open a PR to make the same change here?

@kolyshkin
Copy link
Copy Markdown
Contributor

Want to open a PR to make the same change here?

Sure: #1337

TomSweeneyRedHat pushed a commit to TomSweeneyRedHat/storage that referenced this pull request Feb 8, 2023
Backport the commits in containers#1065
to address https://bugzilla.redhat.com/show_bug.cgi?id=2166429

Make sure the values returned by newLastWriterID() are unique
... by including the time, PID, and a per-process counter.

Pad the rest with rando values for continuity.

This feels a bit like overkill, OTOH adding the PID might be
useful for debugging, so there's that.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants