Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periscope will only run correctly in the aks-periscope namespace #165

Closed
peterbom opened this issue Apr 6, 2022 · 1 comment · Fixed by #196
Closed

Periscope will only run correctly in the aks-periscope namespace #165

peterbom opened this issue Apr 6, 2022 · 1 comment · Fixed by #196

Comments

@peterbom
Copy link
Collaborator

peterbom commented Apr 6, 2022

Describe the bug
Periscope cannot generate blob name for upload to storage if it is in a namespace other than aks-periscope. I'm not sure if this is a known and accepted limitation, but I couldn't find it documented and it tripped me up.

The reason for the hard-coded namespace dependency is:

  • We need to generate a blob with a timestamp that's unique to the deployment/run, but shared between all pods.
  • The method for getting this timestamp is to list all pods in the aks-periscope namespace, and take the creation timestamp of the last one.

Aside from tying us to a particular namespace, this is vulnerable to timing inconsistencies if an additional pod is created after another pod starts executing this code path. It also prevents us from addressing an issue raised in #157 (being unable to re-run Periscope for an existing DaemonSet). So we should probably be looking for a different approach anyway.

A bit of careful planning might enable us to address both this and #157 together. If the timestamp can be passed to periscope from something external to the pod (like the contents of a mounted file that can be watched), it could solve all of the above as well as giving us a potential means to trigger further 'runs' of an existing DaemonSet.

To Reproduce

  1. Deploy using the usual yaml resource specification, but change the namespace to (e.g.) aks-periscope-test.
  2. Wait for pods to be running
  3. Check storage in the Azure Portal. Assuming everything else was set up correctly, the logs should have uploaded successfully, but the blob name will be <no name>.

Expected behavior
The blob name should be something like 2022-03-28T21-26-02Z

Screenshots
image

Desktop (please complete the following information):
N/A

Additional context
N/A

@peterbom
Copy link
Collaborator Author

This is related to an intermittent bug (which occurs more frequently on newer Windows clusters), in which the node folders in the output file structure are divided between more than one timestamped root container.

By using the DIAGNOSTIC_RUN_ID variable now being supplied to Periscope by consuming tools, we can fix both of these bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant