Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add a new container Logs API #36

Closed
pranavs18 opened this issue Jan 14, 2015 · 4 comments
Closed

Feature: Add a new container Logs API #36

pranavs18 opened this issue Jan 14, 2015 · 4 comments
Labels
kind/enhancement Issues that improve or augment existing functionality
Milestone

Comments

@pranavs18
Copy link

Overview

Logs are always useful to figure out what's going on in the distributed system especially in situations when something goes wrong.Docker currently supports getting logs from a container that logs to stdout/stderr. Everything that the process running in the container writes to stdout or stderr docker will convert to json and store in a file on the host machine's disk which you can then retrieve with the docker logs command. This feature aims to provide a way to monitor logs from the Rancher UI.

What Docker already supports?

The current way of fetching the logs of a container is to run the docker logs command with 5 possible options from the docker remote API as in version 1.16 (https://docs.docker.com/reference/api/docker_remote_api_v1.16/#get-container-logs) -

Query Parameters:

follow – 1/True/true or 0/False/false, return stream. Default false
stdout – 1/True/true or 0/False/false, show stdout log. Default false
stderr – 1/True/true or 0/False/false, show stderr log. Default false
timestamps – 1/True/true or 0/False/false, print timestamps for every log line. Default false
tail – Output specified number of lines at the end of logs: all or . Default all

Need for this feature

Currently, there is no way from the Rancher UI to monitor docker container logs. Hence, this feature aims to obtain docker logs from each container, collect and ship it using this new API to the Rancher UI so that the user can always monitor it through the console on the Rancher UI. Thus, this feature provides an alternative to the end user to monitor the logs as agains the existing ways of either using the Docker Remote API or the Docker CLI.

Tentative Design

The result of this new action(API) should return a url to a websocket on the host and also a jwt token. The websocket should connect to the host-api go process which runs on the host. (I'll update this section as I progress further)

Implementation Details

The implementation of this new API would probably be based on how the stats action and the exec action on instance have been implemented in Rancher.Currently, the host-api go process creates a secure websocket that proxies information. Rancher currently uses it to grab information from "cadvisor". Here in this new action API, I plan to call docker logs command with appropriate input parameters required for fetching the logs of a container.

Query Parameters Format -

Name Format
"follow" - boolean
"lines" - Integer (corresponds to tail in docker's remote API) - default Value - 100
"stdOut" - boolean
"stdErr" - boolean
"timestamps" - boolean

Example API usage -

http://Cattle Server IP:8080/v1/containers/container ID/?action=logs&lines=10&stdOut=true&follow=true

Use Cases Needed to be Explored/Supported

  1. Tailing the logs - This functionality will be supported where in the user can input the number of lines of logs to retrieve. The default value will be set to 100 lines.
  2. Downloading the logs from the UI - YET TO BE DECIDED?
  3. Scrolling through the logs on the UI - Once the logs are being viewed on the UI, the user should be able to scroll through them.
  4. Searching through the logs on the UI ?
@pranavs18
Copy link
Author

The progress of this feature can be tracked here - rancher/cattle#138

@pranavs18
Copy link
Author

Few questions @ibuildthecloud

  1. Currently, if I follow the pattern for execHandler, we return the URL to the websocket but when we talk about jwt authentication, how and where do we do its validation? Where can i get the public-key from so that once the jwt token is created by the Rancher server, it could be validated once the request comes in from the user?

  2. Once the URL to the websocket with a port, container UUID and other arguments have been returned, the click on that URL should trigger a request and the "host-api" which I believe is listening to these requests. But looking at this GO-code, it seems it is kind of specific to "Stats" command, so do you think i should modify/restructure it and use it to make calls to the docker daemon? Or write a new Go-agent ?

  3. When you talk about calling docker twice , one for stdOut and another for stdErr, could you point me to any existing module which handles the multiplexing and how any concurrency issues might have been handled here ?

  4. What should actually be returned once the "docker logs...." call has been executed ? This "object" would need to be passed back to the UI.

@ibuildthecloud
Copy link
Contributor

The frame format is

0 1 TS Content

Where 0 is the version number (always 0 for now), 1 is the either 1 or 2 depending on stdout/stderr and the TS is timestamp in the format that Docker returns.

@will-chan will-chan added the kind/enhancement Issues that improve or augment existing functionality label Jan 22, 2015
@will-chan will-chan added this to the Rancher 1.0 milestone Jan 22, 2015
@pranavs18 pranavs18 removed their assignment Feb 7, 2015
@deniseschannon
Copy link

Closing as #40 is closed.

JeffersonBledsoe pushed a commit to JeffersonBledsoe/rancher-cli that referenced this issue Apr 28, 2022
Added basic HTTP auth to template details request
anupama2501 pushed a commit to anupama2501/rancher that referenced this issue May 18, 2023
# This is the 1st commit message:

restricted admin additional tests

# This is the commit message rancher#2:

Stop hosted clusters from deleting before tests run

# This is the commit message rancher#3:

User GUID for the PrincipalID for Active Directory

# This is the commit message rancher#4:

Migrate Active Directory users to use objectGUID as the principalId

# This is the commit message rancher#5:

Add defaults package to extensions. Update v2 tests and extensions  to use the timeout value from the new defaults package

# This is the commit message rancher#6:

Fix atoi call with empty string in azure auth provider

# This is the commit message rancher#7:

Bump csp adapter to 2.0.2-rc2

# This is the commit message rancher#8:

Add retries to kubeapi requests in integration tests

The downstream cluster sometimes has random disconnects that interrupt
the test setup, try to make the suite more resilient when connecting to
the downstream.

# This is the commit message rancher#9:

Add retries to K3D cluster setup

About 1/10 times the integrationsetup script fails with message

Failed Cluster Preparation: Failed Network
Preparation: failed to create cluster network: docker failed to create
new network 'k3d-auto-k3d-cluster-xtrfk': Error response from daemon:
Failed to program FILTER chain: iptables failed: iptables --wait -I
FORWARD -o br-c722011a5900 -j DOCKER: iptables: Resource temporarily
unavailable.\n (exit status 4)

Since the failure is noted to be temporary, add retries to try to avoid
having ithe whole job fail.

# This is the commit message rancher#10:

Add timeout to integration import cluster

The integration test setup has a wait both in the ImportCluster routine
and after it. If a networking error on the test node causes the import
job to never run, the pipeline waits undefinitely until the drone
timeout, and it is never clear why the step hung. This change adds a
timeout to the internal import cluster step so that the wait is shorter
and the problem is more clearly logged. Also add more logging so it is
clear which step is getting stuck.

# This is the commit message rancher#11:

Use a random ID for integration test labels

5f78652 introduced labels to resources in the steveapi integration tests
so that assertions could exclude non-test-generated resources. However,
since the label is deterministic, if the tests are run multiiple times
on the same cluster and if the resources weren't properly cleaned up
after the last test run due to an unexpected failure, the subsequent
test runs would include the old resources in their results. To prevent
this, use a unique ID for the resource label in the steveapi integration
tests.

# This is the commit message rancher#12:

Update steve for new project filtering feature

# This is the commit message rancher#13:

Add steve API tests for filtering by projects

Add integration tests for the new `projectsornamespaces` query parameter
in steve.

# This is the commit message rancher#14:

Improving default for PSP options

Improves the default for global.cattle.psp.enabled to not require manual
user override on k8s 1.25

# This is the commit message rancher#15:

Rebasing helm unittests to use upstrem plugin

Previously, chart unit tests used a fork of helm-unittests to run.
This commit commit changes the unit tests to use the upstream plugin
instead, which requires small changes to the tests and omitting the
tests phase on the s390x architecture.

# This is the commit message rancher#16:

Tests for Improving default for PSP options

# This is the commit message rancher#17:

Bump Rancher-Webhook to v0.3.5-rc5

# This is the commit message rancher#18:

Create a CRTB for a restricted admin when a GRB gets created for it

# This is the commit message rancher#19:

Enqueue restricted admin's GRB if CRTB is deleted from remote cluster

# This is the commit message rancher#20:

Stop creating unnecessary RBAC resources for restricted admins

# This is the commit message rancher#21:

Updated GRB handler for resetricted-admin.

# This is the commit message rancher#22:

Restructure restricted-admin rule reconciliation.

related-resource logic for re-enqueuing GRBs was moved from
`pkg/controllers/management/authprovisioningv2`
to `pkg/controllers/managementuser/rbac`

`pkg/controllers/management/restrictedadminrbac/register.go`
no longer creates cluster and project handlers for giving the
restricted-admin rules in the local cluster namespace. This also caused
the removal of unused member variables from the handler

GRB handler code now ensures a CRTB for the GRB subject to the cluster-owner
roleTemplate if the GRB is for a restricted-admin. If not then the
handler will bind the GRB subject to the cluster-admin role if the GRB
is an admin GRB. This change also caused the removal of unused member
variables from the handler.

# This is the commit message rancher#23:

Moves restricted-admin CRTB to management context.

Restricted admin now gets their CRTB for cluster-owner to downstream
cluster through controllers in the management context.

# This is the commit message rancher#24:

Adds unit tests for restrictedadminrbac controller

# This is the commit message rancher#25:

Fixes admin sync error and adds unit tests.

# This is the commit message rancher#26:

[CAPR] Enhance new provisioning tests for etcd snapshot creation/restore, encryption key rotation, and certificate rotation (rancher#41459)

* Add new operations tests and refactor v2prov test framework, refactor test frameworks to prevent repeating the same code in two places, add more etcd snapshot related tests and additional conditional checks around secret conflicts, selectively check cluster readiness when scaling, check objectstore health to prevent race condition on startup and ensure snapshot file is not failed
* Add unit test for condition manipulation check for managesystemagentplan
* Fix rkebootstrap controller handling of etcd node safe removal annotation
* Add RKE2 manifest removal instructions to encryption key rotation and certificate rotation to help ensure system components are restarted on major operations
* Add additional etcd restore stage to clean up system pods, don't generate capr cluster tokens if the cluster has plans delivered, and don't short circuit plan delivery logic if planAppliedButWaitingForProbes
* Bump rancher-machine version to v0.15.0-rancher100
* Fix S3 endpoint CA rendering and prefer snapshot S3 files and arguments
* Bump system-agent to v0.3.3-rc3
* Consolidate etcd machine cleanup and force remove machines on etcd restore shutdown phase
* Don't autoset join URL if annotation is set
* Clean up non-matching nodes on restore
* Fix unnecessarily noisy certificate rotation pausing

Signed-off-by: Chris Kim <oats87g@gmail.com>
# This is the commit message rancher#27:

Add hostname truncation validation test

# This is the commit message rancher#28:

feat: Allows configuration of the 'type' used in Service

* Defaults to the standard ClusterIP
* Allows user to override with NodePort or LoadBalancer
* Allows user to customise service with provided annotations
* Chart docs have been updated
* This allows smooth running on GKE clusters using static IP addresses and Google managed certificates

Fixes issue: rancher#16061

# This is the commit message rancher#29:

Adds tests for the new service type attribute

# This is the commit message rancher#30:

fix: Fixed silly issue with tests

# This is the commit message rancher#31:

feat: Allows service annotations to be configured

# This is the commit message rancher#32:

fix: Added missing annotations key. Doh.

# This is the commit message rancher#33:

fix: Add missing empty trailing new line.

# This is the commit message rancher#34:

Adds a path to the Ingress rule in the Rancher chart to make it compatible with ingress controllers that require a path to be present.

Fixes rancher#39638

Signed-off-by: Bastian Hofmann <mail@bastianhofmann.de>

# This is the commit message rancher#35:

Fix ingress path unit test.

# This is the commit message rancher#36:

Add multi-environment support for AKS

Issue: rancher/aks-operator#98

# This is the commit message rancher#37:

Updating to Fleet v0.7.0-rc.3

# This is the commit message rancher#38:

Keep all nodes during etcd restore that either match the machine UID label selector or have a corresponding node ref (rancher#41564)

Signed-off-by: Chris Kim <oats87g@gmail.com>
# This is the commit message rancher#39:

Rework errNotConfigured into a type

Using a type that implements Error, we can use that type in
tests without needing to know about its underlying implementation.
This keeps the underlying value opaque. Should be no change in
behavior.

# This is the commit message rancher#40:

Initial round of unit tests for Okta+LDAP

These are specifically intended to test the behavior in PR rancher#41269
so they are intentionally quite limited in scope. Mostly the goal
is to ensure that when an ldapProvider is configured on a SAML
provider, it is actually used when a principal search is performed.

This would be fairly trivial to expand to the shibboleth provider,
and in the future I'd like to include a group search suite.

# This is the commit message rancher#41:

Add doc comments to IsNotConfigured and ErrNotConfigured

# This is the commit message rancher#42:

bump the SUC version in the Dockerfile

# This is the commit message rancher#43:

Add cluster agent tests

# This is the commit message rancher#44:

run constructFilesSecret both when creating and deleting a node (rancher#41003)

Some NodeDrivers need to have access to the same secrets they used when creating the node. For example, the Openstack node driver needs access to the cacert file that is used to connect to Openstack.
# This is the commit message rancher#45:

Fix run script to check for args

# This is the commit message rancher#46:

Pin the rancher-webhook chart to an exact version

# This is the commit message rancher#47:

Bring back the old version-comparing behavior and cover it with tests

# This is the commit message rancher#48:

Add new logic with exact version and cover it with tests

# This is the commit message rancher#49:

Adjust remaining behavior for the deprecated env var

# This is the commit message rancher#50:

Ensure the new and old Helm values are merged

# This is the commit message rancher#51:

Allow downgrades only when using exact version explicitly

# This is the commit message rancher#52:

Add test for agent customization in fleetcluster

# This is the commit message rancher#53:

Do not export RestConfig of test Client; configure a RestGetter instead

# This is the commit message rancher#54:

Additional restricted admin tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Issues that improve or augment existing functionality
Projects
None yet
Development

No branches or pull requests

4 participants