Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracing is not working for google stackdriver #8

Open
cqx820 opened this issue Sep 22, 2021 · 34 comments
Open

Tracing is not working for google stackdriver #8

cqx820 opened this issue Sep 22, 2021 · 34 comments

Comments

@cqx820
Copy link

cqx820 commented Sep 22, 2021

Hi,

I'm trying to send trace data to Google Cloud Trace, I tried a python script with a credential file and it's working for me, but I'm unable to do the same with rust. The Cargo.toml looks like

opentelemetry = { version = "0.16.0", default_features = false, features = ["rt-tokio", "rt-async-std", "rt-tokio-current-thread", "trace", "metrics"] }
opentelemetry-stackdriver = { version = "0.12.0", default_features = false, features = ["gcp_auth", "yup-authorizer"] }
tracing = "0.1.28"
tracing-futures = "0.2.5"
tracing-opentelemetry = "0.15.0"
tracing-subscriber = "0.2.24"

My workflow is like this:

  1. Get exporter
let handle = tokio::runtime::Handle::current();
let spawner = tokio_adapter::TokioSpawner::new(handle);
let exporter = StackDriverExporter::connect(
    GcpAuthorizer::new().await.unwrap(), &spawner, None, None).await.unwrap();  // Tried YupAuthorizer as well, didn't work too.
  1. Get tracer
let tracer = sdk::trace::TracerProvider::builder()
      .with_simple_exporter(exporter)
      .build()
      .tracer("tracer_name", Some(VERSION));
  1. Get subscriber
let subscriber = tracing_subscriber::registry().with(tracing_opentelemetry::layer().with_tracer(tracer));
  1. Set global default
tracing::subscriber::set_global_default(subscriber);
  1. Call tracing::span!(tracing::Level::TRACE, span_name, ...)

Above is my entire workflow, after running I can't find any traces on Google Cloud Trace, I'm sure the cred file has enough permission. I've also tried raising the trace levels but no lucks. There were no runtime errors found. I wonder if I'm doing wrong with getting the exporter and the tracer.

I also wrote some sample codes below just for testing, but they didn't work for me too

let exporter = StackDriverExporter::connect(
    GcpAuthorizer::new().await.unwrap(),
    &spawner, Some(Duration::from_secs(5)), Some(5usize),
  ).await.unwrap();

let tracer = sdk::trace::TracerProvider::builder()
      .with_simple_exporter(exporter)
      .build()
      .tracer("tracer_name", Some(VERSION));

let attributes =
      vec![KeyValue::new("version", "0.0.0")];

let _span = tracer
        .span_builder("my_test_span")
        .with_attributes(attributes)
        .start(&tracer);

tracer.with_span(_span, |cx| {
      sleep(Duration::from_millis(500));
    });

Any suggestions would be appreciated, thank you very much!!

@TommyCpp
Copy link
Contributor

Could you try adding opentelemetry::global::shutdown_tracer_provider() at the end of your main function?

@cqx820
Copy link
Author

cqx820 commented Sep 23, 2021

Could you try adding opentelemetry::global::shutdown_tracer_provider() at the end of your main function?

Thanks for your reply sir, I tried your suggestion but unfortunately it didn't work, I was still not seeing anything on Google Trace. I also found another issue about the YupAuthorizer, I provided a path for persistent_token_file, but I found it's not persisting the token to disk too. I was wondering if the authorization failed, but I didn't see any panic so I was assuming the auth worked.

@djc
Copy link
Contributor

djc commented Sep 23, 2021

We set it up like this:

        let authorizer = GcpAuthorizer::new().await.unwrap();
        let spawner = TokioSpawner::new(Handle::current());
        let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
            .await
            .unwrap();
        let provider = TracerProvider::builder()
            .with_batch_exporter(exporter, Tokio)
            .with_config(Config {
                sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
                ..Default::default()
            })
            .build();

        tracing_subscriber::registry()
            .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
            .try_init()
            .unwrap();

This has been proven to work quite recently.

@cqx820
Copy link
Author

cqx820 commented Sep 23, 2021

We set it up like this:

        let authorizer = GcpAuthorizer::new().await.unwrap();
        let spawner = TokioSpawner::new(Handle::current());
        let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
            .await
            .unwrap();
        let provider = TracerProvider::builder()
            .with_batch_exporter(exporter, Tokio)
            .with_config(Config {
                sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
                ..Default::default()
            })
            .build();

        tracing_subscriber::registry()
            .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
            .try_init()
            .unwrap();

This has been proven to work quite recently.

Thank you very much for your reply, I really appreciate your help. I tried your setup but I was still not able to see any traces. I'm sure the environment variable was properly set and the credential file was valid too. What did I miss?

#[tokio::main]
async fn main() {
    let name = "GOOGLE_APPLICATION_CREDENTIALS";
    match env::var(name) {
      Ok(v) => println!("{}: {}", name, v),
      Err(e) => panic!("${} is not set ({})", name, e)
    }
    init_tracing().await.unwrap();
    let span = tracing::trace_span!("my_span_test", version="0.0.0", uuid="0000");
    span.in_scope(|| {
      println!("Hello, world!");
      sleep(Duration::from_millis(500));
    });
    opentelemetry::global::shutdown_tracer_provider();
}

async fn init_tracing() -> Result<(), anyhow::Error> {
  let authorizer = GcpAuthorizer::new().await.unwrap();
  let spawner = TokioSpawner::new(tokio::runtime::Handle::current());
  let exporter = StackDriverExporter::connect(
    authorizer, &spawner, None, None,
  ).await.unwrap();
  let provider = sdk::trace::TracerProvider::builder()
    .with_batch_exporter(exporter, opentelemetry::runtime::Tokio)
    .with_config(Config {
      sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
      ..Default::default()
    })
    .build();
  tracing_subscriber::registry()
    .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None))).try_init()
    .unwrap();
  Ok(())
}

@djc
Copy link
Contributor

djc commented Sep 29, 2021

There are a few log::error() calls in the exporter. Maybe set up a simple logger (env_logger usually suffices) and see if any of the existing log calls trigger?

@dvtkrlbs
Copy link

dvtkrlbs commented Jan 18, 2022

I am having the same problem. No traces show up on the Google Cloud Trace panel. My init_telemetry function looks like this

async fn init_telemetry() {
    let app_name = "app";
    // Spans are exported in batch - recommended setup for a production application.
    global::set_text_map_propagator(TraceContextPropagator::new());
    let spawner = TokioSpawner::new(tokio::runtime::Handle::current());
    let authorizer = opentelemetry_stackdriver::GcpAuthorizer::new()
        .await
        .unwrap();
    let exporter = opentelemetry_stackdriver::StackDriverExporter::connect(
        authorizer,
        &spawner,
        None,
        Some(4),
    )
    .await
    .unwrap();
    let provider = TProvider::builder()
        .with_batch_exporter(exporter, opentelemetry::runtime::Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
            ..Default::default()
        })
        .build();

    // Filter based on level - trace, debug, info, warn, error
    // Tunable via `RUST_LOG` env variable
    let env_filter = EnvFilter::try_from_default_env().unwrap_or(EnvFilter::new("debug"));
    // Create a `tracing` layer using the Jaeger tracer
    // let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    // let telemetry = tracing_opentelemetry::layer().with_tracer(tracer);
    let telemetry = tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None));
    // Create a `tracing` layer to emit spans as structured logs to stdout
    let formatting_layer = BunyanFormattingLayer::new(app_name.into(), std::io::stdout);
    // Combined them all together in a `tracing` subscriber
    tracing_subscriber::registry()
        .with(env_filter)
        .with(telemetry)
        .with(JsonStorageLayer)
        .with(formatting_layer)
        .try_init()
        .unwrap();
    // tracing::subscriber::set_global_default(subscriber)
    //     .expect("Failed to install `tracing` subscriber.")
}

the same setup works with jaeger. I am suspecting of my TokioSpawner struct which looks like this

struct TokioSpawner {
    handle: tokio::runtime::Handle,
}

impl TokioSpawner {
    fn new(handle: tokio::runtime::Handle) -> Self {
        Self { handle }
    }
}

impl Spawn for TokioSpawner {
    fn spawn_obj(&self, future: FutureObj<'static, ()>) -> std::result::Result<(), SpawnError> {
        self.handle.spawn(future);

        Ok(())
    }

    fn status(&self) -> std::result::Result<(), SpawnError> {
        Ok(())
    }
}

@djc
Copy link
Contributor

djc commented Jan 19, 2022

You don't need to define the TokioSpawner -- opentelemetry-stackdriver provides it, gated by an optional feature.

Did you already try my hint about enabling logging? You'll want to set the tracing-subscriber features to default-features = false to make sure it doesn't catch logging stuff.

You'll also need to give the associated service account permissions to write traces. Anyway, I've been testing opentelementry-stackdriver myself for the past week or so, and I can definitely get it to work.

Current test setup:

    env_logger::init();
    let authorizer = GcpAuthorizer::new().await.unwrap();
    let spawner = TokioSpawner::new(Handle::current());
    let exporter = StackDriverExporter::connect(authorizer, &spawner, None, None)
        .await
        .unwrap();
    let provider = TracerProvider::builder()
        .with_batch_exporter(exporter.clone(), Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(1.0)),
            ..Default::default()
        })
        .build();

    tracing_subscriber::registry()
        .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing", None)))
        .try_init()
        .unwrap();

@dvtkrlbs
Copy link

There is no errors printed. It just prints bunch of zeros without any other information. The service account I am using have admin access to the trace api.

@djc
Copy link
Contributor

djc commented Jan 19, 2022

Well, I'm confident the code above works, because I've recently tested it. If it doesn't work, I suspect it's something about your environment. You could clone this project, change opentelemetry-stackdriver's dependency on opentelemetry to skip the path dependency (and fix up the occurrences of to_bytes() -> to_byte_array()), then add some dbg/println statements to see what's going on.

@etiescu
Copy link

etiescu commented Apr 3, 2022

Hi!

Is there any documentation/example on how to use the most recent version of this library? The above code snippets seem to be a bit outdated with the most recent code not having a TokioSpawner and the introduction of a builder that returns a tuple containing a future :)

@djc
Copy link
Contributor

djc commented Apr 4, 2022

Here's some current code from the same test project:

    env_logger::init();

    let authentication_manager = AuthenticationManager::new().await.unwrap();
    let project_id = authentication_manager.project_id().await.unwrap();
    let log_context = LogContext {
        log_id: "cloud-trace-test".into(),
        resource: MonitoredResource::GenericNode {
            project_id,
            namespace: Some("test".to_owned()),
            location: None,
            node_id: None,
        },
    };

    let authorizer = GcpAuthorizer::new().await.unwrap();
    let (exporter, driver) = StackDriverExporter::builder()
        .log_context(log_context)
        .build(authorizer)
        .await
        .unwrap();

    tokio::spawn(driver);
    let provider = TracerProvider::builder()
        .with_batch_exporter(exporter.clone(), Tokio)
        .with_config(Config {
            sampler: Box::new(Sampler::TraceIdRatioBased(CLOUD_TRACE_RATE)),
            ..Default::default()
        })
        .build();

    tracing_subscriber::registry()
        .with(tracing_opentelemetry::layer().with_tracer(provider.tracer("tracing")))
        .try_init()
        .unwrap();

@richardbrodie
Copy link

I'm also having this problem. My code is essentially identical to @djc's latest snippet, and I get the following output:

OpenTelemetry trace error occurred. Exporter stackdriver encountered the following error(s): authorizer error: authorizer error: Could not establish connection with OAuth server
OpenTelemetry trace error occurred. Exporter stackdriver encountered the following error(s): authorizer error: Could not establish connection with OAuth server

I've confirmed using curl manually that I can successfully fetch the bearer token from https://oauth2.googleapis.com/token so I don't think it's anything wrong with my service_account json.

@djc
Copy link
Contributor

djc commented Apr 19, 2022

Try using gcp_auth to get a token for the correct authz scope without opentelemetry-stackdriver, does that work?

https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-stackdriver/src/lib.rs#L191

@richardbrodie
Copy link

richardbrodie commented Apr 19, 2022

Yep, that works fine. Using scopes=["https://www.googleapis.com/auth/trace.append", "https://www.googleapis.com/auth/logging.write"] I successfully get a Token:

Token { 
    access_token: "****", 
    expires_at: Some(OffsetDateTime { 
        utc_datetime: PrimitiveDateTime { 
            date: Date { year: 2022, ordinal: 109 }, 
            time: Time { hour: 12, minute: 43, second: 58, nanosecond: 457575700 } 
        }, 
        offset: UtcOffset { hours: 0, minutes: 0, seconds: 0 } 
    }) 
}

Assuming you meant gcp_auth::AuthenticationManager::get_token(scopes) right?

@djc
Copy link
Contributor

djc commented Apr 19, 2022

Yes, that's what I meant.

Did you configure either of the TLS roots Cargo features for opentelemetry-stackdriver?

@richardbrodie
Copy link

I have this in my Cargo.toml opentelemetry-stackdriver = { version = "0.14.0", default_features = false, features = ["tls-native-roots", "gcp_auth"] }

@djc
Copy link
Contributor

djc commented Apr 19, 2022

If you run with debug-level logging enabled, which kind of authentication manager is selected?

@richardbrodie
Copy link

richardbrodie commented Apr 19, 2022

It's tracing not logging, but I get a couple of these:
2022-04-19T13:04:26.227593Z DEBUG refresh_token{self=CustomServiceAccount { credentials: ApplicationCredentials { type: Some("service_account"), project_id: Some(...), private_key_id: Some(...), private_key: ..., client_email: ..., client_id: Some(...), auth_uri: Some("https://accounts.google.com/o/oauth2/auth"), token_uri: "https://oauth2.googleapis.com/token", auth_provider_x509_cert_url: Some("https://www.googleapis.com/oauth2/v1/certs"), client_x509_cert_url: Some("https://www.googleapis.com/robot/v1/metadata/x509/...") }, signer: Signer, tokens: RwLock { data: {}, poisoned: false, .. } } client=Client scopes=["https://www.googleapis.com/auth/trace.append", "https://www.googleapis.com/auth/logging.write"]}: gcp_auth::custom_service_account: requesting token from service account: Request { method: POST, uri: https://oauth2.googleapis.com/token, version: HTTP/1.1, headers: {"content-type": "application/x-www-form-urlencoded"}, body: Body(Full(b"grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer&assertion=...")) }

@djc
Copy link
Contributor

djc commented Apr 19, 2022

That seems okay. I'm afraid you'll have to debug for yourself where stuff is going wrong inside opentelemetry-stackdriver. There's not a lot of code there, I'd recommend cloning a local git repo and patching your test setup to use the local version and add some debug statements there.

@richardbrodie
Copy link

Thought so, but it's good to get a pair of fresh eyes on it before I start. Thanks for the help :)

@richardbrodie
Copy link

Update: after a lot of digging and debugging I must embarrassingly report that my issue was simply that in my test program the tokio runtime was being terminated before the gcp_auth thread requesting a new token was finished. I added a 1000ms sleep just before opentelemetry::global::shutdown_tracer_provider() and it works properly now.

I now get tonic error: status: PermissionDenied, message: "The caller does not have permission" but that seems out of the scope of this crate :)

@djc
Copy link
Contributor

djc commented Apr 20, 2022

You'll want some of these:

cloudtrace.traces.patch
logging.logEntries.create
monitoring.timeSeries.create

@AlexandrBurak
Copy link

AlexandrBurak commented Sep 19, 2022

Screenshot from 2022-09-19 21-02-47
Screenshot from 2022-09-19 21-03-30
Hi please if possible
Show a working example at the moment, I tried everything that is here, I can not compile.
In the last example from April 4th, it is not possible to get trace and provider
Thanks.

@djc
Copy link
Contributor

djc commented Sep 20, 2022

It sounds like you might have a version mismatch going on between some of the crates you're using.

@roshaans
Copy link

Does an exporter need to be setup even when my service is deployed on Google's cloud run itself? I currently get logs showing up just fine in Cloud Tracing, but there is no information about spans that seems to be propagating through while the spans show up in my local jaeger setup when running my service.

@djc
Copy link
Contributor

djc commented Jan 17, 2023

AFAIK GCP has a setup that forwards stuff logged to stderr to their logging service by default, but that doesn't work for tracing. So yes, if you want to get spans into Cloud Tracing I think you have to set up an exporter explicitly.

@hdost hdost transferred this issue from open-telemetry/opentelemetry-rust Nov 12, 2023
@cyberbudy
Copy link

I have the same problem - no traces with stackdriver exporter

I've tried to found source code, but links show 404

@ivan-brko
Copy link
Contributor

I got everything to work with the latest versions and documented it here.
I also linked the small repo where I put it all together and it correctly exports logs in my GCP Cloud Run application (as you can see in the screenshots), so you can look there.

Some things could probably be improved, but it should work.

@ivan-brko
Copy link
Contributor

@djc thanks for all the work you did for opentelemetry-stackdriver, it was great to be able to export traces to GCP rather quickly.

Do you know if there is somewhere I can read the overview of the current state and plans for the crate?
GitHub thread or something?
Currently, the docs are not up to date (for example, it's missing GcpAuthorizer), if you follow the link for the repo in the docs it leads you to https://github.com/open-telemetry/opentelemetry-rust which doesn't have the code for stackdriver (not in the main branch, anyway).
I can also see that in this repo the Workspace Cargo.toml, stackdriver is commented out with the following comment: TODO: Add back in once this relies on a published version.

I'm willing to try to help if it's needed.

@djc
Copy link
Contributor

djc commented Jan 15, 2024

@ivan-brko I'm using it at work so I'm also passively maintaining it. However, I'm no longer active as an opentelemetry-rust maintainer and (unrelated to that) this project got moved and apparently the opentelemetry-rust maintainers missed/postponed some things when moving this crate to its new home. PRs still welcome.

@ivan-brko
Copy link
Contributor

@djc PRs should be made against this repo?

Currently, I can't get stackdriver in this repo (main branch) to compile, don't know if I'm doing something wrong as I didn't investigate it, wanted to check if this code is even supposed to be built.
image

I already made a small PR to fix a documentation issue created here, would work on a couple of other improvements gladly (fix the compile issue I posted if needed, documentation improvements + things like this issue) as I would like to get my feet wet with Rust OSS anyway.

If I understand correctly, opentelemetry-stackdriver in this repository wasn't used to publish the actual new versions of the crate yet?

@djc
Copy link
Contributor

djc commented Jan 15, 2024

No -- unfortunately this repo was forked off the original repo only after some semver-incompatible changes to crates in the original repo were made. It might make sense to change the opentelemetry dependencies for opentelemetry-stackdriver to reference the upstream crates as a Git dependency instead for now so you can move forward here? I would probably pin them to the last commit before the extraction of the -contrib repo.

@ivan-brko
Copy link
Contributor

Yeah, that sounds reasonable for the development of new features, but I actually might try fixing the issues with the current build.
A new version of the crate cannot be deployed from this repo before fixing this problem anyway? Publishing a crate that depends on a specific commit would not work well with other libraries that depend on OTEL crates.

Do you maybe know if just fixing the compile errors would be enough or if there are other things to look into?

@ivan-brko
Copy link
Contributor

@djc if you manage to catch some time, you can take a look at the PR, using the latest opentelemetry and opentelemetry_sdk versions and fixing compile issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants