Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the problem with duplicated logs? #316

Closed
macieyng opened this issue Sep 26, 2023 · 12 comments
Closed

What's the problem with duplicated logs? #316

macieyng opened this issue Sep 26, 2023 · 12 comments

Comments

@macieyng
Copy link

I saw this comment: https://github.com/microsoft/ApplicationInsights-Python/blob/main/azure-monitor-opentelemetry/azure/monitor/opentelemetry/_constants.py#L86 and wanted to ask if someone can elaborate?

We used to have duplicated logs with custom otel azure functions extension and configuring logging level for Host.Function.Console in host.json kind of solved it for us. Other way would be to remove StreamHandler from logging configuration. I don't know if that's related, that's why I'd want learn more about this duplicated logs issue.

@lzchen
Copy link
Contributor

lzchen commented Sep 26, 2023

@jeremydvoss

@jeremydvoss
Copy link
Member

jeremydvoss commented Sep 26, 2023

Hi @macieyng it's unimportant. The original way we planned our diagnostic logging system with an "enable" method that would fork all of the logs under additional loggers, such as the exporter logger, to a file for debugging purposes. I found that when extending this to the exporter from the distro, there were certain logs that could be duplicated. This was a note to see if there was way to further this approach without duplicating any logs.

However, I have a PR out that changes how diagnostic logging is done making this irrelevant. So, in short, this was a comment about a potential issue with a proposed future feature that is no longer relevant. Apologies if the comment was overly broad and vague.

@jeremydvoss
Copy link
Member

Reopening to ask: Unrelated, but could you explain more what you mean by "configuring logging level for Host.Function.Console in host.json" Some customers notice duplicate logs when they have both Function's host's .NET logging enabled in addition to their own Python logging. I usually solve this a different way am curious how you did.

@WaterKnight1998
Copy link

WaterKnight1998 commented Sep 28, 2023

@jeremydvoss I have duplicated logs too but with different custom dimensions in my azure function. Is like they are not being merged.

this is my code;

from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import trace

configure_azure_monitor(connection_string=os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"])
logger = logging.getLogger(__name__) 
tracer = trace.get_tracer(__name__)

app = fastapi.FastAPI()

@app.get("/x")
async def x():
   logger.info("a",extras={"b":1})

def main(req: func.HttpRequest, context: func.Context) -> func.HttpResponse:
    return func.AsgiMiddleware(app).handle(req, context)

@WaterKnight1998
Copy link

WaterKnight1998 commented Sep 28, 2023

Fixed by doing:

from azure.monitor.opentelemetry import configure_azure_monitor
from opentelemetry import trace

logging.getLogger().handlers = []

configure_azure_monitor(connection_string=os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"])
tracer = trace.get_tracer(__name__)

app = fastapi.FastAPI()

@app.get("/x")
async def x():
   with tracer.start_as_current_span("aaaaa"):
      logging.info("a",extras={"b":1})

def main(req: func.HttpRequest, context: func.Context) -> func.HttpResponse:
    return func.AsgiMiddleware(app).handle(req, context)

Azure Function Python Worker by default has a handler that causes duplicates

@WaterKnight1998
Copy link

However, now I am seeing too much traces. 3 traces each 5s. They looks like this:

Request URL: 'https://westeurope-5.in.applicationinsights.azure.com//v2.1/track' Request method: 'POST' Request header
Response status: 200 Response headers: 'Transfer-Encoding': 'chunked' 'Content-Type': 'application/json; charset=utf-8
Transmission succeeded: Item received: 3. Items accepted: 3

I tried disabling this URLS from tracing:

ENV VARS

OTEL_PYTHON_EXCLUDED_URLS=.*applicationinsights\.azure\.com/.*
OTEL_PYTHON_REQUESTS_EXCLUDED_URLS=.*applicationinsights\.azure\.com/.*

@macieyng
Copy link
Author

macieyng commented Sep 28, 2023

@jeremydvoss

Reopening to ask: Unrelated, but could you explain more what you mean by "configuring logging level for Host.Function.Console in host.json" Some customers notice duplicate logs when they have both Function's host's .NET logging enabled in addition to their own Python logging. I usually solve this a different way am curious how you did.

I thought that this is a well known issue and currently there's no way around it. I'm not that familiar with how those runners, hosts, python workers work, but we're using Azure Functions v1 for Python. We have those function.json files and we can have an additional host.json file that lives in the root that provides additional configuration - as per docs here. In general we want our python devs to be able to configure as much as possible, so we decided to use AzureMonitorLogExporter together with logger provider, log processors, etc provided by otel. We noticed that when we do that, we get duplicated logs. By inspecting sdkVersion property that is added to the traces/logs in AI we could distinct that some of them are coming from azurefunctions=* and others from py=*;otel=* (I'm writing those from memory). This is probably due to the fact that we not only have otel handler, but also a StreamHandler added to our logging dict config. I could guess that we could probably remove StreamHandler from config if we have otel handler setup and this might resolve our issue with duplicated logs. I actually have never asked myself or teammates if need this stream handler or if they use it for debugging, live logging, etc - probably not.

Nevertheless, duplicates are probably caused by the fact that our app have a handler sending out logs to AI and also to azure functions host console, which then sends it to AI too.

As of now, we just changed Host.Function.Console level to Warning, because it seems like we should want to see those and higher level logs to be notified about potential issues.

Most of our host.json looks like this right now:

{
  "version": "2.0",
  ...
  "logging": {
    "fileLoggingMode": "never",
    "logLevel": {
      "Default": "Information",
      "Microsoft": "Warning",
      "Azure.Messaging.ServiceBus": "Warning",
      "Azure.Core": "Warning",
      "Microsoft.AspNetCore": "Warning",
      "Host": "Information",
      "Host.Controllers.Host": "Warning",
      "Host.Function.Console": "Warning"
    },
    "applicationInsights": {
      "samplingSettings": {
        "isEnabled": true,
        "excludedTypes": "Request,Trace,Exception"
      }
    }
  },
  ...
}

I want to add that, I would not want to remove those default things logged by host, because they contain some useful stuff we didn't quite figured out yet like invocationId, operation_name, etc and logs like when function was triggered, what was the trigger, the final result (success/failure), etc.


@WaterKnight1998

However, now I am seeing too much traces. 3 traces each 5s. They looks like this:

Request URL: 'https://westeurope-5.in.applicationinsights.azure.com//v2.1/track' Request method: 'POST' Request header
Response status: 200 Response headers: 'Transfer-Encoding': 'chunked' 'Content-Type': 'application/json; charset=utf-8
Transmission succeeded: Item received: 3. Items accepted: 3

that's because azure.monitor.opentelemetry.exporter.export is way too verbose at INFO level. To me, those logs should be at debug level. There's another one azure.core.pipeline.policies.http_logging_policy that might produce some noise. Simply set those two to WARNING level. My recommendation is to configure all your loggers with dictConfig, because it gives the best control over logging.

Watch this excellent talk https://youtu.be/5f4o3nyAz5w?si=2EJ2t1027wv37-mt - so much distilled information about logging.

When it comes to those two

OTEL_PYTHON_EXCLUDED_URLS=.*applicationinsights\.azure\.com/.*
OTEL_PYTHON_REQUESTS_EXCLUDED_URLS=.*applicationinsights\.azure\.com/.*

These were excluded by default in OpenCensus to avoid self reporting. Not sure why and how they were lost during transition to OTEL.

As per why logging.info("a",extras={"b":1}) fixed duplicated logs for you I think that it can be logically explained - I'm assuming you see those logs in Application Insights as me.
When we call logging.info and there's no handler setup yet then basicConfig will be called setting up StreamHandler@warning for us source. So if we call configure_azure_monitor before python logging was used for the first time it would add LoggingHandler for us and leaving basicConfig never called. Therefore we have just one handler - no duplicates. Translating to dict config equivalent:

{
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "default": {
            "class": "opentelemetry.sdk._logs.LoggingHandler",
            "logger_provider": get_logger_provider(),
         },
    },
    "loggers": {
        "": {
            "handlers": ["default"],
        },
    },
}

If we follow this logic then it would mean that, if we create new loggers via getLogger(__name__) they end up in the namespace that might already have handler initialized with propagate=True (which is default), so it's also logged under root namespace. AFAIK __name__ in functions returns __app__ string. And the dict config equivalent would be:

{
    "version": 1,
    "disable_existing_loggers": False,
    "handlers": {
        "default": {
            "class": "opentelemetry.sdk._logs.LoggingHandler",
            "logger_provider": get_logger_provider(),
         },
         "other": {
            "class": "logging.StreamHandler"
         },
    },
    "loggers": {
        "": {
            "handlers": ["default"],
        },
        "__app__": {
            "handler": ["other"],
        },
    },
}

If you could look at sdkVersion attribute in AI that would tel you something about where they are coming from. Also some fancy formatting might tell you this, because otel ignores log formatter (from my experience), so you could distinct which one is which.

As to with different custom dimensions your examples also are different in the x body:

@app.get("/x")
async def x():
   with tracer.start_as_current_span("aaaaa"):
      logging.info("a",extras={"b":1})

vs

@app.get("/x")
async def x():
   logger.info("a",extras={"b":1})

Try the second one with span context manager and it should yield similar results in customDimensions, like so:

@app.get("/x")
async def x():
   with tracer.start_as_current_span("aaaaa"):
       logger.info("a",extras={"b":1})

As a sidenote, I think that we used to have a problem that all logs sent through StreamHandler->Host->AI used to have Error level displayed simply because StreamHandler logs to stderr and everything was interpreted as ERROR. We changed that because our DevOps team was complaining about that we push everything to stderr. We changed that and correct levels started to show up in AI.


If anything I wrote contradicts any other part I wrote, it's me trying to figure out those things over and over again. Sorry if it adds communication noise.

@WaterKnight1998
Copy link

These were excluded by default in OpenCensus to avoid self reporting. Not sure why and how they were lost during transition to OTEL.

@jeremydvoss I set those 2 vars but I continue to see the logs. I even disable all automatic exporters: OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=django,asgi,wsgi,fastapi,flask,requests,urllib,urllib3 but I still continue to see them. I have seen that core tracing is always on:

.

Do you know how I can get the python function not to register that in the log?

@macieyng
Copy link
Author

@WaterKnight1998 I think that you should create a new issue for that. It has nothing to do with the original question.

@WaterKnight1998
Copy link

@macieyng @jeremydvoss I created a new issue #317

@lzchen
Copy link
Contributor

lzchen commented Oct 2, 2023

@macieyng

Our monitoring story with Azure functions has not been fleshed out that clearly. There hasn't been a clear guidance or separation on which component is responsible for collecting what (distro or azure functions). We will be working closely with the functions team to sort out these bugs and provide more clarity.

that's because azure.monitor.opentelemetry.exporter.export is way too verbose at INFO level. To me, those logs should be at debug level.

We could possibly change the exporter to send logs on the debug level. Our recommendation to users is to use namespaced loggers so users will not get sdk/exporter logs.

@jeremydvoss
Copy link
Member

Closed in favor of #317 which was then migrated to Azure/azure-sdk-for-python#32503

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants