Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom logger doesn't work with created resources (like producers) #194

Closed
Samreay opened this issue Jan 15, 2024 · 5 comments · Fixed by #195
Closed

Custom logger doesn't work with created resources (like producers) #194

Samreay opened this issue Jan 15, 2024 · 5 comments · Fixed by #195

Comments

@Samreay
Copy link

Samreay commented Jan 15, 2024

There is an option to pass in a Logger when creating the pulsar client to log to. It seems this does not work when creating resources.

Reproduction:

First, run a pulsar standalone instance:

docker run -it -p 6650:6650 -p 8080:8080 --tmpfs /pulsar/data apachepulsar/pulsar:3.1.0 bin/pulsar standalone

And then run this code:

import logging

import pulsar

pulsar_logger = logging.getLogger("pulsarDummyLogger")
pulsar_logger.setLevel(logging.ERROR)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")

logging.info("This should be logged")
pulsar_logger.info("This isn't logged (correctly)")

client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()

It will write this to your terminal:

This should be logged
2024-01-15 14:12:08.947 DEBUG [140254733069184] ProducerImpl:128 | [persistent://public/default/A, standalone-0-4] ~ProducerImpl
2024-01-15 14:12:08.948 INFO  [140254733069184] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-4] , [batching  = off]

Potentially related to apache/pulsar#7713

I note that, if you set the log level to INFO for the pulsar logger, you do get more output:

This should be logged
This isn't logged (correctly)
[<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
Created connection for pulsar://localhost:6650
[127.0.0.1:53354 -> 127.0.0.1:6650] Connected to broker
[persistent://public/default/A, ] Getting connection from pool
[persistent://public/default/A, ] Created producer on broker [127.0.0.1:53354 -> 127.0.0.1:6650] 
[persistent://public/default/A, standalone-0-6] Closing producer for topic 0x7f4478002510
[persistent://public/default/A, standalone-0-6] Closed producer 0
Closing Pulsar client with 0 producers and 0 consumers
[127.0.0.1:53354 -> 127.0.0.1:6650] Connection disconnected
[127.0.0.1:53354 -> 127.0.0.1:6650] Destroyed connection
2024-01-15 14:13:51.273 DEBUG [139932214320000] ProducerImpl:128 | [persistent://public/default/A, standalone-0-6] ~ProducerImpl
2024-01-15 14:13:51.273 INFO  [139932214320000] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-6] , [batching  = off]

But I would expect that the logger is still used when creating resources like producers, readers, and consumers.

Expected behaviour

We can control the formatting and level of the logs produced by the pulsar client. Hopefully this is an easy fix, and the custom logger in the parent client can simply be passed on, if set, to the producer/consumer/readers when they are created

@Samreay Samreay changed the title Custom logger doesn't work Custom logger doesn't work with created resources (like producers) Jan 15, 2024
@merlimat
Copy link
Contributor

merlimat commented Jan 15, 2024

@Samreay If you just want to set the log level, a possible workaround is to pass the FileLogger: https://pulsar.apache.org/api/python/3.4.x/pulsar.FileLogger.html or the ConsoleLogger https://pulsar.apache.org/api/python/3.4.x/pulsar.ConsoleLogger.html

@Samreay
Copy link
Author

Samreay commented Jan 15, 2024

Thanks @merlimat, can confirm that

import logging

import pulsar

pulsar_logger = pulsar.ConsoleLogger(log_level=pulsar.LoggerLevel.Error)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")

client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()

does seem to silence the loggers better than using the standard library logging.

I'll leave the issue open if possible though, because the inconsistency between the behaviour is something which should probably be looked into at some point.

@BewareMyPower
Copy link
Contributor

BewareMyPower commented Jan 16, 2024

But I would expect that the logger is still used when creating resources like producers, readers, and consumers.

Did you mean the last two lines?

2024-01-15 14:12:08.947 DEBUG [140254733069184] ProducerImpl:128 | [persistent://public/default/A, standalone-0-4] ~ProducerImpl
2024-01-15 14:12:08.948 INFO  [140254733069184] ProducerImpl:722 | Producer - [persistent://public/default/A, standalone-0-4] , [batching  = off]

The logs above are printed by the default logger because they are logged during the exit process of the Python interpreter. At this moment, the Python logger cannot be used. So we fall back to the default logger from the C extension instead.

See

if (!Py_IsInitialized()) {
// Python logger is unavailable - fallback to console logger
_fallbackLogger->log(level, line, message);

Reference: https://docs.python.org/3/c-api/init.html#c.Py_IsInitialized

@Samreay
Copy link
Author

Samreay commented Jan 16, 2024

Hmm, that does make it tricky. Effectively my concern here is just that we have our logs from all services routed into a common place, and they all undergo a json transformation. So I'm trying to ensure everything which is collected looks correct. When using python logging, this is easy, because I can attach a json handler or intercept third party library logging, just not when its a direct stdout call.

I noticed in another thread that the C++ client now has an ability to silence all logging, maybe exposing that to the python client would be the easiest way to ensure that developers can have a clean stdout under their control? If not, perhaps just documenting the difference between passing in a python logger and a C++ logger has might alleviate any confusion for other developers trying to integrate the pulsar client

@BewareMyPower
Copy link
Contributor

Here is a workaround to trigger the cleanup (destruction of the objects from the C++ library) explicitly.

import logging

import pulsar

pulsar_logger = logging.getLogger("pulsarDummyLogger")
pulsar_logger.setLevel(logging.INFO)
logging.basicConfig(level=logging.INFO, format="{message}", style="{")

logging.info("This should be logged")
pulsar_logger.info("This isn't logged (correctly)")

client = pulsar.Client("pulsar://localhost:6650", logger=pulsar_logger)
producer = client.create_producer("A")
producer.close()
client.close()
del producer
del client

Then producer and client will be destroyed before the Python interceptor exits.

This should be logged
...
[127.0.0.1:52053 -> 127.0.0.1:6650] Destroyed connection to pulsar://localhost:6650-0
Producer - [persistent://public/default/A, standalone-0-7] , [batching  = off]

BewareMyPower added a commit to BewareMyPower/pulsar-client-python that referenced this issue Jan 16, 2024
Fixes apache#194

### Motivation

When the Python logger is configured as the logger, during the exit
process of a Python script, the default logger will be used for logging
instead because the Python logger is unavailable after the interpreter
is finalized. It might lead to some unexpected outputs to standard
output.
merlimat pushed a commit that referenced this issue Jan 16, 2024
Fixes #194

### Motivation

When the Python logger is configured as the logger, during the exit
process of a Python script, the default logger will be used for logging
instead because the Python logger is unavailable after the interpreter
is finalized. It might lead to some unexpected outputs to standard
output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants