Fix schema_avro.py module's exception when fast avro not installed#90
Fix schema_avro.py module's exception when fast avro not installed#90keenborder786 wants to merge 2 commits intoapache:mainfrom
Conversation
pulsar/schema/schema_avro.py
Outdated
| class AvroSchema(Schema): | ||
| def __init__(self, _record_cls, _schema_definition=None): | ||
| raise Exception("Avro library support was not found. Make sure to install Pulsar client " + | ||
| raise Exception("Avro library support was not found. Make sure to install Pulsar client " + |
There was a problem hiding this comment.
Wouldn't this fail even if one is not trying to use Avro schema?
There was a problem hiding this comment.
The bug is an interesting finding... but yeah, right now AvroSchema gets imported in the schema init, and attempting to do so without fastavro installed (and importable) will lead to this exception being raised, when really it should only be raised when attempting to initialize a new AvroSchema instance. So i don't think this should be merged in as is.
With that said, why you're experiencing the bug that you are @keenborder786 , no clue when it comes to that yet...
There was a problem hiding this comment.
Nope, I tested it out with other Schemas and it works.
There was a problem hiding this comment.
@keenborder786 the build checks are failing because we're importing everything from the schema module:
from pulsar.schema import *
Basically the issue is just that if fastavro isn't available at import time, then the exception is raised... now someone actually intending to import AvroSchema, of course may want to know about the missing dependency right away (before initializing an instance of the class) - But i guess like the tests they could be trying to import everything - not the best practice of course. Still, I imagine we don't want to break that (i could see some users without fastavro also importing the entire schema module)
There was a problem hiding this comment.
@erichare , okay I understand the point that you have raised about AvroSchema Class being used in other places and exceptions being raised when fastavro has not been installed. Therefore I have remodified my PR. Can you please review the latest commit? Thank you.
There was a problem hiding this comment.
I too am a little surprised that this change caused any functional difference in the code that would fix the original bug you found. I think syntactically the change is fine, even improved, but the code should operate the same way as prior. I think it would be good to get a handle on why this would solve the issue before we merge this in.
There was a problem hiding this comment.
@erichare makes sense, can you please try the above steps on your machine. Thank you.
There was a problem hiding this comment.
@keenborder786 I haven't worked too much with AvroSchema specifically, but when i run your example (and the example from the pulsar docs, I get the following error:
UnknownError Traceback (most recent call last)
Cell In[11], line 1
----> 1 producer = client.create_producer(
2 topic=topic,
3 schema=avro_schema)
File /opt/homebrew/lib/python3.11/site-packages/pulsar/__init__.py:645, in Client.create_producer(self, topic, producer_name, schema, initial_sequence_id, send_timeout_millis, compression_type, max_pending_messages, max_pending_messages_across_partitions, block_if_queue_full, batching_enabled, batching_max_messages, batching_max_allowed_size_in_bytes, batching_max_publish_delay_ms, chunking_enabled, message_routing_mode, lazy_start_partitioned_producers, properties, batching_type, encryption_key, crypto_key_reader)
642 raise ValueError("Batching and chunking of messages can't be enabled together.")
644 p = Producer()
--> 645 p._producer = self._client.create_producer(topic, conf)
646 p._schema = schema
647 p._client = self._client
Note i don't think line 642 is relevant - that's just showing the lines around the line that failed. I don't have this problem with other schema types, just AvroSchema...
I don't get far enough to test what you're experiencing. Have you seen this?
There was a problem hiding this comment.
okay let me have a look at it.
…ing a sperate HAS_AVRO boolean variable, directly creating the class in try except block of importing fastavro
BewareMyPower
left a comment
There was a problem hiding this comment.
I cannot see what problem will this PR resolve. I wrote an example script:
# 1.py
from pulsar.schema import *
class Example(Record):
a = Integer()
b = Integer()
schema = AvroSchema(Example)Then I installed the Pulsar Python client inside a python:3.11.2-bullseye container and run the following commands:
$ python3 1.py
Traceback (most recent call last):
File "//1.py", line 7, in <module>
schema = AvroSchema(Example)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py", line 89, in __init__
raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
Exception: Avro library support was not found. Make sure to install Pulsar client with Avro support: pip3 install 'pulsar-client[avro]'
$ cp schema_avro.py /usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py
$ python3 1.py
Traceback (most recent call last):
File "//1.py", line 7, in <module>
schema = AvroSchema(Example)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py", line 83, in __init__
raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
Exception: Avro library support was not found. Make sure to install Pulsar client with Avro support: pip3 install 'pulsar-client[avro]'I cannot see any difference.
|
Close it. Please open another PR if you have something new. |
Fixed the issue raised at #89