Skip to content

Fix schema_avro.py module's exception when fast avro not installed#90

Closed
keenborder786 wants to merge 2 commits intoapache:mainfrom
keenborder786:main
Closed

Fix schema_avro.py module's exception when fast avro not installed#90
keenborder786 wants to merge 2 commits intoapache:mainfrom
keenborder786:main

Conversation

@keenborder786
Copy link
Copy Markdown

Fixed the issue raised at #89

class AvroSchema(Schema):
def __init__(self, _record_cls, _schema_definition=None):
raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this fail even if one is not trying to use Avro schema?

Copy link
Copy Markdown
Contributor

@erichare erichare Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug is an interesting finding... but yeah, right now AvroSchema gets imported in the schema init, and attempting to do so without fastavro installed (and importable) will lead to this exception being raised, when really it should only be raised when attempting to initialize a new AvroSchema instance. So i don't think this should be merged in as is.

With that said, why you're experiencing the bug that you are @keenborder786 , no clue when it comes to that yet...

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, I tested it out with other Schemas and it works.

Copy link
Copy Markdown
Contributor

@erichare erichare Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keenborder786 the build checks are failing because we're importing everything from the schema module:

from pulsar.schema import *

Basically the issue is just that if fastavro isn't available at import time, then the exception is raised... now someone actually intending to import AvroSchema, of course may want to know about the missing dependency right away (before initializing an instance of the class) - But i guess like the tests they could be trying to import everything - not the best practice of course. Still, I imagine we don't want to break that (i could see some users without fastavro also importing the entire schema module)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erichare , okay I understand the point that you have raised about AvroSchema Class being used in other places and exceptions being raised when fastavro has not been installed. Therefore I have remodified my PR. Can you please review the latest commit? Thank you.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too am a little surprised that this change caused any functional difference in the code that would fix the original bug you found. I think syntactically the change is fine, even improved, but the code should operate the same way as prior. I think it would be good to get a handle on why this would solve the issue before we merge this in.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erichare makes sense, can you please try the above steps on your machine. Thank you.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any updates?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keenborder786 I haven't worked too much with AvroSchema specifically, but when i run your example (and the example from the pulsar docs, I get the following error:

UnknownError                              Traceback (most recent call last)
Cell In[11], line 1
----> 1 producer = client.create_producer(
      2     topic=topic,
      3     schema=avro_schema)

File /opt/homebrew/lib/python3.11/site-packages/pulsar/__init__.py:645, in Client.create_producer(self, topic, producer_name, schema, initial_sequence_id, send_timeout_millis, compression_type, max_pending_messages, max_pending_messages_across_partitions, block_if_queue_full, batching_enabled, batching_max_messages, batching_max_allowed_size_in_bytes, batching_max_publish_delay_ms, chunking_enabled, message_routing_mode, lazy_start_partitioned_producers, properties, batching_type, encryption_key, crypto_key_reader)
    642     raise ValueError("Batching and chunking of messages can't be enabled together.")
    644 p = Producer()
--> 645 p._producer = self._client.create_producer(topic, conf)
    646 p._schema = schema
    647 p._client = self._client

Note i don't think line 642 is relevant - that's just showing the lines around the line that failed. I don't have this problem with other schema types, just AvroSchema...

I don't get far enough to test what you're experiencing. Have you seen this?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay let me have a look at it.

…ing a sperate HAS_AVRO boolean variable, directly creating the class in try except block of importing fastavro
@BewareMyPower BewareMyPower changed the title [fix] exception bug as per #89 [fix] schema_avro.py module's exception when fast avro not installed Mar 16, 2023
@BewareMyPower BewareMyPower changed the title [fix] schema_avro.py module's exception when fast avro not installed Fix schema_avro.py module's exception when fast avro not installed Mar 16, 2023
Copy link
Copy Markdown
Contributor

@BewareMyPower BewareMyPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot see what problem will this PR resolve. I wrote an example script:

# 1.py
from pulsar.schema import *

class Example(Record):
    a = Integer()
    b = Integer()

schema = AvroSchema(Example)

Then I installed the Pulsar Python client inside a python:3.11.2-bullseye container and run the following commands:

$ python3 1.py
Traceback (most recent call last):
  File "//1.py", line 7, in <module>
    schema = AvroSchema(Example)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py", line 89, in __init__
    raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
Exception: Avro library support was not found. Make sure to install Pulsar client with Avro support: pip3 install 'pulsar-client[avro]'
$ cp schema_avro.py /usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py
$ python3 1.py
Traceback (most recent call last):
  File "//1.py", line 7, in <module>
    schema = AvroSchema(Example)
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/pulsar/schema/schema_avro.py", line 83, in __init__
    raise Exception("Avro library support was not found. Make sure to install Pulsar client " +
Exception: Avro library support was not found. Make sure to install Pulsar client with Avro support: pip3 install 'pulsar-client[avro]'

I cannot see any difference.

@BewareMyPower
Copy link
Copy Markdown
Contributor

Close it. Please open another PR if you have something new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants