Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] ConnectError while using pyarrow.fs #53

Closed
1 of 2 tasks
emanueledomingo opened this issue Sep 12, 2022 · 7 comments
Closed
1 of 2 tasks

[Bug] ConnectError while using pyarrow.fs #53

emanueledomingo opened this issue Sep 12, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@emanueledomingo
Copy link

Search before asking

  • I searched in the issues and found nothing similar.

Version

OS: Ubuntu 22.04.1 LTS
Pulsar: 2.9.2
python: 3.10.6
pulsar-client: 2.9.3
pyarrow: 7.0.0

Minimal reproduce step

The client doesn't connect to the Broker if i import the pyarrow.fs library before.

from pyarrow import fs
import pulsar

c = pulsar.Client("pulsar://localhost:6650")
p = c.create_producer("test")

If i just comment or import the pyarrow.fs library after pulsar, it works.

What did you expect to see?

2022-09-12 10:37:42.449 INFO  [140586932254528] ClientConnection:189 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
2022-09-12 10:37:42.449 INFO  [140586932254528] ConnectionPool:96 | Created connection for pulsar://localhost:6650
2022-09-12 10:37:42.450 INFO  [140586872165952] ClientConnection:375 | [127.0.0.1:51696 -> 127.0.0.1:6650] Connected to broker
2022-09-12 10:37:42.462 INFO  [140586872165952] HandlerBase:64 | [persistent://public/default/test, ] Getting connection from pool
2022-09-12 10:37:42.465 INFO  [140586872165952] ClientConnection:189 | [<none> -> pulsar://localhost:6650] Create ClientConnection, timeout=10000
2022-09-12 10:37:42.466 INFO  [140586872165952] ConnectionPool:96 | Created connection for pulsar://ac3b9ea4f607:6650
2022-09-12 10:37:42.466 INFO  [140586872165952] ClientConnection:377 | [127.0.0.1:51698 -> 127.0.0.1:6650] Connected to broker through proxy. Logical broker: pulsar://ac3b9ea4f607:6650
2022-09-12 10:37:42.472 INFO  [140586872165952] ProducerImpl:189 | [persistent://public/default/test, ] Created producer on broker [127.0.0.1:51698 -> 127.0.0.1:6650]

What did you see instead?

0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 INFO  [0000-00-00 00:00:00.000 ERROR [0000-00-00 00:00:00.000 INFO  [---------------------------------------------------------------------------
ConnectError                              Traceback (most recent call last)
Cell In [4], line 1
----> 1 p = c.create_producer("test")

File ~/mambaforge/envs/xxx/lib/python3.10/site-packages/pulsar/__init__.py:642, in Client.create_producer(self, topic, producer_name, schema, initial_sequence_id, send_timeout_millis, compression_type, max_pending_messages, max_pending_messages_across_partitions, block_if_queue_full, batching_enabled, batching_max_messages, batching_max_allowed_size_in_bytes, batching_max_publish_delay_ms, message_routing_mode, lazy_start_partitioned_producers, properties, batching_type, encryption_key, crypto_key_reader)
    639     conf.crypto_key_reader(crypto_key_reader.cryptoKeyReader)
    641 p = Producer()
--> 642 p._producer = self._client.create_producer(topic, conf)
    643 p._schema = schema
    644 p._client = self._client

ConnectError: Pulsar error: ConnectError

Anything else?

The same error occurs using the dagster library. I noticed that both pyarrow and dagster use grpcio under the hood.

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@emanueledomingo
Copy link
Author

Hi guys, i made more tests today at office. I found out that the problem is the combination of python and grpcio libraries. I tried the connection script with

  - python==3.9
  - pulsar-client==2.10.1
  - grpcio==1.48.0

and it doesn't work. The same script with

  - python<3.9
  - pulsar-client==2.10.1
  - grpcio==1.48.0

works.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@laurent-chriqui
Copy link
Contributor

laurent-chriqui commented Oct 20, 2022

Hello,

I have the same issue with the pyproj library using Python>=3.9.

When I import pyproj first, I have the exact same issue with Pulsar.
If I import pulsar first, I have a bug in pyproj when I try to do this for example:

import pulsar
import pyproj

pyproj.Transformer.from_crs('epsg:4326', 'epsg:3035')
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/site-packages/pyproj/transformer.py", line 600, in from_crs
    cstrencode(CRS.from_user_input(crs_from).srs),
  File "/usr/local/lib/python3.9/site-packages/pyproj/crs/crs.py", line 501, in from_user_input
    return cls(value, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/pyproj/crs/crs.py", line 348, in __init__
    self._local.crs = _CRS(self.srs)
  File "pyproj/_crs.pyx", line 2352, in pyproj._crs._CRS.__init__
pyproj.exceptions.CRSError: Invalid projection: epsg:4326: (Internal Proj Error: proj_create: cannot build geodeticCRS 4326: cannot build unit of measure 9122: non double value)

So there seems to be some kind of conflict between cpp libraries since pyproj is also a wrapper for the PROJ cpp library and python 3.9+
I do not need to install any version of grpcio for this to happen though.

Thank you for looking into this because we are stuck at python 3.8 for now...

Also posted in pyproj here

@erichare
Copy link
Contributor

@emanueledomingo are you still experiencing this? Using the latest stable releases of pyarrow, pulsar, and Python 3.10 and 3.11 I can't reproduce this currently. Your example code works as expected.

@emanueledomingo
Copy link
Author

@erichare I tested the same script with pulsar-client 3.0.0 and python 3.10 ad this seems to be solved. Didn't tested with python 3.11 yet.

@erichare
Copy link
Contributor

@erichare I tested the same script with pulsar-client 3.0.0 and python 3.10 ad this seems to be solved. Didn't tested with python 3.11 yet.

Cool, good to know! If you have time to check with 3.9 that would be great, I plan to myself soon as well, but i'm thinking this was resolved as some consequence of an update along the way...

@emanueledomingo
Copy link
Author

Tested also with python 3.9, it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants