Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Importing Python module apache_beam.dataframe.convert raises AttributeError #27410

Closed
1 of 15 tasks
jzxu opened this issue Jul 8, 2023 · 2 comments
Closed
1 of 15 tasks

Comments

@jzxu
Copy link

jzxu commented Jul 8, 2023

What happened?

In Python, with Apache beam 2.48.0 and Pandas >=2.0.0 installed, running the following import statement raises an exception:

>>> import apache_beam.dataframe.convert
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jzxu/env/lib/python3.11/site-packages/apache_beam/dataframe/convert.py", line 33, in <module>
    from apache_beam.dataframe import transforms
  File "/home/jzxu/env/lib/python3.11/site-packages/apache_beam/dataframe/transforms.py", line 33, in <module>
    from apache_beam.dataframe import frames  # pylint: disable=unused-import
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jzxu/env/lib/python3.11/site-packages/apache_beam/dataframe/frames.py", line 1231, in <module>
    class DeferredSeries(DeferredDataFrameOrSeries):
  File "/home/jzxu/env/lib/python3.11/site-packages/apache_beam/dataframe/frames.py", line 1338, in DeferredSeries
    @frame_base.populate_defaults(pd.Series)
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jzxu/env/lib/python3.11/site-packages/apache_beam/dataframe/frame_base.py", line 600, in wrap
    base_argspec = getfullargspec(unwrap(getattr(base_type, func.__name__)))
                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'Series' has no attribute 'append'. Did you mean: '_append'?

This is because the function Series.append has been removed in Pandas 2.0.0:

https://pandas.pydata.org/docs/dev/whatsnew/v2.0.0.html

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
copybara-service bot pushed a commit to google-research/skai that referenced this issue Jul 9, 2023
- Removed dependency on deprecated tensorflow addons library. Switched to using OpenCV connected components function instead.

- Pandas version set to <2.0.0 as workaround for
apache/beam#27410

PiperOrigin-RevId: 546618412
copybara-service bot pushed a commit to google-research/skai that referenced this issue Jul 10, 2023
- Removed dependency on deprecated tensorflow addons library. Switched to using OpenCV connected components function instead.

- Pandas version set to <2.0.0 as workaround for
apache/beam#27410

PiperOrigin-RevId: 546618412
copybara-service bot pushed a commit to google-research/skai that referenced this issue Jul 10, 2023
- Removed dependency on deprecated tensorflow addons library. Switched to using OpenCV connected components function instead.

- Pandas version set to <2.0.0 as workaround for
apache/beam#27410

PiperOrigin-RevId: 546924081
@tvalentyn
Copy link
Contributor

Thanks, this is a duplicate of #27221. Left a comment there.

@tvalentyn tvalentyn closed this as not planned Won't fix, can't repro, duplicate, stale Jul 11, 2023
@tvalentyn
Copy link
Contributor

tvalentyn commented Jul 11, 2023

closing as a duplicate, but it is an issue that should be fixed, let's continue on #27221.

panford pushed a commit to panford/skai that referenced this issue Aug 10, 2023
- Removed dependency on deprecated tensorflow addons library. Switched to using OpenCV connected components function instead.

- Pandas version set to <2.0.0 as workaround for
apache/beam#27410

PiperOrigin-RevId: 546924081
panford added a commit to instadeepai/skai that referenced this issue Aug 24, 2023
* Add accelerator desc to readme

* correct accelerator comment and add example for TPUs choices

* Bug fix: No automatic conversion of sets to lists when using random.sample() in python 3.11

See https://docs.python.org/3/whatsnew/3.11.html#porting-to-python-3-11 on random.sample() for details

PiperOrigin-RevId: 540935775

* Process multiple images in single Beam stage. Speed up window grouping.

Previously, the generate_examples_pipeline will create a Beam stage for each image it reads. This change merges all before image processing into a single stage and all after image processing into a single stage.

Also, window grouping is now 10x faster (42 minutes -> 3.5 minutes) due to the removal of the expensive rtree node deletion operation from the algorithm.

PiperOrigin-RevId: 540939377

* Add int64_id to examples.

The int64 id is the first 64 bits of the hex string making up the current string example_id feature. The int64 id is needed for certain ops to run on TPU, as TPU doesn't support string tensors.

After this CL, we will gradually transition from string example ids to using int64 example ids.

PiperOrigin-RevId: 542079778

* Bug fix: Remove empty string from string_label list

PiperOrigin-RevId: 542891589

* Update colab_utils.py

Update for including the processing of labeled data into the generate example command

* Update Colab Run_SKAI_Colab_Pipeline Notebook : add generate pre-labeled examples, add train supervised method

* Revert "Update Colab Run_SKAI_Colab_Pipeline Notebook : add generate pre-labeled examples, add train supervised method"

This reverts commit 850690c.

* Update Colab Run_SKAI_Colab_Pipeline Notebook : add generate pre-labeled examples, add train supervised method

* Fix dependencies to make detect_buildings.py work again.

- Removed dependency on deprecated tensorflow addons library. Switched to using OpenCV connected components function instead.

- Pandas version set to <2.0.0 as workaround for
apache/beam#27410

PiperOrigin-RevId: 546924081

* Internal change

PiperOrigin-RevId: 546938192

* Fix tests.

PiperOrigin-RevId: 547006716

* Internal change

PiperOrigin-RevId: 547895216

* Fix create_labeled_dataset flag typo.

PiperOrigin-RevId: 548236182

* Apply changes suggested by Jlee

* Create Colab instruction Document

* Uses default method for tf.image.resize (bilinear).

PiperOrigin-RevId: 550044281

* Update colab_instructions.md

* Update colab_instructions.md

* Update colab_instructions.md

* Include Instruction Document references

* Add inference beam job.

PiperOrigin-RevId: 550460355

* Add blank colab_instructions.md so that PR 108 will import correctly.

PiperOrigin-RevId: 550661005

* Update Python version for running tests to 3.11.4.

PiperOrigin-RevId: 550665171

* Disable external IPs for dataflow workers.

This avoids getting the number of dataflow workers capped by the external IPs quota. See

https://medium.com/google-cloud/eliminate-auto-scaling-bottlenecks-by-using-private-ips-for-dataflow-workers-23a8a73cecd5

YOU MUST ENABLE "GOOGLE PRIVATE ACCESS" IN THE SUBNETS YOUR WORKERS RUN ON!

https://cloud.google.com/vpc/docs/configure-private-google-access#config-pga

PiperOrigin-RevId: 550675533

* No public description

PiperOrigin-RevId: 550944094

* Removes functionality to specify dataflow container image in the pipeline config as this flag is never set and just introduces a lot of extra code.

Also fixes bug introduced by previous change.

PiperOrigin-RevId: 551017760

* Move documentation files to the right directory.

PiperOrigin-RevId: 552466883

* Update default hyperparameters.

PiperOrigin-RevId: 555425020

* merge with main

* clean up branch - remove unwanted files/folders

* modify docker image for python 3.8 tf2.13

* Add support for inference on gpu

* Add fixes for gpu images

* modify memory allocation

* address PR reviews

* address PR review comments

---------

Co-authored-by: Boris Lami Fonyuy <fonyuy@google.com>
Co-authored-by: Joseph Xu <jzxu@google.com>
Co-authored-by: ambaha1 <baha.amine@hotmail.fr>
Co-authored-by: Jihyeon Lee <jihyeonlee@google.com>
Co-authored-by: Luke Granger-Brown <lukegb@google.com>
Co-authored-by: Mohamedelfatih Mohamedkhair <melfatih@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants