Skip to content

Conversation

@antareepsarkar
Copy link

@antareepsarkar antareepsarkar commented Nov 26, 2025

Follows from #63173. I'm Sorry, but the last commit of the PR showed "Processing updates" for a long time. So, I closed that.

@antareepsarkar
Copy link
Author

antareepsarkar commented Nov 26, 2025

@Alvaro-Kothe
This time, I have used np.array() for it to be time efficient like the _from_sequence method.
Many parts of pandas require _from_sequence to convert objects to their string representations. So, many files needed to be changed. I did not modify it.

If I have done some mistake or any change is required, please tell me.

Copy link
Member

@Alvaro-Kothe Alvaro-Kothe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This time, I have used np.array() for it to be time efficient like the _from_sequence method.

There are some benchmarks for array construction in asv_bench/benchmarks/array.py that you can use to measure performance impact. For guidance on how to run it, refer to the docs.


# to avoid returning an array of string representation of objects.
if dtype == StringDtype():
ndarr = np.array(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this for the following reasons:

  1. It's weird to convert to np.array while calling pd.array and not use it.
  2. The conversion in np.array can raise, e.g., when we call np.array([[1, 2], ["a"]]).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: pandas.array works fine when 2-D array contains string

2 participants