Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve structured array doc regarding extracting key-value pairs #22682

Open
yasirroni opened this issue Nov 28, 2022 · 5 comments
Open

Improve structured array doc regarding extracting key-value pairs #22682

yasirroni opened this issue Nov 28, 2022 · 5 comments

Comments

@yasirroni
Copy link

yasirroni commented Nov 28, 2022

Issue with current documentation:

From structured array doc, there is no a clear and concise tutorial to get key names or even the key-value pairs.

Currentyly, list of key tutorial only tell it implicitly:

d = np.dtype([('x', 'i8'), ('y', 'f4')])
d.names
# ('x', 'y')

Idea or request for content:

Tell user that the keys are under dtype, and to get key-value pairs, user need to do the following:

for key in x.dtype.names:
    value = x[key]

Also might be related, please

  1. support dict() conversion:

    dict_ = dict(x)
    
    # equal to
    dict_ = {}
    for key in x.dtype.names:
        dict_[key] = x[key]
  2. support .items() method to iterate over key-value pairs:

    for key, value in x.items():
        key, value
    
    # equal to
    for key in x.dtype.names:
        value = x[key]

I might be able to submit PR to implement that functionality.

@yasirroni yasirroni changed the title DOC: <Please write a comprehensive title after the 'DOC: ' prefix> Improve structured array doc regarding extracting key-value pairs Nov 28, 2022
@chethanreddy123
Copy link
Contributor

Hey @yasirroni I would like to work on this, can you please guide me with some more insights on the issue, this is my first open-source contribution.

@Mukulikaa
Copy link
Contributor

Hi @yasirroni, can you please elaborate on your issue? If you refer to this section in the doc, it says that the names and fields attributes of the dtype object that will give you the keys and key-value pairs for the structured datatype. I'm not sure if that's what you are looking for?

@rossbar
Copy link
Contributor

rossbar commented Dec 7, 2022

IMO this is indeed covered in the section linked above. I'd be reticent to expand too much re: recommending translating between structured dtypes and dicts, as there may be better data structures suited to the cases, e.g. Dataframes.

@yasirroni
Copy link
Author

yasirroni commented Dec 8, 2022

@chethanreddy123 Hi, I'm not NumPy mainteainer, please ask them for that.

@Mukulikaa Hi, it seems that you don't understand the problem and you better try my piece of code and re-read the docs before further discussion.

@rossbar Hi, sadly, that section did not cover the way to retrieve the key-value, pairs. My use case is that I got structured numpy array from scipy.io.loadmat that give me a sturctured numpy array, but each array did not have any relationship with each other. They even have non matching size and shape! Thus, converting to dataframes is useless. From the doc, I know that I can get the list of keys from x.dtype.names, but the get all the data with the name, I need to use:

x = np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)],
             dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])

for key in x.dtype.names:
    val = x[key]
    print(key, val)

There should be a doc about that, or a better more, an easier way to do that.

Current docs only cover the name retireval and data retrieval separately.

@yasirroni
Copy link
Author

import numpy as np
x = np.array([('Rex', 9, 81.0), ('Fido', 3, 27.0)],
             dtype=[('name', 'U10'), ('age', 'i4'), ('weight', 'f4')])
x
# array([('Rex', 9, 81.), ('Fido', 3, 27.)],
#       dtype=[('name', '<U10'), ('age', '<i4'), ('weight', '<f4')])

x.names
# AttributeError: 'numpy.ndarray' object has no attribute 'names'

x.fields
# AttributeError: 'numpy.ndarray' object has no attribute 'fields'

x['name']
# array(['Rex', 'Fido'], dtype='<U10')

for key in x.dtype.names:
    val = x[key]
    print(key, val)
# name ['Rex' 'Fido']
# age [9 3]
# weight [81. 27.]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants