add slice to getitem for Node(Data)View #4086

MridulS · 2020-07-17T18:26:23Z

@dschult I started playing around with returning slice lists when the user explicitly asks for it. I have done this for Node(Data)View right now, would love to get some feedback and thoughts about this design before going further with this.

The following code will be valid if we have slices in Node(Data)View.

In [13]: G.nodes[0:5]                                                                                                             
Out[13]: [8433035090, 6272075181, 1180114949, 3701032566, 4290388111]

In [14]: G.nodes(data=True)[0:5]                                                                                                  
Out[14]: 
[(8433035090, {'w': 2716738093, 't': '6076217445'}),
 (6272075181, {'w': 7046427529, 't': '891215421'}),
 (1180114949, {'w': 9901849457, 't': '4451045765'}),
 (3701032566, {'w': 2467916768, 't': '3668632860'}),
 (4290388111, {'w': 505061655, 't': '5801405547'})]

In [15]: G.nodes(data='w')[0:5]                                                                                                   
Out[15]: 
[(8433035090, 2716738093),
 (6272075181, 7046427529),
 (1180114949, 9901849457),
 (3701032566, 2467916768),
 (4290388111, 505061655)]

dschult · 2020-07-18T05:10:55Z

This look good to me.
It's a little strange to have G.nodes[4] return an attribute dict while G.nodes[4:6] returns a list of nodes. Can we explain that in a succinct way to users?

rossbar · 2020-07-21T18:45:53Z

This certainly seems like a usability improvement! My knee-jerk reaction was that this might be a pretty significant API change (I'm really not familiar with the current *View APIs) in the sense that it could have implications for things like e.g. backwards compatibility. Maybe a change like this is a nice candidate for a design document (NXEP)? Maybe that would be overkill - I'm not sure at all :)

jarrodmillman · 2020-07-21T18:49:11Z

I started writing this before @rossbar commented, so this is redundant. I am going to leave here just for the record.

@rossbar Would you have time to look at this in the next couple of days? I have some general concerns about introducing new "convenience-oriented" API, but don't have specific thoughts about this PR yet. I believe @MridulS mentioned recently that this is a frequently requested from new users. (I may be misremembering, so please correct me if so.) I don't know if it is appropriate for this change, but maybe we should consider this for our first (new feature/API) NXEP:

https://networkx.github.io/documentation/latest/developer/nxeps/nxep-0000.html

If it seems like a good idea to create a NXEP for this, maybe you can work with @MridulS to draft it. Mridul will have much more experience with the code base and working with new users than you, but you may have more experience with NXEPs (since I basically copied the NEP process).

It would be good for us to gain experience with NXEPs. Hopefully, we will have several as we move closer to NX 3.0 and it would be good for us to see if there is anything we need to improve about the process sooner rather than later. It will also be helpful to have a couple of examples for future contributors to look at.

jarrodmillman · 2020-07-21T18:53:10Z

@MridulS It looks like @rossbar and I are both wondering whether it would make sense to create a NXEP for this. Among other things, that would allow us to record use cases and would serve as useful design documentation going forward. What are your thoughts?

MridulS · 2020-07-21T19:02:51Z

Oh yes, definitely. Now that I think about it, this will lead to significant API changes so we should have a more formal design doc rather than just PRs. I’ll start up a NXEP for this (or if anyone else wants to do that please go ahead).

…

On Wed, 22 Jul 2020 at 00:23, Jarrod Millman ***@***.***> wrote: @MridulS <https://github.com/MridulS> It looks like @rossbar <https://github.com/rossbar> and I are both wondering whether it would make sense to create a NXEP for this. Among other things, that would allow us to record use cases and would serve as useful design documentation going forward. What are your thoughts? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4086 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABI5RFC4ST6Q44SBBDTSPRLR4XP2PANCNFSM4O6VY7BA> .

jarrodmillman · 2020-07-21T19:15:07Z

The NXEP process is very similar to the PEP process and should be near identical to the NEP and SKIP process. Since it looks like you may be the first person to try this out for NX, please don't hesitate to ask questions about the format and process. If things are too cumbersome or don't make sense, we should use this opportunity to refine the NXEP process.

MridulS · 2020-07-23T02:25:11Z

It's a little strange to have G.nodes[4] return an attribute dict while G.nodes[4:6] returns a list of nodes. Can we explain that in a succinct way to users?

I agree with this, and this will be even stranger for G.edges as G.edges[0, 1] will be an attr dict while G.edges[0:1] will be a list.

While working on the NXEP I had a wild-ish idea of copying pandas .head() feature as an alternate implementation of this. We could have something like G.nodes.head() and G.edges.head() which prints out the first x nodes/edges by default. One of main uses of writing down list(G.edges(data=True))[0:10] for users (IMO) is to declutter the screen (especially inside a jupyter notebook) when they have large graphs and G.edges will print out some 1,000 edges. Thoughts?

dschult · 2020-07-23T03:23:46Z

G.nodes.head(n) gives the first n nodes. This separates the dict-like lookup from the list-like slicing. Also, "tail" could give the last n nodes.

But, maybe we should call it slice to allow more generality:

G.nodes.slice(10)
G.nodes.slice(1:2:20)
G.nodes.slice(-10:)

dschult · 2021-03-25T21:08:56Z

I occurs to me that users probably don't want to process "the first 10 edges".
They probably want to see them. We could provide for that use case by adding a __str__ method that produced something like the __repr__ only stopping after 5 edges (or some other number).

I think this fixes the primary motivating use-case -- are there other compelling use-cases?

jarrodmillman · 2021-03-26T14:27:34Z

I like Dan's suggestion about adding a __str__ method. The more I think about G.nodes[4:6], the less I like it. It allows new users who don't know about Python data structures to get a response, but it makes everything more confusing and will require us to carefully explain what is happening. I also am leaning against adding a slice method.

It doesn't save any characters:

G.nodes.slice(10)
list(G.nodes)[10]

And converting an iterable to a list using list() is standard Python that users should learn if they are using NetworkX.

rossbar · 2021-03-26T18:11:47Z

One thing about a __str__ method is that it's not parametrized, so users would be stuck with whatever number of elements is set in the method, or we'd have to provide an external way of setting it via e.g. something like np.printoptions or an rc file. IMO these latter options seem like more trouble than they're worth. I'm not sure

>>> with nx.printoptions(num_elem=5):
...     G.nodes()

is more convenient than

>>> list(G.nodes())[:5]

and is certainly less flexible in any case (what if a user wanted every other node?).

MridulS marked this pull request as draft July 17, 2020 18:26

MridulS added the type: Enhancements label Jul 17, 2020

add slice to __getitem__ for Node(Data)View

c9dec44

MridulS force-pushed the slice_view branch from 5a06a84 to c9dec44 Compare July 17, 2020 18:46

dschult approved these changes Jul 18, 2020

View reviewed changes

MridulS mentioned this pull request Jul 30, 2020

NXEP 2 — API design of view slices #4101

Merged

Base automatically changed from master to main March 4, 2021 18:20

MridulS closed this Feb 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add slice to getitem for Node(Data)View #4086

add slice to getitem for Node(Data)View #4086

MridulS commented Jul 17, 2020

dschult commented Jul 18, 2020

rossbar commented Jul 21, 2020

jarrodmillman commented Jul 21, 2020 •

edited

jarrodmillman commented Jul 21, 2020

MridulS commented Jul 21, 2020 via email

jarrodmillman commented Jul 21, 2020

MridulS commented Jul 23, 2020

dschult commented Jul 23, 2020

dschult commented Mar 25, 2021 •

edited

jarrodmillman commented Mar 26, 2021 •

edited

rossbar commented Mar 26, 2021

add slice to __getitem__ for Node(Data)View #4086

add slice to __getitem__ for Node(Data)View #4086

Conversation

MridulS commented Jul 17, 2020

dschult commented Jul 18, 2020

rossbar commented Jul 21, 2020

jarrodmillman commented Jul 21, 2020 • edited

jarrodmillman commented Jul 21, 2020

MridulS commented Jul 21, 2020 via email

jarrodmillman commented Jul 21, 2020

MridulS commented Jul 23, 2020

dschult commented Jul 23, 2020

dschult commented Mar 25, 2021 • edited

jarrodmillman commented Mar 26, 2021 • edited

rossbar commented Mar 26, 2021

add slice to getitem for Node(Data)View #4086

add slice to getitem for Node(Data)View #4086

jarrodmillman commented Jul 21, 2020 •

edited

dschult commented Mar 25, 2021 •

edited

jarrodmillman commented Mar 26, 2021 •

edited