Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Exploring and Analyzing Network Data with Python #1587

Open
svmelton opened this issue Dec 9, 2019 · 16 comments
Open

Error in Exploring and Analyzing Network Data with Python #1587

svmelton opened this issue Dec 9, 2019 · 16 comments
Assignees
Labels

Comments

@svmelton
Copy link
Contributor

@svmelton svmelton commented Dec 9, 2019

A reader reported the following error via email:
for n in G.nodes(): # Loop through every node, in our data "n" will be the name of the person print(n, G.node[n]['birth_year']) # Access every node by its name, and then by the attribute "birth_year"

Should be:
for n in G.nodes(): # Loop through every node, in our data "n" will be the name of the person print(n, G.nodes[n]['birth_year']) # Access every node by its name, and then by the attribute "birth_year"

I'll submit a PR.

@svmelton svmelton self-assigned this Dec 9, 2019
@walshbr

This comment has been minimized.

Copy link
Contributor

@walshbr walshbr commented Dec 9, 2019

Hmm…wonder if the API for networkx might have changed, as that syntax shows up a few times in the lesson - "G.node[" shows up several times times. And all the code blocks definitely ran when I was editing the thing back when it live. Let me know if you'd rather me do a PR, @svmelton. Or happy to just review yours - all the same to me. Thanks! Good catch.

@svmelton

This comment has been minimized.

Copy link
Contributor Author

@svmelton svmelton commented Dec 9, 2019

If you've got the time to do a PR, that would be great!

@walshbr

This comment has been minimized.

Copy link
Contributor

@walshbr walshbr commented Dec 9, 2019

Actually @svmelton - it looks like updating that syntax exposed another issue that is indeed from ways that API changed in a newer version in the last couple years. So I think it won't be as easy as just updating that one piece of syntax.

@jrladd - can you take a look at this and let me know what you'd like to do? One option might be to freeze the version number on the lesson, or we could update it for the more recent networkx version. Besides the piece flagged here, the line here -

"communities = community.best_partition(G)"

is throwing an error because "edges_iter" was removed in the more recent version of networkx. Let me know what you think!

@svmelton

This comment has been minimized.

Copy link
Contributor Author

@svmelton svmelton commented Dec 9, 2019

@walshbr @jrladd I had to run the pip3 install python-louvain --upgrade command and install numpy to get that section to work.

I am running into issues on the eigenvector section, however—I'm getting a keyerror on eigenvector.

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Dec 9, 2019

Thanks @svmelton @walshbr for looking into this. These difficulties are definitely because NetworkX is changing all the time. I'd prefer to update rather than leave it with an old version, but we should probably note the version this time in case this happens again. It's unlikely we'll be able to keep up with NetworkX forever!

For the nodes issue, updating to G.nodes[n] in all the places it occurs makes perfect sense. @svmelton where exactly is the KeyError coming up in the eigenvector section? Is it not finding what it expects in eigenvector_dict? I suspect this is also due to a change in the NetworkX API, but I'm wondering why it isn't affecting betweenness. It may be that I forgot something the last time I updated the code.

As for community detection, NetworkX now supports modularity inside the library itself. I might prefer to update to the greedy modularity communities function and get rid of python-louvain altogether. That would simplify the tutorial a bit.

If you'd like to go that route w/ modularity, I could make those updates early in the new year.

@svmelton

This comment has been minimized.

Copy link
Contributor Author

@svmelton svmelton commented Dec 9, 2019

@jrladd @walshbr The error is coming up in class0_eigenvector = {n:G.node[n]['eigenvector'] for n in class0}

The error is:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <dictcomp> KeyError: 'eigenvector'

@walshbr

This comment has been minimized.

Copy link
Contributor

@walshbr walshbr commented Dec 9, 2019

That all seems fine to me @jrladd. And it seems like the vast bulk of the intellectual work of the lesson would remain the same, so as long as @svmelton agrees I don't think we would need to retire the old lesson or peer review a new one. I think we could just focus on updating this in the next few weeks. That sound alright to you @svmelton?

I'll let @jrladd take a look at the error!

@svmelton

This comment has been minimized.

Copy link
Contributor Author

@svmelton svmelton commented Dec 9, 2019

Sounds good @walshbr!

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Dec 9, 2019

Thanks, both! The error may be related to python-louvain somehow (maybe the data structure for nodes in classes doesn't include attributes anymore). Since I'll be adjusting that section to work with the new modularity function, I can make sure to avoid the error in the new version. (So long as it's okay to wait a few weeks for a fix.)

@walshbr

This comment has been minimized.

Copy link
Contributor

@walshbr walshbr commented Dec 10, 2019

@jrladd - if networkx changes that much, I wonder if it'd be worth giving a requirements.txt file and instructions for how to install pip from that? Or otherwise giving instructions in the installation section for how to install the specific version of networkx so that it's sort of frozen in time.

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Dec 10, 2019

@walshbr Yes, I think especially if we move away from the python-louvain module and only have one requirement, it'll be easiest just to update the instructions to something like:

pip3 install networkx==2.4

Then readers should get the same version every time. I'll check to make sure this is the right syntax and make the change alongside the others we discussed.

@walshbr

This comment has been minimized.

Copy link
Contributor

@walshbr walshbr commented Dec 10, 2019

That sounds like a good solution to me @jrladd. I'll stick an alert on the top of the lesson for now that the newer version of networkx will raise errors. Are you comfortable submitting a PR to this repository with the changes when you've got them?

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Dec 10, 2019

@walshbr Yep! I may not get it done before break begins at the end of this week, but I'll have it sometime in early Jan. at the latest.

@mdlincoln

This comment has been minimized.

Copy link
Member

@mdlincoln mdlincoln commented Jan 13, 2020

Any progress on this?

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Jan 13, 2020

Thanks for flagging this again, @mdlincoln! I got caught up in MLA and post-holiday activities. I’ll make the changes this week.

@jrladd

This comment has been minimized.

Copy link
Contributor

@jrladd jrladd commented Jan 17, 2020

As you can see above, I've sent a pull request that should fix all the things we talked about. Let me know if this will work! Many thanks @svmelton @walshbr @mdlincoln

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.