Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Definition of "production code" #2

Open
skilfullycurled opened this issue Mar 7, 2018 · 3 comments
Open

Definition of "production code" #2

skilfullycurled opened this issue Mar 7, 2018 · 3 comments

Comments

@skilfullycurled
Copy link

Hi,

Thanks so much for this implementation. Per your promise, it is quite readable and therefore very educational. You write in the readme that:

The code aims at people who want to learn about algorithms for social graphs. By far, this won't do for production code. We aim at readable code for educational purposes. main.py implements the algorithm, util/generate_data.py generates data and ui/index.html helps us with plotting our social graph.

Can you be a bit more specific on what you mean by "production code"? Are you saying you don't think it will work speed-wise...? accuracy..? on graphs larger than n size...?

Thanks for your help!

@RobRomijnders
Copy link
Owner

RobRomijnders commented Mar 7, 2018 via email

@skilfullycurled
Copy link
Author

I don't mind at all!

I'm going to be analyzing two twitter networks the first about...~2 million friends/followers and the second an interaction graph (e.g. mentions, retweets, replies) of...I'm not sure but it'll be in the millions. I'm not sure if BigClam will yield better results than other community detection methods but I am very attracted to the possibility of finding overlapping communities since I think this more closely mirrors realty.

As far as the meaning of "production code" is concerned, I'll have access to some pretty hefty computing so if it's a matter of inefficient code then that might not be too much of a problem.

Follow up question for you (if I may, happy to open a new issue to keep things organized): regarding data generation vs. data input:

p2c = datagen.person2comm
adj = datagen.adj

I trust I can replace datagen.adj with an actual adjacency matrix, but what is the replacement for the person2comm?

@RobRomijnders
Copy link
Owner

RobRomijnders commented Mar 9, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants