-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network analysis #7
Comments
I think this is a great idea! Anthony On Mon, Mar 16, 2015 at 9:29 AM, Chris Holdgraf notifications@github.com
Anthony Suen |
Can you show the legend of the color code for the first graph(the color code for the second one is the correlation)? What does the bar chart with the lines on the top represents? It is interesting to see that there is high correlation in the middle of the matrix. Maybe we can coordinate the seasonal buying among these departments. |
Ah good point @kaiweitan, the color code is actually relatively arbitrary. Matplotlib chooses the colors to accentuate the differences in the data, so in this case "white" isn't necessarily 0. When we make a final output of this, then we will make sure to get the colors right. For the second big correlation matrix, it's currently sorted according to the clustering trees that you see on the margins. We could define "cuts" of those trees as clusters, though doing this is a bit of a dark art. Definitely worth looking into. |
Very cool - we could choose different uses for colors. E.g., color code by manufacturer or supplier ID, rather than their category. That way we could see which organizations are persistently connected to others across time. |
hi nick, nice, but can you explain a bit how you subsetted by months? the one way to clean up the network graph is to remove nodes with centrality = darius On Thu, Mar 19, 2015 at 11:53 AM, nlin3330 notifications@github.com wrote:
Darius Mehri |
Hi Darius, Basically I converted the creation_time variable into a datetime variable which allows ease in specifying a range of dates. For the graph this was the first three months of the current data (1/1/2012-3/1/2012). |
hi nick, i see, i did the same exact thing for 2013, there is some issue On Thu, Mar 19, 2015 at 2:47 PM, nlin3330 notifications@github.com wrote:
Darius Mehri |
if you used drop_duplicates, there is a chance you may be throwing out too On Thu, Mar 19, 2015 at 2:53 PM, Darius Mehri darius_mehri@berkeley.edu
Darius Mehri |
hey @nlin3330 it looks like you worked on the color-coding stuff in a recent commit...do you have any interesting output / plots from that analysis, or still a work-in-progress? |
hey guys, i am back online to work on the project, sorry again for the absence, my objective is by the end of the week to get the group some nice network graphs and some hard data, nick is away all week (he is out of the country), but i will be in touch with him on and off here are some of the plans:
|
This issue is where we'll discuss the network analysis component. We can post graphs, code snippits, and brainstorms.
The network analysis project aims to find cluster of co-occurrence between departments, manufacturers, suppliers, product types, etc.
Project lead is @dariusmehri along with @nlin3330
The text was updated successfully, but these errors were encountered: