Cytoscape_tute.txt

**Install***For Cytoscape, we need to install the program on our computer: https://cytoscape.org/download.htmlWe also want to install the yfiles package for layouts: https://apps.cytoscape.org/apps/yfileslayoutalgorithms**Files**We will pull down the file we want by git cloning the rep. In a terminal, navigate to a directory where you can easily remember where your data lives. Then type: git clone https://github.com/biovcnet/NetworkScienceIntroduction.gitThis gives you several files and a directory called ÒAnalysisÓ. This directory has the file we want to work with in Cytoscape. **Import the data**Launch Cytoscape and click the ÔImport Network from fileÕ button. The file we want is TaraOceansSpearSparP1000GlassoAnalysis.csv. First, hit the Òselect noneÓ buttonÑthis deselects all columns from your imported CSV. In the interactive dialog box select your target and source nodes, as well as edge attributes. The little triangle in each column allows you to define your features. VarA will be our source node, so click the little triangle and click the green button under ÒmeaningÓ to designate it as such.  VarB is the target node, so click the triangle in this column and assign it as such by clicking the bullseye icon. (The order does not actually matter because this is an undirected network.) Next, Select rho.spear, p.spear, and fdr.spear all as edge attributes, which you mark with the purple icon. For simplicity, we will only examine the Spearman network right now. Hit the OK button! We have a network!!**Quick examination**First thing to look at is the number of nodes and edges, which can be found in the left navigation pane. Our network has 27 nodes and 378 edges. Some edges are self-loops. LetÕs remove those by selecting Edit-->Remove self-loops-->TaraOceansSpearSparP1000GlassoAnalysis.csv (if they donÕt go away right away, click anywhere in the viewer window). LetÕs rename this network by going to Edit-->Rename Network (or right click the network in the left pane). LetÕs call it ÒSpearmanÓ.This network is hard to look at, since all the nodes are stacked on top of each other. LetÕs change the layout so we can read the labels. Do this by going to Layout-->yfiles Organic. Read about this and other layout algorithms here: https://www.yworks.com/products/yfiles-layout-algorithms-for-cytoscapeThe nodes are still pretty scrunched up, so letÕs do some stretching of the network. This will not mess up the node positions, it will simply render proportional spacing between the nodes. Do this (in Mac) by selecting Layout-->Node Layout Tools, then drag the scale to the right. This layout tool may look different in Window OS. Whoa! Now weÕre zoomed way inÉ.letÕs fix that by selecting the magnifying class at the top with the arrows pointing NE and SW. This places the network perfectly in our viewer window.Now we can see all the node names. Cool! So, it looks like there are a lot of connectionsÉ.surely not all of them have a significant p-valueÉ..**Selecting edges with statistics**Because we read in the p.spear value, we have information for each edge in our network. We want to be selective and look at connections that are significant, both positive and negative. We can create a new network with strong p-values by clicking ÒselectÓ in the control panel. From here, click the plus button to impose our first filtering criterion. This will be a Òcolumn filterÓ, so click that and scroll down in the drop down box to ÒEdge: p.spearÓ. We now have a range of p-values. LetÕs make our cutoff at 0.05. That means we make our range between 0 and 0.05, inclusive. This highlights 312 edges out of our 351. Make a new network from this first filter by doing Select (at the top)-->Nodes-->Nodes connected by selected edges. This highlights all of our nodes in yellow. Now, go to File-->New Network-->From selected nodes, selected edges.Woo, new network! ItÕs called ÒSpearman(1)Ó. LetÕs rename it ÒSpearman_p0.05Ó.**Edge attribute visualization**Now that we have only statistically-significant edges, letÕs change their rendering to understand what they mean.LetÕs first change the width of the line to indicate the value of rho. We do this by clicking ÒstyleÓ in the control panel. At the bottom, there are options for ÒNodeÓ, ÒEdgeÓ, and ÒNetworkÓ. We want ÒEdgeÓ. There are lots of things we can change here, so donÕt get distracted! LetÕs focus on changing the width (at the bottom). We want continuous mapping based upon the rho.spear column. Once you have those options selected, double-click the little box that shows the values that are being mapped. We can use this editor to change the range of values that is uses and the width of the edges. In the editor, make three ÒbreaksÓ, one for the left, one for the right, and one for the middle. The max and min should be automatically set, but if not, click ÒSet min and maxÓ. The third break will be in the middle at 0.0. Drag the left and right edge widths to be about 10 to 15 (make sure the left and right are the same). Make the middle point size (at 0.0) about 1. This makes the edges with the highest absolute values the thickest, letting us know what the strongest correlations were. This is fun! But what about negative vs. positive connections?? We can address this with COLOR, YAY!!!Going back to your edge style pane, we will find ÒStroke colorÓ. Again, we select continuous mapping, with rho.spear as our column. Click again in the box to open the color editing pane. You have so many color options, and even the option to change palettes. But letÕs keep it simple, with three colors (one on the left, one on the right, one in the middle at 0). Left, for negative connections, will be red. Right, for positive connections, will be blue. And the middle will be white. Now our network tells us which statistically-supported edges are the ÒstrongestÓ in terms of rho AND the nature of those rho values (positive vs. negative). **Reading in node table**Now, letÕs do something with those nodes! For this, I want to introduce more data. I have a file that provides information about which Kingdom (or Domain) of life (Bacteria or Archaea) our phyla fall into. This is found in the git repo as Òkingdom_phylum.csvÓNavigate back to ÒNetworkÓ in the control panel and select our Spearman_p0.05 network if itÕs not already highlighted. Up at the top of the window, next to the ÒImport Network from FileÓ button, there is an ÒImport Table from FileÓ button. Select this and navigate to find the correct file (Òkingdom_phylum.csvÓ). The process for importing this data is very similar to the process for importing a network. Next to the ÒphylumÓ column, hit the triangle, and change ÒphylumÓ to ÒnameÓ (this is important, because it has to match what the column header is in our existing Node Table, otherwise, it cannot successfully map).  This will be our ÒkeyÓ, so select the key icon under ÒmeaningÓ. You should now have 3 columns in your Node Table Ð shared name, name, and Kingdom. LetÕs use this information to change the color of our nodes, whooooo!!Go to ÒStyleÓ in the navigation pane, and at the bottom, find ÒNodeÓ. Find ÒFill colorÓ and our mapping will be discrete this time, using the new ÒKingdomÓ column. You have two things to designate colors for Ð Archaea and Bacteria. Select what colors you want by clicking in the column next to these names and selecting the icon with 3 dots. Go wild! Keep in mind, black is the default label color, so if you want a black node, you have to change the label color to white. You can assign a universal color for node labels by clicking the far left box in the appropriate row that letÕs you change the Òdefault valueÓ.**Network statistics**Looking good. LetÕs let Cytoscape calculate some things about our pretty network and then use those calculations to make additional changes to how it looks. To do network statistics, simply select the network you want to analyze on the left, and select Tools-->NetworkAnalyzer-->Network Analysis-->Analyze Network. Treat the edges as undirected.Now, in addition to the 3 columns we had before, we have several more. These include things like degree, average shortest path, and the closeness coefficient. A results panel may also have popped up to let you immediately see what the clustering coefficient is (0.895) and the node degree distribution. LetÕs change the size of our nodes to be related to their degree, or the number of connections they have. We can do this by going to ÒStyleÓ and under the node options, selecting ÒSizeÓ (if this option is grayed out, make sure the checkbox at the bottom that locks node width and height is selected). The column is Degree and the mapping is continuous. Again, we can change the range of sizes for the nodes by clicking on the box and playing around. **Saving**What a pretty network! LetÕs save all this work by first saving the Cytoscape session (File-->Save). But, we also want a nice image of our network (you can take a screenshotÉ.orrr). We can export a .png or .jpg file by clicking the ÒExport to FileÓ button just underneath the visualizer window. LetÕs export as Image. Browse to the correct directory, rename the file Spearman_p0.05PosNeg.jpg. Zoom to 250% (100% is fine, too) and hit OK. We did it!!! There are so many options in Cytoscape for color rendering, changing line types, the shape of nodes, outline color, and even entire themes. I encourage you to explore and find what you like the best. Especially keep in mind ways to maximize the amount of data youÕre looking to display. Happy networking!!