# Making Publication Quality Figures

## 0. Make sure the figure fits with the context of where it is going to be used

**Figure outline**: By the time publication is in the picture, often the outline of figures is already planned out. This is not the place to discuss proper figure outlining (proper overview figure, methods figure, etc), but it is important to keep in mind that the figure you intend to publish must be placed in context of a larger plan. Often this limits the number of figures significantly and should be considered relatively early to focus data analysis efforts.

**Purpose - poster or paper?**: Poster presentations are much more accepting of cutting corners in figure preparation than paper submissions. However, keep in mind that posters sometimes require figures to be presented much larger, so resolution can be more important in that case.

**Submission instructions**: All academic papers have clear instructions for figures, so it may be helpful to know for where you intend to submit (or a similar journal for a rough guess). Most have expectations for sizes to fit paper column widths, though if you are approximate on single column or double column sizes this can be changed in production. Resolution is generally expected to be at least 300 dots per inch, which should be the goal for raster-based images, though this rule can be bent too. PDF or common high-resolution raster formats are generally requested.

## 1. Get a reasonable, rough visual

**Pilot/prototype the plot**: Sometimes a sketch or mental picture can’t be easily rendered, so it’s important to quickly take the data and see what the basic visual looks like. You may see that your original intent doesn’t convent the information well and you should try another approach. Much like how pilot studies uncover flaws in basic logic, it’s import to quickly plot your data and see if what you want to demonstrate is well represented by your visual.

**Choose the right form**: Maybe your bar plot would be better represented as a pie chart (e.g. since it’s a proportion). Maybe your surface plot, though it looks interesting, would be easier to observe as a contour plot. Line plot or scatter plot? Put it all on a single plot, or split it into multiple plots? This is the time to decide, because if you decide this later you will waste time.

## 2. Add the basics to where someone can minimally interpret the figure

Sure, you may know what is going on with your figure, but anyone else coming to it likely doesn’t have the same experience as the person who collected the data or created the figure. Be sure to add appropriate text to make the figure interpretable. The following list is roughly ranked from most important to leave important “basics” for a figure to be interpretable:
* a figure title
* x-axis label (with units)
* y-axis label (with units)
* a legend (especially for multiple line plots, contour plots w colors, etc)
* a figure caption (a short text description below the figure to describe what it represents). This is critical in posters and papers where people are often drawn to the figure and need context to understand what is going on and what is important, but is often not a step of consequence in the initial generation of the figure.
* Critical text or shape annotations: an outlier to highlight, the value of a data point or level to state explicitly, etc.

## 3. Edit aesthetics using formatting options in your plot library

Now that the figure is “complete” in the sense that it is legible and has the right information, it is important to note that you are far from finished for a publication quality figure. There are aesthetic rules which can be bent, and some can be broken, but most figure submissions would be rejected without taking a number of these into account. These are roughly ordered by most to least important:
1. Fix x-axis and y-axis limits to best show the regions of interest
    * There should not be significant amounts of empty space
    * … but legends and annotations should not cover up data
2. Change the figure size
    * Especially with subplots or complex figures, you may want the figure canvas to be taller or wider than the defaults.
3. Change line colors, types, and data point markers
    * If you expect a lot of points to overlap in a scatter plot or other plot, use lighter, empty marker shapes and consider transparency
    * *Note, 4-5% of the population is color blind, with most of those red/green colorblind, so be sure to not use only color to differentiate. Consider using different line types (dots, dashes, and solid lines) along with different markers. Consider textures in addition to different colors (e.g. for bar plots)*
    * Be sure to make sure the legend reflects the changes appropriately
4. Major major tick marks minimal to give room for labeling
    * Minor tick marks have no labels, while most plotting libraries put labels on major tick marks
    * Generally, for publication quality figures it is less about having very densely spaced major tick marks for precise determinations in a figure, and more about using very few tic marks so the data itself is not obscured and the figure is not overly complicated
5. Change font sizes (and possibly font type as well)
    * Wherever text is, there is almost always an option to change the font size so it is legible. Usually this means you need to make the text larger as many times figures are smaller in print then when you are working with them. Also, many publications suggest only certain types, and most of those are simpler font types like Arial rather than serif font types like times new roman, etc. 
6. Change the aspect ratio
    * Plots often change the aspect ratio (the “stretch” of the x axis compared to the y axis) automatically to improve readability. However, this can warp certain figures. E.g. if the aspect ratio is anything other than 1, a circle will look like an eclipse. Also a y=x line won’t look like it’s at 45 degrees. Etc. Many plotting packages allow you to fix the aspect ratio, or at least be able to force it to be 1 (e.g. “axes equal”). Keep in mind, an aspect ratio of 1 means your plot can still be a rectangle, it just means that, for example, if the rectangle is 4 units wide and 2 units high it will be the same shape as two squares side by side.
7. Remove the right and top parts plots where appropriate.
    * Many publication figures just show the x and y axes, and don’t have any grid or bounding box for the figure. Again, this emphasizes the data and makes publications appear cleaner.
8. Consider additional annotations (e.g. highlighting certain data points, including a trend line, etc)
    * It is helpful to do this in code rather than in post processing using an image editor, because if you have to change the figure in any way it can be auto-generated effortlessly, whereas any post-processing has to be repeated each time you regenerate a figure.
Note: if you are fairly sure you won’t have to re-generate the figure for any reason, and you are fairly handy with image editing software, **you can skip some of the previous steps if you think it is faster to perform them directly in image editing software**. However, if you are not sure you can do this, it is best to spend some time to see if this can be done in code, because once you track down how in code, you can do the same for every figure you’ll ever create in the future.

## 4. Save in a vectorized format (.svg, .pdf, .eps) rather than a raster format (.jpg, .png, .gif, etc)

One you have tuned your figure aesthetics programmatically, it is time to save the figure for dissemination outside of your code or IDE. Just publishing cropped screenshots is generally frowned upon for a number of reasons, most of which is quality (but if you will do this, please at least maximize the size of you figure before doing grabbing a screen shot!) You are better off using the features of your language to save a version of your figure for dissemination. However there is an important choice to make when you are saving a figure - vector or raster graphics format?

**Raster graphics (.jpg, .png, .gif, etc)**: These formats save the image as a collection of X by Y pixels. Each pixel is described by one number if it’s black and white, 3 if it is color, and possibly 4 if you include a level of transparency from completely opaque to completely transparent. In any case, these formats are generally called lossy as converting to this format usually results in a loss of data. Especially upon resizing. Taken an image in a raster format at 500x500 pixels, reduce its size to 50x50 pixels, then expand it back to 500x500 pixels and you’ll likely see blocks the size of 10 pixels. To get an idea of how fine these pixels need to be spaced, publications commonly request 300 pixels per inch for images to make sure they are sharp looking.

**Vector graphics (.svg, .pdf, .eps)**: These formats define the image using geometric shapes - lines, circles, text by fonts and sizes, etc. The advantage of this is that the image can be shrunk and expanded and remain the same quality. This is why vector graphics forms are the standard way to save your images for publication. This way, if someone puts your image in a paper, on a poster, or in a presentation it looks crisp every time.
Explore the options in your data analysis software and find a vector graphics format to publish your images. Also, it’s not unreasonably to create a raster graphics format at the same time in case you are working with someone and can’t open the vector graphics version, since they are more rarely implemented.

## 5. Finishing touches with image editing software
After an image has been saved, there are a number of options available for post-processing a figure. It is in your best interest to become minimally proficient with at least one rather graphics editor, and one vector graphics editor. The same editing rules apply to this stage of processing as previous stages, but now you are not constrained by the limitations of what the plotting software makes available. However, do note that if you regenerate the figure from data, all post processing edits are wasted, so it is important to do this only minimally and generally only directly prior to submission for publication.

Example raster graphics editors:
* Gimp (note: free!)
* Adobe Photoshop

Example vector graphics editors:
* Inkscape (note: free!)
* Adobe Illustrator

### Additional Resources

[10 simple rules for better figures](http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003833) from PLOS Comp Bio
