combining two plots #26

davidaray · 2020-10-27T15:56:09Z

I have two independently produced plots for two related species and would like to combine them onto a single set of axes for comparison. I see this in multiple manuscripts but I don't know how to do it for my data.

Does anyone have some clear instructions on how to accomplish this?

I've tried just about every variation of psmc_plot.pl -M "geomys=0.1,thomomys=0.2" -u "geomys=0.0000000022,thomomys=0.0000000022" -g "geomys=3,thomomys=2" twogophers geomys.psmc thomomys.psmc
possible with no luck.

David

laiqiang0728 · 2021-01-08T13:00:56Z

Hi, I'm on the same issue, have you got any methods to solve this ?
Thanks!

Qiang

davidaray · 2021-01-08T13:53:34Z

Hi, I'm on the same issue, have you got any methods to solve this ?
Thanks!

Qiang

I didn't get anything from the author but I did get some help from another researcher. It involves taking some output data from psmc and then plotting it with Excel or another software package. It's rather easy and I'd be happy to share it with you if you contact me directly.

davidaray · 2021-01-11T15:05:16Z

For anyone interested, here is a workaround for plotting via psmc directly:

This is the documentation and commands from the script I ran. If you're familiar with linux, you should be able to work out the steps. If you have trouble, let me know and I'll try to explain what I did in each step.

#The first step is to run the analysis just once so that you get a single set of data to serve as the main line in the plot.

fq2psmcfa geomys.fq.gz >geomys.single.psmcfa

psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o geomys.single.psmc geomys.psmcfa

psmc_plot.pl -u 2.2e-9 -g 3.08 -R -p geomys.plotsingle.psmc geomys.psmc

#Note the -R option. It produces a .txt file with the data to be plotted.

#Afterwards, you will do the bootstrap that is described in the psmc documentation.

Split sequences to perform bootstrapping

splitfa $PREFIX".psmcfa" > $PREFIX".split.psmcfa"

Run PSMC bootstrap , using the default options

psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o $PREFIX".psmc" $PREFIX".psmcfa"

psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-{}.psmc $PREFIX"2"$MAP2".split.psmcfa" | sh

seq 100 | parallel -I% --max-args 1 $PSMC/psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-%.psmc $PREFIX".split.psmcfa" | sh

cat $PREFIX".psmc" round-*.psmc > $PREFIX".combined.psmc"

rm round-*.psmc

Generate PSMC plot, using the per-generation mutation rate -u and the generation time in years -g for each bootstrap iteration.

psmc_plot.pl -u 2.2e-9 -g $GT -R -p $PREFIX".plot.psmc" $PREFIX".combined.psmc"

#Again note the -R option.

#Concatenate the .txt files from the bootstrap analysis, leaving one space between each iteration's data.

#Use excel to plot the data as described below

I created an excel file that has the concatenated data in columns B-F for taxon 1 and columns I-M for taxon 2. The single-run data are in columns P-T for taxon 1 and W-AA for taxon 2.

Open your excel file and highlight columns B and C.
Click the 'Insert' toolbar and choose 'Recommended Charts'.
Select a 'Scatter' plot.
Right click the x-axis and choose 'Format Axis' change the Axis Options to 'Logarithmic scale'.
Right click any plotted point and choose 'Format Data Series'.
Click the paint can and choose 'Marker' and then 'Marker Options'. Change to 'None'.
Click 'Line' and choose 'Solid line'. Change the transparency and color to whatever suits you. I generally choose a light color.
Right click anywhere in the plotted area and choose 'Select Data'.
Click Series 1 and select 'Edit'. Change the name to 'bootstrap' or whatever you deem appropriate. Don't change any of the other fields.
Click 'Add'. Name the series, 'main' or whatever is appropriate.
Click the Series X box and then select column P.
Click the Series Y box, delete what's there and then select column Q.
Adjust the color for the new line as needed. i usually choose a darker version of the color that I used for the bootstrap.
To add your second data set,, Right click in the plotted area and choose 'Add'. Name the new series 'boostrap' and select column I for Series X and column J for Series Y.
Add the 'main' line as described in steps 10 - 13.

The excel file I created is located here: https://github.com/davidaray/test/blob/master/Pocket_gopher.xlsx

fredjaya · 2021-02-08T11:04:16Z

I provided a solution on a previous issue which utilises the multiline mode in psmc_plot.pl - might find it helpful.

JuliaLopezDelgado · 2021-11-02T20:24:01Z

Thanks for the instructions on the combined bootstrap plot @davidaray !

Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.

Thanks in advance,
Julia

davidaray · 2021-11-09T16:10:57Z

Thanks for the instructions on the combined bootstrap plot @davidaray !

Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.

Thanks in advance, Julia

Julia, I saw this last week and forgot to respond. I apologize. Did you find an answer? I honestly don't remember and our HPCC, where all this work is housed, is down for maintenance this week. I won't be able to access until Monday.

JuliaLopezDelgado · 2021-11-24T10:35:48Z

Thanks for your reply @davidaray! No, I still am unsure about the psmc output. Any insight would be really appreciated.

g-pacheco · 2022-03-02T17:04:55Z

Hello,

I also have the .txt files and I have been trying to find a way to plot these results using ggplot. As it has been said, it would be lovely to understand what are each of the 5 columns in the .txt output, and which one and how should be scaled.

I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with ggplot though.

Has anyone been luckier?

Many thanks in advance, George.

elizeng · 2022-03-04T07:27:11Z

Thanks @davidaray for the helpful instructions!
In the example excel file you have provided, I note that you only use the first 2 columns of the text output which are years from present (X axis) and effective population (Y axis). Any idea what the remaining 3 columns are?

elizeng · 2022-03-04T07:28:35Z

Hello,

I also have the .txt files and I have been trying to find a way to plot these results using ggplot. As it has been said, it would be lovely to understand what are each of the 5 columns in the .txt output, and which one and how should be scaled.

I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with ggplot though.

Has anyone been luckier?

Many thanks in advance, George.

You can probably plot it using the first 2 columns years from present (X axis - 1st column) and effective population (Y axis - 2nd column). Not sure what the rest of the columns refer to

linhai1221 · 2022-04-21T06:41:21Z

include the "-R" in the cmd,
and vim the .gp file as you like
then run cat sample_plot.gp | gnuplot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

combining two plots #26

combining two plots #26

davidaray commented Oct 27, 2020 •

edited

laiqiang0728 commented Jan 8, 2021

davidaray commented Jan 8, 2021

davidaray commented Jan 11, 2021

fredjaya commented Feb 8, 2021 •

edited

JuliaLopezDelgado commented Nov 2, 2021

davidaray commented Nov 9, 2021

JuliaLopezDelgado commented Nov 24, 2021

g-pacheco commented Mar 2, 2022

elizeng commented Mar 4, 2022

elizeng commented Mar 4, 2022

linhai1221 commented Apr 21, 2022

combining two plots #26

combining two plots #26

Comments

davidaray commented Oct 27, 2020 • edited

laiqiang0728 commented Jan 8, 2021

davidaray commented Jan 8, 2021

davidaray commented Jan 11, 2021

Split sequences to perform bootstrapping

Run PSMC bootstrap , using the default options

Generate PSMC plot, using the per-generation mutation rate -u and the generation time in years -g for each bootstrap iteration.

fredjaya commented Feb 8, 2021 • edited

JuliaLopezDelgado commented Nov 2, 2021

davidaray commented Nov 9, 2021

JuliaLopezDelgado commented Nov 24, 2021

g-pacheco commented Mar 2, 2022

elizeng commented Mar 4, 2022

elizeng commented Mar 4, 2022

linhai1221 commented Apr 21, 2022

davidaray commented Oct 27, 2020 •

edited

fredjaya commented Feb 8, 2021 •

edited