Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

combining two plots #26

Open
davidaray opened this issue Oct 27, 2020 · 11 comments
Open

combining two plots #26

davidaray opened this issue Oct 27, 2020 · 11 comments

Comments

@davidaray
Copy link

davidaray commented Oct 27, 2020

I have two independently produced plots for two related species and would like to combine them onto a single set of axes for comparison. I see this in multiple manuscripts but I don't know how to do it for my data.

Does anyone have some clear instructions on how to accomplish this?

I've tried just about every variation of psmc_plot.pl -M "geomys=0.1,thomomys=0.2" -u "geomys=0.0000000022,thomomys=0.0000000022" -g "geomys=3,thomomys=2" twogophers geomys.psmc thomomys.psmc
possible with no luck.

David

@laiqiang0728
Copy link

Hi, I'm on the same issue, have you got any methods to solve this ?
Thanks!

Qiang

@davidaray
Copy link
Author

Hi, I'm on the same issue, have you got any methods to solve this ?
Thanks!

Qiang

I didn't get anything from the author but I did get some help from another researcher. It involves taking some output data from psmc and then plotting it with Excel or another software package. It's rather easy and I'd be happy to share it with you if you contact me directly.

@davidaray
Copy link
Author

For anyone interested, here is a workaround for plotting via psmc directly:

This is the documentation and commands from the script I ran. If you're familiar with linux, you should be able to work out the steps. If you have trouble, let me know and I'll try to explain what I did in each step.

#The first step is to run the analysis just once so that you get a single set of data to serve as the main line in the plot.

fq2psmcfa geomys.fq.gz >geomys.single.psmcfa

psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o geomys.single.psmc geomys.psmcfa

psmc_plot.pl -u 2.2e-9 -g 3.08 -R -p geomys.plotsingle.psmc geomys.psmc

#Note the -R option. It produces a .txt file with the data to be plotted.

#Afterwards, you will do the bootstrap that is described in the psmc documentation.

Split sequences to perform bootstrapping

splitfa $PREFIX".psmcfa" > $PREFIX".split.psmcfa"

Run PSMC bootstrap , using the default options

psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o $PREFIX".psmc" $PREFIX".psmcfa"

psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-{}.psmc $PREFIX"2"$MAP2".split.psmcfa" | sh

seq 100 | parallel -I% --max-args 1 $PSMC/psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-%.psmc $PREFIX".split.psmcfa" | sh

cat $PREFIX".psmc" round-*.psmc > $PREFIX".combined.psmc"

rm round-*.psmc

Generate PSMC plot, using the per-generation mutation rate -u and the generation time in years -g for each bootstrap iteration.

psmc_plot.pl -u 2.2e-9 -g $GT -R -p $PREFIX".plot.psmc" $PREFIX".combined.psmc"

#Again note the -R option.

#Concatenate the .txt files from the bootstrap analysis, leaving one space between each iteration's data.

#Use excel to plot the data as described below

I created an excel file that has the concatenated data in columns B-F for taxon 1 and columns I-M for taxon 2. The single-run data are in columns P-T for taxon 1 and W-AA for taxon 2.

  1. Open your excel file and highlight columns B and C.

  2. Click the 'Insert' toolbar and choose 'Recommended Charts'.

  3. Select a 'Scatter' plot.

  4. Right click the x-axis and choose 'Format Axis' change the Axis Options to 'Logarithmic scale'.

  5. Right click any plotted point and choose 'Format Data Series'.

  6. Click the paint can and choose 'Marker' and then 'Marker Options'. Change to 'None'.

  7. Click 'Line' and choose 'Solid line'. Change the transparency and color to whatever suits you. I generally choose a light color.

  8. Right click anywhere in the plotted area and choose 'Select Data'.

  9. Click Series 1 and select 'Edit'. Change the name to 'bootstrap' or whatever you deem appropriate. Don't change any of the other fields.

  10. Click 'Add'. Name the series, 'main' or whatever is appropriate.

  11. Click the Series X box and then select column P.

  12. Click the Series Y box, delete what's there and then select column Q.

  13. Adjust the color for the new line as needed. i usually choose a darker version of the color that I used for the bootstrap.

  14. To add your second data set,, Right click in the plotted area and choose 'Add'. Name the new series 'boostrap' and select column I for Series X and column J for Series Y.

  15. Add the 'main' line as described in steps 10 - 13.

The excel file I created is located here: https://github.com/davidaray/test/blob/master/Pocket_gopher.xlsx

@fredjaya
Copy link

fredjaya commented Feb 8, 2021

I provided a solution on a previous issue which utilises the multiline mode in psmc_plot.pl - might find it helpful.

@JuliaLopezDelgado
Copy link

Thanks for the instructions on the combined bootstrap plot @davidaray !

Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.

Thanks in advance,
Julia

@davidaray
Copy link
Author

Thanks for the instructions on the combined bootstrap plot @davidaray !

Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.

Thanks in advance, Julia

Julia, I saw this last week and forgot to respond. I apologize. Did you find an answer? I honestly don't remember and our HPCC, where all this work is housed, is down for maintenance this week. I won't be able to access until Monday.

@JuliaLopezDelgado
Copy link

Thanks for your reply @davidaray! No, I still am unsure about the psmc output. Any insight would be really appreciated.

@g-pacheco
Copy link

Hello,

I also have the .txt files and I have been trying to find a way to plot these results using ggplot. As it has been said, it would be lovely to understand what are each of the 5 columns in the .txt output, and which one and how should be scaled.

I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with ggplot though.

Has anyone been luckier?

Many thanks in advance, George.

@elizeng
Copy link

elizeng commented Mar 4, 2022

Thanks @davidaray for the helpful instructions!
In the example excel file you have provided, I note that you only use the first 2 columns of the text output which are years from present (X axis) and effective population (Y axis). Any idea what the remaining 3 columns are?

@elizeng
Copy link

elizeng commented Mar 4, 2022

Hello,

I also have the .txt files and I have been trying to find a way to plot these results using ggplot. As it has been said, it would be lovely to understand what are each of the 5 columns in the .txt output, and which one and how should be scaled.

I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with ggplot though.

Has anyone been luckier?

Many thanks in advance, George.

You can probably plot it using the first 2 columns years from present (X axis - 1st column) and effective population (Y axis - 2nd column). Not sure what the rest of the columns refer to

@linhai1221
Copy link

include the "-R" in the cmd,
and vim the .gp file as you like
then run cat sample_plot.gp | gnuplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants