-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem making f2 file from plink .bed .fam and .bim files #74
Comments
I did a third run, removed the ' marks from around auto_only = FALSE and got this result:
ℹ Reading allele frequencies from PLINK files... Just some extra information. |
Your PLINK files might not be formatted as described here. If you can share the first few lines of the "indfile" is just the variable name used in some places for the I would generally avoid putting named arguments before positional arguments like in |
Thank you very much for your help! If I can use just a single sample in one file and a second file containing a population with which to compare it, it would be sufficient for my purposes at this time. Let me give it a try and see what I can do with it.
Just a little background: I am a former university professor (orchestra conductor), with a DMA from the University of Texas at Austin. About 15 years ago I became very interested in human population genetics and returned to school to earn a master's degree in biology (emphasis on population genetics) from Texas State University in 2012. Since 2013 I have been with the Texas Department of State Health Services in Austin. With the recent explosion of aDNA samples now available publicly, I am specifically interested in comparing aDNA male samples that fall into subclades of E-V13, a haplogroup found most abundantly today in the Balkan peninsula, with other potential source populations. I hope to use these samples to better understand the migration of this particular haplogroup from the Balkans to the rest of Europe.
Once again, thanks for your help.
Steve Bird
3765 Castle Rock Dr
…________________________________
From: Robert Maier ***@***.***>
Sent: Tuesday, June 4, 2024 7:01 PM
To: uqrmaie1/admixtools ***@***.***>
Cc: sbird1000 ***@***.***>; Author ***@***.***>
Subject: Re: [uqrmaie1/admixtools] Problem making f2 file from plink .bed .fam and .bim files (Issue #74)
Your PLINK files might not be formatted as described here<https://uqrmaie1.github.io/admixtools/articles/io.html>. If you can share the first few lines of the .bim and .fam file, I might be able to see what the problem is. One thing I can tell from the log messages is that it's only recognizing a single sample. If there is indeed only one sample or one population in your data, then no f-statistics can be computed; at least two populations with one sample each are needed to do anything.
"indfile" is just the variable name used in some places for the .fam or the .ind file, it doesn't mean it's expecting EIGENSTRAT format.
I would generally avoid putting named arguments before positional arguments like in extract_f2(prefix,auto_only = FALSE, my_f2_dir). I think R will interpret it correctly in this case, but generally the argument position matters for unnamed arguments, so better to write extract_f2(prefix, my_f2_dir, auto_only = FALSE)
—
Reply to this email directly, view it on GitHub<#74 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BITDDKXSKJNY5RJT24UYSBTZFZINRAVCNFSM6AAAAABH6WTSMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBYGU4TSMZVGE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I'm glad you're finding Admixtools 2 useful! One of my goals when working on it was to make these methods a bit more accessible to non-experts. But I realize it's still challenging. A single sample per file won't work. It's necessary that all samples are in the same set of files. You can use the PLINK |
Hi Robert,
I've had a chance to play around with ADMIXTOOLS2 over the weekend and had some questions about its operation. I am using the admixtools::run_shiny_admixtools() shell to carry out operations. For a test file, I am using the aDNA set from Olalde 2023, available here:
https://reich.hms.harvard.edu/datasets
I was able to produce the qpGraph once, but have not been able to do it again. If you could look at these screen shots, perhaps you may be able to let me know where I'm going wrong?
I loaded the "Select data directory" with this path to the following folder:
~/admix2_data/f2_Olalde/
(I had created the folder; it was empty at the start.)
After that, I clicked on "Select .ind or .fam file" and selected:
~/admix2_data/Olalde_2023/1240K_1kCE_Balkan.ind
The screen displayed the individual samples by population (see attachment Screenshot 1 - ADMIXTOOLS2.png)
and the geno file:
~/admix2_data/Olalde_2023/1240K_1kCE_Balkan.geno
However, when I hit the "Extract Data" button, it seemed to run for a while and then I got the following screen instead: (Screenshot 2 - ADMIXTOOLS2.png)
(The screenshots are attached as png files.)
Here is the command line information from the terminal output:
Listening on http://127.0.0.1:6133
Warning: Good news!
You don't need to call `useShinyalert()` anymore. Please remove this line from your code.
If you really want to pre-load {shinyalert} to the UI for any reason, use:
`useShinyalert(force = TRUE)`
[1] "navbar detected"
[1] "f2_options"
[1] "navbar selected"
[1] "f2_options"
[1] "navbar:"
[1] "f2_options"
[1] "multiprocess"
[1] "dashboardbody"
[1] "expanded!"
[1] "data"
[1] "navbar == data!"
[1] "load_data"
[1] "global$bod change detected!"
Warning: The 'plotly_click' event tied a source ID of 'src' is not registered. In order to obtain this event data, please add `event_register(p, 'plotly_click')` to the plot (`p`) that you wish to obtain event data from.
[1] "global$iscountdata"
[1] FALSE
[1] "global$allinds"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
[1] "global$iscountdata"
[1] FALSE
[1] "dashboardbody"
[1] "expanded!"
NULL
[1] "navbar == data!"
[1] "load_data"
Warning in dir.create(input$textdirinput) :
cannot create dir '', reason 'No such file or directory'
Warning in normalizePath(input$textdirinput) :
path[1]="": No such file or directory
[1] "global$iscountdata"
[1] FALSE
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
[1] "global$iscountdata"
[1] FALSE
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
At that point, it ceased to function, or at least I didn't know how to proceed from that point.
I really appreciate your help!
Steve Bird
…________________________________
From: Robert Maier ***@***.***>
Sent: Wednesday, June 5, 2024 11:48 AM
To: uqrmaie1/admixtools ***@***.***>
Cc: sbird1000 ***@***.***>; Author ***@***.***>
Subject: Re: [uqrmaie1/admixtools] Problem making f2 file from plink .bed .fam and .bim files (Issue #74)
I'm glad you're finding Admixtools 2 useful! One of my goals when working on it was to make these methods a bit more accessible to non-experts. But I realize it's still challenging.
A single sample per file won't work. It's necessary that all samples are in the same set of files. You can use the PLINK --merge-list option to merge the file sets. Feel free to email me if you run into other issues!
—
Reply to this email directly, view it on GitHub<#74 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BITDDKVADVROKFQQQPZH62TZF46NTAVCNFSM6AAAAABH6WTSMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJQGUYTOMJYGM>.
You are receiving this because you authored the thread.
|
Hello again,
I tried to run the program again and this time clicked on the f2 button despite the screen showing in Screenshot2 in my previous email. I then went to the terminal and found some additional information:
[1] "popselectors2"
[1] "global$bod change detected!"
[1] "f2plot"
[1] "get f2"
[1] "get f2"
ℹ Reading precomputed data for 26 populations...
ℹ Reading f2 data for pair 351 out of 351...
Warning in f2_from_precomp(global$countdir, inds = inds, pops = pops, apply_corr = input$f2corr, :
Some f2-statistic estimates are negative across blocks. This is probably caused by too many missing or rare SNPs in populations with low sample size, which makes the f2 bias correction unreliable. Consider running 'f2_from_precomp' with 'afprod = TRUE'.
[1] "get f2 done"
Warning: Error in pivot_longer: Arguments in `...` must be used.
✖ Problematic argument:
• ..1 = "pop2"
ℹ Did you misspell an argument name?
129: <Anonymous>
128: signalCondition
127: signal_abort
126: action
124: action_dots
123: check_dots
122: <Anonymous>
121: pivot_longer
120: %>%
119: <reactive:plotly_f2> [/home/steve/R/x86_64-pc-linux-gnu-library/4.1/admixtools/shiny_admixtools/app.R#1938]
103: plotly_f2
102: renderPlotly [/home/steve/R/x86_64-pc-linux-gnu-library/4.1/admixtools/shiny_admixtools/app.R#2061]
101: func
98: shinyRenderWidget
97: func
84: renderFunc
83: output$f2heatmap
2: shiny::runApp
1: admixtools::run_shiny_admixtools
Input to asJSON(keep_vec_names=TRUE) is a named vector. In a future version of jsonlite, this option will not be supported, and named vectors will be translated into arrays instead of objects. If you want JSON object output, please use a named list instead. See ?toJSON.
Maybe this is helpful. Thanks again for setting me on the right track.
Steve
…________________________________
From: Steven Bird ***@***.***>
Sent: Monday, June 10, 2024 12:42 PM
To: uqrmaie1/admixtools ***@***.***>
Subject: Re: [uqrmaie1/admixtools] Problem making f2 file from plink .bed .fam and .bim files (Issue #74)
Hi Robert,
I've had a chance to play around with ADMIXTOOLS2 over the weekend and had some questions about its operation. I am using the admixtools::run_shiny_admixtools() shell to carry out operations. For a test file, I am using the aDNA set from Olalde 2023, available here:
https://reich.hms.harvard.edu/datasets
I was able to produce the qpGraph once, but have not been able to do it again. If you could look at these screen shots, perhaps you may be able to let me know where I'm going wrong?
I loaded the "Select data directory" with this path to the following folder:
~/admix2_data/f2_Olalde/
(I had created the folder; it was empty at the start.)
After that, I clicked on "Select .ind or .fam file" and selected:
~/admix2_data/Olalde_2023/1240K_1kCE_Balkan.ind
The screen displayed the individual samples by population (see attachment Screenshot 1 - ADMIXTOOLS2.png)
and the geno file:
~/admix2_data/Olalde_2023/1240K_1kCE_Balkan.geno
However, when I hit the "Extract Data" button, it seemed to run for a while and then I got the following screen instead: (Screenshot 2 - ADMIXTOOLS2.png)
(The screenshots are attached as png files.)
Here is the command line information from the terminal output:
Listening on http://127.0.0.1:6133
Warning: Good news!
You don't need to call `useShinyalert()` anymore. Please remove this line from your code.
If you really want to pre-load {shinyalert} to the UI for any reason, use:
`useShinyalert(force = TRUE)`
[1] "navbar detected"
[1] "f2_options"
[1] "navbar selected"
[1] "f2_options"
[1] "navbar:"
[1] "f2_options"
[1] "multiprocess"
[1] "dashboardbody"
[1] "expanded!"
[1] "data"
[1] "navbar == data!"
[1] "load_data"
[1] "global$bod change detected!"
Warning: The 'plotly_click' event tied a source ID of 'src' is not registered. In order to obtain this event data, please add `event_register(p, 'plotly_click')` to the plot (`p`) that you wish to obtain event data from.
[1] "global$iscountdata"
[1] FALSE
[1] "global$allinds"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
[1] "global$iscountdata"
[1] FALSE
[1] "dashboardbody"
[1] "expanded!"
NULL
[1] "navbar == data!"
[1] "load_data"
Warning in dir.create(input$textdirinput) :
cannot create dir '', reason 'No such file or directory'
Warning in normalizePath(input$textdirinput) :
path[1]="": No such file or directory
[1] "global$iscountdata"
[1] FALSE
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
[1] "global$iscountdata"
[1] FALSE
[1] "popselectors"
[1] "CEE_EarlyMedieval" "Croatia_Brekinjova"
[3] "Croatia_Cepinski" "Croatia_Gardun"
[5] "Croatia_Gornji" "Croatia_Jagodnjak"
[7] "Croatia_MalaMetaljka" "Croatia_Nustar"
[9] "Croatia_Osijek" "Croatia_Sipar"
[11] "Croatia_SisakPogorelec" "Croatia_TrogirDobric"
[13] "Croatia_TrogirDragulin" "Croatia_Zadar"
[15] "Serbia_Gomolava" "Serbia_Kormadin"
[17] "Serbia_Lepenski" "Serbia_Mediana"
[19] "Serbia_TimacumKuline" "Serbia_TimacumSlog"
[21] "Serbia_ViminaciumGrobalja" "Serbia_ViminaciumPecine"
[23] "Serbia_ViminaciumPirivoj" "Serbia_ViminaciumRit"
[25] "Serbia_ViminaciumRudine" "Serbia_ViminaciumSvetinja"
[1] "popselectors2"
At that point, it ceased to function, or at least I didn't know how to proceed from that point.
I really appreciate your help!
Steve Bird
________________________________
From: Robert Maier ***@***.***>
Sent: Wednesday, June 5, 2024 11:48 AM
To: uqrmaie1/admixtools ***@***.***>
Cc: sbird1000 ***@***.***>; Author ***@***.***>
Subject: Re: [uqrmaie1/admixtools] Problem making f2 file from plink .bed .fam and .bim files (Issue #74)
I'm glad you're finding Admixtools 2 useful! One of my goals when working on it was to make these methods a bit more accessible to non-experts. But I realize it's still challenging.
A single sample per file won't work. It's necessary that all samples are in the same set of files. You can use the PLINK --merge-list option to merge the file sets. Feel free to email me if you run into other issues!
—
Reply to this email directly, view it on GitHub<#74 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BITDDKVADVROKFQQQPZH62TZF46NTAVCNFSM6AAAAABH6WTSMWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJQGUYTOMJYGM>.
You are receiving this because you authored the thread.
|
I'm a first time user of Admixtools2. I have been trying to create the f2 file using 'extract_f2' but no luck. Here are various command line combos I've tried, so far with no luck:
I entered:
prefix = '
/plink2/EAS006uc'/F2/'my_f2_dir = '
extract_f2(prefix, my_f2_dir)
This is what I got back:
ℹ Reading allele frequencies from PLINK files...
ℹ EAS006uc.bed has 1 samples and 2503505 SNPs
ℹ Calculating allele frequencies from 1 samples in 1 populations
ℹ Expected size of allele frequency data: 600 MB
2503k SNPs read...
✔ 2503505 SNPs read in total
Error in discard_snps(snpdat, maxmiss = maxmiss, auto_only = auto_only, :
Could not parse chromosome numbers! Set 'auto_only = FALSE' to ignore chromosome labels!
In addition: Warning message:
In discard_snps(snpdat, maxmiss = maxmiss, auto_only = auto_only, :
Keeping only chromosomes 1 to 22! Set
auto_only = FALSE
to keep all chromosomes!I then tried the same command syntax but with the 'auto_only = FALSE" included and got this:
prefix = '
/plink2/EAS006uc'/F2/'my_f2_dir = '
extract_f2(prefix,'auto_only = FALSE', my_f2_dir)
The result was:
ℹ Reading allele frequencies from PLINK files...
Error in match_samples(fam$X2, fam$X1, inds, pops) :
Individuals missing in indfile:
~/F2/
Why is 'extract_f2' looking for a file called "indfile" when it was not looking for that file in the first run? Rather than .bed .bin and .fam, files, it appears to be looking for files perhaps associated with the EIGENSTRAT format. Why won't it recognize the Plink format files? Do I need to add some other function or flag to my input code?
Thanks for your help!
The text was updated successfully, but these errors were encountered: