iandunham/Forge
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
master
Could not load branches
Nothing to show
Could not load tags
Nothing to show
{{ refName }}
default
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
-
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more about the CLI.
- Open with GitHub Desktop
- Download ZIP
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
For details of the analysis approach see documentation in the web version at http://www.1000genomes.org/forge-analysis 1. The script itself currently called forge.pl written in perl. It has the following perl dependencies. use 5.012; use warnings; use DBI; use Sort::Naturally; use Cwd; use Storable; use Getopt::Long; 2. The sqlite3 db file that stores the bitstrings. This is a 98Gb file for forge2.0 Called forge.db. 3. Four stored hashes containing the parameters for the background selection. two files each for either the omni genotyping array, or all the other GWAS snp arrays. 76M omni.snp_bins.10 231B omni.snp_params.10 63M snp_bins.10 231B snp_params.10 Both the database and the hashes are downloadable from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/browser/forge_11/ forge_db_20141112.tar.gz for forge1.1 ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/browser/forge/ forge_20131009.tar.gz for forge1.0 4. a forge.ini file in the same drectory as the script. Edit this to provide the directory in which the dartabase and hashes are stored. 5. An R 3.0 installation with the "devtools" and "rCharts" packages installed. See https://github.com/ramnathv/rCharts. You will need to install the latest version e.g. require(devtools) install_github('rCharts', 'ramnathv', ref = "dev") The input data is one of several options. a. A list of rsids for SNPs b. a pseudobed format of chr\tbeg\tend\rsid c. bed format of locations d. vcf2. The analysis requires a minimum of 20 SNPs (this is not a strict limit but operationally is best). To work SNPs have to be in phase 1 of 1000 genomes. The script gives warnings on SNPs not found. It also warns for background sets that do not have the right number of SNPs chosen, but this is really for information only. It takes a series of command line options as follows -f : the file to run on -data : whether to analyse ENCODE (encode) or EpigenomeRoadmap (erc) data -label : a name for the files that are generated and for the plot titles where there is a title. - format : for the input data formats. If this is location data e.g. bed or tabix or some vcf lines, the rsid is obtained from the sqlite3 database. -bkgd : whether the background selection should be from Omniarray SNPs (omni) or the a general set of GWAS typing arrays (gwas). Some of these default as described in the perldoc.6. Minimally the command line is forge.pl -f rsidfile -label Some_label which will by default run on Epigenome Roadmap data with the gwas background OUTPUT ====== there are 3 outputs generated 1. A pdf static chart, that would be good for download. 2. Two alternate d3 interactive charts. the *dchart.html are the best in terms of axis labelling but have some minor quirks 3. A Datatables table. 4. There's also a tsv file of the results.
About
GWAS SNP Regulatory Analysis Tool
Resources
Stars
Watchers
Forks
Packages 0
No packages published